Finding Nemo Onetangi

Being a freelance contractor is often stressful – parachuted into a budget constrained project with tight timeframes and where you know next to nothing about the system you will be debugging or co-developing.

Often I find myself working on some back-end system asking myself a question that I have asked myself many times before – “that piece of information in the UI is sourced from what column in what database table?”. Put another way, Onetangi is showing up in the application user interface, the data is coming from some database table, but there are many, I do not know which one, I need to know, and I need to know independently and using desperate means if necessary. Finding the key database table/column often requires a lot of hunting and foraging too and, with hundreds of thousands of lines of code, and micro services scattered and hidden all over the place, it is often easier just to go to the backend database and examine table content hoping the table name can give me a clue as to what it contains (eg. tbl_countries, tblPaymentTypes, massSpectrometryLookup are names that reveal their likely content). However if there are many hundreds of tables, each with many columns, finding Nemo Onetangi can take some time, and most likely it is a waste of time too.

Fortunately I have encountered this problem before, many times before, and the technique I use to approach the problem depends on the technology, tools, the systems access granted to me, and the time I have to perform the task, …. The approach I am going to describe here is from an Oracle (O)RDBMS perspective, and my approach uses a piece of home-rolled PL/SQL. Other approaches involve changing content in the UI followed by grep’ing out the content from the trace or REDO log file, stepping through the code with a debugger, breakpoints, log content capture, and so on.

Some years back I wrote a short script that can be summarized as, in Oracle terms, query all text columns, everywhere, in every table, and pattern match a search string [in my case for the purposes of this blog, Onetangi (and München), but substitute for your specific search criteria]. I have decided to publish the script here, after a bit of sanitization and oversimplification, and include it inline below.

The first part of the script generates some dummy data to demonstrate code execution.  The latter part of the script shows the output from execution. The code is relatively trivial, can be modified for other needs, and if you want to use the code for your own purposes, feel free to do so and without credit.

SQL>
SQL> create table test1(a varchar2(10), b nvarchar2(10), c char(20), d date, e number);
Table created.
SQL> insert into test1(a) values('Onetangi');
1 row created.
SQL> insert into test1(a, b) values('Waiheke', 'München');
1 row created.
SQL> create table test2(a varchar2(10), b nvarchar2(10), c char(20), d date, e number);
Table created.
SQL> insert into test2(a, c) values('Waiheke', 'Onetangi');
1 row created.
SQL> create table test3(a varchar2(10), b nvarchar2(10), c char(20), d date, e number);
Table created.
SQL> insert into test3(c) values('Waiheke');

1 row created.

SQL>
SQL> create table test4(a varchar2(10), b nvarchar2(10), c char(20), d date, e number);

Table created.

SQL> insert into test4(b) values('München');

1 row created.

SQL> commit;

Commit complete.

SQL>

SQL> declare
2    found number;
3    stmt varchar2(500) := 'select count(*) from dual where exists (select 1 from #TABLE# where upper(trim(#COLUMN#)) like upper(trim(''%#SEARCHSTRING#%'')))';
4    maxRowsThreshold number := 100000; --don't want to hammer the system, initially, if I can help it
5  begin
6   for r0 in ( -- S=schema(s), D=data type(s), X=excluded table(s), F=search string(s) case insensitive
7               with f as (select 'S=MIKETEST;D=VARCHAR2;D=CHAR;D=NVARCHAR2;X=TEST4;X=TEST3;F=Onetangi;F=München' fields from dual),
8                    p as (select substr(regexp_substr(f.fields, '[^;]+', 1, level), 1, 1) field_type,
9                                 substr(regexp_substr(f.fields, '[^;]+', 1, level), 3) field_name
10                            from f
11                              connect by regexp_substr(f.fields, '[^;]+', 1, level) is not null
12                          )
13               select atc.owner || '.' || atc.table_name table_name, atc.column_name, p.field_name searchString, atc.owner
14                  from all_tab_columns atc,
15                       all_tables ats,
16                       p
17                   where atc.data_type in (select field_name from p where field_type = 'D')
18                     and atc.owner in (select field_name from p where field_type = 'S')
19                     and not atc.table_name in (select field_name from p where field_type = 'X')
20                     and p.field_type = 'F'
21                     and ats.table_name = atc.table_name
22                     and ats.owner = atc.owner
23                     and nvl(ats.num_rows, maxRowsThreshold) <= maxRowsThreshold
24            ) loop
25               execute immediate replace(
26                                          replace(
27                                                    replace(stmt, '#TABLE#', r0.table_name),
28                                                                                             '#COLUMN#', r0.column_name
29                                                  ),
30                                                   '#SEARCHSTRING#', r0.searchString
31                                        )
32                   into found;
33                if found != 0 then
34                    dbms_output.put_line('Hit, qualified table column: ' || r0.table_name || '.' || r0.column_name || ' searchString: ' || r0.searchString);
35                end if;
36              end loop;
37  end;
38  /

Hit, qualified table column: MIKETEST.TEST1.B searchString: München
Hit, qualified table column: MIKETEST.TEST2.C searchString: Onetangi
Hit, qualified table column: MIKETEST.TEST1.A searchString: Onetangi
PL/SQL procedure successfully completed.
SQL>

I need to point out that is far from code that you would implement on a production box. It is code to execute on a development database, where you are looking for some piece of data somewhere, and you do not care whether query execution takes 1 second or 1 day, you just want to know where the piece of data is coming from for the purposes of debugging, triaging, reverse engineering, …. whatever, some system, and are confident that DBA won’t come down heavy on you for exercising the database hosting environment.

I do not know what else to write other than to address the big question of course – where is Onetangi? I am not going to tell you as that would destroy the pleasure of you finding out on your own. As a clue, it is nowhere near Munich. Another clue, it was my second home as a child, a very enjoyable one too, and I think Nemo would also enjoy living there.

The code in this blog article was developed against the following Oracle version.

SQL>
SQL>
SQL> select *
2    from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for 64-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production
SQL>
SQL>
SQL>

— Published by Mike, 07:53:02 05 January 2018 (CET)

Leave a Reply