PostgreSQL Recursive Common Table Expression WITH Currywürste - My First Development Experience with PostgreSQL
My mother tongue is Oracle, not PostgreSQL. PostgreSQL however has exceptionally good press for being a snappy comprehensive quality product, and with great documentation to boot. Acknowledging I know little about PostgreSQL, I also recently accepted a new role where the future is likely to involve a transition of core systems from Oracle to PostgreSQL. Some ‘homework’ is overdue.
It would be a somewhat deflating experience to explore PostgreSQL using SQL 101 DML. After all, given PostgreSQL’s reputation, I think it’s a safe bet to take for granted that it can query and update relationally structured data 😉 I need to explore the database functionality/feature-set with something more exotic – an Analytic Function (termed a Window Function in PostgreSQL) or, perhaps, a Common Table Expression (CTE). As it turns out, I recently invested some grey matter in learning about CTE’s posting a blog article titled “Oracle Recursive With With Currywürste” (upgrading my knowledge from the peculiar START WITH CONNECT BY PRIOR Oracle proprietary syntax).
Here is a verbatim copy of the SQL I wrote for Oracle in the previous blog post.
SQL> WITH dd AS (SELECT DATE '-4712-1-1' minValidFrom, DATE '9999-12-31' maxValidTo FROM DUAL),
2 x (widgetName, childID, parentID, validFrom, validTo) AS
4 SELECT t0.widgetName,
5 t0.id childID,
6 t0.parentID parentID,
7 COALESCE(t0.validFrom, d0.minValidFrom),
8 COALESCE(t0.validTo, d0.maxValidTo)
9 FROM bom t0
10 CROSS JOIN dd d0
11 WHERE t0.widgetName = 'Product A'
12 UNION ALL
13 SELECT t1.widgetName,
14 t1.id childID,
15 t1.parentID parentID,
16 COALESCE(t1.validFrom, d1.minValidFrom),
17 COALESCE(t1.validTo, d1.maxValidTo)
18 FROM bom t1
19 CROSS JOIN dd d1
20 INNER JOIN x rw ON rw.childID = t1.parentID
21 WHERE COALESCE(t1.validFrom, d1.minValidFrom) <= rw.validFrom
22 OR rw.validFrom = d1.minValidFrom
24 SELECT MAX(t2.validFrom) validFrom,
25 MIN(t2.validTo) validTo
26 FROM x t2;
Can you imagine my surprise to spend only a few moments tweaking the Oracle specific code (note the removal of the DUAL table, the addition of the keyword RECURSIVE, and a change to the min/max date limitations, to then find no further changes were required. The SQL then executed as expected. I show the PostgreSQL specific query below.
WITH RECURSIVE dd AS (SELECT DATE 'January 1, 4713 BC' minValidFrom, DATE 'December 31, 99999 AD' maxValidTo),
x (widgetName, childID, parentID, validFrom, validTo) AS
FROM bom t0
CROSS JOIN dd d0
WHERE t0.widgetName = 'Product A'
FROM bom t1
CROSS JOIN dd d1
INNER JOIN x rw ON rw.childID = t1.parentID
WHERE COALESCE(t1.validFrom, d1.minValidFrom) <= rw.validFrom
OR rw.validFrom = d1.minValidFrom
SELECT MAX(t2.validFrom) validFrom,
FROM x t2;
PostgreSQL is the same, but different, and confirms that for this very specific example and likely many others, that if you use ANSI SQL then vendor lock-in can be diminished. Of course PL/SQL is not PL/pgSQL (and here the differences are more profound), but … databases are a solved technology and there is nothing so special about Oracle nowadays anyway, after all if Oracle were the bees knees PostgreSQL, SQL Server, and all the rest, and this is excluding all the NoSQL DB’s, wouldn’t have so proudly eaten into Oracle database marketshare. For PostgreSQL and big-data – the poster boy has to be Netezza SQL, used to query IBM Netezza Appliances. If you squint, Netezza SQL is a fork of PostgreSQL. If you are after a reference from a prestigious organisation that has confidence in PostgreSQL, in this industry they don’t come much more prestigious than Big Blue!
For me, my PostgreSQL journey will start by learning the syntax and idiosyncrasies/annoyances of PostgreSQL (just like the idiosyncrasies and annoyances/junk DNA in Oracle – I’ve published a few prior observations on this blog). I appreciate too that migrating this single piece of code from Oracle SQL to PostgreSQL was painless as I had adhered to ANSI standards in my original code (not using the Oracle START WITH CONNECT BY PRIOR syntax), and it would be naive to forecast that other code migration exercises will be so straightforward. Nevertheless the exercise has been a very positive first experience. There will be more updates to come from me, of that I am sure, on both PostgreSQL syntax and, once I become more confident with the technology, comments on the query costing selectivities embodied within the optimiser. I do hope however that future is as positive as my first (exploratory) development experience with PostgreSQL.
In a thread on Oracle-l two weeks ago, an attempt to ‘diss out’ an online hiding against the OP was performed as he had the temerity to post something PostgreSQL related to an Oracle forum. Ignoring the tone of some of the exchanges, several differences between the two technologies were highlighted (and refuted). Again ignoring the tone of some of the exchanges, the conversation thread raised some interesting points. I have just reviewed the tone of a few PostgreSQL newsgroups/mailing lists. It appears that the PostgreSQL community feels a lot more collaborative and, dare I write it, friendly, even when Oracle or other vendor DBMS implementations are raised as the topic of discussion. To me this is one of those intangible pull-factors that makes me want to move away from Oracle and toward PostgreSQL. It feels justified to conclude this blog post on this point.
The PostgreSQL code above was developed using: PostgreSQL 9.6beta2 on x86_64-apple-darwin, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00), 64-bit
— Published by Mike, 11:01:04 01 August 2017 (CET)