PostgreSQL is a hyperconverged DBMS offering functionalities that otherwise require specialized database technology. Developing a Graph Database is one such use case.
To explore Graph Databases in more detail, 2ndQuadrant held a live webinar, “Creating Graph Databases in PostgreSQL”, which was hosted by Simon Riggs (CEO at 2ndQuadrant).
In this webinar, Simon demonstrated how PostgreSQL can be used as a Graph Database using native functions.
Those who weren’t able to attend the live webinar can now view the recording here.
Due to limited time, some of the questions were not answered during the live webinar, so answers by the host are provided below:
Question: Infinite loop (no cycle) can be achieved by replacing the UNION ALL by UNION. The UNION performs a bit and stops when a cycle is reached. It may be interesting to benchmark as the sort may under-perform. Your thoughts?
Answer: Note that catalog tables also need to be vacuumed and analyzed. This is done automatically by autovacuum, but you could also monitor the bloat of the catalog tables by looking at `pg_stat_sys_tables`. Also, note that many of the tables we use during the talk are `views`, not physical tables.
Question: Can you suggest software that can graphically display the graph built by Postgres?
Answer: You can use any tool you want; sorry, no preferences from me.
Question: Hi Simon, apologies I missed the start of the webinar so I may have missed this. Any thoughts on using an edge table (e.g src/dst) per edge as opposed to edge array?
Answer: As I said, that is the “obvious” relational approach and exactly the design that is frequently criticized by graph database vendors. You can do that, but it can slow down queries and increase the number of writes to maintain the graph, nullifying the benefits of Postgres. There might be other reasons to adopt it…
Question: How simple would it be to implement some way of an audit/changelog? Effectively, keeping a history of relationships in the graph.
Answer: Fairly straightforward. I guess if you want historical queries on a graph that can be done also.
Question: I joined late so maybe I missed this, but how is the performance affected when you get more than a few levels deep? (as in example: https://stackoverflow.com/questions/52674380/improving-postgres-performance-on-graph-like-queries-of-multi-level-self-joins)
Answer: The example from Stack Overflow doesn’t use recursive queries. The key to query performance in that context is to avoid re-accessing nodes you already accessed. There is nothing inherent in relational systems that would make it geometrically slower, it’s just that SQL by default does not exclude multiple visits or duplicate/multiple rows in the worktables.
To stay updated on upcoming webinars by 2ndQuadrant, you can visit our Webinars page.
For any questions, comments, or feedback, please visit our website or send an email to webinar@2ndquadrant.com.