Business and Data Intelligence: Tables or networks?

7. 2. 2022

I dare say that the vast majority of business intelligence solutions we can see everywhere still run on relational databases. No wonder. I remember how amazed I was when I came to understand the simplicity and beauty of relational algebra and learned of the versatility and power of SQL. Relational databases are based on tables that are interconnected through relations such as “this bank account belongs to this customer”. All views and outputs from relational databases come in the form of tables. Tables are the ideal world for users in the financial controlling department: every line contains a number; there is a sum at the bottom; check it; does it agree? It doesn’t agree; never mind; correct it; is it OK now? Perfect, we’re done… I know what I’m talking about. For me, tables are inside my comfort zone. However, I also understand that for marketing people, tables are more of a nightmare inside their fear zone.

As the volume and importance of the data all around us have grown, I have gradually started moving into a new field called data intelligence. The object of interest here is no longer the business information expressed in data but the data itself. My reasoning was simple: what works in data warehouses (i.e., relational tables) will work in metadata warehouses. After all, in the field of data understanding (for example, when showing the dependencies between data flows—data lineage), it is all about the relationships between data sources and algorithms. So again, we are dealing with relations and, therefore, relational databases are the right technology… But we are going down the wrong path. A thin invisible line has been crossed.

Data understanding is primarily about the complex network of mutual relationships. What interests us in the output is no longer a table but rather a map, a network, an oriented graph. Yes, we can also solve all the required operations with network graphs in a relational database environment, but specific operations such as charting, conditional search for ancestors or successors without a cycle, relationship aggregation, similarity searching, and much more are performed significantly better in so-called graph databases. The basis of these databases is no longer a table and a relation but a node (vertex) and an edge (the connection between vertices).

The difference between a relational and a graph database could be described as the difference between directions (itinerary) and a map. Directions are usually just a list of all the places where you have to turn along the way to eventually reach your destination. With maps, there is much more information, your route is plotted in the context of your surroundings, you can choose the scale, and if you get off track, you can usually still tell where you are.

The point of the story is that the future belongs to graph databases. I know that for many big data experts, this is, figuratively speaking, the discovery of the Americas a few centuries after Columbus. However, as happens on a journey of discovery, I’ve been there and back again. For I have come to realize that the vast majority of tasks, not only in data intelligence but also in business intelligence and management decision support, are essentially like a task in a network graph. So, for example, far more than a simple list of clients with a selected metric, we need to know the patterns of relationships and the connections between them and to visualize, quantify, and compare these patterns. In both data intelligence and business intelligence, we need not a list of numbers and a sum, but rather, we need maps—or more precisely: first a map and then directions.

Now I’ll conclude with a digression into the world of physics. If you read Carlo Rovelli’s Seven Brief Lessons on Physics, you may notice, among other things, that the loop theory of gravity offers a picture of our world that looks less like an object and more like the interaction of relationships. So, we don’t live on a neatly squared sheet of graph paper but in a bubbling tangle of connections. To see the squares, we need to understand the connections.

Author: Petr Hájek

Information Management Advisor