Clarkston Consulting https://www.facebook.com/ClarkstonConsulting https://twitter.com/Clarkston_Inc https://www.linkedin.com/company/clarkston-consulting http://plus.google.com/112636148091952451172 https://www.youtube.com/user/ClarkstonInc
Skip to content

What is a Graph Database?

Graph technology provides an alternative way to store data to a traditional relational database. Instead of storing data in tables with predetermined columns and multiple rows, data is defined by nodes and a relationship. A node represents a single entity, and the relationship provides information on how two nodes are associated. This prioritizes the relationship above all, allowing your data to be analyzed and interpreted from a different perspective.

Graph Database vs. Relational Database

Social networks are a common example of graph because the data mostly consists of how users interact with each other. In a relational database, each social platform would provide varying information on its users and their interactions, resulting in several different tables serving a similar purpose of storing user and transactional data. Then when it’s time to generate the relationships between each of these unique tables, it becomes extremely complex, weighing down performance and making the database difficult to maintain. If you utilize a graph database instead, users are the nodes and the users’ interactions with each other (follows, likes, shares, etc.) determine the relationships between each of the nodes, as seen in the simplified figure below. Similarly to relational databases, nodes and relationship can store many descriptive attributes, but by putting the focus of the data on the relationship, it no longer matters if user a and user b have the same profile data or relational data. This gives flexibility to combine similar data from multiple sources no matter what or how much data exists, while maintaining the ability to extract and analyze data quickly.

graph database

A graph database is probably not going to replace your current data architecture. It is still important to be able to structure data for governance and reporting. However, if you utilize a graph database alongside traditional architecture, it can relieve your current database of heavily connected data or inconsistently structured data. The additional perspective on the data is also a powerful analysis tool that can enhance your insight generation.

How Does this Impact Analysis?

Graph databases still allow for the querying, filtering, aggregating, etc. of data that traditional relational databases do. However, you have more opportunities to understand the how relationships effect each node. Here are a few examples:

  • PageRank centrality determines the importance of a node based on the number and weight of all the relationships to that node. This can help you find the most popular users.
  • Shortest path calculations derive the distance between nodes within given constraints. It is how companies like LinkedIn can tell you whether someone is a 2nd or 3rd degree connection to you.
  • Betweenness centrality measures the number of times each node comes in the shortest path between every other node. This allows you to understand the influence a particular node has on other nodes.
  • Community detection finds groups of nodes that are more connected with each other than they are with the rest of the graph. This can be helpful for interventions designed at a group level, instead of an individual level.

How Does this Relate to Your Business?

This concept can be applied to almost all facets of your business. You can have nodes that are customers and products, and their relationships are defined by a sales transaction. It can help you understand the value and importance of both the products and the customers. Machines on a manufacturing line can be nodes, where the relationships sync the flow of the product through the line. Graph technology can allow you to easily identify bottlenecks and quality issues of both the products and the machines. You can also use graph to assess the quality of your data and eliminate potential duplicate values. If a user’s data from Facebook has very similar connections to a user’s data from Twitter, you can determine a confidence that this person is the same on both platforms. These are just a few of the applications, but you can see how using graph technology and relying on the relationships bring additional business value.

Subscribe to Clarkston's Insights

  • I'm interested in...
  • Clarkston Consulting requests your information to share our research and content with you.

    You may unsubscribe from these communications at any time.

  • This field is for validation purposes and should be left unchanged.
Tags: Data & Analytics, Data Integrity, Data Quality
RELATED INSIGHTS