Graphs & SQL

PGQL is a graph pattern matching query language for the property graph data model, inspired by Cypher, SQL, G-CORE, SPARQL and GSQL. PGQL combines graph pattern matching with familiar constructs from SQL, such as SELECT, FROM and WHERE. PGQL also provides powerful constructs for matching regular path expressions (e.g. PATH).

An example PGQL query is as follows:

SELECT p2.name AS friend_of_friend
  FROM facebook_graph                             /* In the Facebook graph..   */
 MATCH (p1:Person) -/:friend_of{2}/-> (p2:Person) /* ..match two-hop friends.. */
 WHERE p1.name = 'Mark'                           /* ..of Mark.                */

See PGQL 1.1 Specification for a detailed specification of the language.

Graph Pattern Matching

PGQL uses ASCII art syntax for matching vertices, edges, and paths:

  • (n:Person) matches a vertex (node) n with label Person
  • -[e:friend_of]-> matches an edge e with label friend_of
  • -/:friend_of+/-> matches a path consisting of one or more (+) edges, each with label friend_of

SQL Capabilities

PGQL has the following SQL-like capabilities:

  • DISTINCT to remove duplicates
  • GROUP BY to create groups of solutions, and, HAVING to filter out groups of solutions
  • COUNT, MIN, MAX, AVG and SUM to aggregate over groups of solutions
  • ORDER BY to sort results
  • (NOT) EXISTS subqueries to test whether a graph pattern exists, or, does not exist
  • DATE, TIME, TIMESTAMP, TIME WITH TIME ZONE, and TIMESTAMP WITH TIME ZONE temporal data types

Regular Path Expressions

PGQL has regular path expressions (e.g. *, +, ?, {1,4}) for expressing complex traversals for all sorts of reachability analyses:

    PATH connects_to AS (:Device|Switch) <- (:Connection) -> (d:Device|Switch) /* Devices and switches are connected by two edges. */
                  WHERE d.status IS NULL OR d.status = 'OPEN'                  /* Only consider switches with OPEN status. */
  SELECT d1.name AS source, d2.name AS destination
    FROM electric_network
   MATCH (d1:Device) -/:connects_to+/-> (d2:Device)                            /* We match the connects_to pattern one or more (+) times. */
   WHERE d1.name = 'DS'
ORDER BY d2.name
+--------+-------------+
| source | destination | /* The result of above query is a table with columns, like in SQL. */
+--------+-------------+
| DN     | D0          | /* First result row. */
| DN     | D5          |
| DN     | D6          |
| DN     | D7          |
| DN     | D8          |
| DN     | D9          | /* Last result row. */
+--------+-------------+

Temporal Data Types

In addition to numbers, (character) strings, and booleans, PGQL has the following temporal data types:

  • DATE (java.time.LocalDate)
  • TIME (java.time.LocalTime)
  • TIMESTAMP (java.time.LocalDateTime)
  • TIME WITH TIME ZONE (java.time.OffsetTime)
  • TIMESTAMP WITH TIME ZONE (java.time.OffsetDateTime)

PGQL’s Java result set API (see ResultSet.java and ResultAccess.java) is based on the new Java 8 Date and Time Library (java.time.*), offering greatly improved safety and functionality for Java developers.

Resources