Graphs + SQL

PGQL is a graph pattern matching query language for the property graph data model, inspired by Cypher, SQL, and G-CORE. PGQL combines Cypher-like ASCII art syntax with familiar constructs from SQL, such as SELECT, FROM and WHERE. PGQL also provides powerful constructs for matching regular path expressions (e.g. PATH).

An example PGQL query is as follows:

SELECT AS friend_of_friend
  FROM facebook_graph                             /* In the Facebook graph..   */
 MATCH (p1:Person) -/:friend_of{2}/-> (p2:Person) /* ..match two-hop friends.. */
 WHERE = 'Mark'                           /* ..of Mark.                */

See PGQL 1.1 Specification for a detailed specification of the language.

Graph Pattern Matching

PGQL uses ASCII art syntax for matching vertices, edges, and paths:

  • (n:Person) matches a vertex (node) n with label Person
  • -[e:friend_of]-> matches an edge e with label friend_of
  • -/:friend_of+/-> matches a path consisting of one or more (+) edges, each with label friend_of

SQL Capabilities

PGQL has the following SQL-like capabilities:

  • DISTINCT to remove duplicates
  • GROUP BY to create groups of solutions, and, HAVING to filter out groups of solutions
  • COUNT, MIN, MAX, AVG and SUM to aggregate over groups of solutions
  • ORDER BY to sort results
  • (NOT) EXISTS subqueries to test whether a graph pattern exists, or, does not exist

Regular Path Expressions

PGQL has regular path expressions (e.g. *, +, ?, {1,4}) for expressing complex traversals for all sorts of reachability analyses:

    PATH connects_to AS (:Device|Switch) <- (:Connection) -> (d:Device|Switch) /* Devices and switches are connected by two edges. */
                  WHERE d.status IS NULL OR d.status = 'OPEN'                  /* Only consider switches with OPEN status. */
  SELECT AS source, AS destination
    FROM electric_network
   MATCH (d1:Device) -/:connects_to+/-> (d2:Device)                            /* We match the connects_to pattern one or more (+) times. */
   WHERE = 'DS'
| source | destination | /* The result of above query is a table with columns, like in SQL. */
| DN     | D0          | /* First result row. */
| DN     | D5          |
| DN     | D6          |
| DN     | D7          |
| DN     | D8          |
| DN     | D9          | /* Last result row. */

Temporal Data Types

In addition to numbers, (character) strings, and booleans, PGQL has the following temporal data types:

  • DATE (java.time.LocalDate)
  • TIME (java.time.LocalTime)
  • TIMESTAMP (java.time.LocalDateTime)
  • TIME WITH TIME ZONE (java.time.OffsetTime)
  • TIMESTAMP WITH TIME ZONE (java.time.OffsetDateTime)

PGQL’s Java result set API (see and is based on the new Java 8 Date and Time Library (java.time.*), offering greatly improved safety and functionality for Java developers.

Querying Multiple Graphs

Through subqueries, PGQL allows for comparing data from different graphs.

For example, the following query finds people who are on Facebook but not on Twitter:

  FROM facebook_graph
 MATCH (p1:Person)                           /* Match persons in the Facebook graph.. */
 WHERE NOT EXISTS (                          /* ..such that there does not exist..    */
                    SELECT p2
                      FROM twitter_graph
                     MATCH (p2:Person)       /* ..a person in the Twitter graph..     */
                     WHERE = /* ..with the same name.                 */