Decipher Neo4J Cypher Query Language (CQL)

Decipher Neo4J Cypher Query Language (CQL)

Introduction

Cypher is a declarative graph query language that allows for expressive and efficient querying and updating of the graph store. Cypher is a relatively simple but still very powerful language. Very complicated database queries can easily be expressed through Cypher. This allows us to focus on the domain instead of getting lost in database access.

Cypher is designed to be a humane query language, suitable for both developers and operations professionals. Being a declarative language, Cypher focuses on the clarity of expressing what to retrieve from a graph, not on how to retrieve it. Please refer to our previous blog on “Embrace Relationships with Neo4J, R & Java”

Use Case

Let’s have a use case to perform query operations on StackOverflow Dataset graph.db which is attached with this blog. This use case needs Neo4j web console.

What we need to do:

  • Retrieve all the languages that has at least one question
  • Retrieve all java questions
  • Retrieve all java questions with its corresponding answers
  • Retrieve all java questions with the corresponding questioner
  • Retrieve all the users who answered the java questions
  • Retrieve maximum and minimum score for all questions in each language
  • Retrieve questions that has more than 100 views in all languages
  • Retrieve questions that has score less than 10 in all languages
  • Retrieve questions that has maximum score in all languages
  • Retrieve questions which has minimum views in all languages
  • Retrieve details related to the questions  in  all languages
  • Retrieve all users who questioned in Java or Python

Solution

Follow the sections given below from the previous blog “Embrace Relationships with Neo4J, R & Java” to download, Install Neo4j and to start Neo4j server web console, to perform the use case.

The following are the list of cypher queries executed in Neo4j server’s web console and results with snapshot which helps to understand how query performs and retrieves different results in different style.

A snapshot of overall StackOverflow Dataset in Neo4j graph:
Stack overflow dataset

Retrieve all the language that has at least one question

  • Query:
    Execution Time: 137 ms
    Result: 9 nodes, 0 relationships

    Snapshot:atleast one question

Retrieve all java questions

  • Query:

    Execution Time: 177 ms
    Result: 26 nodes, 25 relationships

    Snapshot:java questions

Retrieve all java questions with its corresponding answers

  • Query:

    Execution Time: 254 ms
    Result: 36 nodes, 35 relationships

    Snapshot:Ques with corresponding ans

Retrieve all java questions with the corresponding questioner

  • Query:

    Execution Time: 173 ms
    Result: 25 nodes, 28 relationships

    Snapshot:ques correponding questioner

Retrieve all users who answered the java questions

  • Query:

    Execution Time: 345 ms
    Result: 35 nodes, 36 relationships

    Snapshot: answered java question

Retrieve maximum and minimum score for all questions in each language

  • Query:

    Execution Time: 134 ms
    Result: 9 rows

    Snapshot:maximum minimum score

Retrieve questions that has more than 100 views in all languages

  • Query:

    Execution Time: 146 ms
    Result: 31 nodes, 25 relationships

    Snapshot:more than 100 views

Retrieve questions that has score less than 10 in all languages

  • Query:

    Execution Time: 167 ms
    Result: 29 nodes, 25 relationships

    Snapshot:Score less than 10

Retrieve questions that has maximum score in all languages

  •  Query:

    Execution Time: 140 ms
    Result: 2 nodes, 1 relationship
    Max Score: 91

    Snapshot:maximum score

Retrieve questions which has minimum views in all languages

  • Query:

    Execution Time: 134 ms
    Result: 6 nodes, 4 relationships
    Minimum View: 1

    Snapshot:minimum views

Retrieve details related to the questions  in  all languages

  • Query:

    Execution Time: 686 ms
    Result: 683 rows

    Snapshot:details related to ques

Retrieve all users who questioned in Java or Python

  • Query:

    Execution Time: 345 ms
    Result: 219 nodes, 0 relationships

    Snapshot:question-related-to-python-java

Conclusion

  • Cypher Query Language is a declarative query language used to query Neo4j database, it also focuses on different ways to get result which is expressive and legible.
  • Neo4j is also compatible with JAVA API and Rest API to perform cypher queries on nodes and their relationships from the existing graph DB.

References

4249 Views 1 Views Today
  • Michael Hunger

    Really interesting blog post and dataset.
    I’d love if you’d used Labels for your nodes, that makes it much easier to assign roles/types to them also easier to visualize and faster to look up a subset of nodes by label or label+property (indexed).

    Also the start syntax is no longer needed, match is enough and for better readability it helps to surround all nodes in a pattern with parentheses.

    Thanks again for the great blog post, I hope to see an update to labels soon.
    Hope to see you at GraphConnect in October in SFO

    Cheers, Michael

    • http://www.treselle.com/ Treselle Systems Blog

      Michael, Thanks for your appreciation and your suggestion on the use of Labels.

      I’m aware of the advantage of the Match option over the Start option, but have put it up on the blog in order to share info on both the operations. Will only be too glad to get the chance to meet you @ GraphConnect.

  • Pingback: Neo4j REST API + Extension Points | Treselle Systems