Results

Capital Markets: Search anything on the network and find the relationship among them

Find Nodes & Relationships across 4 million nodes and 22 million relationships with multiple depth level

BUSINESS GOALS:

  • Business Entity Relationships: Enable users to traverse through business entities and their corresponding alphanumeric codes.
  • Marketplace Relationships: Enable users to traverse through competitors’ products, trade area competition, markets that business entities operate in, industries, and groups they relate to.
  • Asset Relationships: Enable users to traverse through business entities, assets, relationships, geo-spatial distance between these assets, and assets that compete with each other.
  • Product Relationship: Enable users to traverse through competitors, product synonyms, partners, suppliers, customers, and others.
  • Recommendations: Provide recommendations to users by connecting their portfolios and find relevant events that they have not subscribed to but are useful.
  • Lazy Entity Fetch: Enable users to get basic details of the entities based on its relationship with other entities and fetch more data from the corresponding underlying persistent storage if needed.
  • Event Lens: Provide ability to users to show graphically why a particular event surfaced and the lineage of that event.
  • Regenerate from source: The nodes and relationships should be able to regenerate from the persistence store that has the actual data.
  • Tag Search: Retrieve or search entities based on the tags associated.
  • Interoperability: Ability to replace different graph implementations such as Neo4J, OrientDB or Titan without lot of development effort.

CHALLENGES:

  • The actual source data is from MySQL, MongoDB, and internal and external API calls.
  • Design separate write intensive and read intensive components for better separation of concerns.
  • Normalize data across multiple sources and convert them into unified nodes & relationship formats.
  • Ability to switch different GraphDB implementation among Neo4j or OrientDB or Titan.

THE SOLUTION:

  • Design and architected GraphDB framework based on Tinkerpop blueprint that enabled to swiitch different underlying GraphDB implementations.
  • Created plugin framework based on JPF to consume data from respective sources and normalize to a common graph format.
  • Implemented a REST API that interacts with Tinkerpop API and Gremlin language to get necessary nodes and relationship based on the search term.
  • The graph will be rip and replaced to keep it simple and avoid transactional issues such as locking, concurrent modification, and others. This also satisfies the ability to regenerate the graph from their sources at any time.
  • Created necessary test cases to perform data quality checks between the source and the graph data.
  • Performed multiple benchmarks and improved the performance of Graph creation and retrievals such as non-transaction mode, Idempotency, GraphDB tuning, JVM tuning, creating vertices in the memory and batch import them into GraphDB, Edge creation optimization, and others.

THE SOLUTION DIAGRAM:

tinkerpop_neo4j_orientdb_titan