Results

Quality Assurance of Retail Stores Data in Motion

Data quality checks on a million retail stores across different persistence storage systems with high accuracy

BUSINESS GOALS:

  • Perform quality checks on retail store data between source data in MongoDB and transformed Master Data in MySQL.
  • Perform quality checks between Master Data in MySQL and JSON cache files for accuracy.
  • Perform random checks on large retail stores plotted on Google maps and compare against the API payload response.
  • Capture screenshot of Google maps with markers at random to check the accuracy of retail store marker plots.
  • Generate dashboard with different reports such as total stores, open/close stores, history of stores over the past few months, and others between source data in MongoDB and master data in MySQL for internal curators to perform sanity checks.
  • Reduce QA manual process time and automate checks across different systems.
  • Provide confidence to the business users that data persisted across systems such as MongoDB, MySQL, and Cache file systems, and the data plotted on Google maps are in sync.

CHALLENGES:

  • Perform quality checks on half a million retail stores’ data, in different format and structure, across multiple data storage systems.
  • Data, persisted in MongoDB as JSON documents, was in different format compared to transformed Master Data in MySQL.
  • Data, persisted in JSON cache files, was in different structure compared to Master Data in MySQL.
  • Automation of quality checks between Google maps DIV markers and API payload response.
  • Capture screenshot of only those doubtful retail companies with different retail store counts based on marker DIV elements.
  • Avoid new implementations to provide unified dashboard between data persisted in MongoDB and MySQL in different formats.
  • Trigger the whole test flow whenever new data arrives at the source in MongoDB and transform it to Master Data in MySQL.

THE SOLUTION:

  • Treselle QA team implemented a polyglot test programming using different programming languages best suited for this use case and also made use of already existing tools and techniques to cut the development cost.
  • A custom Python program with PyMongo and MySQL client was used to compare the retail stores in JSON documents in MongoDB with relational data in MySQL. The test cases were flagged appropriately on mismatch.
  • DBUnit testing framework was used along with JSON.simple library to parse JSON cache files to objects and to compare with mock database loaded into XML. The test cases were flagged appropriately on mismatch.
  • Selenium was integrated with SoapUI and PhantomJS to perform interactive web page navigation in headless mode and to compare the Google maps DIV markers with API payload response. PhantomJS was used to capture the Google map page for any mismatch and to tag them appropriately for further review by QA team.
  • Metabase visualization tool was chosen to create dashboard that was in a question/answer format and to combine reports and charts from MongoDB and MySQL.
  • Publish-Subscribe model was used to start the test flow whenever new data arrives at the source in MongoDB and to move it to MySQL.
  • Cluster of test machines were launched programmatically to split the testing work as quickly as possible while testing the entire test flow as well as to shut down the machines after use.

THE SOLUTION DIAGRAM:

 

 

qa_store_quality