Behavioral Test Driven Quality Assurance on Search Data

Comprehensive suite of testing that includes unit, integration, stress, and data quality sync tests on Search Data with two million entities


  • Perform data quality checks on two million entities between master data stored in MySQL and indexed data stored in ElasticSearch.
  • Perform random checks with different keywords and compare the output including both faceted search and search results on the web page against the API payload response.
  • Capture web page screenshot of faceted search on Left Hand Side (LHS) and search results on Right Hand Side (RHS) for mismatch results between the web page results and API payload.
  • Perform stress test to ensure that search results are served within 2 seconds.
  • Generate Dashboard for internal curators for deeper insights on variety of entities stored in ElasticSearch.
  • Automate majority of the tests to avoid manual errors and save QA resource testing time.


  • Perform quality checks on two million entities between MySQL stored in relational format and ElasticSearch stored in JSON format.
  • Automate quality checks among faceted search menu on LHS, search results on RHS, and API payload response on more than 100 sampling keywords.
  • Capture screenshot only if the API payload response faceted count mismatches with number of elements in the faceted search results on the web page.
  • Choose appropriate performance test framework that can provide overall and API level performance reports visually and has ability to fine tune number of concurrent users, wait between users, and randomly send request to different APIs.
  • Trigger mini smoke test everyday soon after data sync between MySQL and ElasticSearch to verify key functionalities on more than 50 top keywords.
  • Trigger comprehensive test twice a week at scale that includes unit test, integration test, and stress test on more than 250 keywords.


  • Treselle QA team implemented a testing framework with appropriate tools and technologies that was best suited for this use case and also made use of already existing frameworks to cut the development cost.
  • ElasticSearch built in integration with test class ESIntegTestCase was extended to query the indexed data and to compare that with master data in MySQL using JDBC driver.
  • Selenium was integrated with Cucumber for behavior-based acceptance test, PhantomJS to perform interactive web page navigation in headless mode and to compare web page elements with API payload response.
  • Captured screenshots using PhantomJS if there were any mismatches in the results and tagged them appropriately for further review by QA team.
  • Gatling was implemented and necessary Scala scripts were written to authenticate the API.
  • Performance test was triggered with concurrent users ranging between 50 to 200. XPath was used to look into XML report and to alert necessary DEV team when the minimum and maximum response per second threshold limit exceeded.
  • Kibana visualization was used to design dashboard to create around 60 reports that included text reports as well as graph reports such as bar, pie, scattered, geomap, tabular, area, heatmap, and others.
  • Publish-Subscribe model was used to trigger smoke test daily and full blown test twice a week.
  • AWS SDK was used to launch multiple spot instances to perform tests in parallel during full blown test and to shut down the instances once the test jobs were done.