Visual Search Engine for Financial Investment Equity
A powerful business infinity visual search engine to always find up-to-data for Financial Investment Entity
BUSINESS GOALS: Provide a powerful search engine on the data that is sourced from structured, unstructured, and geo-spatial shape files to enable Client’s customers to always find up-to-data with free text and facet search on 11 million documents with sub-second response time.
- Extract necessary data from MySQL Master Data Management (MDM), MongoDB, Geo-Spatial shape files from public and private data partners, and FTP data.
- Multiple naming representation for same entity and error-prone geo-locations entered by data partner field engineers.
- Custom document scoring and boosting requirements to show private data partners results on the top results.
- Complex pattern matching and filtering capabilities while data being ingested to adapt to internal document standards.
- Provide search results within seconds on 11 million documents.
- An intuitive Frontend to display high-performant search facets and infinity scrolling search results.
- Set up 25 cluster ElasticSearch with 80 shards and 50 replicas to accommodate 11 million documents with sufficient high availability and failover mechanism.
- Implemented the following as part of the solution:
- Dynamic Index Templating:Groovy based dynamic template pattern to create field mapping.
- Built-in & Custom Analyzers: Used Keyword, Snowball, Pattern along with 5 custom analyzers for geo-location, exact matches, etc.
- Tokenizers: Bi-grams, Tri-grams, Edge ngrams, and Shingles.
- Stemming: Snowball, Porter Stem, Kstem, Hunspell token filter.
- Scoring Methods: TF-IDF, Okapi BM25, Weight, filed value, decay.
- Aggregations: Metrics such as advanced & approximate statistics, Multi-bucket such as terms, range, histogram, and Nesting such as multi-bucket & single-bucket.