One of our clients had legacy geospatial data in MySQL. This did not meet their new functionalities and capabilities as lot of code had to be written to get the geo data and convert them into GeoJSON standards.
Treselle’s engineering team did a thorough study on various geospatial frameworks, which included ESRI, Google Places, QGIS, PostGIS, Maptitude, and few others. We chose MongoDB for geospatial capabilities as it supports GeoJSON open-source specification and offers good community support. It offers scalability, high availability and is cost-effective. Our client’s requirements did not warrant complex geospatial functionalities as found in commercial packages and thus MongoDB fit the bill. This choice enabled them to store richer data and index on LineString, Polygon, Multipoint, MultiLineString, MultiPolygon, and Geometry, which are very helpful to plot proximities, trade areas, intersections, inclusions, and others.
- Savings of approximately $25,000 annually by not using commercial framework.
- GeoJSON open source specification standards.
- Ability to scale to large geospatial needs due to MongoDB scaling capabilities.
- No special resource skills or training needed as the team was well versed with MongoDB and JSON.
- Quick turnaround time to go to market.
- Feature rich capabilities that includes – Inclusions to get all retails in a state, Intersection to get all neighboring states of a particular state, and Proximity to get all retails within certain mile radius.
One of our clients managed stock and other financial related data in MySQL and found it difficult to perform statistical calculations as the SQL queries were becoming complex and lot of tables had to be created to duplicate the data in different dimensions, which consumed lot of disk space. Our client was horizontally scaling this database by adding more disk space and memory to accommodate this sort of datasets. The queries and stored procedures were becoming very slow and complex and lot of data had to be filtered and thrown away to make everything faster. Our client finally gave up and wanted to find a better solution to this problem.
Treselle’s engineering team understood the problem and implemented the above datasets in Cassandra as it has high write throughput so that same data can be written in different column families for different needs. We also modeled the key space with variety of design patterns that includes composite partition key, clustering columns, counter columns, day/month/quarter/half yearly/annual roll ups, dynamic column families, expiring columns, and many other techniques for properly partitioning the data across the cluster so that reads will be faster for different time intervals.
- Reduced time significantly from several hours to 15 minutes to insert 2 million rows daily with all pre-computations to store materialized views.
- Improved speed of data retrieval by an order of magnitude 50x with simplified CQL queries compared to complex SQL queries.
- Created the ability to grow data as wide as possible as part of Cassandra’s columnar storage.
- Reduced administrative cost by 75% with Cassandra’s auto clustering feature.
- Reduced lines of code from 13,000 to 2,500 and thus reduced cost on code management.
- Saved client $880,000 in Oracle RAC license cost by choosing Oracle Streams.
- Offered the ability to operate in high availability mode from all the layers of the infrastructure increased revenue opportunities.
- Introduced the capability to add more database servers in the syncing process to distribute the load during peak season.
- Ability to run databases with read only mode enabled the client to process complex reporting applications on real time data without impacting the performance of the online databases.
- Saved thousands of dollars annually on Amazon EC2 instances by shutting down multi-node cluster and moved to MySQL.
- Enabled the client to find the talent easily to manage MySQL instead of Hadoop ecosystem in an economical way.
- Enabled the client’s business analyst team to directly get the needed data from the database instead of waiting for the Hadoop/Pig specialist to provide the needed reports.
- Reduced the dataset analysis processing time by 50% compared to Hadoop/Pig execution time.
- Physical and logical database design
- Infrastructure setup
- Database migration
- Database cluster implementation
- Database replication/mirroring setup
- Database administration & monitoring
- Relational database modeling
- NoSQL database modeling
- Database PL/SQL programming
- NoSQL query language
- Database consolidation
- Indexing, sharding, & partitioning