User Analytics using Metabase and MongoDB

User Analytics using Metabase and MongoDB

Introduction

Metabase, an open source, easy-to-use database visualization tool, is built and maintained by a dedicated Metabase team and comes with a Crate driver. It is written in Clojure and offers multiple options such as Mac application, Docker image, cloud images, and a jar file, which are specifically designed for particular use cases.

Metabase is mainly used for analyzing your existing data on a daily basis by quickly fetching answers to your common queries without dealing with complex workflows. It supports around 7 different type charts, which were used to plot charts using data from both NoSQL and MySQL databases using common/basic SQL Queries.

This blog content deals with data visualization using Metabase and MongoDB to analyze the users’ access of the application on a daily basis.

Use Cases

Let us consider a use case to analyze the user data using Metabase in connection with MongoDB.

What we need to do

  • Pre-requisites
  • Data Loading into MongoDB
  • Setup MongoDB with Metabase
  • User Account Creation and Access Restriction
  • Dashboard Creation with Different Query Layers

Solution

Pre-requisites

  • JDK 1.6 +
  • Install MongoDB and Metabase on same node (both to be in running state)

Install MongoDB and Metabase

Few links for installation references and reference documents are:

https://docs.mongodb.com/manual/tutorial
http://www.metabase.com/start/jar.html

Data Loading into MongoDB

  • Connect to MongoDB shell, and create a database using the below comment:

(Note: A new database will be automatically created if not already present)

  • Insert/Import the documents using the below command.
  • Create a new collection using the below command

 (Note: A new collection will be automatically created if not already present)

  • Import the sample json to your MongoDB using the below command:

 Sample Document and its Fields:

Field

Description

id

MongoDB HashID

Username

Specific name of the user

IP

IP address of the machine accessed by the user

Item

Type of product

Date

User accessed Date and Time (it should be in ISODate format not a string)

user_agent

Browser Name and Version, OS

Action

Provides the information about the page accessed by the user

Brand

Legal marketing name of the company

Download Sample document: user_info

Setup MongoDB with Metabase

To setup MongoDB, perform the following steps:

  • Run the downloaded Metabase jar using the below command:
  •  Create your own account by providing details such as First Name, Last Name, Email Address, Password, and Company or Team Name using the below link:

http://localhost:3000/setup/

setup

  • Provide your DB details such as DB Name, Host, Port, DB User Name, Password, and so on.

add_data

User Account Creation and Access Restriction

You can create multiple new users by choosing People tab, and assign a role to each of them to enable access restriction.

admin_page

You can add multiple databases by choosing Databases tab.

Data Preprocessing

To analyze your data, you need to preprocess your data in a questioner format with Metabase. Metabase will provide answers to your questions almost instantly. It will allow you to create custom filters and provide sorting options to refine or rearrange the output answers.

Dashboard Creation with Different Query Layers

You can create your own dashboard by combining your questions and answers and can even layer them on top of each other for better comparison.
You can save your frequently used questions in the Dashboard and can review them at later point of time.

dashboard

  • Click “New Question” tab
  • Connect to the db and table/collection.
  • Add where condition in the filter by column by choosing your field and value.

Here are few sample queries:

Total Record Count

  • Choose view type as “Count of rows” and execute the queries.

total_record_count

Total Page Count by Action 

  • Choose the field in group by session as “Action”.

action

Total Page Count by Brand

  • Choose the field in group by as “Brand”.

count_by_brand

User Access Count

  • Choose the field in group by as “user account”.
  • Filter the column/field by not empty.

user_access_count

Total Page Count by Item and Brand

  • Apply not empty filter for both column and add the same in group by column.

pagecount_by_item_brand

Women – Count of User Based on Brand and User Account

  • Apply filter for action column as “women” and group by “user account and brand”.

brand_and_user_account_count

Electronics – Count of User Based on Item

  • Choose action as “electronics” and group by column as “item and user account”.

count_of_user_based_on_item

User Account Based on Brand Sort by Access Count

  • Apply “brand and user account” in group by column, and choose “sort” based on “count” with descending.

sortby_access_count

Home furniture’s with past 7 days based on user account

  • Choose filter as past 7 days in “date time” field and action as “home & future” with group by as “user account”.

user_account

  • Save all the query by clicking “Save” button.
  • Add the graph to the dashboard.

The dashboard would look similar to the one as shown below:

user_analysis

Note: Metabase time filter will work only on MongoDB version greater than 3.2.

Limitations

  • Selection of tables/collection in Metabase is limited to two for each database connection.
  • JVM size need to be increased if table size is too large so as to perform Metabase process.

Conclusion

  • Connect with different type of databases in a comparatively simple manner without requiring high technical knowledge.
  • Metabase allows you to easily analyze billions of data in databases and analyze user access on a daily basis.
  • Store billions of users’ data in a more reliable and productive manner in MongoDB.
  • Handle huge volume of request such as 50,000 requests with MongoDB and Metabase.

Reference

4224 Views 1 Views Today
  • Arindam Biswas

    Add fluentd to this mix where it basically tails the access log and inputs it to mongo / mysql and you have a homegrown analytics engine!