Crime Analysis Using H2O Autoencoders – Part 2

Crime Analysis Using H2O Autoencoders – Part 2

Overview

This is the second part of a two-part series of Crime Analysis using H2O Autoencoders. In our previous blog on Crime Analysis Using H2O Autoencoders – Part 1, we discussed building the analytical pipeline and applying Deep Learning to predict the arrest status of the crimes happening in Los Angeles (LA). Our Machine Learning model can be deployed as a jar file using POJO and MOJO objects. H2O generated POJO and MOJO models can be easily embeddable into Java environment based on the autogenerated h2o-genmodel.jar file.

In this blog, let us discuss deploying the H2O Autoencoders model into a real-time production environment by converting it into POJO objects using H2O functions. As the Autoencoders does not support MOJO models, the POJO model is used in this blog.

Dataset Description

Crime dataset of Los Angeles, from 2016-2017, with 224K records and 27 attributes is used as the source file. For more description, refer our previous blog on Crime Analysis Using H2O Autoencoders – Part 1.

Sample Deployment Model

select

Use Case

Deploy the H2O Autoencoders model into the production environment.

Synopsis

  • Generate JAR File for H2O Autoencoder Model
  • Run model
  • Deploy model into production environment
  • Implement machine learning model (Java Spring)
    • Set up model execution project
    • Set up model deployment project
  • Perform overall production deployment

Generating JAR File for H2O Autoencoder Model

The Autoencoders model created from our previous analysis is as follows:

select

To generate the JAR file, perform the following:

    • Download the Autoencoders model using h2o.download_pojo() function in H2O package.
    • Execute the below syntax to create a Java file along with the JAR file:

select

  • Download the Java file along with the JAR file using a Java Decompiler as shown in the below diagram:

select

Note: If the downloaded dependency JAR file does not contain logic to implement the autoencoder model, an UnsupportedOperationException error will be thrown similar to the one shown in the below diagram:

select

The error can be viewed in the PredictCsv.java file as shown in the below diagram:

select

Similarly, you can view other models such as BinomialModelPrediction, MultinomialModelPrediction, and so on.

To overcome this exception error, perform the following:

select

  • View the new jar file downloaded from the external site containing logic for the Autoencoders as shown in the below diagram:

select

Running Model

You need a Java file from POJO object, an input file, and a h2o-genmodel.jar file with its dependencies to run the model.

To run the model, perform the following:

  • Use test_input.csv as an input file and output.csv as an output file.
  • Run the model with all the dependencies using the below commands:
Note: As the Autoencoders return reconstruction MSE error values for all columns for each class, the arrest status of the crimes cannot be predicted.

    • Download the already trained Supervised Classification model as the POJO object using the pre-trained autoencoder model to predict the values.
    • Create a separate folder named “pre-trained” for this process.
    • Append all the JAR files into this folder.
    • Copy and paste the dependency JAR files and inputs into this folder.
    • Compile and run the Java file using the below commands:

select

  • Obtain the output of our prediction model. The output looks similar to the one shown below:

select

From the above results, it is evident that our model works fine as a standalone Java file. Let us convert this model into a JAR file and move it into the production environment along with h2o-genmodel.jar and input files.

Deploying Model into Production Environment

To deploy the model into the production environment, perform the following:

  • Convert the model into the JAR file with all the class files using the below command:
select

  • Place the above setup on any server and run the JAR file using the below command:

Implementing Machine Learning Model (Java Spring)

To implement the POJO model in the Java Environment using Spring Framework, set up a simple Spring WebService project and pass the input as JSON payload through POST call.

Setting Up Model Execution Project

To set up a model execution project, perform the following:

  • Parse an input CSV file and convert it into required Java collection objects.
  • Convert the collection objects into JSON string to pass it as a JSON payload in the POST call.
  • Create a function to make the JSON string as a valid request for our API call and to make all necessary connection objects within it.

Project Setup

select

Few class files in the project setup are:

  • CrimeModelExecution.java – Makes all the required function calls and converts the input file string into a valid JSON string. It is the core file for our project.
  • CSVParser.java – Parses a CSV file and converts it into required Java collections.
  • URLExecution.java – Contains functions to make the JSON string as the valid request for our API call. It makes all necessary connection objects within it.
  • StringUtil.java – All Util functions are made in this class.

Setting Up Model Deployment Project

To set up model deployment project, perform the following:

  • Convert the execution project into the JAR file with all its dependencies.
  • Initiate a server to run all APIs containing necessary logic to apply prediction on the dataset.
  • Setup the project in a server environment and pass the required input files as parameters.

The project setup is as follows:

select

Few class files in the project setup are:

  • CrimeController.java – Contains all APIs required to apply Model Prediction for the datasets and to pass the input as JSON payload through POST call and as the File format in POST call.
  • UtilHelper.java – Performs basic string datatype conversions.

The project is implemented based on dependencies present in the h2o-genmodel.jar (PredictCSV.java) file. So, add this JAR to our classpath during implementation.

Performing Overall Production Deployment

The overall production deployment involves analyzing the input, implementing a model using R scripts, downloading the model into required Java Objects, and implementing these objects in the production environment.

The flow of moving the Machine Learning models into the production environment is as follows:

select

To deploy the model, perform the following:

  • Upload all the codes in a specified location.
  • Create separate batch files (in Windows environment) for implementing R Script.
  • Make the project execution JAR.
  • Deploy the model in the production environment as shown in the below diagram:

select

Conclusion

In this blog, we discussed setting up a simple Spring Webservice project in Java environment and deploying the Machine Learning model in the real-time production environment using the command prompt and the POJO model. In our use case, the setup was performed on Windows. But, the same can be followed in any real-time server setup. The h2o-genmodel.jar file contains all the dependencies and default functionalities required to build the model using Java.

To know about building the analytical pipeline and applying Deep Learning to predict the arrest status of the crimes happening in Los Angeles, consider our previous blog on Crime Analysis Using H2O Autoencoders – Part 1.

References

1741 Views 2 Views Today