April 26, 2022

Using TensorFlowJS & HarperDB for Machine Learning

Welcome to Community Posts
Click below to read the full article.
Arrow
Summary of What to Expect
Table of Contents

Implementing a Dog Breed Classifier Using Stanford Dogs and MobileNet with HarperDB Custom Functions

Intro

HarperDB is an easy-to-use database solution that has a simple method of creating endpoints to interact with data, called Custom Functions. These Custom Functions can even be used to implement a machine learning algorithm to classify incoming data. TensorFlowJS is a library released by Google that makes it possible to use JavaScript for machine learning so it can be done in the browser or on a NodeJS server like we’ll be doing in this article.

Summary

What We’re Going To Do

This article will explain how to train and use a TensorFlowJS model to classify dog breeds with HarperDB Custom Functions, using the Stanford Dogs dataset and MobileNetV2 as a base for transfer learning.

Stanford Dogs

There’s an awesome dataset that was released by Stanford with 20,000 images of dogs. The images are grouped into different folders, each folder containing the name of the breed. There are additional annotations available for bounding boxes as well, but today we’ll be focused solely on classifying the breed.

MobileNet

There’s a SOTA (state of the art) model published by Google called MobileNet which is a relatively small model with the ability to classify over 1,000 images. It’s built small so it’ll run on mobile devices without taking up too many resources. We’ll be using version 2 of this model which is available in the @tensorflow-models/mobilenet package.

Transfer Learning

Transfer learning is the technique of taking a pretrained model and training it to output new data. Like teaching an old dog new tricks! For that we’ll be using @tensorflow-models/knn-classifier.

We’ll be sending an image into MobileNet and getting out the logits, which is the bit right before the classification. Then we’ll send those logits into a KNN-Classifier which uses the K-Nearest Neighbors algorithm to associate those logits with specific dog breeds.

Getting Started

If that all sounds complicated, *don’t worry*. This implementation will be quick and easy thanks to HarperDB Custom Functions.

Screenshot of HarperDB Studio — Classification Table w/ a Stanford Dog

Setup

Prereqs

1. A HarperDB Account

2. A HarperDB Local Database

Clone the Repo

Clone this repo into your Custom Functions folder

git clone https://github.com/HarperDB/hdb-cf-dogml.git ~/hdb/src/custom_functions/dogml

Restart Custom Functions

Use the link in the HarperDB Studio Functions page (bottom left of the screen) to refresh the projects.

Screenshot of Custom Functions link
Screenshot of Server Restart button

Run /setup

The training data and TensorFlowJS modules need to be installed. This can be done via the `/setup` endpoint.

If you go to http://localhost:9926/dogml/setup it’ll start the setup. You can check on the progress in the logs — either in stdout from the locally running database or in the logs section of the Status page inside of the Studio.

The expected output of starting setup is {success: true, message: ML Setup Started}

This will use the $HOME/dogml directory in relation to the database for all of the training materials.

Be sure to wait for the ML Setup Complete note in the database logs.

Screenshot of HarperDB Logs

Activate

Run /train

To train the model, visit the `/train` endpoint by going to http://localhost:9926/dogml/train. This will begin the model training. You can see the status inside of the console logs (similar to viewing the info during /setup), or inside of the logs table inside of the schema.

Verify Model

Once the logs indicate that the training is complete, you should be able to see the model appear in the models table in the schema.

Screenshot of HarperDB Studio — Models Table

Classify a Dog Breed!

Travel to the UI at http://localhost:9926/dogml/ui and try uploading an image of a dog (one of the images in the $HOME/dogml/training_data/Images directory will do).

The results should appear in the UI as well as in the classifications table.

Screenshot of HarperDB ML Dog Dashboard

Go Deeper

Add New Training Data

You can add more training data by adding new images to the $HOME/dogml/training_data/Images directory — either by putting the image in the correct folder or making a new folder (if it’s a breed without a folder already present). All images should be JPEGs.

Removing Training Data

You can also remove training data in the $HOME/dogml/training_data/Images directory to better target specific breeds.

Update the Model

If you modify the training data and use the /train endpoint to create a new model, be sure to then call the /update endpoint at http://localhost:9926/dogml/update to ensure the new model is loaded into the classifier.

Train w/ GPU

To train the model 200% faster, use the /train_gpu endpoint at http://localhost:9926/dogml/train_gpu. This will take advantage of a CUDA-Enabled Nvidia GPU to process the training mathematics quicker.

Be sure the necessary drivers and CUDA libraries are installed

Here’s a guide to install CUDA on Ubuntu

Review

There you have it, you’ve just trained a machine learning model on dog breed data and can now use it to classify images of dogs and determine the breed. To do this, we used a HarperDB Custom Function and TensorFlowJS to train a MobileNet model on the Stanford Dogs dataset.