Mobile object detector with TensorFlow Lite

This article is a logical continuation of the previous article “OBJECT DETECTION WITH RASPBERRY PI AND PYTHON”. Today we try to optimize an object detection model and improve performance with TensorFlow Lite.

TensorFlow Lite is the official solution for running machine learning models on mobile and embedded devices. It enables on‑device machine learning inference with low latency and a small binary size on Android, iOS, Raspberry Pi and etc. TensorFlow Lite uses many techniques for this such as quantized kernels that allow smaller and faster (fixed-point math) models.

We will optimize the SSD Lite MobileNet v2 model for a proper comparison.

You can skip the next two parts by using the provided docker image with TensorFlow 1.9 and object detection API pre-installed:

Installing Tensorflow

If you don’t have TensorFlow installed on a host machine, install TensorFlow. You can use the official instruction. Or install TensorFlow from source using Bazel following the instructions here. Also, pay attention to the TF version, because TF is not backward compatible. And if build model with 1.11 version, then the model may not work on 1.9 version. So better use 1.9 version for TF Lite optimization.

Installing TensorFlow Object Detection

If you are not familiar with TensorFlow Object Detection, welcome! To install it, you can follow the instructions from the official git repository.

Then, download the SSDLite-MobileNet model from the TensorFlow detection model zoo and unpack it.

model ZOO

Convert a model with TensorFlow Lite

You can skip this part too because we’ve made a pre-trained model available here.

To make these commands easier to run, let’s set up some environment variables:

We start with a checkpoint and get a TensorFlow frozen graph with compatible ops that we can use with TensorFlow Lite. So, you already have to install TensorFlow and Object Detection python libraries. Then to get the frozen graph, run the export_tflite_ssd_graph.py script from the models/research directory with this command:

In the /tmp/tflite directory, you should now see two files: tflite_graph.pb and tflite_graph.pbtxt. Note that the add_postprocessing flag enables the model to take advantage of a custom optimized detection post-processing operation which can be thought of as a replacement for tf.image.non_max_suppression. Make sure not to confuse export_tflite_ssd_graph with export_inference_graph in the same directory. Both scripts output frozen graphs: export_tflite_ssd_graph will output the frozen graph that we can input to TensorFlow Lite directly and is the one we’ll be using.

Next we’ll use TensorFlow Lite to get the optimized model by using TOCO, the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite) via the following command. or a floating point model, run this from the tensorflow/ directory:

Running our model on Raspberry Pi</2>

To run our TensorFlow Lite model on a device, we will need to setup TensorFlow and openCV. You can read about this process here.

Now, let’s implement a class for working with “lite” graph:

All above code is available on github.

Summary

If we run both TFLite and non-TFLite versions of the model we can observe the following:
– SSD MobileNet Light – 1.02 avg FPS;
– SSD MobileNet Light with TensorFlow Lite – 1.73 avg FPS.

Looking at the results we can say that TensorFlow Lite gives a performance boost of about 70%, which is quite impressive for such a simple operation.