Table of Contents

πŸ’‘ Introduction

This talk was given to the Tensorflow Deep Learning Malaysia Facebook group during the June 2022 online meetup. The group had over 7.5k members consisting of audience from various background related to artificial intelligence in Malaysia.

The goal of the talk is to introduce the members to existing open-source tools they can use to deploy models on the cloud and edge.

Half of the audience has no experience with deep learning. Hence, the talk was tailored to beginners in the field.

πŸͺ‚ The Deep Gap

I started the talk by introducing my background as an academic and my experience in the field.

I started exploring the field of deep learning (DL) in 2013. Having been in the field for over 9+ years now, I shared my stories on how I arrived at this point and my observation of the DL field over the years.

I also shared that being in academia, we are incentivized for publications more than anything else. As a result, many “groundbreaking” works in DL stopped at the point of publication - which is a pity. Had the works continue beyond that, they could have the potential to change the industry.

The consequence?

More than 85% of machine learning models fail to make it into production.

Gartner Survey

I unveiled that the deep gap is that not enough attention is placed on productionizing/deploying DL models in real world applications.

⛏ Technical Walkthrough

I transition the talk to share on some of my recent projects on deploying DL models.

I elaborated on two general categories of deployment environments:

  • Cloud Deployment.
  • Edge Deployment.

🌧 Cloud Deployment

Cloud deployment is a setting where the trained DL model is hosted on the cloud infrastructure.

I shared how I trained a state-of-the-art VFNet model with IceVision and deploy them on an Android phone using the Hugging Face Hub ecosystem.

The details can be found in the following posts:

πŸ“± Edge Deployment

Edge deployment is a setting where the trained DL model is placed on a physical computing hardware (also known as edge device) where the data is collected.

I shared how I trained a state-of-the-art object detection model, YOLOX to accurately detect license plates on Malaysian vehicles. I also shared how I optimize the model to run 10x faster (at 50 FPS) on a CPU using the OpenVINO toolkit.

I briefly talked about an alternative to the OpenVINO toolkit which can accelerate inference up to 180 FPS using DeepSparse and SparseML library by Neural Magic.

The details can be found in the following posts:

🍧 Takeaways

Here are the takeaways from the brief talk


  • Begin with deployment in mind as the end goal.
  • The gap is deeper at the deployment side.
  • Many open-source tools make it easy to deploy models.
  • MLOps - hot topic worth exploring.

πŸ“½ Video & Presentation Deck

Recorded video πŸ‘‡

My presentation deck πŸ‘‡