Deploy a machine learning model in seconds

Don’t want to deal with Docker, Flask, and web hosting? Try this instead

Abubakar Abid
4 min readFeb 26, 2021

Several years ago, when I built my first machine learning model to classify handwritten digits, I immediately wanted to show it off to my friends and my siblings so that they could see what I had built. I remember searching “how to deploy a machine learning model” and becoming frustrated with all of the steps required, from containerization with Docker to purchasing an Amazon web server to host the model. What I thought would be a fast process took me several days of development and a lot of debugging.

Now, many years later into my PhD, I’ve trained hundreds of machine learning models to do all sorts of things, like classifying ultrasound images and figuring out which genes are responsible for cancer. But I still dread needing to deploy my machine learning models, because like this xkcd alludes, it’s just so time consuming and often requires ongoing maintenance:

Credit: https://xkcd.com/1319/

But in this article, I want to share a simple trick to deploy a machine learning model that requires only Python, 3 lines of code, and is completely free!

Enter… the Gradio library

The trick is to use the open-source Gradio Python library, which was developed to create web demos from machine learning models. If you visit the Gradio website, you’ll see examples of demos that you can create, like this one below for a model that counts the number of people in a crowd.

But we won’t be focusing on the demos in this article. What I’ll instead be talking about is a (relatively) secret feature of the library called share .

This feature allows you to deploy a model for a certain time period, allowing anyone to use it and try it out. And you can do it directly from a Python terminal or notebook running locally or on a server. Here’s what it looks like:

Let me break down what’s happening in the gif into 3 steps…

Step 1: Load the model & define your inference function

Gradio is not restricted to any specific machine learning framework. Instead, it works with general Python functions. So the first step is to create a function that wraps around your model inference — the function should take an input and return the prediction (the prediction can be a number, some text, an image, a dictionary of labels/confidences, or something else, depending on the type of your model)

Here’s a quick example for an image classification model in TensorFlow (though you can also use PyTorch, scikit-learn, or any other framework):

Step 1: Load the model & define your inference function

Notice that we’ve loaded the model and created a wrapper function called classify_image() which takes in an image (in the form of an array) and passes it through the model to get a prediction.

Step 2: Create a Gradio interface object

To use Gradio, you have to create a simple GUI around your model, consisting of an input component and an output component. There’s a lot of different options (https://gradio.app/docs), so you can choose whichever one makes sense for your model. In this case, we’ll use the Image input and Label output. Then, we pass in the function we defined and the input components we chose into the Interface() class, like so:

Step 2: Create the Gradio interface object

Step 3: Call the launch() method with share=True

The final step is to call launch(share=True). That’s just:

…And that’s it! When you run this, you’ll see that a public link to your model has been magically created:

A public link to your model has been magically created

Awesome! You can now send the link to your friends, collaborators, or colleagues for them to use and run your model directly from their browsers.

To summarize:

--

--