Predicting bicycle traffic with a REST API using real-time weather data

This post extends the project on modelling bicycle traffic in Cologne. We will host a REST API with our trained models to allow real-time predictions by querying the tomorrow.io API. Additionally, we will build a REST API with Flask to serve our models. This will allow us to make predictions with our models in a real-time setting.

This post showcases my ability to: - Think end-to-end about a machine learning project - Use the tomorrow.io API to get real-time weather forecasts - Use Flask to use the models in a REST API.

The tomorrow.io forecast API

In order to use our models in a real-time production setting, we need to think of which features can be available at inference. As the bicycle counter data is quantized in time, we can not use this data to predict, e.g., the bicycle traffic of the next hour. This is clearly a limitation of the original project idea and data set. Moreover, our prediction pipeline is based on knowing the weather data.

Here is an example, how we can get the weather data from the tomorrow.io API. You can get an API key with a free account. We will store the key in a .env file and use the python-dotenv package to load it. Don’t forget to commit .env to your .gitignore file. Let’s look at one time point from the forecast API.

Processing the forecast data

Since the tomorrow.io API uses different feature names, we have to translate it, to be compatible with our data. However, a few columns will be missing. We can impute them by taking the mean of the training data. In practice, this will lead to worse predictions, but we will accept that for now to go through the whole process.

Using the models in a REST API

Let’s copy the files to another directory, where we use the models in a REST API. Flaskis a great choice to build a REST API with our trained models in a super simple way. We can follow the example provided by Muhammad Bilal Shinwari’s article on Medium (Code).

We will write two new files: app.py and client.py, the former to run the Flask app, the latter to post requests. Let’s assume, that the features are part of the request. In this way, clients can request predictions for time points of their liking. A possible other solution would be, that we assume, clients want alway predict the next possible time frame. Then we could move querying the tomorrow.io API to the Flask app. Here is what we need inside of app.py:

And then client.py can look like this:

Running the Flask app and the client

Now, you can run the Flask app with python app.py and then run the client with python client.py. You should see the predictions in the console. The terminal will show something like this on the Flask side:

* Serving Flask app 'app'
 * Debug mode: on
INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000
INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug: * Restarting with stat
WARNING:werkzeug: * Debugger is active!
INFO:werkzeug: * Debugger PIN: 248-803-927
INFO:werkzeug:127.0.0.1 - - [10/Dec/2024 11:39:17] "POST /predict HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [10/Dec/2024 11:39:50] "POST /predict-mapie HTTP/1.1" 200 -

And the client will show the predictions:

{'Prediction': 91762}
{'PI': [75288, 171766], 'Prediction': 124490}

So this is an extremly easy and convenient way to use our custom models in a real-time setting, where the client doesn’t have to know how the model is implemented.