You shouldn’t do like this.

Model prediction is a backend as it can take. Here you transfer responsibilities for requests queue on system or on front http server like nginx. And most of the queue will be lost on load.

To make this production ready you need:

  1. Run background worker which is listening request queue.

Thus you can see load in your queue, can add additional workers to handle load, all requests will be served.

--

--

Machine Learning and Computer Vision Researcher. Founder LearnML.Today

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Andrey Nikishaev

Andrey Nikishaev

Machine Learning and Computer Vision Researcher. Founder LearnML.Today