jeudi 26 décembre 2019

Timeout occurs when serving deep learning models with django, gunicorn and Nginx

I was able to serve deep learning models on 4 2080TI GPUs based on django, gunicorn and Nginx. The majority of the latency is around 200ms, but several requests takes over than 2s to finish. It happens occasionally and is hard to reproduce under some specific setting. How to fix this problem?

BTW, the QPS is just 1~2, so it's not result from busy GPU/CPU usage.

Here is the Nginx log: nginx log

Aucun commentaire:

Enregistrer un commentaire