Django Requests Canceled by Client

Andreas

2023-12-21 15:30

We kind of plan to proxy an rest api call through a Django view. This is not the ideal solution but currently the best option. The result of the in between rest api is processed and saved in the database of the proxy Django. We return the same value that is saved into the database in the response of the view. The request in between is normally fast enough for timeouts, but what happens when it is not fast enough.

In this post I plan to reproduce this scenario with Django 5.0 and current psycopg3. We run sync views/models in Django so I will try this first. Because the result of the rest api call in the view is needed async will not help to speed this up, this post is using sync for everything. But our setup is with PostgreSQL and I may want to try this later with async this reproduction is using PostgreSQL too.

Steps in this post:

Running PostgreSQL in Docker.
Setting up Django.
Add model, url-route and view.
The actual experiment.

Running PostgreSQL

Running PostgreSQL in Docker container:

docker run -d --name mydb -e POSTGRES_PASSWORD=someSecretPasswordHere -p 5432:5432 postgres:latest

No volume needed because we don't need to persist the database after removing the container.

We need to create the database (here "mysite"). This can be done with psql, which I don't have on my system, so I used the one from the postgres Docker container like this:

docker run -it --rm postgres psql -h 172.17.0.2 -U postgres
# then paste the password
# and create the db:
CREATE DATABASE mysite;
exit

The ip adress is needed because we are in the Docker container and there we cannot access the port 5432 of another container via localhost. To get the ip address: docker inspect mydb | jq -r '.[0].NetworkSettings.IPAddress'.

Setting up Django

Install required packages and setup a new Django project:

# in a virtualenv
pip install django==5.0 psycopg[binary]==3.1.16
# init django project
django-admin startproject mysite
cd mysite
# create a new app
python manage.py startapp myapp

Setup changes needed in mysite/settings.py:

Add "myapp" to INSTALLED_APPS
And change database to postgresql (this is not the proposed way from the Django documentation!):

DATABASES = {
   "default": {
        "ENGINE": "django.db.backends.postgresql",
        "HOST": "localhost",
        "NAME": "mysite",
        "USER": "postgres",
        "PASSWORD": "someSecretPasswordHere",
    }
}

Model, Route and View

Because we want to test if writing to the Database is still happening even if the request is interrupted we need a model, a view and a route to the view.

First we add a simple model to myapp/models.py:

from django.db import models

class SomeData(models.Model):
    key = models.CharField(max_length=42, unique=True)
    is_running = models.BooleanField(default=True)
    payload = models.JSONField(default=dict, blank=True)
    modified = models.DateTimeField(auto_now_add=True)

A route to our view in myapp/urls.py:

from django.urls import path
from . import views

urlpatterns = [
    path("", views.index, name="index"),
]

Change the urlpatterns and add an import for "include" in mysite/urls.py:

from django.urls import include, path

urlpatterns = [
    path("admin/", admin.site.urls),
    path("/", include("myapp.urls")),
]

The most important one: the view (saved to myapp/views.py):

import json
import time

from django.http import HttpResponse
from .models import SomeData

def index(request):
    key = request.GET.get("key")
    if key:
        obj, _ = SomeData.objects.update_or_create(
            key=key,
            defaults={"key": key, "is_running": True},
        )

        # here some rest api call is happening
        time.sleep(5)

        # some results get processed and written to the database
        obj.payload = {"some": key}
        obj.is_running = False
        obj.save()
        return HttpResponse(json.dumps(obj.payload))

    return HttpResponse("needs ?key=")

Reusing the obj after waiting for 5 seconds could cause problems. But on the other hand, a reprocessing should result in the same payload written to the database. So we don't care for now and will add a model-get when it is needed later.

The Experiment

We need to run the migrations and start the Django server:

python manage.py makemigrations myapp
python manage.py migrate
python manage.py runserver 8080

Now trigger a call but don't wait the 5 seconds but cancel the request after 2 seconds:

$ curl -m 2 http://localhost:8080/\?key\=unique_key
curl: (28) Operation timed out after 2001 milliseconds with 0 bytes received

Now see how the data in the database looks like by running python manage.py shell:

>>> from myapp.models import SomeData
>>> SomeData.objects.get(key="unique_key").__dict__
{'_state': <django.db.models.base.ModelState object at 0x7f1bdbf79f90>, 'id': 3, 'key': 'unique_key', 'is_running': False, 'payload': {'some': 'unique_key'}, 'modified': datetime.datetime(2023, 12, 21, 14, 20, 45, 754889, tzinfo=datetime.timezone.utc)}

Result

It seems like the data is still written into the database, even if it is not returned. I verified this by adding some prints in the view and increased the sleep time to 20 seconds. Then I tried to run this with gunicorn (gunicorn mysite.wsgi -b 0.0.0.0:8080) to verify this is not a side effect of the devserver. Still the same is happening. 🎉