Six Five Six Four

2024 07Jul 28Sun

Filed under Technical

How to disable Nginx buffering

https://serverfault.com/questions/768693/nginx-how-to-completely-disable-request-body-buffering

# Reverse proxy to terminate TLS for Synapse
server {
    listen 8448 ssl default_server;
    
    location / {
        # https://serverfault.com/questions/768693/nginx-how-to-completely-disable-request-body-buffering
        # 
        # I disabled request and response body buffering here because
        # it might be messing up the sync in Synapse. Also nobody needs it,
        # are you kidding me? It's 2024, clients and servers are all streaming.
        # Don't buffer the responses. 2024 07Jul 28Sun
        
        client_max_body_size 0;
        proxy_buffering off;
        proxy_pass http://127.0.0.1:8008;
        proxy_redirect off;
        proxy_request_buffering off;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

What happened?

I got someone new onto my self-hosted Matrix chat stack. Then I got someone else to federate with it.

Then we noticed it was barely functioning at all.

Chat messages weren't getting delivered on time even though the sender and receiver phones were both on good Internet connections and had the app running in the foreground.

Also, there was a green loading bar that wouldn't go away.

Why weren't messages getting delivered?

Nginx was buffering.

The character Alastor from the adult animation show Hazbin Hotel, right as he sings "Uh-oh, the TV is buffering!"

"Uh oh, the proxy is buffering!"

Luckily, I had run into this problem before while running PTTH behind Nginx.

I copied the above directives from the Nginx config I use for PTTH, pasted them into the Nginx config for Synapse, restarted Nginx and Synapse, and everything has been fine the last few hours.

Why had I seen it before?

Because PTTH and Matrix both use long-polling on some of their HTTP connections.

Nginx buffers by default, so I had similar bugs when deploying PTTH. If Nginx buffers in front of the PTTH relay, the servers may not send files to the client when they're supposed to, even though everything is online and healthy.

In fact, I copied the idea of long polling from Matrix's implementation, used it on a couple small projects, and then made it the central piece of PTTH.

Why use long polling?

Because you can't build a good chat app without it. [1]

Without long polling, a normal HTTP connection is like this:

There's no concept of timing, and the server can't start a request, only the client can.

A chat app needs to immediately update when the server tells the client that someone else messaged us.

You can do this with normal polling, but it doesn't work well:

Notice that we're constantly talking to the server even though nothing is happening. This is a waste of bandwidth, battery life, energy, and CPU time.

And despite the waste, we still don't hear about new messages when they reach the server. We only hear about them on the next poll cycle.

Long polling is when the server puts the connection on hold, like putting a phone call on hold.

We can't get change the request-response architecture, but we can use a delay before the response to hijack it and use it for cheap server push.

"What's the most important part of comedy, it's timing."

Why do long-polling and buffering conflict?

Because they fight each other to control the timing of the connection.

Buffering assumes that it's okay to delay the first part of the request or response, in order to send the whole thing at once instead of as small chunks.

Long polling assumes that any part of a response might be useful all by itself.

So delaying a response from a long-poll server confuses the heck out of the clients:

At this point the Client is waiting for a "Roger" from Nginx, but Nginx is waiting for the server to "finish" its response, but the server may not finish that response for a long time if the chat is quiet.

That explains the infinite loading bar. Then...

Now the server has actually sent something, but Nginx buffers it and doesn't tell the client. So the client is still waiting, the server doesn't know anything is wrong, Nginx is doing its best, and I'm scratching my head at a 10-second delay between sending and receiving a text.

So if time is part of the message, buffering causes the server to accidentally lie.

Why does Nginx do this by default?

I'm guessing it's for backwards compatibility. Nginx will be 20 years old this year, and it serves a double-digit percent of all Web traffic, depending how you measure it.

You can't be that big for that long and change your defaults. Someone out there has depended on buffering for 20 years, and they need to feel safe that upgrading Nginx won't break their website.

Why was it the default back then?

The Web was built different 20 years ago. Rust, Go, and Node.js didn't exist. Chromium didn't exist.

Many servers handled requests by forking child processes or spawning worker threads. Nginx was designed to solve the C10K problem by using an event-based architecture to serve many requests on a single OS thread.

In this world, long-polling was rare, and Nginx would typically be connected to an app server that did need an OS thread for each connection. So a slow client, or a client with slow Internet, could overload the app servers by taking up a bunch of threads for a long time.

Nginx probably set their defaults to protect app servers like that by buffering requests and responses, so that the app server's threads would be freed up as quickly as possible, and Nginx would handle the difficult work of talking to slow clients.

Nowadays this isn't needed. I use Nginx for TLS termination, but the PTTH relay uses a similar event-based architecture as Nginx, and buffering adds nothing. PTTH can handle 1,000 concurrent connections on a $5 / month DigitalOcean droplet, because 1,000 idle connections is a non-issue for tokio and hyper, which PTTH is built on.

Why did I forget about this when deploying my chat server?

Because I don't take good care of my home server setup, and I treat everything like kittens instead of cattle, so the two Nginx configs for PTTH and for chat are totally separate one-off things. Oops.

Footnotes

[1] - You can use Websockets, but long-polling is much simpler to implement.

Discuss on Mastodon