Why I hate custom protocols over HTTP.

One recent trend is to use HTTP in order to send data between a client and server. Between protocols built on top of SOAP and XML/RPC (and yes, I’ve built code on top of XML/RPC, and have a Java XML/RPC library), it’s not all that uncommon to send text commands over HTTP.

And it makes sense: HTTP generally is not blocked by various internet providers while other ports are firewalled, and it is well supported across the ‘net.

As a rule, however, I’m generally opposed to overriding an existing protocol for private use. My instincts are if it is possible for me to open a port from a client to the server that is not in use by an existing protocol, then use that port instead.

With HTTP, there are a number of downsides. HTTP is essentially a polling protocol: ask a question, wait for an answer, get an answer. There is a lot of plumbing that has gone into HTTP in order to work around the performance issues revolving around HTTP–but because it is essentially a polling protocol, there is little you can do to bypass a resource that takes a long time to download besides opening up a second connection. (Protocols like LDAP allow multiple logical connections over the same physical TCP socket.)

HTTP has also become somewhat more complicated over the years, with things like optional keep-alive settings and an array of possible return codes. All of this makes sense if you’re building a web browser (though some of it is a bit over-engineered: I don’t know if 418: “I’m a teapot” is a joke or a sarcastic response to things like 449: “Retry With”), but for a simple RPC protocol, we really don’t need more than “success”/”failure”/”exception.”

And today I learned another thing that just confirms my “don’t override someone else’s protocol; just build your own” instinct.

As designed the client I’m working on initialized a connection by requesting information on static resources that may have changed. So I’d do an “init” call, and wait for a response. As part of the request call, the server team specified that I should send “if-modified-since” with the date of the last response, so I can tell if I should update the cached response. (This was modified from the original idea, which was to simply use an integer version.) This client runs on Android both over WiFi and over the cell network.

You can guess what happened next.

Yes, T-Mobile rolled out a new proxy server to reduce 3G network traffic by automatically detecting and caching server responses, and sending ‘304’ errors on the init call. Well, if you send ‘if-modified-since’, you better process 304 errors, right?

My client didn’t.

And so it means the 130,000 people running our client software–died. Hard.

The first time you run the application it would sync up just fine. But the next time you’d hook up, T-Mobile would detect that your request hadn’t changed, and would send a 304 response–which the client would not understand, and eventually shut down claiming the client could not connect to the server.

And we never tested this. Of course we never tested this. Our server never sent the 304 exception, and so we never had a way to test this. In retrospect, of course “everyone knows” that if you send an if-modified-since, you should process the 304 exception.

The fix was simple, as all such things tend to be once they are discovered and understood.

But it would never happened if we had never overridden an HTTP protocol, where there are layers we don’t fully understand (until they break) running on a network which can insert any ol’ proxy (some with bugs we may never understand) between the client and the server.

2 thoughts on “Why I hate custom protocols over HTTP.

  1. Do you have to send a “Cache-Control: no-cache” header? Do all proxies obey this header? I wonder how well different HTTP-based web services frameworks handle stuff like this. Were you using any kind of framework on Android? (Are there any?)

    Ken B.

    Like

  2. I don’t know if I have to; I know I wasn’t. And for the call I was making, since I have the cached information it was easy enough to modify the code to handle a 304 error.

    I suspect most well-worn HTTP-based web services frameworks handle this correctly–if only because someone somewhere out there has been bitten by this bug. But we’re not using any of these frameworks; the irony of our whole “let’s reuse an existing protocol” thing is the hesitancy by our team to also reuse an existing framework that handles the protocol.

    Like

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s