HTTPWTF — Looking at some of the HTTP quirks

HTTP is fundamental to modern development, from frontend to backend to mobile. But like any widespread mature standard, it’s got some funky skeletons in the closet.

Some of these skeletons are little-known but genuinely useful features, some of them are legacy oddities relied on by billions of connections daily, and some of them really shouldn’t exist at all. Let’s look behind the curtain.

Did know about that Expect: 100-continue header — and even found a bug in curl handling an intermediary HTTP 417 response — but most of the stuff — such as the HTTP Trailers — were new to me.

HTTPWTF →

Via Frederick

Fastify, Fast and low overhead web framework, for Node.js

It’s been a while since I’ve set up a server with Node, but turns out Fastify is preferred over Express nowadays.

Fastify is a web framework highly focused on providing the best developer experience with the least overhead and a powerful plugin architecture. It is inspired by Hapi and Express and as far as we know, it is one of the fastest web frameworks in town.

import Fastify from 'fastify';
const fastify = Fastify({ logger: true });

fastify.get('/', async (request, reply) => {
  return { hello: 'world' };
});

const start = async () => {
  try {
    await fastify.listen(3000);
  } catch (err) {
    fastify.log.error(err);
    process.exit(1);
  }
}
start();

Apart from the core there are lots of plugins for authentication, cors, forms, cookies, jwt tokens, etc.

Fastify, Fast and low overhead web framework, for Node.js →

httpstat – curl statistics made simple

httpstat visualizes curl(1) statistics in a way of beauty and clarity.

It is a single file🌟 Python script that has no dependency👏 and is compatible with Python 3🍻.

Installation through PiP or HomeBrew:

pip install httpstat
brew install httpstat

Once installed through one of those, you can directly call httpstat:

httpstat https://www.bram.us/

httpstat – curl statistics made simple →

About the HTTP Expect: 100-continue header …

TL;DR HTTP clients may send a Expect: 100-continue header along with POST requests to warn the server that they’re about to send a large(ish) payload.

At that point the server can:

  1. Decline, by sending back 401/405 and successively closing the connection.
  2. Accept, by sending back 100 Continue, after which the client will send over the payload.
  3. Ask the client to re-send the original request in an unaltered way (original headers + payload), by replying with a 417 Expectation Failed

The almighty curl is quite a courteous HTTP client and automatically sends the Expect: 100-continue header when it’s about to send large payloads.

Unfortunately, as I discovered and reported, it doesn’t didn’t play nice when it got sent back a 417

~

In a recent project I was fiddling around with implementing an HTTP Server. One of my curl-based tests included a POST request that had a largeish payload.

curl -X POST -d "<somelargepayload>" http://localhost:8080/

Upon executing that command I saw this arrive on the server:

POST / HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.64.1
Accept: */*
Content-Length: 4096
Expect: 100-continue

As you can see here – and what I didn’t expect – is that the POST body itself isn’t present with that request. Additionally a Expect: 100-continue header got injected somewhere along the road. So where did the POST body go? And what about that new header?

~

👀 Digging into the HTTP specification

Turning to the HTTP spec to see what the Expect header is all about, I found that it is quite a courteous addition to the spec which clients can implement:

The “Expect” header field in a request indicates a certain set of behaviors (expectations) that need to be supported by the server in order to properly handle this request. The only such expectation defined by this specification is 100-continue.

A 100-continue expectation informs recipients that the client is about to send a (presumably large) message body […] This allows the client to wait for an indication [from the server] that it is worthwhile to send the message body before actually doing so […].

[A 100-continue expectation] allows the origin server to immediately respond with an error message, such as 401 (Unauthorized) or 405 (Method Not Allowed), before the client starts filling the pipes with an unnecessary data transfer.

And that’s exactly what happened: curl injected the Expect: 100-continue header and waited for an interim response from the server. This gives the server the time to evaluate whether is should abort the request or not (most likely by evaluating the value of the Content-Length header). After one second of waiting for such an interim response, curl sent over the data in the POST body. Attaboy, curl! 🐶

⏱ Sidenote: That one second of waiting is the default timeout curl will wait. You can configure it using the --expect100-timeout SECONDS option:

curl -X POST -d "<somelargepayload>" --expect100-timeout 5 http://localhost:8080/

📭 Sidenote: To prevent curl from injecting the Expect: 100-continue header – and have it immediately send the POST body without warning the server first – you can override the Expect header:

curl -X POST -d "<somelargepayload>" -H "Expect:" http://localhost:8080/

So, mystery solved right? The POST body is just sent a little later, after curl gives the server the change to prematurely abort the request. Yes, but there’s more to it …

~

🤓 More spec details

Reading further down in the spec I found a few more interesting tidbits:

A client that will wait for a 100 (Continue) response before sending the request message body MUST send an Expect header field containing a 100-continue expectation.

A client that sends a 100-continue expectation is not required to wait for any specific length of time; such a client MAY proceed to send the message body even if it has not yet received a response. Furthermore, since 100 (Continue) responses cannot be sent through an HTTP/1.0 intermediary, such a client SHOULD NOT wait for an indefinite period before sending the message body.

This specifies that a client MAY wait for the server to send a 100 Continue interim response, but that the client may also just wait for an arbitrary amount of time and send over the body without awaiting that interim response. This last part is for compatibility reasons with HTTP/1.0, which does not speak Expect: 100-continue. Again curl did as told: it waited for 1 second and sent over the body.

~

♻️ HTTP 417

Apart from ways of aborting a Expect: 100-continue request (by sending back a 401 or 405) and allowing such a request (by sending back a 100), the spec also provides the server with a way of forcing the client to send everything over in one go:

A client that receives a 417 (Expectation Failed) status code in response to a request containing a 100-continue expectation SHOULD repeat that request without a 100-continue expectation, since the 417 response merely indicates that the response chain does not support expectations (e.g., it passes through an HTTP/1.0 server).

To not have my server keep track of both individual message parts, I decided to implement the 417 behavior. That way I could simply discard the first part of the request (the part with 100-continue expectation), send back the 417, and then read out the entire original HTTP message in one go.

This approach also played nice with the parse_request helper function from Guzzle’s PSR-7 Library which I am using: it takes an entire HTTP message (e.g. headers + body) as its only argument.

~

🐛 curl vs. HTTP 417.

As said, as done. Time to test things out!

curl -X POST -d "" http://localhost:8080/

Again curl sent out a request with a Expect: 100-continue header, and this time I responded with HTTP/1.1 417 Expectation Failed so that curl would do as the spec tells it to do.

Curl however did not behave as … expected (🥁): After waiting for one second it sent over the POST body, just like it did before, even though it had received the 417.

With all the gathered knowledge and this behavior combined I turned to curl’s GitHub repository and opened an issue. Pretty soon after I got a response from curl’s lead developer Daniel Stenberg:

This is indeed a bug. 417 is just a very unusual response code so we’ve never had reason to implement this properly before.

What surprised me even more is that this bug got fixed in no time: two days after reporting it a fix landed and was set out to ship with the 7.69.0 release (which was released early March). Daniel even wrote an extensive blogpost about the 417 header on his blog.

~

Let this be a reminder to everyone to report bugs, no matter how small or rare they are. I think that, as a user of other people’s code, it’s our duty to report so. Being a developer of libraries/stuff myself, it’s one of those thing that I really appreciate, especially when they’re reduced to a minimal test-case and described in detail. Extra bonus points for referring to papers and specifications. And oh, always keep it friendly, that also helps 🙂

Fire and forget HTTP requests in PHP

Chris White, on creating really fast HTTP requests in PHP, by manually building an HTTP request and sending a payload:

Hand-crafting HTTP requests seemed like an unreliable method at first, but after some pretty extensive testing I can vouch for it reliably sending the requests and the remote server receiving them in full. It can be tricky to craft the request properly (respecting the line feeds that the HTTP specification expects, etc), but that’s why you have tests, right? 🙂

Like so:

$endpoint = 'https://logs.loglia.app';
$postData = '{"foo": "bar"}';

$endpointParts = parse_url($endpoint);
$endpointParts['path'] = $endpointParts['path'] ?? '/';
$endpointParts['port'] = $endpointParts['port'] ?? $endpointParts['scheme'] === 'https' ? 443 : 80;

$contentLength = strlen($postData);

$request = "POST {$endpointParts['path']} HTTP/1.1\r\n";
$request .= "Host: {$endpointParts['host']}\r\n";
$request .= "User-Agent: Loglia Laravel Client v2.2.0\r\n";
$request .= "Authorization: Bearer api_key\r\n";
$request .= "Content-Length: {$contentLength}\r\n";
$request .= "Content-Type: application/json\r\n\r\n";
$request .= $postData;

$prefix = substr($endpoint, 0, 8) === 'https://' ? 'tls://' : '';

$socket = fsockopen($prefix.$endpointParts['host'], $endpointParts['port']);
fwrite($socket, $request);
fclose($socket);

This way of requesting performs about 5 times faster than making requests using Guzzle. No surprise there really: the closer you are to the core language, the faster it will work. C is faster than PHP. PHP’s built-in functions are faster than libraries built on top.

Fire and forget HTTP requests in PHP →

💁‍♂️ Back when I was a lecturer Web we started off with looking at the HTTP protocol before we even wrote one single line of PHP. Mandatory knowledge for everyone working in web imho.

Vapor – Server Side Swift

Interesting to see that Swift can also be used as a serverside language.

One can clearly see parallels with other languages and frameworks. For example Vapor comes with an HTTP Package, which – amongst other things – contains a Request class.

// http://vapor.codes/example?query=hi#fragments-too
let scheme = request.uri.scheme // http
let host = request.uri.host // vapor.codes

let path = request.uri.path // /example
let query = request.uri.query // query=hi
let fragment = request.uri.fragment // fragments-too

// Route “hello/:name/age/:age”
let name = request.parameters["name"] // String?
let age = request.parameters["age"]?.int // Int?

// Headers
let contentType = request.headers["Content-Type"]

// …

Vapor →

Zttp, a developer friendly wrapper for Guzzle

If you’re not familiar with the evolution of Guzzle, the library has basically gotten more professional and less usable with each new version. New layers upon layers of specification-respecting abstractions and rules made Guzzle incredibly difficult to get started with.

Zttp solves just that, by keeping things simple:

Zttp is a simple Guzzle wrapper designed to provide a really pleasant development experience for most common use cases.

$response = Zttp::withHeaders(["Fancy" => "Pants"])->post($url, [
    "foo" => "bar",
    "baz" => "qux",
]);

var_dump($response->json());

As per usual: it depends. Zttp might float your boat, you might need Guzzle itself if you want to do some more advanced things.

Zttp (GitHub) →

(via)

Using Immutable Caching To Speed Up The Web

Firefox shipped with support for Cache-Control: Immutable:

The benefits of immutable mean that when a page is refreshed, which is an extremely common social media scenario, elements that were previously marked immutable with an HTTP response header do not have to be revalidated with the server.

No more 304‘s for those resources, because the browser won’t even re-request them 🙂

Using Immutable Caching To Speed Up The Web →
Cache-Control: immutable introductory post →

Guzzle — PHP HTTP Client

Guzzle is a framework that includes the tools needed to create a robust web service client, including: Service descriptions for defining the inputs and outputs of an API, resource iterators for traversing paginated resources, batching for sending a large number of requests as efficiently as possible.

<?php
require_once 'vendor/autoload.php';
use Guzzle\Http\Client;

// Create a client to work with the Twitter API
$client = new Client('https://api.twitter.com/{version}', array(
    'version' => '1.1'
));

// Sign all requests with the OauthPlugin
$client->addSubscriber(new Guzzle\Plugin\Oauth\OauthPlugin(array(
    'consumer_key'    => '***',
    'consumer_secret' => '***',
    'token'           => '***',
    'token_secret'    => '***'
)));

echo $client->get('statuses/user_timeline.json')->send()->getBody();
// >>> {"public_gists":6,"type":"User" ...

// Create a tweet using POST
$request = $client->post('statuses/update.json', null, array(
    'status' => 'Tweeted with Guzzle, http://guzzlephp.org'
));

// Send the request and parse the JSON response into an array
$data = $request->send()->json();
echo $data['text'];
// >>> Tweeted with Guzzle, http://t.co/kngJMfRk

Guzzle →