Pagination and cursors
Whenever you retrieve a list of things in an API I’ve managed to influence, unless the size of the list is internally capped at an incredibly small number, you’ll see that the API supports limit
and cursor
query parameters to allow you to page through the results. This post will explore what those parameters do, how they work, and some of the limitations of the approach.
Why pagination?
Before we get into the details of how the solution works, we should properly understand the problem. Most APIs start out very simple, with a trivial GET
operation on a container resource that returns a list of objects, and everyone is happy:
GET /widgets HTTP/1.1
Accept: application/json
...
HTTP/1.1 200 OK
...
Content-Type: application/json
[ { widget 1 }, { widget 2 }, ... ]
The happiness starts to go away when the server starts handling hundreds or thousands of widgets. API response times start getting long, payloads are huge, and memory usage on the server gets really spiky as it tries to serialize the entire contents of the database in response to every request.
Most people come to the quick realization that it’s better to only ask for a smaller amount of data to start, and only load additional data on demand:
GET /widgets?limit=10 HTTP/1.1
Accept: application/json
...
HTTP/1.1 200 OK
...
Content-Type: application/json
[ { widget 1, { widget 2 }, ... ]
But wait! With this data structure, how is the client supposed to know whether there’s more data to load?
Some APIs simply say “Give me the start point and page size, and I’ll figure out the rest.” OK, no problem. But what’s the start point? Primary key? What if you’re sorting on something else? Things get complicated quickly. Some folks try to use an offset
parameter, but that can also get complicated.
Cursors can help
Inspired by a very detailed post by the kind folks at Slack, I tried implementing cursors in an API framework I was building. Briefly, in this context a cursor is a token that can be used to resume the query when more than one page of results is available.
When someone makes a request to list resources, our API returns a structure that contains one attribute with the list of requested resources, and if there is more than one page of results, it will also include a next
attribute containing an encoded token that can be used in a subsequent request to get the next page of results.
GET /widgets?limit=10 HTTP/1.1
Accept: application/json
...
HTTP/1.1 200 OK
Content-Type: application/json
...
{ "widgets": [...], "next": "bmljZSB0cnkK" }
If the next
attribute isn’t there, there is no more data to be had.
The next
attribute is a base64-encoded token that the server knows how to decode into a database query that corresponds to the next page of data. It includes information about the sort order, starting point, and anything else that the server needs in order to do its job. The API consumer doesn’t need to know anything other than “as long as there’s a next
field, take the value out of the result and paste it into the cursor
query parameter in the next request”, and I’m free to make changes to the internal logic at any time without affecting consumers, which is pretty amazing from a service development perspective. Different resources can use different logic internally without exposing the details to the API user.
Even better, the next
attribute is self-contained. The server doesn’t have to maintain any state about the query or the cursor, so no resources are consumed if the caller never comes back.
Other ways to communicate the cursor
Some folks use a Link
header to carry the full URL for the next page; I think that is a fantastic idea and recommend implementing that as well:
Link: </api/widgets?cursor=aGFwcHkgZWFzdGVyIQo>; rel="next"
Security considerations
If someone weren’t careful, they could easily take the idea above and say “I’ll just base64-encode a JSON blob with a bunch of parameters, call it a cursor, and take the rest of the day off!”
Of course, we want to keep our attack surface to a minimum. To help with this, we encrypt the cursor with an algorithm that verifies the integrity of the protected content, and we salt the input to make it harder to attack with known plaintext. For added protection, we also rotate the encryption key regularly, so that if someone does manage to break the key they can only attempt further attacks for a limited period of time.
Limitations
The main limitation in this scheme is that it’s pretty focused on unidirectional scrolling – the cursor only goes forward (or back, depending on the sort order). There are ways to deal with this, but they’re far more complicated than I wanted to get into, and more importantly they’re not relevant to our current use cases.