You Are Doing SQL Pagination Wrong

Royer Robin on 2022-03-30

Almost all applications using SQL fall into this common performance trap. Avoid it now!

Google bottom Pagination widget

Every application faces these issues in its early stages: you have to limit the quantity of data sent to a client for its requests to prevent your quality of service to degrade. We find this pattern everywhere:

The delicate part becomes “how we implement this solution?”.

We are going to look here at a simple use case, for pagination with a table object of 1000000 rows. We are considerating 2 different pagination cases: one following the unique primary key id and the other one by tenant name (varchar field with an index).

objects table description — using mariadb 10.5.8

Naive implementation

The easiest solution for pagination, found everywhere on the internet is the usage of LIMIT and OFFSET.

OFFSET Pagination query example

Let’s try this query for the 2 indexed columns (id and tenant), for different values of OFFSET:

Pagination response time for different OFFSET values

When we inspect the database optimizer query plan for our pagination query: we notice that the database is using a filesort operation for ordering on the non-indexed column and an index for the tenant and id column. Based on that we can conclude that:

  1. Using offset on a non-indexed sorted column is a heavy task for the database and must be avoided at all costs.

MySQL must do an extra pass to find out how to retrieve the rows in sorted order. The sort is done by going through all rows according to the join type and storing the sort key and pointer to the row for all rows that match the WHERE clause.

2. Using OFFSET, even on the primary key, becomes slower over time for high database volume, as the number of rows that need to be loaded in order to be skipped becomes higher and higher.

3. Using offset on an indexed field with a big key_length is slower than on a small key_length.

As we can see, OFFSET pagination has some drawbacks:

As it is convenient to implement the OFFSET pagination at the beginning of a project (you don’t have to anticipate query plan, UI sorting use case and index creation), it is common to see an application performance degrade slowly, becoming unusable over time. However, due to the way the OFFSET is specified, the database has to iterate over all the OFFSET rows, making this solution really inefficient.

Seek method pagination

As seen previously, even if we try to use an index, OFFSET make it inefficient as all concerned rows must be loaded, we will have to construct a query that uses the index in order to retrieve our targeted pages.

Seek Pagination query example

Let’s find you out how long is this query taking when iterating on the last seen id.

Seek Pagination Query response time

The result shows us that we won many orders of magnitude in our use case, making the response query time less dependent on the database size. However, when moving to the Seek pagination method, some changes are notable:

Multi-columns index for Seek Pagination
Seek Multi columns index pagination query

This solution makes pagination really efficient but increase the development complexity: we need to think about the pagination and the allowed sorting columns in our database schema.

As usual in software engineering performance is a tradeoff.

We have seen here how to banish using the OFFSET pagination in our SQL database when performance is critical.

In exchange for a bit of complexity and a loss of feature, the seek method provides one of the fastest SQL-vendor agnostic ways of paginating your data.