EF Core Pagination

Photo by Kevin Ku on Unsplash

EF Core Pagination

What is Pagination?

Pagination is the process of retrieving results in pages, rather than all at once.

This process is extremely useful for large datasets because it improves the overall performance of the system, by fetching a limited number of items rather than all of them.

There are two main ways to implement this process into your application: Offset Pagination and Keyset Pagination.

In both cases, it's important to keep a fully unique ordering, to avoid skipping results.

For example, if results are ordered only by date, but there can be multiple results with the same date, then results could be skipped when paginating as they're ordered differently across two paginating queries.

The solution to this issue is to unique the ordering by using both the date and the id (or any other unique property or combination of properties).

As a general rule of thumb, apply unique ordering criteria before using the pagination in your application, and don't expect databases to consider any ordering by default.

Also, define indexes for the properties that are used for ordering. If there are multiple properties for defining the ordering criteria, then create a composite index.

Offset Pagination

This is arguably the easiest way to implement pagination in your application. It uses two easy-to-understand methods for manipulating the data that's going to be returned from the server side.

These two methods are Skip(OFFSET in SQL), which specifies the number of items to skip from the dataset, and Take(LIMIT in SQL), which specifies the maximum number of items per page to be returned.

In summary, Skip is responsible for going to the previous/next page, and Take represents the maximum number of items per page.

Even though it's the easiest way to implement pagination, some drawbacks are coming with it.

Let's consider the following example:

int toSkip = 100;
int toTake = 100;
var students = ctx.Students
    .OrderBy(student => student.Id)
    .Skip(toSkip)
    .Take(toTake)
    .ToList();

In this example, we're fetching the page results by skipping the first 100 items from the dataset and taking at most 30 items from the remaining ones.

Observe also that before using the Skip and Take methods, we applied unique ordering criteria (ordering by student's ID), to avoid unintentionally skipping results.

Even though it's an easy-to-follow example, it also has some severe shortcomings:

  1. The first 100 items from the database still need to be processed, even though they aren't considered for the current page results. Skipping the first 100 students, doesn't appear as an expensive operation at first sight, but what would happen if we need to skip 100k rows or even 1 million rows? (This is a highly possible case in big enterprise projects). In conclusion, the performance will suffer as you progress through the results and approach a distant page, as the Skip value will increase, and the database will have more and more items to process. Also, we could conclude that as the number of students increases, there will be more and more items to skip by the database, and the performance will be degraded.

  2. Another pitfall of offset pagination is that it can mistakenly return duplicate entries or skip entries if data is being simultaneously added or deleted from the database. Consider having two pages with the same number of results. As you're on the first page, somebody is deleting the last item of the page. That means when you move to the second page, the first item that normally should be on the second page will be skipped as it will become the last item of the previous page. Unless you go back and look at the previous page again, you will miss that page result.

Keyset Pagination

To address the shortcomings of the offset pagination, there exists an alternative pagination strategy that's called keyset pagination.

This pagination strategy (sometimes called s_eek-based pagination_ or cursor pagination) uses a WHERE clause to skip rows, instead of an OFFSET.

Typically, the last returned result from the previous page is utilized to specify the criteria for retrieving the results from the current page.

The pages aren't being divided up in relation to the general location of entries, thus when new entries are added or deleted, the content on each page doesn't move.

Scalability is also less of a problem because the server is not required to do the counting and iterative looping through of data necessary for offset pagination.

Consider the following example:

int lastStudentId = 30;
int toTake = 30;
var page = ctx.Students
    .OrderBy(student => student.Id)
    .Where(student => student.Id > lastStudentId)
    .Take(toTake)
    .ToList();

In this example, we know the last row returned by the previous page, and we get the results for the current page (the ones after the last row) based on that.

In addition, to limit the number of returned results, we establish a maximum page size by calling the Take method.

Keyset pagination is suited for applications where the user navigates forward and backward and does not require random access to a specific page, as provided by the previous pagination strategy.

We might also consider it the best approach if you want to create an endless scroll content application because it is efficient and addresses the problems of offset pagination.

Conclusion

Pagination is more complicated than it appears, so pay special attention to what's needed for your application.

Before deciding on one of the two techniques, consider the advantages and disadvantages of each.

As a general guideline, only use offset pagination if you need random access to a certain page; otherwise, try keyset pagination, which is usually faster.