JollyAnteater Logo

The Hidden Cost of 'Helpful' API Fields

Jolly Anteater
Tags:
eng

Written mid-December 2025.

List endpoints are the most important endpoints for user-facing APIs. They're the fundamental mechanism for discovering what data is stored in an API before trying to drill down into particular entries. Building a list endpoint appears trivial, but it can have long-lasting consequences if implemented poorly, eating away at weeks' worth of engineering time for months to come.

APIs are contracts. Once they're created, you'll have a hard time modifying them or removing them. The worst mistake you can make with a list endpoint is not including pagination. The second is implementing pagination poorly.

Move Fast and Break Things (and Then Pay For It)

Move fast and break things. It's an important motto for startups and lean software companies. The saying masks the consequences of everyday decision-making. Some decisions are easily reversible. Some are not, and the cost of trying to undo or repair poorly planned decisions can cost hundreds of thousands (or millions) of dollars.

Develop Your API with Pagination in Mind

Pagination refers to query parameters on list endpoints which are used to fetch "pages". Each page has a limit and an offset. Offset and page are interchangeable in this case. Users are required to set a limit for how many items to fetch, and an offset for where in the list to fetch entities from.

Good API reviews will make the novice developer add maximum limits, preventing users from fetching over a certain amount. The inexperienced developer may ask themselves: why enforce limits at all?

API users are greedy and will use the API in a way that is convenient for them. That isn't weird or unexpected. If you don't want someone to use an API in a particular way, then block them!

When our company built out our external APIs, we had low volume and small lists. We had neither pagination nor limit enforcement. Customers would fetch all of their items and that was fine.

The problems began when customers had thousands of items and were fetching thousands at a time. Some even fetched multiple thousands. Each of those items kicks off a backend call to a database, often hitting the 10 second timeout limit and retrying, again, and again, and again. Cue customers complaining about our "poor performance."

Eventually, we introduced "soft" pagination. We included page, limit, next page, and count as fields on the returned object. Here's an example:

{
  message: "Succeeded in fetching Pokemon",
  data: [...],
  pagination: {
    page: 1,
    limit: 100,
    nextPage: 2,
    count: 2000,
  },
}

This change allowed customers to limit the number of objects to fetch which, in turn, reduced strain on our backend, and timeouts were reduced. This was "soft" in that we didn't actually enforce a limit. Users could still set the limit to 3000 (which some customers still do to this day!) and then get timed out.

We enforced usage of pagination limits eventually, but never enforced it down to a limit of 100. I'm not super sure why. It was already a breaking change to enforce usage of limits anyways, but as software engineers we must choose to move forward rather than complain about legacy systems. Otherwise we'd just be miserable all the time, and no one likes working with someone that's miserable.

Our customers are quite sensitive to API changes and over time our customer base grew. This made it quite hard to enforce a lower limit on these calls. Luckily, my tech lead and I built out a small framework for collecting and enforcing API deprecations safely. More on that in a separate blog post.

The Dreaded count Field

Some endpoint lists may contain hundreds of thousands to millions of entries. Think audit logs, change logs, or some type of logging for actions on a platform. We have one such endpoint that tracks logs, reaching several millions for one large customer. This customer is a heavy API user and also enjoys using the log list endpoint. This benign combination is secretly quite devilish.

We noticed one day that this customer was seeing timeouts despite using small limits on this endpoint. Surely only fetching 25 items isn't that bad for performance?

Our list endpoints include some special fields: has_next as one, count as another. These aren't both strictly necessary, but they seem useful to add.

Wrong. Do not add something just because it seems useful. If it isn't necessary to an API, it shouldn't be included.

Why? Because you cannot safely remove that field later. There will always be some customer who depends on that field for some random flow, and if you remove it they will complain.

This xkcd captures the problem perfectly:

xkcd 1172: Workflow

Comic: xkcd 1172, "Workflow" by Randall Munroe (licensed under CC BY-NC 2.5).

The count field is the ultimate foot-gun. We use Mongo for our primary database and have a custom ORM built on top of it. To fetch the total count for a given list call, we end up having to do an expensive query across the entire database for that particular entity. That's small for the majority of endpoints, but insanely painful for the log endpoint which is in the millions (and could eventually be billions).

Pagination was supposed to help API performance, but it introduced something that made it worse. Now we cannot remove the count field safely. It's possible some customer uses it for an important flow. To fully deprecate it, we have to notify customers and be prepared to have multiple conversations with them before finally removing it.

Hope at the End of the Tunnel

Alas, there is something we can do to mitigate this issue, somewhat. The tech lead for the team identified start date and end date as reliable query parameters for reducing the total items to count. This reduces the scale of logs that are scanned, and it puts the performance improvement of the endpoint on the user.

Unfortunately, we cannot enforce the usage of these parameters, but we can encourage them!

A better end state would be to remove page/offset and count entirely. Instead of page/offset, use cursor-based pagination, where you include an ID and fetch the next X items in sorted order after that ID. This way you don't need to fetch the first X pages or items before continuing.

Instead of count, you should use has_next as a way to know whether there are more results to fetch. Finally, if a customer really needs an endpoint for knowing the total number of items, then just make a special endpoint for that.

Do not make a commonly used endpoint worse for the sake of a couple of people. Make something they can opt into rather than opting everyone in.

Think of API Scaling From Day One

In the modern programming era, there isn't a good reason not to have cursor-based pagination. It takes at most a day to write and will save you days, weeks, and months in the future. If your startup really needs an API out as soon as possible, at the very least put limits in place and be ready to break customers in the future.

But really ask yourself: "What happens if the number of items a user can fetch with this API scales to the thousands? Could we handle that?" Your future selves and future junior developers will thank you.