API Rate-limiting going into effect on March 1st, 2022

Since started in 2014, it offered an REST-API as an essential part of its service. The API allows you to automated almost every aspect of your signage devices. From probing their status to integrating into other software. Various customers operate in-house frontends talking to the info-beamer API to allow their users simplified workflows. Thanks to the OAuth app integration, it’s even possible to build HTML/JS only frontends providing additional features and custom UIs like the fleet updater (source) or a yet unofficial slot based content player.

Today the API handles more than one million requests per day. While this is absolutely manageable and the majority of calls is justified, the issue is that the current absence of any kind of rate limiting resulted in some cases of rather unoptimized usage of the API: Examples include querying the list of assets 10 times within a few seconds or fetching the device details of all devices individually instead of using a single list call. These calls not only waste resources but in all cases expose an underlying implementation flaw in how the API is used that can be easily improved.

Starting on March 1st 2022, the info-beamer API will be subject to rate-limits. Rate limits (like quotas) are a way to protect the API from intentional abuse or unintentional misuse. The intention of this change is to more quickly expose unintentional misuse of the API and to ensure the API remains fast and reliable for all users going forward.

There are two type of limits: One is the global account limit. This limiter caps each account at a sustained rate limits of 300 calls per minute. Additionally individual endpoints can have their own limits specified in the documentation of each call.

The limits currently suggested are explicitly set in a way that 99% of all API users won’t notice any difference as they won’t run into any of the limits. The remaining 1% of users that are affected by the new rate limits have two options:

  • Optimizing their usage of the API to avoid running into rate limits.
  • Gracefully slowing down API requests in case a limit is hit.

While the first option is generally recommended and, as in the examples given above, most likely easy to implement, sometimes running into limits might happen regardless. The API now includes additional information to make managing rate limited calls easier: All successful API responses for calls within the allowed rate limit return additional HTTP headers:

X-Rate-Limit-Action: package:detail
X-Rate-Limit-Remaining: 119

The X-Rate-Limit-Remaining value is the number of additional requests most likely to not run into rate limits. Once this value reaches 0, the next call might result in a rate limit response. They are signaled using the standard HTTP 429 status code and the following new headers:

X-Rate-Limited: true
X-Rate-Limit-Action: package:detail
Retry-After: 2

The value in Retry-After is the number of seconds suggested to wait before retrying the API call again. Using this mechanism you can transparently wrap your API calls in a retry logic given in pseudo code:

def rate_limit_wrapper(api_call):
    while true:
        response = api_call()
        if response.status_code == 200:
           return response
        elif response.status_code == 429:
           raise APIError(response) 

The code basically calls the API, returns on successful requests or errors, but retries API calls in case the requests was rate limited. It uses the suggested number of seconds provided in the Retry-After header to delay the next attempt. Using similar code will ensure rate limits don’t affect your code at all, other than slowing it down automatically in case you run into a limit.


Why rate limits?

Some API usage seen today is highly unoptimized and might cause issues going. Limits ensure that the API remains fast and reliable for all users.

How are the limits implemented?

The rate limited uses a leaky bucket implementation with a allowed bursts of 200%. As an example: If the API endpoint allows 60 requests/minute, the limiter will have a “virtual bucket” with space for 120 “drops” (200% of 60). Each time the API is called, a drop is added to the bucket. If the bucket overflows, which means it already has 120 drops in it, the rate limit is hit an a 429 response is returned to the caller. For the given example, at a constant rate of one drop per second (so 60 drops per minute) the bucket leaks drops in it, slowly making space for more requests.

Where can I see the new limits?

The documentation shows the limit for API calls with rate limits. As an example, check out the list devices call. At the top is a new “Rate limited” section specifying the limit. You can see your current API usage and how often you would run into limits.

Can I test out the new limits now?

Yes. You can enable the limiter in your account settings by enabling the Preview rate limiting going into effect at March 1st, 2022 checkbox. This will enable the limits now and the API will behave as it will in March. Before March you’ll be able to disable the limiter using the same checkbox again at any moment.

Do I need to do anything?

Depends. The new limits shouldn’t affect most users. The burst allowance of 200% means that you most likely won’t hit the limit unless you have a sustained API call rate exceeding the specified limits.

You should check out your current API usage on the new API usage page. It shows the number of calls made in the last 8 days to actions subject to the upcoming rate limits. If a limit is hit, you’ll see which action was responsible. Click on the action to jump directly to the documentation to see which API call was responsible.

What change is recommended?

If you can, you should wrap all your API calls with an automated retry logic similar to the one given in pseudo code above. This ensures that your API client transparently handles rate limiting.

Happy to answer any other question you might have.

info-beamer - Digital Signage for the Raspberry Pi community forum - Imprint