GitHub’s search API is a powerful tool, but its search functionality is heavily rate limited compared to the rest of its API. GitHub’s general rate limit on its API is 5000 requests per hour (roughly 83 requests per minute), while the rate limit for search requests is documented at only 30 requests per minute. This can be restrictive in some use cases, but with just five lines of code, we can increase this limit to over 40 requests per minute!
(At this point, some readers may be concerned that “over 40” divided by “30” is not, in fact, an increase of 27%. Read on to find out the source of this discrepancy!)
To begin, let’s clarify those aforementioned rate limits – these are limits on requests that we’ve associated with an access token connected to our GitHub account, also known as authenticated requests. We can also query the GitHub API using unauthenticated requests (ie. without an access token), but at a much lower rate limit – GitHub only allows 10 unauthenticated search requests per minute.
However, GitHub tracks these authenticated and unauthenticated rate limits separately! This is by design, which I confirmed with GitHub via HackerOne prior to posting. To increase our effective rate limit, we can write our application code to combine our authenticated and unauthenticated API requests. Our application can make an authenticated request, and if that authenticated request fails due to rate limiting, we can retry that request again without authentication. This effectively increases our rate limit by 10 requests per minute.
Let’s illustrate with two separate code snippets – the first using only authenticated requests, and the second using both authenticated and unauthenticated requests. In both of these snippets, we try to make 50 requests in parallel to the GitHub search API via Octokit’s search_repositories
method.
In this first snippet, we expect to see 30 requests succeed (returning a Sawyer::Resource
) and 20 fail (returning an Octokit error), given the documented rate limit.
Run it, and we see this output:
$ ruby authenticated_only.rb
36 requests succeeded, 14 requests failed
Oddly enough, GitHub does not appear to strictly adhere to its documented rate limit of 30 requests per minute, but our premise still holds – we can’t make all 50 requests due to GitHub’s rate limiting.
Now, let’s run the second snippet, which is five lines of code longer than our previous snippet. In this snippet, if a request using our authenticated client fails, we retry the same request using an unauthenticated client.
We see the following output:
$ ruby authenticated_and_unauthenticated.rb
46 requests succeeded, 4 requests failed
As predicted, we’ve successfully increased our rate limit from 36 to 46 requests per minute, a 27% increase from what we could achieve previously.
I really did expect to put the number 33% in this blog post’s title, not 27%. – it’s unclear to me why my authenticated client can make 36 successful requests, when the search API limit is documented at 30. I observed some variation on the output of this script too, ranging from 40 to 46 successful requests.
Going back to our performance gains – is this method effective for every application using the GitHub search API? No, probably not – 10 additional requests per minute is inconsequential in a large production application at scale. In that case, there are other techniques available to avoid hitting the GitHub search API rate limit. Some examples include caching your search results from the GitHub API, or rotating GitHub credentials to multiply your effective rate limit.
However, what if you’re using GitHub’s search API at a small scale? For example, you may be using the search API in a script that runs in your local development environment, or in some sort of internal tooling. In such a scenario, you may just be occasionally hitting the authenticated request limit, but haven’t reached a point where you need a more scalable solution. In that case, these five lines of code may give you a good “bang for your buck” in solving rate limiting issues.
Start your journey towards writing better software, and watch this space for new content.