Rate Limits, Retries, and Data Ingress at Scale

While developing apps with the Dropbox API v2, you may occasionally encounter a HTTP error code 429 in response to requests your app makes. Status code 429 indicates a transient error, which can potentially be resolved by retrying the request. This document describes the two types of errors that can lead to a 429 response and provides advice on how to handle each type of error.

In API v2, the body of a 429 response may be JSON or plain text. Your app should check the value of the Content-Type response header to determine which. If it’s JSON, it will include a reason field with one of two values: too_many_requests or too_many_write_operations. (You can assume that a plain text response is equivalent to too_many_requests.) These tags correspond to an underlying rate limit being applied or a namespace lock contention occurring, respectively, and represent the two types of errors that lead to 429 responses. While both errors can potentially be remedied by retrying the request, it is important to understand how they differ and the appropriate ways for handling each.

Rate Limits

Rate limit errors occur when your app is trying to make more requests in a specific window of time than is allowed. There are many technical and non-technical reasons why these rate limits exist but they are set high enough such that legitimate uses of the API should rarely encounter them.

Rate limits are applied to the OAuth access token used to make the request. For user-linked apps this means rate limits apply per user who has linked your app. Multiple apps that a user may have linked don’t contribute to each other’s rate limiting.

For team-linked apps rate limits apply per team whose admin has linked your app when calling Business Endpoints. If your app has the team member file access permission and is calling User Endpoints, the rate limits apply per team member for these requests to ensure scalability for apps that cater to larger teams. This means that, for team-linked apps, not only do requests made by multiple apps that a team may have linked don’t contribute to each other’s rate limiting, but requests made on behalf of multiple team members by a single app don’t contribute to each other’s rate limiting either.

Rate limited responses include a Retry-After header that provides the duration, in seconds, your app should wait before retrying the request to avoid a subsequent rate limit response. Note that this duration is an upper cap on how long an app would wait to guarantee the next request will not be rate limited. A request made sooner may not be rate limited. If you’re finding that waiting the duration provided by the Retry-After header is causing a bottleneck in your app’s execution, you may want to implement your own back-off mechanism wherein you start by waiting a fraction of the Retry-After value before retrying and wait a little longer after each consecutive request that is being rate limited. With experimentation, you can find the right balance between ensuring your app will not be rate limited on its next request and ensuring it won’t wait any longer than necessary to avoid rate limits. However, keep in mind that rate limited requests themselves also contribute to rate limits so you may want to avoid checking for rate limit expiry too frequently.

One last note on rate limits: We are often asked whether we can share what these rate limits actually are. Dropbox doesn’t publish the rate limits for its API. This is to discourage developers from relying on client-side rate limit caps hard-coded into their apps in order to avoid rate limit errors altogether. While it may seem advantageous, this approach becomes problematic when we change the rate limits being applied on Dropbox servers. Given that we continue to experiment with and adjust our rate limits, we strongly advise against using this approach, and instead recommend you try your own back-off mechanism as discussed above.

Namespace Lock Contentions

To understand why namespace lock contentions occur, it is important to first understand what namespaces are and how they are used by the Dropbox service.

Namespaces

You can think of a namespace as a collection of files and folders in the internal representation of the Dropbox file system. Each user’s private Dropbox folder maps to a root namespace. Shared folders and team folders in Dropbox Business and Dropbox Enterprise, are each also mapped to namespaces. When a user joins a shared folder it is mounted in their root namespace. That is, a special folder entry is added to the user’s root namespace that points to the shared namespace. (Team folders are auto-mounted for all their members.) These details are largely hidden from the end user, and the shared namespaces are synced to their desktop just like any other folder. Multiple users can mount the same shared namespace, so any changes to files in that namespace are synced to all its members.

Importantly, namespaces are also where we set ownership and access permissions. A user has the same access type for all of the files and folders in a namespace. For example, a user with view permissions to a shared folder can mount the namespace backing that shared folder and sync its contents, but is not allowed to write any changes. In the product, we call this a “View-only Shared Folder”.

Namespace Locks

Write operations on a file within a namespace first acquire a lock on that namespace and release the lock when the operation has completed in order to ensure consistency. A namespace lock contention occurs when a process attempts to acquire a lock on a particular namespace and fails due to another process holding the lock. Note that namespace lock contention is a Dropbox-wide limitation and is not API-specific. Attempting to upload many files on the website in parallel to the same shared folder will cause the same error. Namespace lock contention errors also include a Retry-After header in the response—whose value will always be 0—indicating the client can immediately retry the request since namespace locks are acquired and released in quick succession. If you can ascertain that your app itself is not responsible for the lock contention (for example, if you’re only executing uploads serially), then the best way to handle these errors programmatically is to retry the request until it succeeds.

However, in certain use cases such as data migration, your application may be performing parallel uploads to the same namespace and may, therefore, be directly responsible for the lock contention, especially in situations where there is no concurrent write activity from users or other apps. Dropbox API v2 offers batch upload functionality to mitigate the effects of namespace lock contention in these scenarios.

Batch Upload

Each file upload to Dropbox consists of two stages: appending the byte contents of the file to an upload buffer on the Dropbox server, and committing those bytes as a file into a target namespace. In API v2, the traditional /files/upload endpoint combines the two stages together while the upload session mechanism decouples uploading byte contents of a file from committing those bytes to the target namespace. Batch upload is an extension of this mechanism that takes advantage of the fact that a namespace lock is acquired when uploaded bytes are being committed, and not when they are being uploaded. This allows developers to minimize namespace lock contention and improve performance of their apps. The idea is to group concurrent file uploads into batches, where files in each batch are uploaded in parallel via multiple API requests to maximize throughput, but the whole batch is committed in a single, asynchronous API call to allow Dropbox to coordinate the acquisition and release of namespace locks for all files in the batch as efficiently as possible. Following this approach, which we outline in detail below, you can achieve workload parallelization without encountering namespace lock contention or having to worry about tracking namespace boundaries. If, however, your goal is to optimize throughput as much as possible, you should follow a two-tier approach, where you restrict each batch to target at most one namespace, and in addition to parallelizing file uploads within each batch, also parallelize batch uploads that target distinct namespaces.

The general algorithm is as follows:

  1. Group files that are being uploaded to into one or more batches. A batch can contain up to 1,000 entries. Entries in a batch don’t all have to target the same namespace, but the fewer namespaces involved in a batch, the more efficient committing the batch will be.
  2. For each file in a batch, in parallel, call /files/upload_session/start once and /files/upload_session/append_v2 zero or more times as needed to upload the full contents of the file over multiple requests. Note that the last call to /files/upload_session/append_v2 when uploading a particular file (or the call to /files/upload_session/start if the full contents are to be uploaded in a single request) should specify 'close': true in its argument to ensure the upload session is closed and ready to be committed.
  3. Call /files/upload_session/finish_batch once for the entire batch to commit the uploaded files. Note that if you’re uploading multiple batches to the same namespace in parallel, you should call /files/upload_session/finish_batch serially to avoid namespace lock contention as discussed above. Note that finish_batch is an asynchronous call: it will return immediately with an async_job_id and continue processing the commit in the background. Call /files/upload_session/finish_batch/check with the async_job_id to check if the batch commit has been completed.