Using Dropbox's Delta API: Lessons Learned From Site44

Posted on October 10, 2012

This guest post is written by Steve Marx, one of the founders of Site44, a static web hosting service built on top of Dropbox.

Dropbox has a rich API that allows developers to query and manipulate data in Dropbox. One of the newest additions to the API is /delta, an efficient way to keep track of changes to a user's Dropbox. This API has been available in production since March of this year, and we at Site44 have been developing with it since its beta in February. Along the way, we developed some best practices for making effective use of this valuable API.

Where Site44 uses the delta API

Site44 provides static website hosting to Dropbox users. We essentially act as a web server in front of Dropbox, taking incoming requests and determining which file from Dropbox should be sent as a response. To maximize website performance and to minimize the load on Dropbox's servers, we avoid fetching content from Dropbox except when absolutely necessary. Instead, we maintain our own copy of all our users' content. The delta API is what lets us keep that copy up-to-date even as users are making changes to their websites in Dropbox.

How to use the delta API

Each call you make to the delta API returns a list of "delta entries." Each delta entry consists of a path and metadata. If the metadata is null, it means the path was deleted. If the metadata is present, it's the current metadata for that path.

On your first call to the delta API, you'll receive delta entries that include every file in the user's Dropbox to which your app has access (ideally just an app folder). On subsequent calls, you'll only see delta entries for paths that have been created, modified, or deleted since the last call. I like to think of the delta entries as a set of instructions for how to update an app's local state to match Dropbox's state. They aren't necessarily the exact changes that a user made, so don't think of them as a log or an activity feed. The only guarantee the delta API makes is that if you process each of the delta entries, your state will match Dropbox's at the time of the call.

For Dropbox to return to you the right set of changes, it has to know what changes you've already seen. To that end, every response from the delta API returns a "cursor," which you then pass as a parameter on your next call. When the cursor is present in the call, it means you're asking Dropbox "What's changed since you gave me this cursor?"

There's one last field returned by the delta API: "reset." If this field is set to true, it means you should discard all your local state before processing the delta entries. This happens the first time you call the API (when you have no cursor). Otherwise it should be rare, but make sure your code handles it properly.

Achieving low latency

A big part of Site44's value is how quickly we can pick up changes that users make to files in their Dropbox. Because the delta API is based on polling, we see those changes only as quickly as we poll the API. We'd like to see changes within a second of when they're made, and there's no way we can poll at that rate. Even if we could, Dropbox might not be happy with the number of requests per second we were making to their API.

To solve this problem, we use a hybrid approach. We poll as fast as we reasonably can, but we also poll on-demand when someone refreshes in the browser. This means that if a user makes a change to a file in Dropbox and then refreshes his website in the browser, he will see his changes immediately. This approach has worked very well for us, and we find that our users are impressed with how quickly we're able to pick up changes to their sites.

Simplifying the rest of your API use

The delta API, though slightly more complicated than the rest of Dropbox's API, lets you simplify the rest of your code. If we didn't have the delta API, we'd have to periodically walk the contents of a user's Dropbox, checking metadata to figure out what's changed. Because we do have the delta API, the rest of Site44's interaction with Dropbox is simple. When we need to retrieve a file that's changed, we just use the /files API to retrieve the new file.

The delta API ensures that we can make a small number of calls to Dropbox and still maintain a perfect copy of our users' data. Thanks to our hybrid approach of timed and on-demand polling, we can do all of this with very low latency.

A unique benefit

Site44 leverages Dropbox's ease of use and its API to provide our customers with an absurdly simple web hosting experience. Without the delta API, it would have been a pain to keep customer content in sync; the delta API made it straightforward to provide this essential quality to our customers. As far as we know, Dropbox is the only cloud storage provider to offer first-class support for synchronization—give it a try!