In order to allow API apps to verify uploaded contents or compare remote files to local files without downloading them, the FileMetadata
object contains a hash of the file contents in the content_hash
property.
To verify that the server’s copy of the file is identical to yours, make sure the server-generated content_hash
is identical to your locally-computed version of the file’s content hash.
To calculate the content_hash
of a file:
Note there is no block for an empty file of zero length. In this case an empty string would be formed in step 3 above.
Here is an example of running the above algorithm on this image of the Milky Way from NASA.
The file is 9.7 MB (9,711,423 bytes) in size. We divide it into three blocks and run SHA-256 on each block.
Block | Size | SHA-256 (32-byte value, shown in hex) |
1 | 4194304 | 2a846fa617c3361fc117e1c5c1e1838c336b6a5cef982c1a2d9bdf68f2f1992a |
2 | 4194304 | c68469027410ea393eba6551b9fa1e26db775f00eae70a0c3c129a0011a39cf9 |
3 | 1322815 | 7376192de020925ce6c5ef5a8a0405e931b0a9a8c75517aacd9ca24a8a56818b |
Concatenate the three 32-byte hashes to get a single 96-byte value.
Run SHA-256 on the concatenated value, then hex-encode the result, yielding 485291fa0ee50c016982abbfa943957bcd231aae0492ccbaa22c58e3997b35e0
.
Example code in some popular languages are available in our Github repo.
You can assume that the content_hash
field would always be available and we would not change the way to generate it.
However in the unlikely case where we decide to change it in the future, we want to keep the transition process as smooth as possible by declaring the field as optional.
We may provide the new representation in another field and stop providing the old representation after a certain time.
We will provide advanced notice to developers with the details of the transition process.