0
votes

I am using Dropbox API (python version), and want to replicate one functionality in Dropbox client-side software.

In Dropbox API, I can call a function like put_file() to upload the file to my Dropbox account.

Dropbox actually implemented per-user deduplication mechanism, which means that you need to transmit the chunk/file hash to the server before transmitting the chunk/file to the server. If you uploaded a file F before, if now the server finds a hash match, you don't need to transmit the chunk/file again.

put_file() seems to upload the file everytime and does not do the chunking.

I also found upload_chunk() probably useful, but it seems not that useful.

I am wondering how can I do the chunk-based deduplication with Dropbox API?

(for example, I can upload the hash of a particular chunk, and the server will reply me whether there is a hash match)

1

1 Answers

1
votes

According to this announcement the purpose of chunked upload is to make it possible to deal with spotty connections by letting you upload a large file in chunks instead. It's not about deduplication.

If you take a look through the Core API documentation (not that much to read, really), there is no mention anywhere of de-duplication being offered through the API. Wether you use Python or any other language or library, without the published API supporting de-duplication, there is no way you can access this functionality.