B2 cloud9/15/2023 ![]() But there are enough points of failure, and the numbers so large, that errors happening is a fact and you really need to insure against it. You can argue it's a painful requirement to force on users since it means they have to track/compute it themselves (might be nontrivial for streaming applications), which is fair. What if a client screws it up and there's some off-by-one error (or whatever)? If you can provide instant feedback, at upload time, that your clients provided bogus data, that's a good thing. At which point uploads become more complicated since clients need to retain state and potentially resume. limited load balancing) - so while I really doubt it, it's remotely plausible they deem the internal overhead of SSL too high.īut then there's the potential for mismatch between "what the customer thinks they uploaded" and "what the customer actually uploaded" too! Less of an issue for now because their API only appears to support uploading files all at once, but eventually I'm sure they'll support a multipart upload scheme like the other platforms do. They offer a low price point and call out other compromises they make to achieve it (e.g. Ideally that would provide integrity too, but maybe not. Or B2's internal network path (if they have any) between that and the disk. You could have unforeseen bugs in the code sitting on the other end of their upload URL (it's probably not all theirs, and even if it was it was written by human developers). There's the write path from B2 receives your bits to when they're stored on disk, for one. ![]() Also: TLS may guarantee the bits your client sends are the ones their server receives, but that's just one cause of errors. This would solve the 2pass problem while still ensuring the client is actually doing the integrity check - and since Backblaze is more than likely to take the heat on any corruption issues, it's probably a good policy for them to make sure lazy client implementations aren't going to cause problems that their storage then gets the publicity smear for.Īs jbeda mentions, hardware errors are one big reason: with the scale S3/Azure/GCS/Backblaze operate at it's a matter of when and not if you're going to run into problems. Another possible solution is to require either a subsequent API call, or format the first message as a multipart, and use that route to have the caller submit the hash that's used to confirm and commit the file to storage after the body upload. Improving the API to avoid the 2-pass problem is spot-on though. We're wallpapering over it with better ones and everyone serious has been for years: it's time to accept that cheap CRCs aren't a good place to get stuck. "But fast" if you fail to store it isn't very helpful. Having been on the receiving end of entirely too many corrupted files in my life, I strongly approve of their use of a hash that's been standardized and fast for decades and remains cryptographically strong. This is probably overkill over a cheaper CRC. > They require a SHA1 hash when uploading objects. In the rare case that the object was corrupted in transit, delete/retry. Then compare that response with a hash computed while uploading. ![]() A better method is to allow users to omit the hash and return it in the upload response. This can slow uploads of large objects dramatically. In addition, it means that users have to make 2 passes to upload - first to compute the hash and then another to upload. * They require a SHA1 hash when uploading objects. * The lack of scalable front-end load balancing is shown by the fact that they require users to first make an API call to get an upload URL followed by doing the actual upload. This will require more code to be written and it will be harder to adapt client libraries. It is surprising that they didn't make it compatible with the S3 API - at least for common object/bucket create/delete. For context, I was involved in the early days of Google Cloud Storage. ![]()
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |