Since 5fc2bbcc (2013-05-20 23:31:50 -0500), the value we stored into
the s3cmd-attrs header for md5 contains the value for the plaintext,
not the encrypted, instance of the file. But after download we're
(incorrectly) checking the md5 of the encrypted file. This patch fixes
this.
We started storing the md5 value in s3cmd-attrs header in 1703df7009
(Fri Jun 15 23:43:00 2012). So it's likely been broken for a couple
years, and we'll have to deal with it (check both before and after
decryption I suppose, in case it matches either). That'll be another patch.
With the new Content-MD5 branch too, calculating md5 before encrypting
is a really really bad idea - that's just broken design. Encryption
should be done before we calculate the MD5 of the thing being
uploaded. It was the filename swizzle that caught me off guard.
... | ... |
@@ -1143,10 +1143,14 @@ class S3(object): |
1143 | 1143 |
response["md5"] = response["headers"]["etag"] |
1144 | 1144 |
|
1145 | 1145 |
md5_hash = response["headers"]["etag"] |
1146 |
- try: |
|
1147 |
- md5_hash = response["s3cmd-attrs"]["md5"] |
|
1148 |
- except KeyError: |
|
1149 |
- pass |
|
1146 |
+ if not 'x-amz-meta-s3tools-gpgenc' in response["headers"]: |
|
1147 |
+ # we can't trust our stored md5 because we |
|
1148 |
+ # encrypted the file after calculating it but before |
|
1149 |
+ # uploading it. |
|
1150 |
+ try: |
|
1151 |
+ md5_hash = response["s3cmd-attrs"]["md5"] |
|
1152 |
+ except KeyError: |
|
1153 |
+ pass |
|
1150 | 1154 |
|
1151 | 1155 |
response["md5match"] = md5_hash.find(response["md5"]) >= 0 |
1152 | 1156 |
response["elapsed"] = timestamp_end - timestamp_start |