Note: A little memo description of the nightmare for performance here:
** FOR AWS, 2 cases:
- COPY will copy the metadata of the source to dest, but you can't
modify them. Any additional header will be ignored anyway.
- REPLACE will set the additional metadata headers that are provided
but will not copy any of the source headers.
So, to add to existing meta during copy, you have to do an
object_info to get original source headers, then modify, then use
REPLACE for the copy operation.
** For Minio and maybe other implementations:
- if additional headers are sent, they will be set to the
destination on top of source original meta in all cases COPY and
REPLACE.
It is a nice behavior except that it is different of the aws one.
As it was still too easy, there is another catch:
In all cases, for multipart copies, metadata data are never copied
from the source.
But normally automatically handled now, despite having the extra
object_info in worst cases.
| ... | ... |
@@ -822,6 +822,27 @@ class S3(object): |
| 822 | 822 |
|
| 823 | 823 |
def object_copy(self, src_uri, dst_uri, extra_headers=None, |
| 824 | 824 |
src_size=None, extra_label="", replace_meta=False): |
| 825 |
+ """Remote copy an object and eventually set metadata |
|
| 826 |
+ |
|
| 827 |
+ Note: A little memo description of the nightmare for performance here: |
|
| 828 |
+ ** FOR AWS, 2 cases: |
|
| 829 |
+ - COPY will copy the metadata of the source to dest, but you can't |
|
| 830 |
+ modify them. Any additional header will be ignored anyway. |
|
| 831 |
+ - REPLACE will set the additional metadata headers that are provided |
|
| 832 |
+ but will not copy any of the source headers. |
|
| 833 |
+ So, to add to existing meta during copy, you have to do an object_info |
|
| 834 |
+ to get original source headers, then modify, then use REPLACE for the |
|
| 835 |
+ copy operation. |
|
| 836 |
+ |
|
| 837 |
+ ** For Minio and maybe other implementations: |
|
| 838 |
+ - if additional headers are sent, they will be set to the destination |
|
| 839 |
+ on top of source original meta in all cases COPY and REPLACE. |
|
| 840 |
+ It is a nice behavior except that it is different of the aws one. |
|
| 841 |
+ |
|
| 842 |
+ As it was still too easy, there is another catch: |
|
| 843 |
+ In all cases, for multipart copies, metadata data are never copied |
|
| 844 |
+ from the source. |
|
| 845 |
+ """ |
|
| 825 | 846 |
if src_uri.type != "s3": |
| 826 | 847 |
raise ValueError("Expected URI type 's3', got '%s'" % src_uri.type)
|
| 827 | 848 |
if dst_uri.type != "s3": |
| ... | ... |
@@ -837,8 +858,12 @@ class S3(object): |
| 837 | 837 |
acl = None |
| 838 | 838 |
|
| 839 | 839 |
multipart = False |
| 840 |
- |
|
| 841 | 840 |
headers = None |
| 841 |
+ |
|
| 842 |
+ if extra_headers or self.config.mime_type: |
|
| 843 |
+ # Force replace, that will force getting meta with object_info() |
|
| 844 |
+ replace_meta = True |
|
| 845 |
+ |
|
| 842 | 846 |
if replace_meta: |
| 843 | 847 |
src_info = self.object_info(src_uri) |
| 844 | 848 |
headers = src_info['headers'] |
| ... | ... |
@@ -865,9 +890,8 @@ class S3(object): |
| 865 | 865 |
threshold = self.config.multipart_copy_chunk_size_mb * SIZE_1MB |
| 866 | 866 |
|
| 867 | 867 |
if src_size > threshold: |
| 868 |
- # Sadly, s3 is badly done as metadata will not be copied in |
|
| 869 |
- # multipart copy unlike what is done in the case of direct |
|
| 870 |
- # copy. |
|
| 868 |
+ # Sadly, s3 has a bad logic as metadata will not be copied for |
|
| 869 |
+ # multipart copy unlike what is done for direct copies. |
|
| 871 | 870 |
# TODO: Optimize by re-using the object_info request done |
| 872 | 871 |
# earlier earlier at fetch remote stage, and preserve headers. |
| 873 | 872 |
if src_headers is None: |
| ... | ... |
@@ -883,6 +907,7 @@ class S3(object): |
| 883 | 883 |
else: |
| 884 | 884 |
headers = SortedDict(ignore_case=True) |
| 885 | 885 |
|
| 886 |
+ # Following meta data are updated even in COPY by aws |
|
| 886 | 887 |
if self.config.acl_public: |
| 887 | 888 |
headers["x-amz-acl"] = "public-read" |
| 888 | 889 |
|
| ... | ... |
@@ -898,6 +923,7 @@ class S3(object): |
| 898 | 898 |
headers['x-amz-server-side-encryption-aws-kms-key-id'] = \ |
| 899 | 899 |
self.config.kms_key |
| 900 | 900 |
|
| 901 |
+ # Following meta data are not updated in simple COPY by aws. |
|
| 901 | 902 |
if extra_headers: |
| 902 | 903 |
headers.update(extra_headers) |
| 903 | 904 |
|