S3
协议操作对象存储服务,通常是实现上传下载功能。但是在某些场景下,程序不具备操作权限,或为了安全原因而缩小权限配置,需要实现无鉴权的上传与下载。这时可以用S3
协议的Presigned URLs
来实现无鉴权读写操作。
推荐使用PUT而不是POST来实现上传,因为PUT使用起来比较简单。
PUT上传
import boto3
def gen_s3_presigned_put(bucket: str, path: str) -> str:
s3r = boto3.resource(
's3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=S3_ACCESS_KEY,
aws_secret_access_key=S3_SECRET_KEY,
region_name=S3_REGION,
config=Config(signature_version='s3v4'),
)
if not s3r.Bucket(bucket).creation_date:
s3r.create_bucket(Bucket=bucket)
return s3r.meta.client.generate_presigned_url(
ClientMethod='put_object',
Params={
'Bucket': bucket,
'Key': path,
},
ExpiresIn=3600,
HttpMethod='PUT',
)
url = generate_presigned_put('bucket', 'remote/path/of/file')
generate_presigned_url
的源码可见botocore/signers.py#L245
。
这里的返回值,就是一个在1小时内可以用PUT
上传的URL字符串。通过某种方式传递URL到无鉴权的服务或客户端,就可以实现上传功能。
S3_*
等,就是鉴权信息。生成URL
需要鉴权信息,使用时则不用。上面代码中,还包含一个if
判定桶的存在性并自动建立的功能。它在生产环境基本无用,但是方便调试,可以酌情去掉。
以下是上传示例:
import requests
def upload_with_put(url):
with open('local/path/of/file', 'rb') as file:
response = requests.put(url, data=file)
response.raise_for_status()
GET下载
import boto3
def gen_s3_presigned_get(bucket: str, path: str) -> str:
s3r = boto3.resource(
's3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=S3_ACCESS_KEY,
aws_secret_access_key=S3_SECRET_KEY,
region_name=S3_REGION,
config=Config(signature_version='s3v4'),
)
if not s3r.Bucket(bucket).creation_date:
s3r.create_bucket(Bucket=bucket)
return s3r.meta.client.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': bucket,
'Key': path,
},
ExpiresIn=3600,
HttpMethod='GET',
)
url = generate_presigned_get('bucket', 'remote/path/of/file')
整个实现和PUT
非常类似,只是改了ClientMethod
和HttpMethod
。
以下是下载示例:
size = 2**12 # Use 4 KB memery buffer
with open(path, 'wb') as file:
for chunk in response.iter_content(chunk_size=size):
file.write(chunk)
POST上传
import boto3
def gen_s3_presigned_post(bucket: str, path: str) -> str:
s3r = boto3.resource(
's3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=S3_ACCESS_KEY,
aws_secret_access_key=S3_SECRET_KEY,
region_name=S3_REGION,
config=Config(signature_version='s3v4'),
)
if not s3r.Bucket(bucket).creation_date:
s3r.create_bucket(Bucket=bucket)
dict_ = s3r.meta.client.generate_presigned_post(
Bucket=bucket,
Key=path,
ExpiresIn=3600,
)
return dict_['url'], dict_['fields']
url, fields = generate_presigned_post('bucket', 'remote/path/of/file')
generate_presigned_post
的源码可见botocore/signers.py#L605
。除了URL
外,fields
是个dict
,也是必须要传递的参数。并且似乎需要传递或约定path
,也可能使用通用的file
也行。
import requests
def upload_with_post(url, fields):
with open('local/path/of/file', 'rb') as file:
files = {'file': ('remote/path/of/file', file)}
response = requests.post(url, data=fields, files=files)
response.raise_for_status()
HTML网页示例如下,其中也可发现fields
中包含的具体内容。:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<!-- Copy the 'url' value returned by S3Client.generate_presigned_post() -->
<form action="URL_VALUE" method="post" enctype="multipart/form-data">
<!-- Copy the 'fields' key:values returned by S3Client.generate_presigned_post() -->
<input type="hidden" name="key" value="VALUE" />
<input type="hidden" name="AWSAccessKeyId" value="VALUE" />
<input type="hidden" name="policy" value="VALUE" />
<input type="hidden" name="signature" value="VALUE" />
File:
<input type="file" name="file" /> <br />
<input type="submit" name="submit" value="Upload to Amazon S3" />
</form>
</body>
</html>
总结
同样是上传,用POST
就会麻烦一些,因此推荐PUT
。二者的区别,可能在于操作的幂等性,POST
是不能重复的,而PUT
可以。(仅语义推断,未实测验证。)
分段上传,用Presigned URLs
似乎是不支持的。
原文来自:https://note.qidong.name/2020/08/python-s3-presigned-url/
作者:匿蟒
我就知道你“在看”
- 左青龙
- 微信扫一扫
-
- 右白虎
- 微信扫一扫
-
评论