requestsダウンロード大ファイルの優雅な実現

6442 ワード

python

#!/usr/bin/python
# -*- coding: UTF-8 -*-
import requests
import json
from contextlib import closing


chapters = requests.get(url='https://unsplash.com/napi/collections/'
                            '1065976/photos?page=1&per_page=20&order_by=latest', verify=False)

# print(chapters.text)
json_res = json.loads(chapters.text)
print('     ……')
for url in json_res:
    src_id = url['id']
    download_url = url['links']['download']+'?force=true'
    with closing(requests.get(url=download_url, verify=False, stream=True)) as res:
        with open('%s.jpg' % src_id, 'wb') as fd:
            print('    ……')
            for chunk in res.iter_content(chunk_size=1024):
                if chunk:
                    fd.write(chunk)
print('    ……')

注意事項

requests.getパラメータにstream=Trueを加える.

with closing(requests.get(url=download_url, verify=False, stream=True)) as res:

要求でstreamをTrueに設定すると、Requestsは接続プールに接続を戻すことができません.すべてのデータを消費したり、Response.closeを呼び出したりしない限り.これにより接続効率が低下するという問題があるため、with closingを使用して

を確実に閉じる.

        with open('%s.jpg' % src_id, 'wb') as fd:
            print('    ……')
            for chunk in res.iter_content(chunk_size=1024):
                if chunk:
                    fd.write(chunk)

Response.iter_contentを使用すると、Response.rawを直接使用しなければならない多くの処理が処理されます.ストリームがダウンロードされると、上記が優先的に推奨されるコンテンツの取得方法である.

left joinの第2表で条件を満たす第1条記録

Androidネーミング方式