Herokuで定期的に為替レートを取得し、ログをAmazon S3にアップロード


開発環境を構築

virtualenvを作成

$ mkdir cronapp && cd cronapp
$ mkvirtualenv venv

Herokuにログイン

(venv)$ heroku login

Pythonライブラリを追加

(venv)$ cat >> requirements.txt << EOF
APScheduler==3.0.4
awscli==1.9.11
boto3==1.2.2
botocore==1.3.11
colorama==0.3.3
docutils==0.12
futures==3.0.3
httplib2==0.9
jmespath==0.9.0
pyasn1==0.1.9
python-dateutil==2.4.2
pytz==2015.7
requests==2.8.1
rsa==3.2.3
six==1.10.0
tzlocal==1.2
wheel==0.26.0
EOF
(venv)$ pip install -r requirements.txt

動作確認用スクリプトを作成

(venv)$ vi cron.py
from apscheduler.schedulers.blocking import BlockingScheduler

sched = BlockingScheduler()

@sched.scheduled_job('interval', minutes=3)
def job_3min():
    print('[cron.py:job_3min] Start.')

sched.start()

作成したスクリプトを定期実行するProcfileを追加

(venv)$ echo "bot: python cron.py" > Procfile

.gitignoreを追加

(venv)$ cat >> .gitignore << EOF
venv
*.pyc
.idea
EOF

ローカルにリポジトリを作成

(venv)$ git init && git add . && git commit -m "initial commit"

Herokuへデプロイ

Herokuにリポジトリを作成

(venv)$ heroku create

アプリケーションをデプロイ

(venv)$ git push heroku master

dynoプロセスの割り当て

(venv)$ heroku ps:scale bot=1

動作確認

(venv)$ heroku logs
2015-12-07T01:36:20.343967+00:00 app[bot.1]: [cron.py:job_3min] Start.
2015-12-07T01:39:20.346373+00:00 app[bot.1]: [cron.py:job_3min] Start.
2015-12-07T01:42:20.344067+00:00 app[bot.1]: [cron.py:job_3min] Start.

openexchangerates.orgで為替データを取得し、S3にアップロード

cron.pyを以下の内容に変更

  • 事前にAWSのIAMユーザー作成、Open Exchange RatesのAPIキー取得を行っておく。
  • Open Exchange Ratesのデータは毎時1〜2分頃に更新されるが、余裕をもって毎時10分に定期実行するよう指定。
import requests, json, datetime, pytz, logging
import boto3, botocore
from apscheduler.schedulers.blocking import BlockingScheduler

logging.basicConfig()
sched = BlockingScheduler()

@sched.scheduled_job('cron', minute='10', hour='*/1')
def job_crawl():
    print('[cron.py:job_crawl] Start.')

    ####################################
    # API Keys
    ####################################

    OPEN_EXCHANGE_API_URL = 'https://openexchangerates.org/api/latest.json?app_id='
    OPEN_EXCHANGE_APP_ID = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

    AWS_ACCESS_KEY_ID = 'xxxxxxxxxxxxxxxx'
    AWS_SECRET_ACCESS_KEY = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    AWS_REGION_NAME = 'xx-xxxxx-x'
    AWS_S3_BUCKET_NAME = 'xxxxxxxxxxx'

    ####################################
    # Retrieve json data from openexchangerates.com
    ####################################

    res = requests.get(OPEN_EXCHANGE_API_URL + OPEN_EXCHANGE_APP_ID)
    json_data = json.loads(res.text.decode('utf-8'))
    del json_data['disclaimer']
    del json_data['license']
    json_text = json.dumps(json_data)

    timestamp = json_data['timestamp']
    exchange_date = datetime.datetime.fromtimestamp(timestamp, tz=pytz.utc)

    ####################################
    # Upload json data to S3 bucket
    ####################################

    if json_text:

        #
        # AWS Session
        #
        session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
                                        aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                                        region_name=AWS_REGION_NAME)
        s3 = session.resource('s3')
        bucket = s3.Bucket(AWS_S3_BUCKET_NAME)

        #
        # Upload Latest
        #
        bucket_latest_key_name = 'exchange/latest.json'
        obj = bucket.Object(bucket_latest_key_name)
        response = obj.put(
            Body=json_text.encode('utf-8'),
            ContentEncoding='utf-8',
            ContentType='application/json'
        )

        #
        # Upload Daily Data
        #
        bucket_prefix_daily = "{0:%Y-%m-%d}".format(exchange_date)
        bucket_daily_key_name = 'exchange/' + bucket_prefix_daily + '/' + bucket_prefix_daily + '.json'
        obj = bucket.Object(bucket_daily_key_name)
        response = obj.put(
            Body=json_text.encode('utf-8'),
            ContentEncoding='utf-8',
            ContentType='application/json'
        )

        #
        # Upload Hourly Data
        #
        bucket_hourly_prefix = "{0:%Y-%m-%d-%H}".format(exchange_date)
        bucket_hourly_key_name = 'exchange/' + bucket_prefix_daily + '/' + bucket_hourly_prefix + '.json'
        try:
            # If json file already exists, do nothing
            s3.Object(AWS_S3_BUCKET_NAME, bucket_hourly_key_name).load()
        except botocore.exceptions.ClientError as e:
            # If json file doesn't exists
            obj = bucket.Object(bucket_hourly_key_name)
            response = obj.put(
                Body=json_text.encode('utf-8'),
                ContentEncoding='utf-8',
                ContentType='application/json'
            )

    print('[cron.py:job_crawl] Done.')


sched.start()

更新内容を反映しデプロイ

(venv)$ git add . && git commit -m "changed cron job"
(venv)$ git push heroku master

動作確認

(venv)$ heroku logs
2015-12-07T03:10:00.003862+00:00 app[bot.1]: [cron.py:job_crawl] Start.
2015-12-07T03:10:01.856428+00:00 app[bot.1]: [cron.py:job_crawl] Done.

おまけ

App名を変更

(venv)$ heroku apps:rename cronapp

リポジトリを変更

(venv)$ git remote rm heroku
(venv)$ heroku git:remote -a cronapp

参考サイト