Python スクレイピングはじめの一歩

5034 ワード

urllib Python BeautifulSoup scraping stock Python テキストリンク

Pythonでのスクレイピングの基本的なコードです。
ここでは「株探」のwebページから本日の四本値を取得しています。
※urllib、BeautifulSoupのインストールは別途必要
※Python3.x

# %reset
# get html ----------------------------------
code = '4188'
url = 'https://kabutan.jp/stock/?code=' + code
from urllib import request
html = request.urlopen(url)

# Beautiful soup ----------------------------------
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

# extract rows -------------------------------------
table = soup.findAll('table',{'class':'stock_st_table'})[0]
rows = table.findAll('td')

# get parameters ------------------------------------------
# date
date = table.find("td",{'class':'stock_st_table_top'}).get_text()
# prices and times
table = soup.findAll('table',{'class':'stock_st_table'})[0]
rows = table.findAll('td')
price = {}
time = {}
for i, row in enumerate(rows):
    buf = row.get_text()
    if buf in ['始値','高値','安値','終値']:
        key = buf
        price[key] = rows[i+1].get_text()
        time[key] = rows[i+3].get_text()[1:-1]

※必要に応じてwebページのソースコードを確認しましょう。chromeなら右クリック→「ページのソースを表示」
※コーディング規約は意識してないです。規約を大事にされている方々ごめんなさい。
※スクレイピングされる側にも負担になりますので、やりすぎに注意しましょう。

Author And Source

この問題について(Python スクレイピングはじめの一歩), 我々は、より多くの情報をここで見つけました https://qiita.com/yoinhu/items/4d1c31874cf4c97c4c37

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .

JAvascript浮動小数点数演算精度問題

NYOJ 171聡明なkk(ダイナミックプランニング復習)

Python スクレイピング はじめの一歩

Author And Source

Python スクレイピングはじめの一歩