IDWR速報データのインフルエンザ定点当たり報告数・都道府県別をスクレイピング

4000 ワード

国立感染症研究所に同じデータのCSVがあるのでスクレイピング

from urllib.parse import urljoin

import requests
from bs4 import BeautifulSoup

url = "https://www.niid.go.jp/niid/ja/data.html"

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
}

r = requests.get(url, headers=headers)
r.raise_for_status()

soup = BeautifulSoup(r.content, "html.parser")

tag = soup.select_one(
    'div.leading-0 > table > tbody > tr > td > p.body1 > a[href$="-teiten.csv"]'
)

link = urljoin(url, tag.get("href"))

import pandas as pd

df = pd.read_csv(
    link,
    encoding="cp932",
    skiprows=3,
    index_col=0,
    header=0,
    usecols=[0, 1, 2],
    na_values="-",
)

df1 = df[df.index.notna()]

Author And Source

この問題について(IDWR速報データのインフルエンザ定点当たり報告数・都道府県別をスクレイピング), 我々は、より多くの情報をここで見つけました https://qiita.com/barobaro/items/4178da6daba1732df307

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .

ajaxは[object XMLDocument]の結果を返します.

dubboのDubboSwaggerServiceについて