python(指定URLの符号化を検出)
521 ワード
import urllib.request
import chardet
def main():
url = input(' URL:')
response = urllib.request.urlopen(url)
html = response.read()
#
encode = chardet.detect(html)['encoding']
if encode == 'GB2312':
encode == 'GBK'
print(' :%s' % encode)
if __name__ == '__main__':
main()
実行結果
>>> URL:https://www.baidu.com/
:ascii
>>> URL:https://www.python.org/
:utf-8