(python)Xpathがhtmlタグ(HTMLタグとコンテンツ)を抽出する方法

5029 ワード

質問:(python)Xpathがhtmlタグ(HTMLタグと内容)を抽出する方法の説明:

<div>
   <table>
      <tr>
         <td>Row value 1td>
         <td>Row value 2td>
      tr>
      <tr>
         <td>Row value 3td>
         <td>Row value 4td>
      tr>
      <tr>
         <td>Row value 1td>
         <td>Row value 1td>
      tr>
   table>
div>

tableラベルを抽出する方法は、次のとおりです.

<table>
  <tr>
     <td>Row value 1td>
     <td>Row value 2td>
  tr>
  <tr>
     <td>Row value 3td>
     <td>Row value 4td>
  tr>
  <tr>
     <td>Row value 1td>
     <td>Row value 1td>
  tr>
table>

コードは次のとおりです.

selector = etree.HTML(html)
content = selector.xpath('//div/table')[0]
print(content)
# 
#  ：   Element    str

ソリューション1:

BeautifulSoup find

ソリューション2:

from lxml.html import fromstring, tostring
# fromstring    HtmlElement  
# selector = fromstring(html)

selector = etree.HTML(html)
content = selector.xpath('//div/table')[0]
print(content)
# tostring        html  
original_html = tostring(content)

ソリューション3:

[div/table]

ソリューション4

from lxml import etree
div = etree.HTML(html)
table = div.xpath('//div/table')[0]
content = etree.tostring(table,print_pretty=True, method='html')  #

以上、「(python)Xpathがhtmlタグ(HTMLタグとコンテンツ)をどのように抽出するか」という質問の回答を紹介しましたが、必要なネットユーザーに役立つことを願っています.このWebサイトのリンク:http://www.codes51.com/itwd/4510100.html

[Rails] ログイン機能 devise 流れ簡易メモ

python文字列処理