python爬虫類-BeautifulSoup(2)子孫ノード(.children.descendants)と親ノード(.parents)

10899 ワード

学習ノート

3.1子ノードと子ノード

soup.body.h1#   body     h1，  h1    body

同じ理屈だdiv.find_all(‘img’)はすべてのdivの中のimgラベルを見つけます.
...descendants
比較コードは次のとおりです.

html = urlopen('http://www.pythonscraping.com/pages/page3.html')
soup = BeautifulSoup(html, 'lxml')
children = soup.find('table',{'id':'giftList'}).children
descendants = soup.find('table',{'id':'giftList'}).descendants
sum = 0
for child in children:
    print(child)
    sum +=1
print(sum)
sum2 = 0
for descendant in descendants:
    sum2+=1
    print(descendant)
print(sum2)

実行結果からsum=13,sum 2=86 descendantsの第1部を比較すると

#============= soup.find('table',{'id':'giftList'})    ====
Item Title

Description

Cost

Image
#============ soup.find('table',{'id':'giftList'})    ====
        #============     ，('table',{'id':'giftList'})     ==
Item Title
       #============     ，('table',{'id':'giftList'})     ==

Item Title#=========        ，      ================

#============  ====================
Description


Description


Cost


Cost
.... 
      ，children            。 descendants                ，                    。 
  3.2     
                 ，                 ，.parents   .parent。 
  from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen('http://www.pythonscraping.com/pages/warandpeace.html')
soup = BeautifulSoup(html)
print(soup.find('img', {'src':'../img/gifts/img1.jpg'}).parent.previous_sibling.get_text())
  
コードがどのように  しているかを  します.
--
--(3)
    --"$15.00"(4)
--s(2)
    -- "../img/gifts/img1.jpg">(1) 
  1.      src=”../img/gifts/img1.jpg”   img 
 2.  img      s. 
 3.  s          
 4.         
  

                            
                        
                    
                    
                     
                    
                    
                     
                
                
                    
                        
                        
                             
                        
                        
                        
                             
                        
                        
                        
                             
                        
                    
                
            
        
    
    
               
        
            
                
                                SQL       
                                    xieke90
UNION ALLUNION      JOIN
                                
                                java    --   
                                         
java      final 
                                
                                [       ]  CPU     
                                    comsci
cpu
                                
                                JVM      Eden Space、Survivor Space、Tenured Gen，Perm Gen  
                                      shang
jvm  
                                
                                      QQ
                                    oloz
qq
                                
                                    
                                      chu
  
                                
                                       
                                       
       
                                
                                php     
                                    aichenglong
php     
                                
                
            
        
    


    
        
                 ：
            ABCDEFGHIJKLMNOPQRSTUVWXYZ  
        
    


    
        
               -
                 -
                 -
            Sitemap -
                
        
             IT    CopyRight © 2000-2050 IT    IT610.com , All Rights Reserved.
             ICP 09083238

C++カリキュラム設計:学生管理システム

redis 5.0.2クラスタの構築