PostgreSQL 11-クエリー新規データ-1つのテーブルが存在し、別のテーブルが存在しない方法の比較

8552 ワード

10月と8月の新しいデータをクエリーしてパフォーマンスをテストします.すなわち、あるテーブルに存在し、別のテーブルに存在しないデータをどのようにクエリーするかということです.
 ( )
select * from  table1 A 
where A.ID not in (select ID from table2)

 ( )
select * from  table1 A 
where not exists (select ID from table2 B where A.ID=B.ID)


  ( )
select * from  table1 A 
 left join table2 B on A.ID=B.ID where B.ID is null
 
  ( )

 explain select * from table1 A where (select count(1) as num from table2 B where A.ID = B.ID) = 0 

結論(億級データ)


方法2の効率が最も高い


オープンパラメータ比較前後変化enable_partitionwise_join = on enable_partitionwise_aggregate = on
Gather (cost=10572265.02…34584444.75 rows=79739540 width=818) Gather (cost=650912.31…34047685.92 rows=79739540 width=818)
 , , 
 
------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=528898.82..28729667.59 rows=79739540 width=409)
   Workers Planned: 6
   ->  Parallel Append  (cost=527898.82..20754713.59 rows=3353954 width=409)
         ->  Parallel Hash Anti Join  (cost=528458.45..1297565.86 rows=209884 width=409)
               Hash Cond: (a_3.ID = b_3.ID)
               ->  Parallel Seq Scan on table1_p4 a_3  (cost=0.00..571000.04 rows=1663104 width=409)
               ->  Parallel Hash  (cost=498939.20..498939.20 rows=1453220 width=33)
                     ->  Parallel Seq Scan on table2_p4 b_3  (cost=0.00..498939.20 rows=1453220 width=33)
         ->  Parallel Hash Anti Join  (cost=528067.24..1296744.17 rows=210030 width=409)
               Hash Cond: (a_9.ID = b_9.ID)
               ->  Parallel Seq Scan on table1_p10 a_9  (cost=0.00..570680.75 rows=1662175 width=409)
               ->  Parallel Hash  (cost=498570.44..498570.44 rows=1452144 width=33)
                     ->  Parallel Seq Scan on table2_p10 b_9  (cost=0.00..498570.44 rows=1452144 width=33)
         ->  Parallel Hash Anti Join  (cost=527991.54..1296353.87 rows=209557 width=409)
               Hash Cond: (a_11.ID = b_11.ID)
               ->  Parallel Seq Scan on table1_p12 a_11  (cost=0.00..570445.92 rows=1661492 width=409)
               ->  Parallel Hash  (cost=498498.35..498498.35 rows=1451935 width=33)
                     ->  Parallel Seq Scan on table2_p12 b_11  (cost=0.00..498498.35 rows=1451935 width=33)


 :
 Gather  (cost=595528.82..30861735.59 rows=79739540 width=818)
   Workers Planned: 6
   ->  Parallel Append  (cost=594528.82..22886781.59 rows=3353954 width=818)
         ->  Parallel Hash Anti Join  (cost=595158.45..1430965.86 rows=209884 width=818)
               Hash Cond: (a_3.ID = b_3.ID)
               ->  Parallel Seq Scan on table1_p4 a_3  (cost=0.00..571000.04 rows=1663104 width=409)
               ->  Parallel Hash  (cost=498939.20..498939.20 rows=1453220 width=409)
                     ->  Parallel Seq Scan on table2_p4 b_3  (cost=0.00..498939.20 rows=1453220 width=409)
       。。。。。。。。。。
         ->  Parallel Hash Anti Join  (cost=594133.04..1428563.89 rows=209659 width=818)
               Hash Cond: (a_7.ID = b_7.ID)
               ->  Parallel Seq Scan on table1_p8 a_7  (cost=0.00..570061.72 rows=1660372 width=409)
               ->  Parallel Hash  (cost=498079.13..498079.13 rows=1450713 width=409)
                     ->  Parallel Seq Scan on table2_p8 b_7  (cost=0.00..498079.13 rows=1450713 width=409)



 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..21918498515.70 rows=797396 width=409)
   ->  Seq Scan on table1_p1 a  (cost=0.00..1369544385.59 rows=49824 width=409)
         Filter: ((SubPlan 1) = 0)
         SubPlan 1
           ->  Aggregate  (cost=137.36..137.37 rows=1 width=8)
                 ->  Append  (cost=0.56..137.32 rows=16 width=0)
                       ->  Index Only Scan using table2_p1_ID_idx on table2_p1 b  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p2_ID_idx on table2_p2 b_1  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p3_ID_idx on table2_p3 b_2  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p4_ID_idx on table2_p4 b_3  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p5_ID_idx on table2_p5 b_4  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p6_ID_idx on table2_p6 b_5  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p7_ID_idx on table2_p7 b_6  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p8_ID_idx on table2_p8 b_7  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p9_ID_idx on table2_p9 b_8  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p10_ID_idx on table2_p10 b_9  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p11_ID_idx on table2_p11 b_10  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p12_ID_idx on table2_p12 b_11  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p13_ID_idx on table2_p13 b_12  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p14_ID_idx on table2_p14 b_13  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p15_ID_idx on table2_p15 b_14  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
                       ->  Index Only Scan using table2_p16_ID_idx on table2_p16 b_15  (cost=0.56..8.58 rows=1 width=0)
                             Index Cond: (ID = a.ID)
   ->  Seq Scan on table1_p2 a_1  (cost=0.00..1369703814.29 rows=49830 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p3 a_2  (cost=0.00..1369605271.04 rows=49826 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p4 a_3  (cost=0.00..1371442955.44 rows=49893 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p5 a_4  (cost=0.00..1369648013.99 rows=49828 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p6 a_5  (cost=0.00..1369593313.76 rows=49826 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p7 a_6  (cost=0.00..1369871350.56 rows=49836 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p8 a_7  (cost=0.00..1369190070.50 rows=49811 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p9 a_8  (cost=0.00..1369577783.54 rows=49825 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p10 a_9  (cost=0.00..1370676463.24 rows=49865 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p11 a_10  (cost=0.00..1370035452.26 rows=49842 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p12 a_11  (cost=0.00..1370113241.76 rows=49845 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p13 a_12  (cost=0.00..1369706288.17 rows=49830 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p14 a_13  (cost=0.00..1370079156.90 rows=49844 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p15 a_14  (cost=0.00..1369817062.47 rows=49834 width=409)
         Filter: ((SubPlan 1) = 0)
   ->  Seq Scan on table1_p16 a_15  (cost=0.00..1369889905.19 rows=49837 width=409)
         Filter: ((SubPlan 1) = 0)
(68 rows)