PostgreSQL 11-クエリー新規データ-1つのテーブルが存在し、別のテーブルが存在しない方法の比較
8552 ワード
10月と8月の新しいデータをクエリーしてパフォーマンスをテストします.すなわち、あるテーブルに存在し、別のテーブルに存在しないデータをどのようにクエリーするかということです.
結論(億級データ)
( )
select * from table1 A
where A.ID not in (select ID from table2)
( )
select * from table1 A
where not exists (select ID from table2 B where A.ID=B.ID)
( )
select * from table1 A
left join table2 B on A.ID=B.ID where B.ID is null
( )
explain select * from table1 A where (select count(1) as num from table2 B where A.ID = B.ID) = 0
結論(億級データ)
方法2の効率が最も高い
オープンパラメータ比較前後変化enable_partitionwise_join = on enable_partitionwise_aggregate = on
Gather (cost=10572265.02…34584444.75 rows=79739540 width=818) Gather (cost=650912.31…34047685.92 rows=79739540 width=818) , ,
------------------------------------------------------------------------------------------------------------------------------
Gather (cost=528898.82..28729667.59 rows=79739540 width=409)
Workers Planned: 6
-> Parallel Append (cost=527898.82..20754713.59 rows=3353954 width=409)
-> Parallel Hash Anti Join (cost=528458.45..1297565.86 rows=209884 width=409)
Hash Cond: (a_3.ID = b_3.ID)
-> Parallel Seq Scan on table1_p4 a_3 (cost=0.00..571000.04 rows=1663104 width=409)
-> Parallel Hash (cost=498939.20..498939.20 rows=1453220 width=33)
-> Parallel Seq Scan on table2_p4 b_3 (cost=0.00..498939.20 rows=1453220 width=33)
-> Parallel Hash Anti Join (cost=528067.24..1296744.17 rows=210030 width=409)
Hash Cond: (a_9.ID = b_9.ID)
-> Parallel Seq Scan on table1_p10 a_9 (cost=0.00..570680.75 rows=1662175 width=409)
-> Parallel Hash (cost=498570.44..498570.44 rows=1452144 width=33)
-> Parallel Seq Scan on table2_p10 b_9 (cost=0.00..498570.44 rows=1452144 width=33)
-> Parallel Hash Anti Join (cost=527991.54..1296353.87 rows=209557 width=409)
Hash Cond: (a_11.ID = b_11.ID)
-> Parallel Seq Scan on table1_p12 a_11 (cost=0.00..570445.92 rows=1661492 width=409)
-> Parallel Hash (cost=498498.35..498498.35 rows=1451935 width=33)
-> Parallel Seq Scan on table2_p12 b_11 (cost=0.00..498498.35 rows=1451935 width=33)
:
Gather (cost=595528.82..30861735.59 rows=79739540 width=818)
Workers Planned: 6
-> Parallel Append (cost=594528.82..22886781.59 rows=3353954 width=818)
-> Parallel Hash Anti Join (cost=595158.45..1430965.86 rows=209884 width=818)
Hash Cond: (a_3.ID = b_3.ID)
-> Parallel Seq Scan on table1_p4 a_3 (cost=0.00..571000.04 rows=1663104 width=409)
-> Parallel Hash (cost=498939.20..498939.20 rows=1453220 width=409)
-> Parallel Seq Scan on table2_p4 b_3 (cost=0.00..498939.20 rows=1453220 width=409)
。。。。。。。。。。
-> Parallel Hash Anti Join (cost=594133.04..1428563.89 rows=209659 width=818)
Hash Cond: (a_7.ID = b_7.ID)
-> Parallel Seq Scan on table1_p8 a_7 (cost=0.00..570061.72 rows=1660372 width=409)
-> Parallel Hash (cost=498079.13..498079.13 rows=1450713 width=409)
-> Parallel Seq Scan on table2_p8 b_7 (cost=0.00..498079.13 rows=1450713 width=409)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..21918498515.70 rows=797396 width=409)
-> Seq Scan on table1_p1 a (cost=0.00..1369544385.59 rows=49824 width=409)
Filter: ((SubPlan 1) = 0)
SubPlan 1
-> Aggregate (cost=137.36..137.37 rows=1 width=8)
-> Append (cost=0.56..137.32 rows=16 width=0)
-> Index Only Scan using table2_p1_ID_idx on table2_p1 b (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p2_ID_idx on table2_p2 b_1 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p3_ID_idx on table2_p3 b_2 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p4_ID_idx on table2_p4 b_3 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p5_ID_idx on table2_p5 b_4 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p6_ID_idx on table2_p6 b_5 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p7_ID_idx on table2_p7 b_6 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p8_ID_idx on table2_p8 b_7 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p9_ID_idx on table2_p9 b_8 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p10_ID_idx on table2_p10 b_9 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p11_ID_idx on table2_p11 b_10 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p12_ID_idx on table2_p12 b_11 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p13_ID_idx on table2_p13 b_12 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p14_ID_idx on table2_p14 b_13 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p15_ID_idx on table2_p15 b_14 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p16_ID_idx on table2_p16 b_15 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Seq Scan on table1_p2 a_1 (cost=0.00..1369703814.29 rows=49830 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p3 a_2 (cost=0.00..1369605271.04 rows=49826 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p4 a_3 (cost=0.00..1371442955.44 rows=49893 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p5 a_4 (cost=0.00..1369648013.99 rows=49828 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p6 a_5 (cost=0.00..1369593313.76 rows=49826 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p7 a_6 (cost=0.00..1369871350.56 rows=49836 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p8 a_7 (cost=0.00..1369190070.50 rows=49811 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p9 a_8 (cost=0.00..1369577783.54 rows=49825 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p10 a_9 (cost=0.00..1370676463.24 rows=49865 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p11 a_10 (cost=0.00..1370035452.26 rows=49842 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p12 a_11 (cost=0.00..1370113241.76 rows=49845 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p13 a_12 (cost=0.00..1369706288.17 rows=49830 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p14 a_13 (cost=0.00..1370079156.90 rows=49844 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p15 a_14 (cost=0.00..1369817062.47 rows=49834 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p16 a_15 (cost=0.00..1369889905.19 rows=49837 width=409)
Filter: ((SubPlan 1) = 0)
(68 rows)
, ,
------------------------------------------------------------------------------------------------------------------------------
Gather (cost=528898.82..28729667.59 rows=79739540 width=409)
Workers Planned: 6
-> Parallel Append (cost=527898.82..20754713.59 rows=3353954 width=409)
-> Parallel Hash Anti Join (cost=528458.45..1297565.86 rows=209884 width=409)
Hash Cond: (a_3.ID = b_3.ID)
-> Parallel Seq Scan on table1_p4 a_3 (cost=0.00..571000.04 rows=1663104 width=409)
-> Parallel Hash (cost=498939.20..498939.20 rows=1453220 width=33)
-> Parallel Seq Scan on table2_p4 b_3 (cost=0.00..498939.20 rows=1453220 width=33)
-> Parallel Hash Anti Join (cost=528067.24..1296744.17 rows=210030 width=409)
Hash Cond: (a_9.ID = b_9.ID)
-> Parallel Seq Scan on table1_p10 a_9 (cost=0.00..570680.75 rows=1662175 width=409)
-> Parallel Hash (cost=498570.44..498570.44 rows=1452144 width=33)
-> Parallel Seq Scan on table2_p10 b_9 (cost=0.00..498570.44 rows=1452144 width=33)
-> Parallel Hash Anti Join (cost=527991.54..1296353.87 rows=209557 width=409)
Hash Cond: (a_11.ID = b_11.ID)
-> Parallel Seq Scan on table1_p12 a_11 (cost=0.00..570445.92 rows=1661492 width=409)
-> Parallel Hash (cost=498498.35..498498.35 rows=1451935 width=33)
-> Parallel Seq Scan on table2_p12 b_11 (cost=0.00..498498.35 rows=1451935 width=33)
:
Gather (cost=595528.82..30861735.59 rows=79739540 width=818)
Workers Planned: 6
-> Parallel Append (cost=594528.82..22886781.59 rows=3353954 width=818)
-> Parallel Hash Anti Join (cost=595158.45..1430965.86 rows=209884 width=818)
Hash Cond: (a_3.ID = b_3.ID)
-> Parallel Seq Scan on table1_p4 a_3 (cost=0.00..571000.04 rows=1663104 width=409)
-> Parallel Hash (cost=498939.20..498939.20 rows=1453220 width=409)
-> Parallel Seq Scan on table2_p4 b_3 (cost=0.00..498939.20 rows=1453220 width=409)
。。。。。。。。。。
-> Parallel Hash Anti Join (cost=594133.04..1428563.89 rows=209659 width=818)
Hash Cond: (a_7.ID = b_7.ID)
-> Parallel Seq Scan on table1_p8 a_7 (cost=0.00..570061.72 rows=1660372 width=409)
-> Parallel Hash (cost=498079.13..498079.13 rows=1450713 width=409)
-> Parallel Seq Scan on table2_p8 b_7 (cost=0.00..498079.13 rows=1450713 width=409)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..21918498515.70 rows=797396 width=409)
-> Seq Scan on table1_p1 a (cost=0.00..1369544385.59 rows=49824 width=409)
Filter: ((SubPlan 1) = 0)
SubPlan 1
-> Aggregate (cost=137.36..137.37 rows=1 width=8)
-> Append (cost=0.56..137.32 rows=16 width=0)
-> Index Only Scan using table2_p1_ID_idx on table2_p1 b (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p2_ID_idx on table2_p2 b_1 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p3_ID_idx on table2_p3 b_2 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p4_ID_idx on table2_p4 b_3 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p5_ID_idx on table2_p5 b_4 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p6_ID_idx on table2_p6 b_5 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p7_ID_idx on table2_p7 b_6 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p8_ID_idx on table2_p8 b_7 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p9_ID_idx on table2_p9 b_8 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p10_ID_idx on table2_p10 b_9 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p11_ID_idx on table2_p11 b_10 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p12_ID_idx on table2_p12 b_11 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p13_ID_idx on table2_p13 b_12 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p14_ID_idx on table2_p14 b_13 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p15_ID_idx on table2_p15 b_14 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Index Only Scan using table2_p16_ID_idx on table2_p16 b_15 (cost=0.56..8.58 rows=1 width=0)
Index Cond: (ID = a.ID)
-> Seq Scan on table1_p2 a_1 (cost=0.00..1369703814.29 rows=49830 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p3 a_2 (cost=0.00..1369605271.04 rows=49826 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p4 a_3 (cost=0.00..1371442955.44 rows=49893 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p5 a_4 (cost=0.00..1369648013.99 rows=49828 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p6 a_5 (cost=0.00..1369593313.76 rows=49826 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p7 a_6 (cost=0.00..1369871350.56 rows=49836 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p8 a_7 (cost=0.00..1369190070.50 rows=49811 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p9 a_8 (cost=0.00..1369577783.54 rows=49825 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p10 a_9 (cost=0.00..1370676463.24 rows=49865 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p11 a_10 (cost=0.00..1370035452.26 rows=49842 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p12 a_11 (cost=0.00..1370113241.76 rows=49845 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p13 a_12 (cost=0.00..1369706288.17 rows=49830 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p14 a_13 (cost=0.00..1370079156.90 rows=49844 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p15 a_14 (cost=0.00..1369817062.47 rows=49834 width=409)
Filter: ((SubPlan 1) = 0)
-> Seq Scan on table1_p16 a_15 (cost=0.00..1369889905.19 rows=49837 width=409)
Filter: ((SubPlan 1) = 0)
(68 rows)