phoenixクエリー最適化40秒から0.02秒
24073 ワード
phoenix 5.0
データ量は多くなく、100万で、テストは一般的に十分です.
0: jdbc:phoenix:192.168.199.154> select count(1) from T_EXTENSION_ALL_DATAS_SHOW;
+-----------+
| COUNT(1) |
+-----------+
| 999999 |
+-----------+
1 row selected (8.714 seconds)
ここでrowkey、メインキーは、CONTRAINT PK PRIMARY KEY(SHOW_DATE,SEQ_ID,EMAIL)
0: jdbc:phoenix:192.168.199.154> select * from T_EXTENSION_ALL_DATAS_SHOW limit 2;
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
| SHOW_DATE | SEQ_ID | EMAIL | TIME_SPEND | CAM_SITE | TOKEN_EARNED | REVENUE | TIPS_SENT | TOY | CREATED_DATE |
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
| 2018-11-24 | 1 | [email protected] | 65 | cam4 | 20.5 | 200.5 | 21 | ambi | 2018-11-24 15:22:40 |
| 2018-11-24 | 2 | [email protected] | 65 | cam4 | 20.5 | 200.5 | 21 | ambi | 2018-11-24 15:22:40 |
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
2 rows selected (0.273 seconds)
--クエリー条件は日付のみで、最大IDをクエリーします.まだスピードが速いのが見えます.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 999999 |
+---------+
1 row selected (0.032 seconds)
0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' order by seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK 1 ROWS 715 BYTES SERIAL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24'] | 715 | 1 | 0 |
| SERVER FILTER BY FIRST KEY ONLY | 715 | 1 | 0 |
| SERVER 1 ROW LIMIT | 715 | 1 | 0 |
| CLIENT 1 ROW LIMIT | 715 | 1 | 0 |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.064 seconds)
--条件をクエリーし、emailを追加します.ダイレクトタイムアウト、エラー.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' order by seq_id desc limit 1;
Error: org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=16, exceptions:
Sat Nov 24 15:43:47 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60121
--emailにグローバル2次インデックスを追加し、クエリーします.2次インデックスを歩くと、速度が速くなります.0.059秒
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ON T_EXTENSION_ALL_DATAS_SHOW(EMAIL);
999,999 rows affected (56.13 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (0.059 seconds)
0: jdbc:phoenix:192.168.199.154>
--ウェブサイトの検索条件に変更するには57秒かかります!
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 499999 |
+---------+
1 row selected (57.825 seconds)
0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 2-CHUNK 639317 ROWS 314572822 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24'] | 314572822 | 639317 | 1543044967313 |
| SERVER FILTER BY CAM_SITE = 'cam4' | 314572822 | 639317 | 1543044967313 |
| SERVER 1 ROW LIMIT | 314572822 | 639317 | 1543044967313 |
| CLIENT 1 ROW LIMIT | 314572822 | 639317 | 1543044967313 |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.069 seconds)
--いつものように、Webサイトのフィールドを作成します.二次インデックス、57秒から0.024秒に!
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE ON T_EXTENSION_ALL_DATAS_SHOW(CAM_SITE);
999,999 rows affected (59.081 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 499999 |
+---------+
1 row selected (0.024 seconds)
--今回、私たちはウェブサイトに加えて、おもちゃの条件を増やして、どうなりますか?
--日付がRANGE SCANになっているのが見えますが、サイトもおもちゃもFILTERなので特に遅いです.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 499999 |
+---------+
1 row selected (56.94 seconds)
0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 2-CHUNK 639317 ROWS 314572822 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24'] | 314572822 | 639317 | 1543044967313 |
| SERVER FILTER BY (CAM_SITE = 'cam4' AND TOY = 'ambi') | 314572822 | 639317 | 1543044967313 |
| SERVER 1 ROW LIMIT | 314572822 | 639317 | 1543044967313 |
| CLIENT 1 ROW LIMIT | 314572822 | 639317 | 1543044967313 |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.029 seconds)
0: jdbc:phoenix:192.168.199.154>
--クエリの要件を満たすには、2次インデックスの組み合わせを作成する必要があります.速度はまた0.039秒に上昇した.ははは
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE_TOY ON T_EXTENSION_ALL_DATAS_SHOW(CAM_SITE,TOY);
999,999 rows affected (56.776 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 499999 |
+---------+
1 row selected (0.039 seconds)
0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK 1 ROWS 73 BYTES SERIAL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE_TOY ['cam4','ambi','2018-11-24'] | 73 | 1 | 0 |
| SERVER FILTER BY FIRST KEY ONLY | 73 | 1 | 0 |
| SERVER 1 ROW LIMIT | 73 | 1 | 0 |
| CLIENT 1 ROW LIMIT | 73 | 1 | 0 |
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.033 seconds)
この时、私は自分に闻いて、机能のページは1つのクエリーの条件を多くして、多くいくつかの2级のインデックスを创立します.速度の要件を満たすことができます.
例えばcam_siteもtoyも、必須項目ではありません.
ユーザー
cam_のみ入力可能site、2次インデックスが1つ必要です.
toyのみを入力する場合は、2次インデックスが1つ必要です.
2つを同時に入力する場合は、2次インデックスが1つ必要です.
簡単な2つの条件で、3つのインデックスが必要です.
例えばページには6つのクエリー条件があります!!どのくらいの2級のインデックスが必要です!!!何倍成長?気が狂った
同じテーブルでは、インデックス数は10を超えてはいけません.インデックステーブルが多ければ多いほど、データの挿入が遅くなります.
この時、実は技術より考え方が重要です.
1、需要の面から(ビッグデータクエリー、多すぎる条件のクエリーに合わない).無意味な検索条件で、すべてPKを落とす.
最も一般的な:時間(yyyy-MM-dd)..
2、テーブル設計の面からrow keyは連合プライマリキーであってもよい.これを利用して、2次インデックスの数を減らすことができます.例えばshow_dateはプライマリ・キーです.
3、ロジックの面から、例えば私はこのいくつかのクエリー条件が必要で、いくつかの2級インデックスが必要ですか??
時間、メールボックス、ウェブサイト、おもちゃ
実際、私は2級インデックスを4つだけで十分です.
時間が経つとrowkeyは言わない.
ユーザーが時間、メールボックス、おもちゃを入力したら、どう処理しますか?時間も41秒かかりました.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (41.395 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]';
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (0.035 seconds)
0: jdbc:phoenix:192.168.199.154> select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (40.367 seconds)
0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi','2018-11-24'] | null | null | null |
| SERVER FILTER BY FIRST KEY ONLY AND "EMAIL" = '[email protected]' | null | null | null |
| SERVER 1 ROW LIMIT | null | null | null |
| CLIENT 1 ROW LIMIT | null | null | null |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.04 seconds)
実行計画から見ると、email 2級インデックスを改造する必要があると思います.
0: jdbc:phoenix:192.168.199.154> drop index IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL on T_EXTENSION_ALL_DATAS_SHOW;
No rows affected (2.275 seconds)
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ON T_EXTENSION_ALL_DATAS_SHOW(EMAIL) INCLUDE(CAM_SITE,TOY);
999,999 rows affected (76.893 seconds)
0: jdbc:phoenix:192.168.199.154> select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (40.06 seconds)
0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi','2018-11-24'] | null | null | null |
| SERVER FILTER BY FIRST KEY ONLY AND "EMAIL" = '[email protected]' | null | null | null |
| SERVER 1 ROW LIMIT | null | null | null |
| CLIENT 1 ROW LIMIT | null | null | null |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.034 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]';
+---------+-------+
| SEQ_ID | TOY |
+---------+-------+
| 45555 | ambi |
+---------+-------+
1 row selected (0.07 seconds)
実行の結果は、まだ理想的ではありません.完全に40秒かかり、実行計画にも何の変化もありません.~~~~(>_
しかし、サブクエリは本当にとても速くて、どうして1阶を包んで40秒遅くなりました???
仕方がない:Hint
0: jdbc:phoenix:192.168.199.154> select /*+ INDEX(T_EXTENSION_ALL_DATAS_SHOW IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL) */ t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID |
+---------+
| 45555 |
+---------+
1 row selected (0.017 seconds)
0: jdbc:phoenix:192.168.199.154> explain select /*+ INDEX(T_EXTENSION_ALL_DATAS_SHOW IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL) */ t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK 1182604 ROWS 314572800 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ['[email protected]','2018-11-24'] | 314572800 | 1182604 | 1543050902166 |
| SERVER FILTER BY "TOY" = 'ambi' | 314572800 | 1182604 | 1543050902166 |
| SERVER 1 ROW LIMIT | 314572800 | 1182604 | 1543050902166 |
| CLIENT 1 ROW LIMIT | 314572800 | 1182604 | 1543050902166 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.048 seconds)
ははは、よかった、とても完璧です.
phoenixは、通常、最適な2次インデックスを自動的に選択しますが、賢くない場合があります.
このときは、どの2級インデックスを使うべきかを教えてあげる必要があります!!
方法は簡単で、論理的な角度からメールボックスが最も細粒度です.クエリー条件の場合、最も細かい粒度があります.
まず、最細粒度フィールドを2次インデックスクエリの結果に移動します.(最細粒度条件の濾過を行ったことがありますが、この結果セットはかなり小さくなりました!)
結果セットをおもちゃ名でフィルタします.
2回調べると、同じ速度で速いという意味です.ははは
注意:2級インデックスを歩くと、フィルタよりも速くなるとは限らない.(結果セットが小さいときはfilterが優勢!)
もう一つの方法は、1秒以上もかかります.この方式では、メールボックスや時間が2級インデックスになり、おもちゃも2級インデックスになります.まるで大きな曲がり角を許したようで、ははは.
0: jdbc:phoenix:192.168.199.154> select t1.seq_id from T_EXTENSION_ALL_DATAS_SHOW t1 inner join (select show_date,email,seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t2 on(t1.show_date=t2.show_date and t1.seq_id= t2.seq_id) where t1.toy='ambi' order by t1.seq_id desc limit 1;
+------------+
| T1.SEQ_ID |
+------------+
| 45555 |
+------------+
1 row selected (1.171 seconds)
0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id from T_EXTENSION_ALL_DATAS_SHOW t1 inner join (select show_date,email,seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t2 on(t1.show_date=t2.show_date and t1.seq_id= t2.seq_id) where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| PLAN | EST_BYTES_READ | EST_ROWS_READ | EST_INFO_TS |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi'] | 314572800 | 1182604 | 1543050902166 |
| SERVER FILTER BY FIRST KEY ONLY | 314572800 | 1182604 | 1543050902166 |
| SERVER TOP 1 ROW SORTED BY ["T1.:SEQ_ID" DESC] | 314572800 | 1182604 | 1543050902166 |
| CLIENT MERGE SORT | 314572800 | 1182604 | 1543050902166 |
| CLIENT LIMIT 1 | 314572800 | 1182604 | 1543050902166 |
| PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE) | 314572800 | 1182604 | 1543050902166 |
| CLIENT 1-CHUNK 1182604 ROWS 314572800 BYTES PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ['[email protected]','2018-11-24'] | 314572800 | 1182604 | 1543050902166 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
7 rows selected (0.051 seconds)
phoenixクエリの最適化については、今日ここまでです.END