phoenixクエリー最適化40秒から0.02秒

24073 ワード


phoenix 5.0
 
データ量は多くなく、100万で、テストは一般的に十分です.
0: jdbc:phoenix:192.168.199.154> select count(1) from T_EXTENSION_ALL_DATAS_SHOW;
+-----------+
| COUNT(1)  |
+-----------+
| 999999    |
+-----------+
1 row selected (8.714 seconds)

 
ここでrowkey、メインキーは、CONTRAINT PK PRIMARY KEY(SHOW_DATE,SEQ_ID,EMAIL)
0: jdbc:phoenix:192.168.199.154> select * from T_EXTENSION_ALL_DATAS_SHOW limit 2;
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
|  SHOW_DATE  | SEQ_ID  |   EMAIL    | TIME_SPEND  | CAM_SITE  | TOKEN_EARNED  | REVENUE  | TIPS_SENT  |  TOY  |     CREATED_DATE     |
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
| 2018-11-24  | 1       | [email protected]  | 65          | cam4      | 20.5          | 200.5    | 21         | ambi  | 2018-11-24 15:22:40  |
| 2018-11-24  | 2       | [email protected]  | 65          | cam4      | 20.5          | 200.5    | 21         | ambi  | 2018-11-24 15:22:40  |
+-------------+---------+------------+-------------+-----------+---------------+----------+------------+-------+----------------------+
2 rows selected (0.273 seconds)

 
--クエリー条件は日付のみで、最大IDをクエリーします.まだスピードが速いのが見えます.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 999999  |
+---------+
1 row selected (0.032 seconds)


0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' order by seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                                      PLAN                                                       | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK 1 ROWS 715 BYTES SERIAL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24']  | 715             | 1              | 0            |
|     SERVER FILTER BY FIRST KEY ONLY                                                                             | 715             | 1              | 0            |
|     SERVER 1 ROW LIMIT                                                                                          | 715             | 1              | 0            |
| CLIENT 1 ROW LIMIT                                                                                              | 715             | 1              | 0            |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.064 seconds)

--条件をクエリーし、emailを追加します.ダイレクトタイムアウト、エラー.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' order by seq_id desc limit 1;
Error: org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=16, exceptions:
Sat Nov 24 15:43:47 CST 2018, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60121

 
--emailにグローバル2次インデックスを追加し、クエリーします.2次インデックスを歩くと、速度が速くなります.0.059秒
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ON T_EXTENSION_ALL_DATAS_SHOW(EMAIL);
999,999 rows affected (56.13 seconds)
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (0.059 seconds)
0: jdbc:phoenix:192.168.199.154> 

 
--ウェブサイトの検索条件に変更するには57秒かかります!
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 499999  |
+---------+
1 row selected (57.825 seconds)

0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                             PLAN                                                             | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 2-CHUNK 639317 ROWS 314572822 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24']  | 314572822       | 639317         | 1543044967313  |
|     SERVER FILTER BY CAM_SITE = 'cam4'                                                                                       | 314572822       | 639317         | 1543044967313  |
|     SERVER 1 ROW LIMIT                                                                                                       | 314572822       | 639317         | 1543044967313  |
| CLIENT 1 ROW LIMIT                                                                                                           | 314572822       | 639317         | 1543044967313  |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.069 seconds)


 
--いつものように、Webサイトのフィールドを作成します.二次インデックス、57秒から0.024秒に!
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE ON T_EXTENSION_ALL_DATAS_SHOW(CAM_SITE);
999,999 rows affected (59.081 seconds)

0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 499999  |
+---------+
1 row selected (0.024 seconds)

 
--今回、私たちはウェブサイトに加えて、おもちゃの条件を増やして、どうなりますか?
--日付がRANGE SCANになっているのが見えますが、サイトもおもちゃもFILTERなので特に遅いです.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 499999  |
+---------+
1 row selected (56.94 seconds)

0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                             PLAN                                                             | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 2-CHUNK 639317 ROWS 314572822 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER T_EXTENSION_ALL_DATAS_SHOW ['2018-11-24']  | 314572822       | 639317         | 1543044967313  |
|     SERVER FILTER BY (CAM_SITE = 'cam4' AND TOY = 'ambi')                                                                    | 314572822       | 639317         | 1543044967313  |
|     SERVER 1 ROW LIMIT                                                                                                       | 314572822       | 639317         | 1543044967313  |
| CLIENT 1 ROW LIMIT                                                                                                           | 314572822       | 639317         | 1543044967313  |
+------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.029 seconds)
0: jdbc:phoenix:192.168.199.154> 

 
--クエリの要件を満たすには、2次インデックスの組み合わせを作成する必要があります.速度はまた0.039秒に上昇した.ははは
0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE_TOY ON T_EXTENSION_ALL_DATAS_SHOW(CAM_SITE,TOY);
999,999 rows affected (56.776 seconds)

0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 499999  |
+---------+
1 row selected (0.039 seconds)


0: jdbc:phoenix:192.168.199.154> explain select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and cam_site='cam4' and toy='ambi' order by seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                                                     PLAN                                                                      | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK 1 ROWS 73 BYTES SERIAL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_CAM_SITE_TOY ['cam4','ambi','2018-11-24']  | 73              | 1              | 0            |
|     SERVER FILTER BY FIRST KEY ONLY                                                                                                           | 73              | 1              | 0            |
|     SERVER 1 ROW LIMIT                                                                                                                        | 73              | 1              | 0            |
| CLIENT 1 ROW LIMIT                                                                                                                            | 73              | 1              | 0            |
+-----------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.033 seconds)

 
この时、私は自分に闻いて、机能のページは1つのクエリーの条件を多くして、多くいくつかの2级のインデックスを创立します.速度の要件を満たすことができます.
例えばcam_siteもtoyも、必須項目ではありません.
ユーザー
cam_のみ入力可能site、2次インデックスが1つ必要です.
toyのみを入力する場合は、2次インデックスが1つ必要です.
2つを同時に入力する場合は、2次インデックスが1つ必要です.
簡単な2つの条件で、3つのインデックスが必要です.
 
例えばページには6つのクエリー条件があります!!どのくらいの2級のインデックスが必要です!!!何倍成長?気が狂った
同じテーブルでは、インデックス数は10を超えてはいけません.インデックステーブルが多ければ多いほど、データの挿入が遅くなります.
 
この時、実は技術より考え方が重要です.
1、需要の面から(ビッグデータクエリー、多すぎる条件のクエリーに合わない).無意味な検索条件で、すべてPKを落とす.
最も一般的な:時間(yyyy-MM-dd)..
2、テーブル設計の面からrow keyは連合プライマリキーであってもよい.これを利用して、2次インデックスの数を減らすことができます.例えばshow_dateはプライマリ・キーです.
3、ロジックの面から、例えば私はこのいくつかのクエリー条件が必要で、いくつかの2級インデックスが必要ですか??
時間、メールボックス、ウェブサイト、おもちゃ
実際、私は2級インデックスを4つだけで十分です.
時間が経つとrowkeyは言わない.
ユーザーが時間、メールボックス、おもちゃを入力したら、どう処理しますか?時間も41秒かかりました.
0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]' and toy='ambi' order by seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (41.395 seconds)


0: jdbc:phoenix:192.168.199.154> select seq_id from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]';
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (0.035 seconds)


0: jdbc:phoenix:192.168.199.154> select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (40.367 seconds)


0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                                      PLAN                                                       | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi','2018-11-24']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY AND "EMAIL" = '[email protected]'                                               | null            | null           | null         |
|     SERVER 1 ROW LIMIT                                                                                          | null            | null           | null         |
| CLIENT 1 ROW LIMIT                                                                                              | null            | null           | null         |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.04 seconds)

実行計画から見ると、email 2級インデックスを改造する必要があると思います.
0: jdbc:phoenix:192.168.199.154> drop index IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL on T_EXTENSION_ALL_DATAS_SHOW;
No rows affected (2.275 seconds)


0: jdbc:phoenix:192.168.199.154> CREATE INDEX IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ON T_EXTENSION_ALL_DATAS_SHOW(EMAIL) INCLUDE(CAM_SITE,TOY);
999,999 rows affected (76.893 seconds)


0: jdbc:phoenix:192.168.199.154> select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (40.06 seconds)


0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
|                                                      PLAN                                                       | EST_BYTES_READ  | EST_ROWS_READ  | EST_INFO_TS  |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi','2018-11-24']  | null            | null           | null         |
|     SERVER FILTER BY FIRST KEY ONLY AND "EMAIL" = '[email protected]'                                               | null            | null           | null         |
|     SERVER 1 ROW LIMIT                                                                                          | null            | null           | null         |
| CLIENT 1 ROW LIMIT                                                                                              | null            | null           | null         |
+-----------------------------------------------------------------------------------------------------------------+-----------------+----------------+--------------+
4 rows selected (0.034 seconds)


0: jdbc:phoenix:192.168.199.154> select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]';
+---------+-------+
| SEQ_ID  |  TOY  |
+---------+-------+
| 45555   | ambi  |
+---------+-------+
1 row selected (0.07 seconds)




実行の結果は、まだ理想的ではありません.完全に40秒かかり、実行計画にも何の変化もありません.~~~~(>_
しかし、サブクエリは本当にとても速くて、どうして1阶を包んで40秒遅くなりました???
 
仕方がない:Hint
0: jdbc:phoenix:192.168.199.154> select /*+ INDEX(T_EXTENSION_ALL_DATAS_SHOW IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL) */ t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------+
| SEQ_ID  |
+---------+
| 45555   |
+---------+
1 row selected (0.017 seconds)


0: jdbc:phoenix:192.168.199.154> explain select /*+ INDEX(T_EXTENSION_ALL_DATAS_SHOW IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL) */ t1.seq_id as seq_id from (select seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t1 where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                                          PLAN                                                                           | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK 1182604 ROWS 314572800 BYTES PARALLEL 1-WAY REVERSE RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ['[email protected]','2018-11-24']  | 314572800       | 1182604        | 1543050902166  |
|     SERVER FILTER BY "TOY" = 'ambi'                                                                                                                     | 314572800       | 1182604        | 1543050902166  |
|     SERVER 1 ROW LIMIT                                                                                                                                  | 314572800       | 1182604        | 1543050902166  |
| CLIENT 1 ROW LIMIT                                                                                                                                      | 314572800       | 1182604        | 1543050902166  |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
4 rows selected (0.048 seconds)


ははは、よかった、とても完璧です.
phoenixは、通常、最適な2次インデックスを自動的に選択しますが、賢くない場合があります.
このときは、どの2級インデックスを使うべきかを教えてあげる必要があります!!
 
方法は簡単で、論理的な角度からメールボックスが最も細粒度です.クエリー条件の場合、最も細かい粒度があります.
まず、最細粒度フィールドを2次インデックスクエリの結果に移動します.(最細粒度条件の濾過を行ったことがありますが、この結果セットはかなり小さくなりました!)
結果セットをおもちゃ名でフィルタします.
2回調べると、同じ速度で速いという意味です.ははは
 
 
注意:2級インデックスを歩くと、フィルタよりも速くなるとは限らない.(結果セットが小さいときはfilterが優勢!)
もう一つの方法は、1秒以上もかかります.この方式では、メールボックスや時間が2級インデックスになり、おもちゃも2級インデックスになります.まるで大きな曲がり角を許したようで、ははは.
0: jdbc:phoenix:192.168.199.154> select t1.seq_id from T_EXTENSION_ALL_DATAS_SHOW t1 inner join (select show_date,email,seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t2 on(t1.show_date=t2.show_date and t1.seq_id= t2.seq_id) where t1.toy='ambi' order by t1.seq_id desc limit 1;
+------------+
| T1.SEQ_ID  |
+------------+
| 45555      |
+------------+
1 row selected (1.171 seconds)


0: jdbc:phoenix:192.168.199.154> explain select t1.seq_id from T_EXTENSION_ALL_DATAS_SHOW t1 inner join (select show_date,email,seq_id,toy from T_EXTENSION_ALL_DATAS_SHOW where show_date='2018-11-24' and email='[email protected]') t2 on(t1.show_date=t2.show_date and t1.seq_id= t2.seq_id) where t1.toy='ambi' order by t1.seq_id desc limit 1;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
|                                                                                PLAN                                                                                 | EST_BYTES_READ  | EST_ROWS_READ  |  EST_INFO_TS   |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
| CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_TOY ['ambi']                                                                           | 314572800       | 1182604        | 1543050902166  |
|     SERVER FILTER BY FIRST KEY ONLY                                                                                                                                 | 314572800       | 1182604        | 1543050902166  |
|     SERVER TOP 1 ROW SORTED BY ["T1.:SEQ_ID" DESC]                                                                                                                  | 314572800       | 1182604        | 1543050902166  |
| CLIENT MERGE SORT                                                                                                                                                   | 314572800       | 1182604        | 1543050902166  |
| CLIENT LIMIT 1                                                                                                                                                      | 314572800       | 1182604        | 1543050902166  |
|     PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE)                                                                                                                        | 314572800       | 1182604        | 1543050902166  |
|         CLIENT 1-CHUNK 1182604 ROWS 314572800 BYTES PARALLEL 1-WAY ROUND ROBIN RANGE SCAN OVER IDX_T_EXTENSION_ALL_DATAS_SHOW_EMAIL ['[email protected]','2018-11-24']  | 314572800       | 1182604        | 1543050902166  |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+----------------+----------------+
7 rows selected (0.051 seconds)

 
phoenixクエリの最適化については、今日ここまでです.END