mysql in文のクエリ効率の遅い最適化テクニックの例

10020 ワード

表の構造は以下の通りで、文章は690編しかありません.

   article(id,title,content)
   tag(tid,tag_name)
       article_tag(id,tag_id,article_id)

ラベルのtidは135であり、クエリーラベルtidは135の文章リストである.
690の文章、以下の文で検索して、とても遅いです:

select id,title from article where id in(
select article_id from article_tag where tag_id=135
)

この中でこの速度は速いです.

select article_id from article_tag where tag_id=135

クエリの結果は5つの記事で、idは428429430431432です.
次のsqlで文章を調べるのも速いです.

select id,title from article where id in(
428,429,430,431,432
)

解決方法:

select id,title from article where id in(
select article_id from (select article_id from article_tag where tag_id=135) as tbt
)

その他の解決方法:(例)

mysql> select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

紙幅を節約するため、出力内容を省略した.
67 rows in set (12.00 sec)
67行のデータしか返されませんが、12秒もかかりました.システムにはこのようなクエリーが同時にたくさんあるかもしれません.システムは耐えられません.descで見てみます(注:explainでもいいです)

mysql> desc select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
| 1 | PRIMARY | abc_number_prop | ALL | NULL | NULL | NULL | NULL | 2679838 | Using where |
| 2 | DEPENDENT SUBQUERY | abc_number_phone | eq_ref | phone,number_id | phone | 70 | const,func | 1 | Using where; Using index |
+----+--------------------+------------------+--------+-----------------+-------+---------+------------+---------+--------------------------+
2 rows in set (0.00 sec)

このクエリーを実行すると200万行以上スキャンされることがわかりますが、インデックスが作成されていないのではないでしょうか.見てみましょう.

mysql>show index from abc_number_phone;
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_phone | 0 | PRIMARY | 1 | number_phone_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 1 | phone | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 0 | phone | 2 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | number_id | 1 | number_id | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | created_by | 1 | created_by | A | 36879 | NULL | NULL | | BTREE | | |
| abc_number_phone | 1 | modified_by | 1 | modified_by | A | 36879 | NULL | NULL | YES | BTREE | | |
+------------------+------------+-------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.06 sec)
mysql>show index from abc_number_prop;
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abc_number_prop | 0 | PRIMARY | 1 | number_prop_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | number_id | 1 | number_id | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | created_by | 1 | created_by | A | 311268 | NULL | NULL | | BTREE | | |
| abc_number_prop | 1 | modified_by | 1 | modified_by | A | 311268 | NULL | NULL | YES | BTREE | | |
+-----------------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.15 sec)

上の出力から分かるように、この2つのテーブルはnumber_idフィールドにインデックスが作成されました.サブクエリ自体に問題があるかどうかを確認します.

mysql> desc select number_id from abc_number_phone where phone = '82306839';
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
| 1 | SIMPLE | abc_number_phone | ref | phone | phone | 66 | const | 6 | Using where; Using index |
+----+-------------+------------------+------+---------------+-------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)


問題ありません.数行のデータをスキャンするだけで、インデックスが機能します.
検索してみます.

mysql> select number_id from abc_number_phone where phone = '82306839';
+-----------+
| number_id |
+-----------+
| 8585 |
| 10720 |
| 148644 |
| 151307 |
| 170691 |
| 221897 |
+-----------+
6 rows in set (0.00 sec)


サブクエリで得られたデータを上のクエリに直接配置

mysql> select * from abc_number_prop where number_id in (8585, 10720, 148644, 151307, 170691, 221897);
67 rows in set (0.03 sec)


速度も速く、MySQLはサブクエリを処理するときに十分ではないようです.私はMySQL 5.1.42とMySQL 5.5.5.19で試しましたが、この問題があります.
ネットを検索すると、多くの人がこの問題に遭遇したことがあります.
参考資料1:MySQL最適化の使用接続(join)サブクエリの代わりに
参考資料2:MYSQLサブクエリとネストクエリの最適化インスタンス解析
ネット上のこれらの資料の提案に基づいて、joinを変えて試してみます.変更前:

select * from abc_number_prop where number_id in (select number_id from abc_number_phone where phone = '82306839');

変更後:

select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
mysql> select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
67 rows in set (0.00 sec)

効果は悪くなく、クエリーにかかる時間はほぼ0です.MySQLがこのクエリをどのように実行しているか見てみましょう

mysql>desc select a.* from abc_number_prop a inner join abc_number_phone b on a.number_id = b.number_id where phone = '82306839';
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
| 1 | SIMPLE | b | ref | phone,number_id | phone | 66 | const | 6 | Using where; Using index |
| 1 | SIMPLE | a | ref | number_id | number_id | 4 | eap.b.number_id | 3 | |
+----+-------------+-------+------+-----------------+-----------+---------+-----------------+------+--------------------------+
2 rows in set (0.00 sec)

小結:サブクエリの速度が遅い場合は、JOINでクエリを書き換えて最適化できます.
ネット上では、JOIN文を使ったクエリが必ずしもサブクエリを使ったものより速くないという記事もあります.
mysqlマニュアルにも言及されていますが、具体的な原文はmysqlドキュメントのこの章:I.3.Restrictions on Subqueries 13.2.8. Subquery Syntax
抜粋:
1)INを使用するサブクエリについて:
Subquery optimization for IN is not as effective as for the = operator or for IN(value_list) constructs.
A typical case for poor IN subquery performance is when the subquery returns a small number of rows but the outer query returns a large number of rows to be compared to the subquery result.
The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery. Consider the following statement that uses an uncorrelated subquery:
SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
The optimizer rewrites the statement to a correlated subquery:
SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.
An implication is that an IN subquery can be much slower than a query written using an IN(value_list) construct that lists the same values that the subquery would return.
2)サブクエリをjoinに変換することについて:
The optimizer is more mature for joins than for subqueries, so in many cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.
An exception occurs for the case where an IN subquery can be rewritten as a SELECT DISTINCT join. Example:
SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE condition);
That statement can be rewritten as follows:
SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND condition;
But in this case, the join requires an extra DISTINCT operation and is not more efficient than the subquery
まとめ
以上、mysql in文の検索効率の遅い最適化テクニックの例について、興味のある方は、「mysqlのサブクエリーの連携とinの効率、企業生産MySQLの最適化の紹介などを参照してください.
本稿で述べたことが皆さんに役に立つことを願っています.