hive0.11マルチテーブルjoin countI(distinct)bug

1711 ワード

詳細
     hive0.11テスト中、次のバグが見つかりました.
 
select count(distinct t2.user_id),t1.app_id,t2.from_id
 from t1 
 join t2 on t1.app_id=t2.app_id
 join t3 on t2.from_id=t3.flag
 group by t1.app_id,t2.from_id

クエリ・プロシージャは次のエラーを報告します:FAILED:NullPointerException null
2013-09-16 20:20:59,611 ERROR ql.Driver (SessionState.java:printError(386)) - FAILED: NullPointerException null
java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer$MetadataOnlyTaskDispatcher.dispatch(MetadataOnlyOptimizer.java:308)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101)

具体的な原因は解決中です
中間解決策は一時テーブル、サブクエリを形成し、パラメータ「set hive.map.aggr=false;一時的な解決
select count(distinct tmp.user_id), tmp.app_id,tmp.from_id 
from (select t2.user_id,t1.app_id,t2.from_id
	 from t1 
	 join t2 on t1.app_id=t2.app_id
	 join t3 on t2.from_id=t3.flag
	 group by t1.app_id,t2.from_id
 ) tmp

hiveこのメールに関する質問の説明:http://mail-archives.apache.org/mod_mbox/hive-user/201309.mbox/%3CCA+FBdFQYHm9WvpWYSwaFGs8Vo=crNuSD=zv-Wf7tE8S4=X7AJg@mail.gmail.com%3E
hive公式issues,HIVE-5129:https://issues.apache.org/jira/browse/HIVE-5129
hive公式reviewboard:https://reviews.apache.org/r/13697/diff/#index_header
hive公式歴史jira:https://issues.apache.org/jira/issues/?jql=project%20%3D%20HIVE