Hadoop mapreduceユニットテストツールMRUnit簡単に使用
5140 ワード
hadoopバージョン:
バージョンによって採用される実装の書き方は少し異なりますが、ここで採用されるバージョンの詳細は以下の通りです.
よく使用されるクラスは次のとおりです.
mapper,combiner,reducer実装の意味は以下の通りである.
mapper,combiner,reducerをテストするコードは以下の通りです.
注意事項:
$ hadoop version
Hadoop 0.20.2-cdh3u4
Subversion git://ubuntu-slave01/var/lib/jenkins/workspace/CDH3u4-Full-RC/build/cdh3/hadoop20/0.20.2-cdh3u4/source -r 214dd731e3bdb687cb55988d3f47dd9e248c5690
Compiled by jenkins on Mon May 7 13:01:39 PDT 2012
From source with checksum a60c9795e41a3248b212344fb131c12c
バージョンによって採用される実装の書き方は少し異なりますが、ここで採用されるバージョンの詳細は以下の通りです.
<dependency>
<groupId>org.apache.mrunit</groupId>
<artifactId>mrunit</artifactId>
<version>1.0.0</version>
<classifier>hadoop1</classifier>
</dependency>
よく使用されるクラスは次のとおりです.
org.apache.hadoop.mrunit.mapreduce.MapDriver;
org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
mapper,combiner,reducer実装の意味は以下の通りである.
CompMapper: 222-333##id1##id2 key id1##id2,value 1L( )
CompCombiner: key
CompReducer: key id1##id2,value long , double
mapper,combiner,reducerをテストするコードは以下の通りです.
private MapDriver<Text, LongWritable, Text, LongWritable> mapDriver;
private ReduceDriver<Text, LongWritable, Text, DoubleWritable> reduceDriver;
private ReduceDriver<Text, LongWritable, Text, LongWritable> combinerDriver;
private MapReduceDriver<Text, LongWritable, Text, LongWritable, Text, LongWritable> mapCombinerDriver;
private MapReduceDriver<Text, LongWritable, Text, LongWritable, Text, DoubleWritable> mapReducerDriver;
@Before
public void setUp() {
CompMapper mapper = new CompMapper();
CompCombiner combiner = new CompCombiner();
CompReducer reducer = new CompReducer();
mapDriver = new MapDriver<Text, LongWritable, Text, LongWritable>(mapper);
reduceDriver = new ReduceDriver<Text, LongWritable, Text, DoubleWritable>(reducer);
combinerDriver = new ReduceDriver<Text, LongWritable, Text, LongWritable>(combiner);
mapCombinerDriver = new MapReduceDriver<Text, LongWritable, Text, LongWritable, Text, LongWritable>(
mapper, combiner);
mapReducerDriver = new MapReduceDriver<Text, LongWritable, Text, LongWritable, Text, DoubleWritable>(
mapper, reducer);
}
@Test
public void testMapper() throws IOException {
mapDriver.setInput(new Text("222-333##id1##id2"), new LongWritable(1L));
mapDriver.withOutput(new Text("id1##id2"), new LongWritable(1L));
mapDriver.runTest();
}
@Test
public void testCombiner() throws IOException {
List<LongWritable> values = new ArrayList<LongWritable>();
for (int i = 0; i < 5; i++) {
values.add(new LongWritable(NumberUtils.toLong(i + "")));
}
combinerDriver.addInput(new Text("id1##id2"), values);
combinerDriver.withOutput(new Text("id1##id2"), new LongWritable(10L));
combinerDriver.runTest();
}
@Test
public void testReducer() throws IOException {
List<LongWritable> values = new ArrayList<LongWritable>();
long count = 0;
for (int i = 0; i < 5; i++) {
count = count + (long) i;
values.add(new LongWritable(NumberUtils.toLong(i + "")));
}
reduceDriver.addInput(new Text("id1##id2"), values);
int numHash = reduceDriver.getConfiguration().getInt(
MinhashOptionCreator.NUM_HASH_FUNCTIONS, 10);
DoubleWritable dw = new DoubleWritable();
BigDecimal b1 = new BigDecimal(count);
BigDecimal b2 = new BigDecimal(numHash);
dw.set(b1.divide(b2).doubleValue());
reduceDriver.withOutput(new Text("id1##id2"), dw);
reduceDriver.runTest();
}
@Test
public void tetMapCombiner() throws IOException {
mapCombinerDriver.addInput(new Text("222-333##id1##id2"), new LongWritable(1L));
mapCombinerDriver.addInput(new Text("111-333##id1##id2"), new LongWritable(1L));
mapCombinerDriver.withOutput(new Text("id1##id2"), new LongWritable(2L));
mapCombinerDriver.runTest();
}
@Test
public void tetMapReducer() throws IOException {
mapReducerDriver.addInput(new Text("222-333##id1##id2"), new LongWritable(1L));
mapReducerDriver.addInput(new Text("111-333##id1##id2"), new LongWritable(1L));
int numHash = reduceDriver.getConfiguration().getInt(
"NUM", 10);
DoubleWritable dw = new DoubleWritable();
BigDecimal b1 = new BigDecimal(2L);
BigDecimal b2 = new BigDecimal(numHash);
dw.set(b1.divide(b2).doubleValue());
mapReducerDriver.withOutput(new Text("id1##id2"), dw);
mapReducerDriver.runTest();
}
注意事項:
1.MRUnit Hadoop
2. java.lang.IncompatibleClassChangeError