2014년 4월 28일 월요일

how to use bzip2 that supports a splittable compression with hadoop-streaming package.

sample codes for testing 1. compression # To make a compress files per block by mapper. hadoop jar hadoop-streaming-2.2.0.2.1.0.0-92.jar \     -D mapreduce.output.fileoutputformat.compress=TRUE \     -D mapreduce.output.fileoutputformat.compress.type=RECORD \     -D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec...