默认的OutputFormat。每一行输出key.toString()+"\t"+value.toString()。输出为文本文件。
每一行输出文件路径+文件内容。输出为二进制文件。常用于后续MapReduce的输入。
1)MyOutputFormat类
//k类型为T1,v类型为T2 public class MyOutputFormat extends FileOutputFormat<T1, T2> { //返回一个RecordWriter的实现类对象 @Override public RecordWriter<T1, T2> getRecordWriter(TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException { return new MyRecordWriter(taskAttemptContext); } }2)MyRecordWriter类
public class MyRecordWriter extends RecordWriter<T1, T2> { private FSDataOutputStream fSDataOutputStream; //HDFS输出流 //在构造器中开流 public MyRecordWriter(TaskAttemptContext taskAttemptContext) throws IOException { Configuration configuration = taskAttemptContext.getConfiguration(); //获取当前配置信息 FileSystem fileSystem = FileSystem.get(configuration); //获取当前HDFS String outputDir = configuration.get(FileOutputFormat.OUTDIR); //获取HDFS的输出目录 fSDataOutputStream = fileSystem.create(new Path(outputDir + "myOutputFile")); //自定义输出目录 } //自定义输出 @Override public void write(T1 k, T2 v) throws IOException, InterruptedException { } //关流 @Override public void close(TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException { IOUtils.closeStream(fSDataOutputStream); } }3)在Driver类中修改OutputFormat
job.setOutputFormatClass(MyOutputFormat.class);