com.datasalt.pangool.tuplemr.mapred.lib.input
Class PangoolMultipleInputs

java.lang.Object
  extended by com.datasalt.pangool.tuplemr.mapred.lib.input.PangoolMultipleInputs

public class PangoolMultipleInputs
extends Object

This class supports MapReduce jobs that have multiple input paths with a different InputFormat and Mapper for each path.

This class is inspired by the org.apache.hadoop.mapred.lib.MultipleInputs


Field Summary
static String PANGOOL_INPUT_DIR_FORMATS_PREFIX_CONF
           
static String PANGOOL_INPUT_DIR_MAPPERS_PREFIX_CONF
           
 
Constructor Summary
PangoolMultipleInputs()
           
 
Method Summary
static void addInputContext(org.apache.hadoop.mapreduce.Job job, String inputName, String key, String value)
          Specific (key, value) configurations for each Input.
static Set<String> addInputPath(org.apache.hadoop.mapreduce.Job job, org.apache.hadoop.fs.Path path, org.apache.hadoop.mapreduce.InputFormat inputFormat, org.apache.hadoop.mapreduce.Mapper mapperInstance, Map<String,String> specificContext)
          Add a Path with a custom InputFormat and Mapper to the list of inputs for the map-reduce job.
static void setSpecificInputContext(org.apache.hadoop.conf.Configuration conf, String inputName)
          Iterates over the Configuration and sets the specific context found for the input in the Job instance.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PANGOOL_INPUT_DIR_FORMATS_PREFIX_CONF

public static final String PANGOOL_INPUT_DIR_FORMATS_PREFIX_CONF
See Also:
Constant Field Values

PANGOOL_INPUT_DIR_MAPPERS_PREFIX_CONF

public static final String PANGOOL_INPUT_DIR_MAPPERS_PREFIX_CONF
See Also:
Constant Field Values
Constructor Detail

PangoolMultipleInputs

public PangoolMultipleInputs()
Method Detail

addInputPath

public static Set<String> addInputPath(org.apache.hadoop.mapreduce.Job job,
                                       org.apache.hadoop.fs.Path path,
                                       org.apache.hadoop.mapreduce.InputFormat inputFormat,
                                       org.apache.hadoop.mapreduce.Mapper mapperInstance,
                                       Map<String,String> specificContext)
                                throws FileNotFoundException,
                                       IOException
Add a Path with a custom InputFormat and Mapper to the list of inputs for the map-reduce job. Returns the instance files created.

Parameters:
job - The Job
path - Path to be added to the list of inputs for the job
inputFormat - InputFormat class to use for this path
mapperInstance - Mapper instance to use
Throws:
IOException
FileNotFoundException

addInputContext

public static void addInputContext(org.apache.hadoop.mapreduce.Job job,
                                   String inputName,
                                   String key,
                                   String value)
Specific (key, value) configurations for each Input. Some Input Formats read specific configuration values and act based on them.


setSpecificInputContext

public static void setSpecificInputContext(org.apache.hadoop.conf.Configuration conf,
                                           String inputName)
Iterates over the Configuration and sets the specific context found for the input in the Job instance. Package-access so it can be unit tested. The specific context is configured in method this. addInputContext(Job, String, String, String)



Copyright © –2014 Datasalt Systems S.L.. All rights reserved.