com.datasalt.pangool.tuplemr
Class TupleMRConfig

java.lang.Object
  extended by com.datasalt.pangool.tuplemr.TupleMRConfig

public class TupleMRConfig
extends Object

TupleMRConfig contains the entire configuration parameters from a Tuple-based job. The main information that it contains :


Field Summary
static String CONF_COMPARATOR_INSTANCES
           
static String CONF_COMPARATOR_REFERENCES
           
 
Constructor Summary
protected TupleMRConfig()
           
 
Method Summary
 List<String> calculateRollupBaseFields()
          Returns the fields that are a subset from the groupBy fields and will be used when rollup is needed.
 boolean equals(Object a)
           
static TupleMRConfig get(org.apache.hadoop.conf.Configuration conf)
           
 Criteria getCommonCriteria()
          Returns the criteria used to sort fields that are common among the intermediate schemas.
 List<String> getCustomPartitionFields()
          Returns the custom fields used to partition tuples.
 Map<String,String> getFieldAliases(String schemaName)
           
 List<String> getGroupByFields()
          Returns the fields that are common among all the intermediate schemas that will be used to group by the tuples emitted from the TupleMapper
 Schema getIntermediateSchema(int schemaId)
          Returns a defined intermediate schema with the specified schemaId.
The schemaId follows the order of schema definition in addIntermediateSchema(Schema)
 Schema getIntermediateSchema(String schemaName)
          Returns a defined intermediate schema with the specified name
 List<String> getIntermediateSchemaNames()
          Returns a list with the names of all the intermediate schemas.
 List<Schema> getIntermediateSchemas()
          Returns all the intermediate schemas defined.
 int getNumIntermediateSchemas()
          Returns the number of intermediate schemas defined
 String getRollupFrom()
          Returns the field from which the rollup will be performed
 Map<String,Map<String,String>> getSchemaFieldAliases()
          Returns a map that contains for every schema a list of field aliases.
 Integer getSchemaIdByName(String name)
          Returns the schemaId from the schema's name.
 Criteria.Order getSchemasOrder()
          Returns the order that will be used to sort tuples with different schemas after being compared by commonOrder.
 SerializationInfo getSerializationInfo()
          Returns the SerializationInfo instance related to this configuration.
 List<Criteria> getSpecificOrderBys()
          Returns the order that will be used to sort tuples with different schemas after being compared by commonOrder and schemaOrder.
 int hashCode()
           
static TupleMRConfig parse(String s)
          Parse a schema from the provided string.
static Set<String> set(TupleMRConfig mrConfig, org.apache.hadoop.conf.Configuration conf)
          Returns the instance files generated.
protected  String toJson(boolean pretty)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

CONF_COMPARATOR_REFERENCES

public static final String CONF_COMPARATOR_REFERENCES
See Also:
Constant Field Values

CONF_COMPARATOR_INSTANCES

public static final String CONF_COMPARATOR_INSTANCES
See Also:
Constant Field Values
Constructor Detail

TupleMRConfig

protected TupleMRConfig()
Method Detail

getIntermediateSchema

public Schema getIntermediateSchema(String schemaName)
Returns a defined intermediate schema with the specified name


getIntermediateSchema

public Schema getIntermediateSchema(int schemaId)
Returns a defined intermediate schema with the specified schemaId.
The schemaId follows the order of schema definition in addIntermediateSchema(Schema)


getSchemaIdByName

public Integer getSchemaIdByName(String name)
Returns the schemaId from the schema's name.


getIntermediateSchemaNames

public List<String> getIntermediateSchemaNames()
Returns a list with the names of all the intermediate schemas.


getIntermediateSchemas

public List<Schema> getIntermediateSchemas()
Returns all the intermediate schemas defined.


getSchemasOrder

public Criteria.Order getSchemasOrder()
Returns the order that will be used to sort tuples with different schemas after being compared by commonOrder.


getSpecificOrderBys

public List<Criteria> getSpecificOrderBys()
Returns the order that will be used to sort tuples with different schemas after being compared by commonOrder and schemaOrder.


getCustomPartitionFields

public List<String> getCustomPartitionFields()
Returns the custom fields used to partition tuples. By default if this list is null then the partition criteria used will match the groupByFields. In case of rollup then the fields used will be a subset of the groupByFields up to the rollupFrom field.


getSerializationInfo

public SerializationInfo getSerializationInfo()
Returns the SerializationInfo instance related to this configuration.


getNumIntermediateSchemas

public int getNumIntermediateSchemas()
Returns the number of intermediate schemas defined


getCommonCriteria

public Criteria getCommonCriteria()
Returns the criteria used to sort fields that are common among the intermediate schemas. This criteria is the first used in SortComparator and in GroupComparator


getGroupByFields

public List<String> getGroupByFields()
Returns the fields that are common among all the intermediate schemas that will be used to group by the tuples emitted from the TupleMapper


getSchemaFieldAliases

public Map<String,Map<String,String>> getSchemaFieldAliases()
Returns a map that contains for every schema a list of field aliases. Field aliases are needed to declare common fields to be used in setGroupByFields(List) andsetGroupByFields(List)


getFieldAliases

public Map<String,String> getFieldAliases(String schemaName)

calculateRollupBaseFields

public List<String> calculateRollupBaseFields()
Returns the fields that are a subset from the groupBy fields and will be used when rollup is needed.

See Also:
RollupReducer

getRollupFrom

public String getRollupFrom()
Returns the field from which the rollup will be performed


get

public static TupleMRConfig get(org.apache.hadoop.conf.Configuration conf)
                         throws TupleMRException
Throws:
TupleMRException

set

public static Set<String> set(TupleMRConfig mrConfig,
                              org.apache.hadoop.conf.Configuration conf)
                       throws TupleMRException
Returns the instance files generated.

Throws:
TupleMRException

toString

public String toString()
Overrides:
toString in class Object

toJson

protected String toJson(boolean pretty)

parse

public static TupleMRConfig parse(String s)
                           throws IOException
Parse a schema from the provided string. If named, the schema is added to the names known to this parser.

Throws:
IOException

equals

public boolean equals(Object a)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object


Copyright © –2014 Datasalt Systems S.L.. All rights reserved.