com.datasalt.pangool.tuplemr.serialization
Class TupleSerialization

java.lang.Object
  extended by com.datasalt.pangool.tuplemr.serialization.TupleSerialization
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.serializer.Serialization<DatumWrapper<ITuple>>

public class TupleSerialization
extends Object
implements org.apache.hadoop.io.serializer.Serialization<DatumWrapper<ITuple>>, org.apache.hadoop.conf.Configurable

A Serialization for DatumWrapper

To use this serialization with Hadoop, use the method enableSerialization(Configuration) over the Hadoop configuration.


Field Summary
static String CONF_SCHEMA_VALIDATION
          Configuration parameter to enable the Schema strict validation.
When schema validation is set, the schema of the tuples emitted through the TupleMapper collector or TupleOutputFormat are validated, i.e.
 
Constructor Summary
TupleSerialization()
           
TupleSerialization(HadoopSerialization ser, TupleMRConfig tupleMRConf)
           
 
Method Summary
 boolean accept(Class<?> c)
           
static void disableSchemaValidation(org.apache.hadoop.conf.Configuration conf)
          see CONF_SCHEMA_VALIDATION
static void disableSerialization(org.apache.hadoop.conf.Configuration conf)
          Use this method to disable this serialization in Hadoop
static void enableSchemaValidation(org.apache.hadoop.conf.Configuration conf)
          see CONF_SCHEMA_VALIDATION
static void enableSerialization(org.apache.hadoop.conf.Configuration conf)
          Use this method to enable this serialization in Hadoop
 org.apache.hadoop.conf.Configuration getConf()
           
 org.apache.hadoop.io.serializer.Deserializer<DatumWrapper<ITuple>> getDeserializer(Class<DatumWrapper<ITuple>> c)
           
static boolean getSchemaValidation(org.apache.hadoop.conf.Configuration conf)
          see CONF_SCHEMA_VALIDATION
 org.apache.hadoop.io.serializer.Serializer<DatumWrapper<ITuple>> getSerializer(Class<DatumWrapper<ITuple>> c)
           
 void setConf(org.apache.hadoop.conf.Configuration thatConf)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CONF_SCHEMA_VALIDATION

public static final String CONF_SCHEMA_VALIDATION
Configuration parameter to enable the Schema strict validation.
When schema validation is set, the schema of the tuples emitted through the TupleMapper collector or TupleOutputFormat are validated, i.e. must strictly match the expected schema set in those outputs,otherwise an exception will be raised.

Using strict matching is safer, but not recommended in production environment, since its overhead may be great. On the other hand,if schema validation is not set, duck typing is applied to the tuples. That is, a tuple can be accepted as long as its schema contains all the fields from the expected schema in the same order. It may contain additional fields but these won't be serialized. Important:
Schema strict validation is unset by default, and is only recommended in testing environments.

See Also:
Constant Field Values
Constructor Detail

TupleSerialization

public TupleSerialization()

TupleSerialization

public TupleSerialization(HadoopSerialization ser,
                          TupleMRConfig tupleMRConf)
Method Detail

enableSchemaValidation

public static void enableSchemaValidation(org.apache.hadoop.conf.Configuration conf)
see CONF_SCHEMA_VALIDATION


disableSchemaValidation

public static void disableSchemaValidation(org.apache.hadoop.conf.Configuration conf)
see CONF_SCHEMA_VALIDATION


getSchemaValidation

public static boolean getSchemaValidation(org.apache.hadoop.conf.Configuration conf)
see CONF_SCHEMA_VALIDATION


accept

public boolean accept(Class<?> c)
Specified by:
accept in interface org.apache.hadoop.io.serializer.Serialization<DatumWrapper<ITuple>>

getConf

public org.apache.hadoop.conf.Configuration getConf()
Specified by:
getConf in interface org.apache.hadoop.conf.Configurable

setConf

public void setConf(org.apache.hadoop.conf.Configuration thatConf)
Specified by:
setConf in interface org.apache.hadoop.conf.Configurable

getSerializer

public org.apache.hadoop.io.serializer.Serializer<DatumWrapper<ITuple>> getSerializer(Class<DatumWrapper<ITuple>> c)
Specified by:
getSerializer in interface org.apache.hadoop.io.serializer.Serialization<DatumWrapper<ITuple>>

getDeserializer

public org.apache.hadoop.io.serializer.Deserializer<DatumWrapper<ITuple>> getDeserializer(Class<DatumWrapper<ITuple>> c)
Specified by:
getDeserializer in interface org.apache.hadoop.io.serializer.Serialization<DatumWrapper<ITuple>>

enableSerialization

public static void enableSerialization(org.apache.hadoop.conf.Configuration conf)
Use this method to enable this serialization in Hadoop


disableSerialization

public static void disableSerialization(org.apache.hadoop.conf.Configuration conf)
Use this method to disable this serialization in Hadoop



Copyright © –2014 Datasalt Systems S.L.. All rights reserved.