com.datasalt.pangool.io
Class TupleFile.Writer

java.lang.Object
  extended by com.datasalt.pangool.io.TupleFile.Writer
All Implemented Interfaces:
Closeable
Enclosing class:
TupleFile

public static class TupleFile.Writer
extends Object
implements Closeable

Class for writing files containing ITuple. Typical usage would be:
TupleFile.Writer writer = new TupleFile.Writer(fs, conf, file, schema); Tuple tuple = new Tuple(schema); for (...) { fillTuple(tuple); writer.append(tuple); } close();


Constructor Summary
TupleFile.Writer(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FSDataOutputStream out, Schema schema, org.apache.hadoop.io.SequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.io.SequenceFile.Metadata metadata)
          Creates a TupleFile Writer.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema)
          Create the named file for storing @{link ITuple}s with the given schema.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema, int bufferSize, short replication, long blockSize, org.apache.hadoop.util.Progressable progress, org.apache.hadoop.io.SequenceFile.Metadata metadata)
          Create the named file with write-progress reporter for storing @{link ITuple}s with the given schema.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema, int bufferSize, short replication, long blockSize, org.apache.hadoop.io.SequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress, org.apache.hadoop.io.SequenceFile.Metadata metadata)
          Creates a TupleFile Writer.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema, org.apache.hadoop.util.Progressable progress, org.apache.hadoop.io.SequenceFile.Metadata metadata)
          Create the named file with write-progress reporter for storing @{link ITuple}s with the given schema.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema, org.apache.hadoop.io.SequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress)
          Creates a TupleFile Writer.
TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Schema schema, org.apache.hadoop.io.SequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress, org.apache.hadoop.io.SequenceFile.Metadata metadata)
          Creates a TupleFile Writer.
 
Method Summary
 void append(ITuple tuple)
          Append a ITuple
 void close()
          Close the file.
 org.apache.hadoop.io.compress.CompressionCodec getCompressionCodec()
          Returns the compression codec of data in this file.
 long getLength()
          Returns the current length of the output file.
 void sync()
          create a sync point
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema)
                 throws IOException
Create the named file for storing @{link ITuple}s with the given schema.

Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema,
                        org.apache.hadoop.util.Progressable progress,
                        org.apache.hadoop.io.SequenceFile.Metadata metadata)
                 throws IOException
Create the named file with write-progress reporter for storing @{link ITuple}s with the given schema.

Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema,
                        int bufferSize,
                        short replication,
                        long blockSize,
                        org.apache.hadoop.util.Progressable progress,
                        org.apache.hadoop.io.SequenceFile.Metadata metadata)
                 throws IOException
Create the named file with write-progress reporter for storing @{link ITuple}s with the given schema.

Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema,
                        int bufferSize,
                        short replication,
                        long blockSize,
                        org.apache.hadoop.io.SequenceFile.CompressionType compressionType,
                        org.apache.hadoop.io.compress.CompressionCodec codec,
                        org.apache.hadoop.util.Progressable progress,
                        org.apache.hadoop.io.SequenceFile.Metadata metadata)
                 throws IOException
Creates a TupleFile Writer.

Parameters:
fs - The configured filesystem.
conf - The configuration.
name - The name of the file.
schema - The schema of the tuples to be written
bufferSize - buffer size for the underlaying outputstream.
replication - replication factor for the file.
blockSize - block size for the file.
compressionType - The compression type.
codec - The compression codec.
progress - The Progressable object to track progress.
metadata - The metadata of the file.
Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.FSDataOutputStream out,
                        Schema schema,
                        org.apache.hadoop.io.SequenceFile.CompressionType compressionType,
                        org.apache.hadoop.io.compress.CompressionCodec codec,
                        org.apache.hadoop.io.SequenceFile.Metadata metadata)
                 throws IOException
Creates a TupleFile Writer.

Parameters:
conf - The configuration.
out - The stream on top which the writer is to be constructed.
schema - The schema of the tuples to be written
compressionType - The compression type.
codec - The compression codec.
metadata - The metadata of the file.
Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema,
                        org.apache.hadoop.io.SequenceFile.CompressionType compressionType,
                        org.apache.hadoop.io.compress.CompressionCodec codec,
                        org.apache.hadoop.util.Progressable progress)
                 throws IOException
Creates a TupleFile Writer.

Parameters:
fs - The configured filesystem.
conf - The configuration.
name - The name of the file.
schema - The schema of the tuples to be written
compressionType - The compression type.
codec - The compression codec.
progress - The Progressable object to track progress.
Throws:
IOException

TupleFile.Writer

public TupleFile.Writer(org.apache.hadoop.fs.FileSystem fs,
                        org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path name,
                        Schema schema,
                        org.apache.hadoop.io.SequenceFile.CompressionType compressionType,
                        org.apache.hadoop.io.compress.CompressionCodec codec,
                        org.apache.hadoop.util.Progressable progress,
                        org.apache.hadoop.io.SequenceFile.Metadata metadata)
                 throws IOException
Creates a TupleFile Writer.

Parameters:
fs - The configured filesystem.
conf - The configuration.
name - The name of the file.
schema - The schema of the tuples to be written
compressionType - The compression type.
codec - The compression codec.
progress - The Progressable object to track progress.
metadata - The metadata of the file.
Throws:
IOException
Method Detail

getCompressionCodec

public org.apache.hadoop.io.compress.CompressionCodec getCompressionCodec()
Returns the compression codec of data in this file.


sync

public void sync()
          throws IOException
create a sync point

Throws:
IOException

close

public void close()
           throws IOException
Close the file.

Specified by:
close in interface Closeable
Throws:
IOException

append

public void append(ITuple tuple)
            throws IOException
Append a ITuple

Throws:
IOException

getLength

public long getLength()
               throws IOException
Returns the current length of the output file.

This always returns a synchronized position. In other words, immediately after calling TupleFile.Reader.seek(long) with a position returned by this method, TupleFile.Reader.next(ITuple) may be called. However the key may be earlier in the file than key last written when this method was called (e.g., with block-compression, it may be the first key in the block that was being written when this method was called).

Throws:
IOException


Copyright © –2014 Datasalt Systems S.L.. All rights reserved.