MarkLogic Connector for Hadoop 1.1-3

com.marklogic.mapreduce
Class MarkLogicRecordReader<KEYIN,VALUEIN>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
      extended by com.marklogic.mapreduce.MarkLogicRecordReader<KEYIN,VALUEIN>
All Implemented Interfaces:
MarkLogicConstants, Closeable
Direct Known Subclasses:
DocumentReader, KeyValueReader, NodeReader, ValueReader

public abstract class MarkLogicRecordReader<KEYIN,VALUEIN>
extends org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
implements MarkLogicConstants

A RecordReader that fetches data from MarkLogic server and generates key value pairs.


Field Summary
protected  org.apache.hadoop.conf.Configuration conf
          Job configuration.
protected  long count
          Count of records fetched
protected  float length
          Total expected count of the records in a split.
static org.apache.commons.logging.Log LOG
           
protected  MarkLogicInputSplit mlSplit
          Input split for this record reader
protected  com.marklogic.xcc.ResultSequence result
          ResultSequence from the MarkLogic server.
protected  com.marklogic.xcc.Session session
          Session to the MarkLogic server.
 
Fields inherited from interface com.marklogic.mapreduce.MarkLogicConstants
ADVANCED_MODE, BASIC_MODE, BATCH_SIZE, BIND_SPLIT_RANGE, CONTENT_TYPE, DEFAULT_BATCH_SIZE, DEFAULT_CONTENT_TYPE, DEFAULT_MAX_SPLIT_SIZE, DEFAULT_OUTPUT_CONTENT_ENCODING, DEFAULT_OUTPUT_XML_REPAIR_LEVEL, DEFAULT_PROPERTY_OPERATION_TYPE, DOCUMENT_SELECTOR, INDENTED, INPUT_DATABASE_NAME, INPUT_HOST, INPUT_KEY_CLASS, INPUT_LEXICON_FUNCTION_CLASS, INPUT_MODE, INPUT_PASSWORD, INPUT_PORT, INPUT_QUERY, INPUT_SSL_OPTIONS_CLASS, INPUT_USE_SSL, INPUT_USERNAME, INPUT_VALUE_CLASS, MAX_SPLIT_SIZE, MR_NAMESPACE, NODE_OPERATION_TYPE, OUTPUT_CLEAN_DIR, OUTPUT_COLLECTION, OUTPUT_CONTENT_ENCODING, OUTPUT_CONTENT_LANGUAGE, OUTPUT_CONTENT_NAMESPACE, OUTPUT_DIRECTORY, OUTPUT_FAST_LOAD, OUTPUT_FOREST_HOST, OUTPUT_HOST, OUTPUT_KEY_TYPE, OUTPUT_KEY_VARNAME, OUTPUT_NAMESPACE, OUTPUT_PASSWORD, OUTPUT_PERMISSION, OUTPUT_PORT, OUTPUT_PROPERTY_ALWAYS_CREATE, OUTPUT_QUALITY, OUTPUT_QUERY, OUTPUT_SSL_OPTIONS_CLASS, OUTPUT_STREAMING, OUTPUT_TOLERATE_ERRORS, OUTPUT_USE_SSL, OUTPUT_USERNAME, OUTPUT_VALUE_TYPE, OUTPUT_VALUE_VARNAME, OUTPUT_XML_REPAIR_LEVEL, PATH_NAMESPACE, PROPERTY_OPERATION_TYPE, RECORD_TO_FRAGMENT_RATIO, SPLIT_END_VARNAME, SPLIT_QUERY, SPLIT_START_VARNAME, SUBDOCUMENT_EXPRESSION, TXN_SIZE
 
Constructor Summary
MarkLogicRecordReader(org.apache.hadoop.conf.Configuration conf)
           
 
Method Summary
 void close()
           
protected abstract  void endOfResult()
           
 org.apache.hadoop.conf.Configuration getConf()
           
 long getCount()
           
protected abstract  float getDefaultRatio()
           
 float getProgress()
           
 void initialize(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
 boolean nextKeyValue()
           
protected abstract  boolean nextResult(com.marklogic.xcc.ResultItem result)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.RecordReader
getCurrentKey, getCurrentValue
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

mlSplit

protected MarkLogicInputSplit mlSplit
Input split for this record reader


count

protected long count
Count of records fetched


session

protected com.marklogic.xcc.Session session
Session to the MarkLogic server.


result

protected com.marklogic.xcc.ResultSequence result
ResultSequence from the MarkLogic server.


conf

protected org.apache.hadoop.conf.Configuration conf
Job configuration.


length

protected float length
Total expected count of the records in a split.

Constructor Detail

MarkLogicRecordReader

public MarkLogicRecordReader(org.apache.hadoop.conf.Configuration conf)
Method Detail

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Specified by:
close in class org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
Throws:
IOException

getConf

public org.apache.hadoop.conf.Configuration getConf()

getProgress

public float getProgress()
                  throws IOException,
                         InterruptedException
Specified by:
getProgress in class org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
Throws:
IOException
InterruptedException

initialize

public void initialize(org.apache.hadoop.mapreduce.InputSplit split,
                       org.apache.hadoop.mapreduce.TaskAttemptContext context)
                throws IOException,
                       InterruptedException
Specified by:
initialize in class org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
Throws:
IOException
InterruptedException

nextKeyValue

public boolean nextKeyValue()
                     throws IOException,
                            InterruptedException
Specified by:
nextKeyValue in class org.apache.hadoop.mapreduce.RecordReader<KEYIN,VALUEIN>
Throws:
IOException
InterruptedException

endOfResult

protected abstract void endOfResult()

nextResult

protected abstract boolean nextResult(com.marklogic.xcc.ResultItem result)

getDefaultRatio

protected abstract float getDefaultRatio()

getCount

public long getCount()

MarkLogic Connector for Hadoop 1.1-3

Copyright © 2013 MarkLogic Corporation. All Rights Reserved.

Complete online documentation for MarkLogic Server, XQuery and related components may be found at developer.marklogic.com