MarkLogic Connector for Hadoop 1.1-3

com.marklogic.mapreduce
Class DocumentInputFormat<VALUEIN>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<KEYIN,VALUEIN>
      extended by com.marklogic.mapreduce.MarkLogicInputFormat<DocumentURI,VALUEIN>
          extended by com.marklogic.mapreduce.DocumentInputFormat<VALUEIN>
All Implemented Interfaces:
MarkLogicConstants

public class DocumentInputFormat<VALUEIN>
extends MarkLogicInputFormat<DocumentURI,VALUEIN>

MarkLogicInputFormat for Document.

Use this class when using documents in a MarkLogic Database as input in a MapReduce job. This format produces key-value pairs where the key is the DocumentURI and the value is a document in VALUEIN at the given URI.

See Also:
ContentReader

Field Summary
 
Fields inherited from class com.marklogic.mapreduce.MarkLogicInputFormat
LOG
 
Fields inherited from interface com.marklogic.mapreduce.MarkLogicConstants
ADVANCED_MODE, BASIC_MODE, BATCH_SIZE, BIND_SPLIT_RANGE, CONTENT_TYPE, DEFAULT_BATCH_SIZE, DEFAULT_CONTENT_TYPE, DEFAULT_MAX_SPLIT_SIZE, DEFAULT_OUTPUT_CONTENT_ENCODING, DEFAULT_OUTPUT_XML_REPAIR_LEVEL, DEFAULT_PROPERTY_OPERATION_TYPE, DOCUMENT_SELECTOR, INDENTED, INPUT_DATABASE_NAME, INPUT_HOST, INPUT_KEY_CLASS, INPUT_LEXICON_FUNCTION_CLASS, INPUT_MODE, INPUT_PASSWORD, INPUT_PORT, INPUT_QUERY, INPUT_SSL_OPTIONS_CLASS, INPUT_USE_SSL, INPUT_USERNAME, INPUT_VALUE_CLASS, MAX_SPLIT_SIZE, MR_NAMESPACE, NODE_OPERATION_TYPE, OUTPUT_CLEAN_DIR, OUTPUT_COLLECTION, OUTPUT_CONTENT_ENCODING, OUTPUT_CONTENT_LANGUAGE, OUTPUT_CONTENT_NAMESPACE, OUTPUT_DIRECTORY, OUTPUT_FAST_LOAD, OUTPUT_FOREST_HOST, OUTPUT_HOST, OUTPUT_KEY_TYPE, OUTPUT_KEY_VARNAME, OUTPUT_NAMESPACE, OUTPUT_PASSWORD, OUTPUT_PERMISSION, OUTPUT_PORT, OUTPUT_PROPERTY_ALWAYS_CREATE, OUTPUT_QUALITY, OUTPUT_QUERY, OUTPUT_SSL_OPTIONS_CLASS, OUTPUT_STREAMING, OUTPUT_TOLERATE_ERRORS, OUTPUT_USE_SSL, OUTPUT_USERNAME, OUTPUT_VALUE_TYPE, OUTPUT_VALUE_VARNAME, OUTPUT_XML_REPAIR_LEVEL, PATH_NAMESPACE, PROPERTY_OPERATION_TYPE, RECORD_TO_FRAGMENT_RATIO, SPLIT_END_VARNAME, SPLIT_QUERY, SPLIT_START_VARNAME, SUBDOCUMENT_EXPRESSION, TXN_SIZE
 
Constructor Summary
DocumentInputFormat()
           
 
Method Summary
 org.apache.hadoop.mapreduce.RecordReader<DocumentURI,VALUEIN> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
 
Methods inherited from class com.marklogic.mapreduce.MarkLogicInputFormat
getSplits
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DocumentInputFormat

public DocumentInputFormat()
Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<DocumentURI,VALUEIN> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                        org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                 throws IOException,
                                                                                        InterruptedException
Specified by:
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<DocumentURI,VALUEIN>
Throws:
IOException
InterruptedException

MarkLogic Connector for Hadoop 1.1-3

Copyright © 2013 MarkLogic Corporation. All Rights Reserved.

Complete online documentation for MarkLogic Server, XQuery and related components may be found at developer.marklogic.com