MarkLogic Connector for Hadoop 1.1-3

com.marklogic.mapreduce
Class ContentOutputFormat<VALUEOUT>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.OutputFormat<KEYOUT,VALUEOUT>
      extended by com.marklogic.mapreduce.MarkLogicOutputFormat<DocumentURI,VALUEOUT>
          extended by com.marklogic.mapreduce.ContentOutputFormat<VALUEOUT>
Type Parameters:
VALUEOUT -
All Implemented Interfaces:
MarkLogicConstants, org.apache.hadoop.conf.Configurable

public class ContentOutputFormat<VALUEOUT>
extends MarkLogicOutputFormat<DocumentURI,VALUEOUT>

MarkLogicOutputFormat for Content.

Use this class to store results as content in a MarkLogic Server database. The text, XML, or binary content is inserted into the database at the given DocumentURI.

When using this MarkLogicOutputFormat, your key should be the URI of the document to insert into the database. The value should be the content to insert, in the form of Text or MarkLogicNode.

Several configuration properties exist for controlling the content insertion, including permissions, collections, quality, directory, and content type.

See Also:
MarkLogicConstants, ContentLoader, ZipContentLoader

Field Summary
static org.apache.commons.logging.Log LOG
           
 
Fields inherited from class com.marklogic.mapreduce.MarkLogicOutputFormat
conf
 
Fields inherited from interface com.marklogic.mapreduce.MarkLogicConstants
ADVANCED_MODE, BASIC_MODE, BATCH_SIZE, BIND_SPLIT_RANGE, CONTENT_TYPE, DEFAULT_BATCH_SIZE, DEFAULT_CONTENT_TYPE, DEFAULT_MAX_SPLIT_SIZE, DEFAULT_OUTPUT_CONTENT_ENCODING, DEFAULT_OUTPUT_XML_REPAIR_LEVEL, DEFAULT_PROPERTY_OPERATION_TYPE, DOCUMENT_SELECTOR, INDENTED, INPUT_DATABASE_NAME, INPUT_HOST, INPUT_KEY_CLASS, INPUT_LEXICON_FUNCTION_CLASS, INPUT_MODE, INPUT_PASSWORD, INPUT_PORT, INPUT_QUERY, INPUT_SSL_OPTIONS_CLASS, INPUT_USE_SSL, INPUT_USERNAME, INPUT_VALUE_CLASS, MAX_SPLIT_SIZE, MR_NAMESPACE, NODE_OPERATION_TYPE, OUTPUT_CLEAN_DIR, OUTPUT_COLLECTION, OUTPUT_CONTENT_ENCODING, OUTPUT_CONTENT_LANGUAGE, OUTPUT_CONTENT_NAMESPACE, OUTPUT_DIRECTORY, OUTPUT_FAST_LOAD, OUTPUT_FOREST_HOST, OUTPUT_HOST, OUTPUT_KEY_TYPE, OUTPUT_KEY_VARNAME, OUTPUT_NAMESPACE, OUTPUT_PASSWORD, OUTPUT_PERMISSION, OUTPUT_PORT, OUTPUT_PROPERTY_ALWAYS_CREATE, OUTPUT_QUALITY, OUTPUT_QUERY, OUTPUT_SSL_OPTIONS_CLASS, OUTPUT_STREAMING, OUTPUT_TOLERATE_ERRORS, OUTPUT_USE_SSL, OUTPUT_USERNAME, OUTPUT_VALUE_TYPE, OUTPUT_VALUE_VARNAME, OUTPUT_XML_REPAIR_LEVEL, PATH_NAMESPACE, PROPERTY_OPERATION_TYPE, RECORD_TO_FRAGMENT_RATIO, SPLIT_END_VARNAME, SPLIT_QUERY, SPLIT_START_VARNAME, SUBDOCUMENT_EXPRESSION, TXN_SIZE
 
Constructor Summary
ContentOutputFormat()
           
 
Method Summary
 void checkOutputSpecs(org.apache.hadoop.conf.Configuration conf, com.marklogic.xcc.ContentSource cs)
           
 org.apache.hadoop.mapreduce.RecordWriter<DocumentURI,VALUEOUT> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
 
Methods inherited from class com.marklogic.mapreduce.MarkLogicOutputFormat
checkOutputSpecs, getConf, getForestHostMap, getOutputCommitter, queryForestHostMap, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

ContentOutputFormat

public ContentOutputFormat()
Method Detail

checkOutputSpecs

public void checkOutputSpecs(org.apache.hadoop.conf.Configuration conf,
                             com.marklogic.xcc.ContentSource cs)
                      throws IOException
Specified by:
checkOutputSpecs in class MarkLogicOutputFormat<DocumentURI,VALUEOUT>
Throws:
IOException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<DocumentURI,VALUEOUT> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                               throws IOException,
                                                                                      InterruptedException
Specified by:
getRecordWriter in class org.apache.hadoop.mapreduce.OutputFormat<DocumentURI,VALUEOUT>
Throws:
IOException
InterruptedException

MarkLogic Connector for Hadoop 1.1-3

Copyright © 2013 MarkLogic Corporation. All Rights Reserved.

Complete online documentation for MarkLogic Server, XQuery and related components may be found at developer.marklogic.com