com.marklogic.mapreduce
Class ContentOutputFormat<VALUEOUT>
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<KEYOUT,VALUEOUT>
com.marklogic.mapreduce.MarkLogicOutputFormat<DocumentURI,VALUEOUT>
com.marklogic.mapreduce.ContentOutputFormat<VALUEOUT>
- Type Parameters:
VALUEOUT -
- All Implemented Interfaces:
- MarkLogicConstants, org.apache.hadoop.conf.Configurable
public class ContentOutputFormat<VALUEOUT>
- extends MarkLogicOutputFormat<DocumentURI,VALUEOUT>
MarkLogicOutputFormat for Content.
Use this class to store results as content in a MarkLogic Server database.
The text, XML, or binary content is inserted into the database at the
given DocumentURI.
When using this MarkLogicOutputFormat, your key should be the URI of
the document to insert into the database. The value should be the content to
insert, in the form of Text or
MarkLogicNode.
Several configuration properties exist for controlling the content insertion,
including permissions, collections, quality, directory, and content type.
- See Also:
MarkLogicConstants,
ContentLoader,
ZipContentLoader
|
Field Summary |
static org.apache.commons.logging.Log |
LOG
|
| Fields inherited from interface com.marklogic.mapreduce.MarkLogicConstants |
ADVANCED_MODE, BASIC_MODE, BATCH_SIZE, BIND_SPLIT_RANGE, CONTENT_TYPE, DEFAULT_BATCH_SIZE, DEFAULT_CONTENT_TYPE, DEFAULT_MAX_SPLIT_SIZE, DEFAULT_OUTPUT_CONTENT_ENCODING, DEFAULT_OUTPUT_XML_REPAIR_LEVEL, DEFAULT_PROPERTY_OPERATION_TYPE, DOCUMENT_SELECTOR, INDENTED, INPUT_DATABASE_NAME, INPUT_HOST, INPUT_KEY_CLASS, INPUT_LEXICON_FUNCTION_CLASS, INPUT_MODE, INPUT_PASSWORD, INPUT_PORT, INPUT_QUERY, INPUT_SSL_OPTIONS_CLASS, INPUT_USE_SSL, INPUT_USERNAME, INPUT_VALUE_CLASS, MAX_SPLIT_SIZE, MR_NAMESPACE, NODE_OPERATION_TYPE, OUTPUT_CLEAN_DIR, OUTPUT_COLLECTION, OUTPUT_CONTENT_ENCODING, OUTPUT_CONTENT_LANGUAGE, OUTPUT_CONTENT_NAMESPACE, OUTPUT_DIRECTORY, OUTPUT_FAST_LOAD, OUTPUT_FOREST_HOST, OUTPUT_HOST, OUTPUT_KEY_TYPE, OUTPUT_KEY_VARNAME, OUTPUT_NAMESPACE, OUTPUT_PASSWORD, OUTPUT_PERMISSION, OUTPUT_PORT, OUTPUT_PROPERTY_ALWAYS_CREATE, OUTPUT_QUALITY, OUTPUT_QUERY, OUTPUT_SSL_OPTIONS_CLASS, OUTPUT_STREAMING, OUTPUT_TOLERATE_ERRORS, OUTPUT_USE_SSL, OUTPUT_USERNAME, OUTPUT_VALUE_TYPE, OUTPUT_VALUE_VARNAME, OUTPUT_XML_REPAIR_LEVEL, PATH_NAMESPACE, PROPERTY_OPERATION_TYPE, RECORD_TO_FRAGMENT_RATIO, SPLIT_END_VARNAME, SPLIT_QUERY, SPLIT_START_VARNAME, SUBDOCUMENT_EXPRESSION, TXN_SIZE |
|
Method Summary |
void |
checkOutputSpecs(org.apache.hadoop.conf.Configuration conf,
com.marklogic.xcc.ContentSource cs)
|
org.apache.hadoop.mapreduce.RecordWriter<DocumentURI,VALUEOUT> |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
|
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final org.apache.commons.logging.Log LOG
ContentOutputFormat
public ContentOutputFormat()
checkOutputSpecs
public void checkOutputSpecs(org.apache.hadoop.conf.Configuration conf,
com.marklogic.xcc.ContentSource cs)
throws IOException
- Specified by:
checkOutputSpecs in class MarkLogicOutputFormat<DocumentURI,VALUEOUT>
- Throws:
IOException
getRecordWriter
public org.apache.hadoop.mapreduce.RecordWriter<DocumentURI,VALUEOUT> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException,
InterruptedException
- Specified by:
getRecordWriter in class org.apache.hadoop.mapreduce.OutputFormat<DocumentURI,VALUEOUT>
- Throws:
IOException
InterruptedException
Copyright © 2013 MarkLogic Corporation. All Rights Reserved.
Complete online documentation for MarkLogic Server,
XQuery and related components may be found at
developer.marklogic.com