public interface MarkLogicConstants
Use the mapreduce.marklogic.input.*
properties when
using MarkLogic Server as an input source. Use the
mapreduce.marklogic.output.*
properties when using
MarkLogic Server to store your results.
Modifier and Type | Field and Description |
---|---|
static String |
ADVANCED_MODE
Value string of advanced mode for
input.mode . |
static String |
ASSIGNMENT_POLICY
The config property name (
"mapreduce.marklogic.output.assignmentpolicy" )
which, if set, indicates assignment policy for output documents. |
static String |
BASIC_MODE
Value string of basic mode for
input.mode . |
static String |
BATCH_SIZE
The config property name (
"mapreduce.marklogic.output.batchsize" )
which, if set, indicates the number of records in one request. |
static String |
BIND_SPLIT_RANGE
The config property name (
"mapreduce.marklogic.input.bindsplitrange" )
which, if set to true, specifies that the input query declares and
references external variables "splitstart"
and "splitend" under the
namespace "http://marklogic.com/hadoop". |
static String |
COLLECTION_FILTER
The config property name (
"mapreduce.marklogic.input.filter.collection" )
which, if set, indicates to only include documents with one or many of
specified collection URIs when using ForestInputFormat . |
static String |
CONTENT_TYPE
The config property name (
"mapreduce.marklogic.output.content.type" )
which, if set, indicates type of content to be inserted when using
ContentOutputFormat. |
static String |
COPY_COLLECTIONS
The config property name (
"mapreduce.marklogic.copycollections" )
which, if set, specifies whether to copy document collections from
source to destination. |
static String |
COPY_METADATA
The config property name (
"mapreduce.marklogic.copymetadata" )
which, if set, specifies whether to copy document metadata from
source to destination. |
static String |
COPY_QUALITY
The config property name (
"mapreduce.marklogic.copyquality" )
which, if set, specifies whether to copy document quality from
source to destination. |
static int |
DEFAULT_BATCH_SIZE
Default batch size.
|
static String |
DEFAULT_CONTENT_TYPE
Default content type.
|
static long |
DEFAULT_LOCAL_MAX_SPLIT_SIZE
The default maximum split size for input splits, used if
input.maxsplitsize is not specified
and running in local mode. |
static long |
DEFAULT_MAX_SPLIT_SIZE
The default maximum split size for input splits, used if
input.maxsplitsize is not specified. |
static String |
DEFAULT_OUTPUT_CONTENT_ENCODING
Default output content encoding
|
static String |
DEFAULT_OUTPUT_XML_REPAIR_LEVEL
Default output XML repair level
|
static String |
DEFAULT_PROPERTY_OPERATION_TYPE
Default property operation type.
|
static int |
DEFAULT_TXN_SIZE
Default transaction size.
|
static String |
DIRECTORY_FILTER
The config property name (
"mapreduce.marklogic.input.filter.directory" )
which, if set, indicates to only include documents with one of
specified directory URIs when using ForestInputFormat . |
static String |
DOCUMENT_SELECTOR
The config property name (
"mapreduce.marklogic.input.documentselector" )
which, if set, specifies the document selection portion of the
path expression used to retrieve data from the server. |
static String |
EXECUTION_MODE
The config property name (
"mapreduce.marklogic.mode" )
which, if set, indicates whether the job is running in local or
distributed mode. |
static String |
EXTRACT_URI |
static String |
INDENTED
The config property name (
"mapreduce.marklogic.input.indented" ) which, if
set, specifies whether to format data with indentation retrieved from
MarkLogic. |
static String |
INPUT_DATABASE_NAME
Not yet Implemented.
|
static String |
INPUT_HOST
The config property name (
"mapreduce.marklogic.input.host" )
which, if set, specifies the MarkLogic Server host to use for
input operations. |
static String |
INPUT_KEY_CLASS
The config property name (
"mapreduce.marklogic.input.keyclass" )
which, if set, specifies the name of the class of the map
input keys for KeyValueInputFormat . |
static String |
INPUT_KEYSTORE_PASSWD
The config property name (
"mapreduce.marklogic.input.keystorepassword" )
which, if set, specifies the Keystore password which will be used if
input.ssl is set to true. |
static String |
INPUT_KEYSTORE_PATH
The config property name (
"mapreduce.marklogic.input.keystorepath" )
which, if set, specifies the Keystore which will be used if
input.ssl is set to true. |
static String |
INPUT_LEXICON_FUNCTION_CLASS
The config property name (
"mapreduce.marklogic.input.lexiconfunctionclass" )
which, if set, specifies the name of the class implementing
LexiconFunction
which will be used to generate input. |
static String |
INPUT_MODE
The config property name (
"mapreduce.marklogic.input.mode" )
which, if set, specifies whether to use basic or advanced
input query mode. |
static String |
INPUT_PASSWORD
The config property name (
"mapreduce.marklogic.input.password" )
which, if set, specifies the cleartext password to use for
authentication with input.username . |
static String |
INPUT_PORT
The config property name (
"mapreduce.marklogic.input.port" )
which, if set, specifies the port number of the input XDBC
server on the MarkLogic Server host specified by the
input.host property. |
static String |
INPUT_QUERY
The config property name (
"mapreduce.marklogic.input.query" )
which, if set, specifies the query used to retrieve input
records from MarkLogic Server. |
static String |
INPUT_QUERY_LANGUAGE
The config property name (
"mapreduce.marklogic.input.querylanguage" )
which, if set, specifies the query language will be used for input query and split query. |
static String |
INPUT_QUERY_TIMESTAMP
The config property name (
"mapreduce.marklogic.input.querytimestamp" )
which, if set, specifies data retrieval from MarkLogic Server at the
specified timestamp. |
static String |
INPUT_RESTRICT_HOSTS
The config property name (
"mapreduce.marklogic.input.restricthosts" )
which, if set, specifies whether to restrict input hosts that
mlcp will connect to. |
static String |
INPUT_SSL_OPTIONS_CLASS
The config property name (
"mapreduce.marklogic.input.ssloptionsclass" )
which, if set, specifies the name of the class implementing
SslConfigOptions which will be used if
input.ssl is set to true. |
static String |
INPUT_SSL_PROTOCOL
The config property name (
"mapreduce.marklogic.input.sslprotocol" )
which, if set, specifies the SSL protocol which will be used if
input.ssl is set to true. |
static String |
INPUT_TRUSTSTORE_PASSWD
The config property name (
"mapreduce.marklogic.input.truststorepassword" )
which, if set, specifies the Keystore password which will be used if
input.ssl is set to true. |
static String |
INPUT_TRUSTSTORE_PATH
The config property name (
"mapreduce.marklogic.input.truststorepath" )
which, if set, specifies the Keystore which will be used if
input.ssl is set to true. |
static String |
INPUT_USE_SSL
The config property name (
"mapreduce.marklogic.input.usessl" )
which, if set, specifies whether the connection to the input server is
SSL enabled; false is assumed if not set. |
static String |
INPUT_USERNAME
The config property name (
"mapreduce.marklogic.input.username" )
which, if set, specifies the MarkLogic Server user name
under which input queries and operations run. |
static String |
INPUT_VALUE_CLASS
The config property name (
"mapreduce.marklogic.input.valueclass" )
which, if set, specifies the name of the class of the map
input value for KeyValueInputFormat , ValueInputFormat
and DocumentInputFormat . |
static String |
MAX_SPLIT_SIZE
The config property name (
"mapreduce.marklogic.input.maxsplitsize" )
which, if set, specifies the maximum number of fragments per
input split. |
static long |
MIN_NODEUPDATE_VERSION
Minimum MarkLogic version to accept node-update permissions.
|
static String |
MODE_DISTRIBUTED |
static String |
MODE_LOCAL |
static String |
MR_NAMESPACE
The namespace ("http://marklogic.com/hadoop") in which the split range external variables
are defined.
|
static String |
NODE_OPERATION_TYPE
The config property name (
"mapreduce.marklogic.output.node.optype" )
which, if set, indicates what node operation to perform
during output. |
static String |
OUTPUT_CLEAN_DIR
The config property name (
"mapreduce.marklogic.output.content.cleandir" )
which, if set, indicates whether or not to remove the output
directory. |
static String |
OUTPUT_COLLECTION
The config property name (
"mapreduce.marklogic.output.content.collection" )
which, if set, specifies a comma-separated list of collections
to which generated output documents are added. |
static String |
OUTPUT_CONTENT_ENCODING
The config property name (
"mapreduce.marklogic.output.content.encoding" ) which, if set,
specifies the charset encoding to be used by the server when loading
this document. |
static String |
OUTPUT_CONTENT_LANGUAGE
The config property name (
"mapreduce.marklogic.output.content.language" ) which, if set,
specifies the language name to associate with inserted documents. |
static String |
OUTPUT_CONTENT_NAMESPACE
The config property name (
"mapreduce.marklogic.output.content.namespace" ) which, if set,
specifies the namespace to associate with inserted documents. |
static String |
OUTPUT_DATABASE_NAME
The config property name (
"mapreduce.marklogic.output.databasename" )
which, if set, specifies the MarkLogic Server database to use for
output operations. |
static String |
OUTPUT_DIRECTORY
The config property name (
"mapreduce.marklogic.output.content.directory" )
which, if set, specifies the MarkLogic Server database directory
where output documents are created. |
static String |
OUTPUT_FAST_LOAD
The config property name (
"mapreduce.marklogic.output.content.fastload" )
which, if set, indicates whether or not to use the fast load mode
to load content into MarkLogic. |
static String |
OUTPUT_FOREST_HOST
Internal use only.
|
static String |
OUTPUT_GRAPH
Default graph for rdf
|
static String |
OUTPUT_HOST
The config property name (
"mapreduce.marklogic.output.host" )
which, if set, specifies the MarkLogic Server host to use for
output operations. |
static String |
OUTPUT_KEY_TYPE
The config property name (
"mapreduce.marklogic.output.keytype" )
which, if set, specifies the data type of the
output keys for KeyValueOutputFormat . |
static String |
OUTPUT_KEY_VARNAME
Value string of the output key external variable name.
|
static String |
OUTPUT_KEYSTORE_PASSWD
The config property name (
"mapreduce.marklogic.output.keystorepassword" )
which, if set, specifies the Keystore password which will be used if
input.ssl is set to true. |
static String |
OUTPUT_KEYSTORE_PATH
The config property name (
"mapreduce.marklogic.output.keystorepath" )
which, if set, specifies the Keystore which will be used if
input.ssl is set to true. |
static String |
OUTPUT_NAMESPACE
The config property name (
"mapreduce.marklogic.output.node.namespace" )
which, if set, indicates the namespace used for output. |
static String |
OUTPUT_OVERRIDE_GRAPH
Graph overrided for rdf
|
static String |
OUTPUT_PARTITION
The config property name (
"mapreduce.marklogic.output.partition" )
which, if set, specifies the partition
where output documents are created. |
static String |
OUTPUT_PASSWORD
The config property name (
"mapreduce.marklogic.output.password" )
which, if set, specifies the cleartext password to use for
authentication with output.username . |
static String |
OUTPUT_PERMISSION
The config property name (
"mapreduce.marklogic.output.content.permission" )
which, if set, specifies a comma-separated list role-capability
pairs to associate with created output documents. |
static String |
OUTPUT_PORT
The config property name (
"mapreduce.marklogic.output.port" )
which, if set, specifies the port number of the output MarkLogic
Server specified by the input.host property. |
static String |
OUTPUT_PROPERTY_ALWAYS_CREATE
The config property name (
"mapreduce.marklogic.output.property.alwayscreate" )
which, if set to true, causes PropertyOutputFormat
to create document properties for reduce output
key-value pairs even when no document exists with
the target URI. |
static String |
OUTPUT_QUALITY
The config property name (
"mapreduce.marklogic.output.content.quality" )
which, if set, specifies the document quality for created
output documents. |
static String |
OUTPUT_QUERY
The config property name (
"mapreduce.marklogic.output.query" )
which, if set, specifies the statement to execute against MarkLogic
Server. |
static String |
OUTPUT_QUERY_LANGUAGE
The config property name (
"mapreduce.marklogic.output.querylanguage" )
which, if set, specified the query language will be used for output query. |
static String |
OUTPUT_RESTRICT_HOSTS
The config property name (
"mapreduce.marklogic.output.restricthosts" )
which, if set, specifies whether to restrict output hosts that
mlcp will connecot to. |
static String |
OUTPUT_SSL_OPTIONS_CLASS
The config property name (
"mapreduce.marklogic.output.ssloptionsclass" )
which, if set, specifies the name of the class implementing
SslConfigOptions which will be used if
output.usesslprotocol is set to SSLv3. |
static String |
OUTPUT_SSL_PROTOCOL
The config property name (
"mapreduce.marklogic.output.sslprotocol" )
which, if set, specifies SSL protocol which will be used if
output.usessl is set to true. |
static String |
OUTPUT_STREAMING
The config property name (
"mapreduce.marklogic.output.content.streaming" )
which, if set, specifies whether to use streaming to insert
content. |
static String |
OUTPUT_TRUSTSTORE_PASSWD
The config property name (
"mapreduce.marklogic.output.truststorepassword" )
which, if set, specifies the Keystore password which will be used if
input.ssl is set to true. |
static String |
OUTPUT_TRUSTSTORE_PATH
The config property name (
"mapreduce.marklogic.output.truststorepath" )
which, if set, specifies the Keystore which will be used if
input.ssl is set to true. |
static String |
OUTPUT_URI_PREFIX
The config property name (
"mapreduce.marklogic.output_uriprefix" )
which, if set, specifies a string to prepend to all document URIs. |
static String |
OUTPUT_URI_REPLACE
The config property name (
"mapreduce.marklogic.output.urireplace" )
which, if set, specifies a comma separated list of regex pattern and
string pairs, 1st to match a uri segment, 2nd the string to replace
with, with the 2nd one in ''. |
static String |
OUTPUT_URI_SUFFIX
The config property name (
"mapreduce.marklogic.output_urisuffix" )
which, if set, specifies a string to append to all document URIs. |
static String |
OUTPUT_USE_SSL
The config property name (
"mapreduce.marklogic.output.usessl" )
which, if set, specifies whether the connection to the output server is
SSL enabled; false is assumed if not set. |
static String |
OUTPUT_USERNAME
The config property name (
"mapreduce.marklogic.output.username" )
which, if set, specifies the MarkLogic Server user name
under which output operations run. |
static String |
OUTPUT_VALUE_TYPE
The config property name (
"mapreduce.marklogic.output.valuetype" )
which, if set, specifies the data type of the map
output value for KeyValueOutputFormat . |
static String |
OUTPUT_VALUE_VARNAME
Value string of the output value external variable name.
|
static String |
OUTPUT_XML_REPAIR_LEVEL
The config property name (
"mapreduce.marklogic.output.content.repairlevel" )
which, if set, specifies the document repair level for this options object. |
static String |
PATH_NAMESPACE
The config property name (
"mapreduce.marklogic.input.namespace" )
which, if set, specifies a list of namespaces to use when
evaluating the path expression constructed from the
input.documentselector and
input.subdocumentexpr properties. |
static String |
PROPERTY_OPERATION_TYPE
The config property name (
"mapreduce.marklogic.output.property.optype" )
which, if set, indicates what property operation to perform
during output when using PropertyOutputFormat . |
static String |
QUERY_FILTER
The config property name (
"mapreduce.marklogic.input.filter.query" )
which, if set, indicates to only include documents matching the cts
query MarkLogicInputFormat . |
static String |
RECORD_TO_FRAGMENT_RATIO
The config property name (
"mapreduce.marklogic.input.recordtofragmentratio" ) which, if
set, specifies the ratio of the number of retrieved
records to the number of accessed fragments. |
static String |
REDACTION_RULE_COLLECTION
The config property name (
"mapreduce.marklogic.input.redaction.rules" )
which, if set, specifies a comma-separated list of
redaction rule collection URIs. |
static String |
SERVER_THREAD_COUNT
For adjusting MLCP concurrency, internal use only
|
static String |
SPLIT_END_VARNAME
Use this external variable name (
"splitend" ) in your advanced
mode input query to access the end value of the record range in an input
split when "mapreduce.marklogic.input.bindsplitrange" is true. |
static String |
SPLIT_QUERY
The config property name (
"mapreduce.marklogic.input.splitquery" )
which, if set, specifies the query MarkLogic Server uses
to generate input splits. |
static String |
SPLIT_START_VARNAME
Use this external variable name (
"splitstart" ) in your advanced
mode input query to access the start value of the record range in an
input split when "mapreduce.marklogic.input.bindsplitrange" is true. |
static String |
SUBDOCUMENT_EXPRESSION
The config property name (
"mapreduce.marklogic.input.subdocumentexpr" )
which, if set, specifies the path expression used to retrieve
sub-document records from the server. |
static String |
TEMPORAL_COLLECTION
The config property name (
"mapreduce.marklogic.output.temporalcollection" )
which, if set, indicates temporal collection for documents. |
static double |
THREAD_MULTIPLIER |
static String |
TXN_SIZE
The config property name (
"mapreduce.marklogic.output.transactionsize" )
which, if set, indicates the number of requests in one transaction. |
static String |
TYPE_FILTER
The config property name (
"mapreduce.marklogic.input.filter.type" )
which, if set, indicates to only include documents with one of
specified types when using ForestInputFormat . |
static final String INPUT_USERNAME
"mapreduce.marklogic.input.username"
)
which, if set, specifies the MarkLogic Server user name
under which input queries and operations run. Required if using
MarkLogic Server for input.static final String INPUT_PASSWORD
"mapreduce.marklogic.input.password"
)
which, if set, specifies the cleartext password to use for
authentication with input.username
.
Required if using MarkLogic Server for input.static final String INPUT_HOST
"mapreduce.marklogic.input.host"
)
which, if set, specifies the MarkLogic Server host to use for
input operations. Required if using MarkLogic Server for input.static final String INPUT_PORT
"mapreduce.marklogic.input.port"
)
which, if set, specifies the port number of the input XDBC
server on the MarkLogic Server host specified by the
input.host
property. Required if using
MarkLogic Server for input.
NOTE: Within a cluster, all nodes supplying MapReduce input data must use the same XDBC server port number.
static final String INPUT_USE_SSL
"mapreduce.marklogic.input.usessl"
)
which, if set, specifies whether the connection to the input server is
SSL enabled; false is assumed if not set.static final String INPUT_SSL_PROTOCOL
"mapreduce.marklogic.input.sslprotocol"
)
which, if set, specifies the SSL protocol which will be used if
input.ssl
is set to true.static final String INPUT_KEYSTORE_PATH
"mapreduce.marklogic.input.keystorepath"
)
which, if set, specifies the Keystore which will be used if
input.ssl
is set to true.static final String INPUT_KEYSTORE_PASSWD
"mapreduce.marklogic.input.keystorepassword"
)
which, if set, specifies the Keystore password which will be used if
input.ssl
is set to true.static final String INPUT_TRUSTSTORE_PATH
"mapreduce.marklogic.input.truststorepath"
)
which, if set, specifies the Keystore which will be used if
input.ssl
is set to true.static final String INPUT_TRUSTSTORE_PASSWD
"mapreduce.marklogic.input.truststorepassword"
)
which, if set, specifies the Keystore password which will be used if
input.ssl
is set to true.static final String INPUT_SSL_OPTIONS_CLASS
"mapreduce.marklogic.input.ssloptionsclass"
)
which, if set, specifies the name of the class implementing
SslConfigOptions
which will be used if
input.ssl
is set to true.static final String DOCUMENT_SELECTOR
"mapreduce.marklogic.input.documentselector"
)
which, if set, specifies the document selection portion of the
path expression used to retrieve data from the server. Only
used if using MarkLogic Server for input in basic
mode.
The XQuery path expression step given in this property must
select a sequence of document nodes. To further refine the
input selection to nodes or values within the documents, use
input.subdocumentexpr
. If
this property is not set, fn:collection()
is used.
For more information, see the overview.
This property is only usable when basic
mode is
specified with the input.mode
property. If
more powerful input customization is needed, use
advanced
mode and specify a complete input query
with the input.query
property.
The path expression step given in this property must be searchable. A searchable expression is one which can be optimized using indexes. See the Query and Performance Tuning Guide for more information on searchable path expressions.
The following selects all documents:
<property> <name>mapreduce.marklogic.input.documentselector</name> <value>fn:collection()</value> </property>
static final String SUBDOCUMENT_EXPRESSION
"mapreduce.marklogic.input.subdocumentexpr"
)
which, if set, specifies the path expression used to retrieve
sub-document records from the server. Used only if using MarkLogic
Server for input in basic
mode. If not set,
the document nodes selected by the document selector
are used.
The XQuery path expression step given in this property should
select a sequence of nodes or atomic values from the set of
documents selected by the path step given in the
input.documentselector
property.
For more information, see the overview.
This property is only usable when basic
mode is
specified with the input.mode
property. If
more powerful input customization is needed, use
advanced
mode and specify a complete input query
with the input.query
property.
The following would select all documents containing hrefs:
<property> <name>mapreduce.marklogic.input.documentselector</name> <value>fn:collection()</value> </property> <property> <name>mapreduce.marklogic.input.subdocumentexpr</name> <value>//wp:a[@href]</value> </property>
static final String INPUT_LEXICON_FUNCTION_CLASS
"mapreduce.marklogic.input.lexiconfunctionclass"
)
which, if set, specifies the name of the class implementing
LexiconFunction
which will be used to generate input.static final String PATH_NAMESPACE
"mapreduce.marklogic.input.namespace"
)
which, if set, specifies a list of namespaces to use when
evaluating the path expression constructed from the
input.documentselector
and
input.subdocumentexpr
properties.
Specify the namespaces as comma separated alias-URI pairs. For example:
<property> <name>mapreduce.marklogic.input.namespace</name> <value>wp, "http://www.mediawiki.org.xml/export-0.4/"</value> </property>
If a namespace URI includes a comma, you must set this property programmatically, rather than in a config file.
static final String SPLIT_QUERY
"mapreduce.marklogic.input.splitquery"
)
which, if set, specifies the query MarkLogic Server uses
to generate input splits. This property is required (and only
usable) in advanced
mode; see the
input.mode
property for details.
The split query must return a sequence of (forest id, record count, hostname) tuples. The host name and forest id identify the forest associated with the split. The count is an estimate of the number of key-value pairs in the split.
The default split query used in basic
input mode
computes a rough estimate based on the number of documents in
the database.
static final String MAX_SPLIT_SIZE
"mapreduce.marklogic.input.maxsplitsize"
)
which, if set, specifies the maximum number of fragments per
input split. Optional. Default: 50000L.
The default should be suitable for most applications.static final String INPUT_DATABASE_NAME
The config property name ("mapreduce.marklogic.input.databasename"
)
which, if set, specifies the name of the MarkLogic Server
database from which to create input splits.
static final String INPUT_KEY_CLASS
"mapreduce.marklogic.input.keyclass"
)
which, if set, specifies the name of the class of the map
input keys for KeyValueInputFormat
. Optional.
Default: Text
.static final String INPUT_VALUE_CLASS
"mapreduce.marklogic.input.valueclass"
)
which, if set, specifies the name of the class of the map
input value for KeyValueInputFormat
, ValueInputFormat
and DocumentInputFormat
.
Optional. Default: Text
for
KeyValueInputFormat
and ValueInputFormat
,
DatabaseDocument
for DocumentInputFormat
.static final String INPUT_MODE
"mapreduce.marklogic.input.mode"
)
which, if set, specifies whether to use basic or advanced
input query mode. Allowable values are basic
and
advanced
. Optional. Default: basic
.
Only basic mode is supported at this time.
Basic mode enables use of the
input.documentselector
,
input.subdocumentexpr
, and
input.namespace
properties. Advanced
mode enables use of the input.query
and
input.splitquery
properties.
static final String BASIC_MODE
input.mode
.static final String ADVANCED_MODE
input.mode
.static final String INPUT_QUERY
"mapreduce.marklogic.input.query"
)
which, if set, specifies the query used to retrieve input
records from MarkLogic Server. This property is required
when advanced
is specified in the
input.mode
property.
The value of this property must be a fully formed query,
suitable for evaluation by xdmp:eval
, and
must return a sequence. The items in the sequence depend
on the InputFormat
subclass configured for the job. For details, see
"Advanced Input Mode" in the Hadoop MapReduce Connector
Developer's Guide.
static final String INPUT_QUERY_TIMESTAMP
"mapreduce.marklogic.input.querytimestamp"
)
which, if set, specifies data retrieval from MarkLogic Server at the
specified timestamp.
static final String BIND_SPLIT_RANGE
"mapreduce.marklogic.input.bindsplitrange"
)
which, if set to true, specifies that the input query declares and
references external variables "splitstart"
and "splitend"
under the
namespace "http://marklogic.com/hadoop". The connector binds to
these variables with the start and end of an input split
instead of constraining the query with the split range.
For details, see "Optimizing Your Input Query" in the Hadoop MapReduce Connector Developer's Guide.
static final String MR_NAMESPACE
The split range variables "splitstart"
and "splitend"
are in this namespace when
using advanced input mode and "mapreduce.marklogic.input.bindsplitrange"
is true. Declare a namespace prefix for this namespace in your input
query and qualify references to "splitstart"
and "splitend"
by the prefix. For details,
see "Optimizing Your Input Query" in the Hadoop MapReduce Connector
Developer's Guide.
static final String SPLIT_START_VARNAME
"splitstart"
) in your advanced
mode input query to access the start value of the record range in an
input split when "mapreduce.marklogic.input.bindsplitrange"
is true.
The variable must be declared and referenced in the namespace
"http://marklogic.com/hadoop"
. For details, see
"Optimizing Your Input Query" in the Hadoop MapReduce Connector
Developer's Guide.
static final String SPLIT_END_VARNAME
"splitend"
) in your advanced
mode input query to access the end value of the record range in an input
split when "mapreduce.marklogic.input.bindsplitrange"
is true.
The variable must be declared and referenced in the namespace
"http://marklogic.com/hadoop"
. For details, see
"Optimizing Your Input Query" in the Hadoop MapReduce Connector
Developer's Guide.
static final String RECORD_TO_FRAGMENT_RATIO
"mapreduce.marklogic.input.recordtofragmentratio"
) which, if
set, specifies the ratio of the number of retrieved
records to the number of accessed fragments. Optional.
Default: 1.0 (one record per fragment) for documents,
100 for nodes and values.
The record to fragment ratio is used for progress estimate.
static final String INDENTED
"mapreduce.marklogic.input.indented"
) which, if
set, specifies whether to format data with indentation retrieved from
MarkLogic. Optional. Valid values: TRUE, FALSE, SERVERDEFAULT.
Default: false.static final String COLLECTION_FILTER
"mapreduce.marklogic.input.filter.collection"
)
which, if set, indicates to only include documents with one or many of
specified collection URIs when using ForestInputFormat
.static final String DIRECTORY_FILTER
"mapreduce.marklogic.input.filter.directory"
)
which, if set, indicates to only include documents with one of
specified directory URIs when using ForestInputFormat
.static final String QUERY_FILTER
"mapreduce.marklogic.input.filter.query"
)
which, if set, indicates to only include documents matching the cts
query MarkLogicInputFormat
.static final String TYPE_FILTER
"mapreduce.marklogic.input.filter.type"
)
which, if set, indicates to only include documents with one of
specified types when using ForestInputFormat
.static final String EXTRACT_URI
static final String OUTPUT_USERNAME
"mapreduce.marklogic.output.username"
)
which, if set, specifies the MarkLogic Server user name
under which output operations run. Required if using MarkLogic
Server for output.static final String OUTPUT_PASSWORD
"mapreduce.marklogic.output.password"
)
which, if set, specifies the cleartext password to use for
authentication with output.username
.
Required if using MarkLogic Server for output.static final String OUTPUT_HOST
"mapreduce.marklogic.output.host"
)
which, if set, specifies the MarkLogic Server host to use for
output operations. Required if using MarkLogic Server for
output.static final String OUTPUT_FOREST_HOST
static final String OUTPUT_PORT
"mapreduce.marklogic.output.port"
)
which, if set, specifies the port number of the output MarkLogic
Server specified by the input.host
property.
Required if using MarkLogic Server for output.static final String OUTPUT_DATABASE_NAME
"mapreduce.marklogic.output.databasename"
)
which, if set, specifies the MarkLogic Server database to use for
output operations. The default value is the target database assigned
to the AppServer.
.static final String OUTPUT_USE_SSL
"mapreduce.marklogic.output.usessl"
)
which, if set, specifies whether the connection to the output server is
SSL enabled; false is assumed if not set.static final String OUTPUT_SSL_PROTOCOL
"mapreduce.marklogic.output.sslprotocol"
)
which, if set, specifies SSL protocol which will be used if
output.usessl
is set to true.static final String OUTPUT_KEYSTORE_PATH
"mapreduce.marklogic.output.keystorepath"
)
which, if set, specifies the Keystore which will be used if
input.ssl
is set to true.static final String OUTPUT_KEYSTORE_PASSWD
"mapreduce.marklogic.output.keystorepassword"
)
which, if set, specifies the Keystore password which will be used if
input.ssl
is set to true.static final String OUTPUT_TRUSTSTORE_PATH
"mapreduce.marklogic.output.truststorepath"
)
which, if set, specifies the Keystore which will be used if
input.ssl
is set to true.static final String OUTPUT_TRUSTSTORE_PASSWD
"mapreduce.marklogic.output.truststorepassword"
)
which, if set, specifies the Keystore password which will be used if
input.ssl
is set to true.static final String OUTPUT_SSL_OPTIONS_CLASS
"mapreduce.marklogic.output.ssloptionsclass"
)
which, if set, specifies the name of the class implementing
SslConfigOptions
which will be used if
output.usesslprotocol
is set to SSLv3.static final String OUTPUT_DIRECTORY
"mapreduce.marklogic.output.content.directory"
)
which, if set, specifies the MarkLogic Server database directory
where output documents are created.
If output.cleandir
is false (the default)
then an error occurs if the directory already exists. If output.cleandir
is true, then the directory
is removed as part of the job submission process.
static final String OUTPUT_CONTENT_ENCODING
"mapreduce.marklogic.output.content.encoding"
) which, if set,
specifies the charset encoding to be used by the server when loading
this document. The encoding provided will be passed to the server at
document load time and must be a name that it recognizes. The document
byte stream will be transcoded to UTF-8 for storage.static final String DEFAULT_OUTPUT_CONTENT_ENCODING
static final String OUTPUT_COLLECTION
"mapreduce.marklogic.output.content.collection"
)
which, if set, specifies a comma-separated list of collections
to which generated output documents are added. Optional. Relevant
only when using MarkLogic Server for output with
ContentOutputFormat
.
Example:
<property> <name>mapreduce.marklogic.output.content.collection</name> <value>latest,top10</value> </property>
static final String OUTPUT_GRAPH
static final String OUTPUT_OVERRIDE_GRAPH
static final String OUTPUT_PERMISSION
"mapreduce.marklogic.output.content.permission"
)
which, if set, specifies a comma-separated list role-capability
pairs to associate with created output documents. Optional. If
not set, the default permissions for
output.username
are used. Relevant
only when using MarkLogic Server for output with
ContentOutputFormat
.
Example:
<property> <name>mapreduce.marklogic.output.content.permission</name> <value>dls-user,update,dls-user,read</value> </property>
See "URI Privileges and Permissions on Documents" in the Understanding and Using Security Guide for more information about roles and capabilities.
If the property value includes a comma in embedded in the role name, you must set this property in your code, rather than in a configuration file.
static final String OUTPUT_QUALITY
"mapreduce.marklogic.output.content.quality"
)
which, if set, specifies the document quality for created
output documents. Optional. Relevant only when using MarkLogic
Server for output with ContentOutputFormat
.
Quality affects the search relevance of a document. The value must be a positive or negative integer. For more information about document quality, see "Relevance Scores: Understanding and Customizing" in the Search Developer's Guide.
static final String OUTPUT_STREAMING
"mapreduce.marklogic.output.content.streaming"
)
which, if set, specifies whether to use streaming to insert
content. When streaming is set to true, the content will
not be fully buffered in memory, hence will consume less
memory but will disable auto-retry if there is a problem
inserting the content.static final String OUTPUT_CLEAN_DIR
"mapreduce.marklogic.output.content.cleandir"
)
which, if set, indicates whether or not to remove the output
directory. Only applicable to ContentOutputFormat
.
Default: false.
When set to true, the output directory specified by the
output.content.directory
property
is removed. When set to false, an exception is thrown if
the output content directory already exists.
static final String OUTPUT_FAST_LOAD
"mapreduce.marklogic.output.content.fastload"
)
which, if set, indicates whether or not to use the fast load mode
to load content into MarkLogic. Default: false.
Setting it to true when the documents to be loaded already exist may cause XDMP-DBDUPURI error if the original documents were inserted when the database had a different forest count. The fast load mode will always be used if "mapreduce.marklogic.output.content.directory" is set.
static final String NODE_OPERATION_TYPE
"mapreduce.marklogic.output.node.optype"
)
which, if set, indicates what node operation to perform
during output. Required if using MarkLogic Server for output
with NodeOutputFormat. Valid choices: INSERT_BEFORE, INSERT_AFTER,
INSERT_CHILD, REPLACE.NodeOpType
,
NodeOutputFormat
,
Constant Field Valuesstatic final String OUTPUT_PROPERTY_ALWAYS_CREATE
"mapreduce.marklogic.output.property.alwayscreate"
)
which, if set to true, causes PropertyOutputFormat
to create document properties for reduce output
key-value pairs even when no document exists with
the target URI. Default: false.
By default, PropertyOutputFormat
does not create a
property for a document URI unless the document already
exists.
static final String OUTPUT_NAMESPACE
"mapreduce.marklogic.output.node.namespace"
)
which, if set, indicates the namespace used for output.
This is used only in NodeOutputFormat, and is used for
resolving element names in the node path.static final String EXECUTION_MODE
"mapreduce.marklogic.mode"
)
which, if set, indicates whether the job is running in local or
distributed mode.static final String MODE_DISTRIBUTED
static final String MODE_LOCAL
static final long DEFAULT_MAX_SPLIT_SIZE
input.maxsplitsize
is not specified.static final long DEFAULT_LOCAL_MAX_SPLIT_SIZE
input.maxsplitsize
is not specified
and running in local mode.static final String PROPERTY_OPERATION_TYPE
"mapreduce.marklogic.output.property.optype"
)
which, if set, indicates what property operation to perform
during output when using PropertyOutputFormat
. Ignored
if not using PropertyOutputFormat
. Optional. Valid choices:
SET_PROPERTY, ADD_PROPERTY. Default: SET_PROPERTY.static final String DEFAULT_PROPERTY_OPERATION_TYPE
static final String CONTENT_TYPE
"mapreduce.marklogic.output.content.type"
)
which, if set, indicates type of content to be inserted when using
ContentOutputFormat. Optional. Valid choices: XML, JSON, TEXT, BINARY,
MIXED, UNKNOWN.
Default: XML.static final String OUTPUT_KEY_TYPE
"mapreduce.marklogic.output.keytype"
)
which, if set, specifies the data type of the
output keys for KeyValueOutputFormat
. Optional.
Default: xs:string.static final String OUTPUT_VALUE_TYPE
"mapreduce.marklogic.output.valuetype"
)
which, if set, specifies the data type of the map
output value for KeyValueOutputFormat
.
Optional. Default: xs:string.static final String OUTPUT_QUERY
"mapreduce.marklogic.output.query"
)
which, if set, specifies the statement to execute against MarkLogic
Server. This property is required for KeyValueOutputFormat.
The statement is allowed to declare and refernce two external variables "key" and "value" under namespace "http://marklogic.com/hadoop", which will be bound by the connector with the output key and value in the user specified data type.
static final String OUTPUT_KEY_VARNAME
static final String OUTPUT_CONTENT_LANGUAGE
"mapreduce.marklogic.output.content.language"
) which, if set,
specifies the language name to associate with inserted documents. A
value of en
indicates that the document is in english. The
default is null, which indicates to use the server default.static final String OUTPUT_CONTENT_NAMESPACE
"mapreduce.marklogic.output.content.namespace"
) which, if set,
specifies the namespace to associate with inserted documents. The
default is null, which indicates that the default namespace should
be used.static final String OUTPUT_VALUE_VARNAME
static final String OUTPUT_XML_REPAIR_LEVEL
"mapreduce.marklogic.output.content.repairlevel"
)
which, if set, specifies the document repair level for this options object.static final String OUTPUT_PARTITION
"mapreduce.marklogic.output.partition"
)
which, if set, specifies the partition
where output documents are created.static final String OUTPUT_URI_REPLACE
"mapreduce.marklogic.output.urireplace"
)
which, if set, specifies a comma separated list of regex pattern and
string pairs, 1st to match a uri segment, 2nd the string to replace
with, with the 2nd one in ''.static final String OUTPUT_URI_PREFIX
"mapreduce.marklogic.output_uriprefix"
)
which, if set, specifies a string to prepend to all document URIs.static final String OUTPUT_URI_SUFFIX
"mapreduce.marklogic.output_urisuffix"
)
which, if set, specifies a string to append to all document URIs.static final String DEFAULT_OUTPUT_XML_REPAIR_LEVEL
static final String DEFAULT_CONTENT_TYPE
static final String BATCH_SIZE
"mapreduce.marklogic.output.batchsize"
)
which, if set, indicates the number of records in one request.
Optional. Currently only applies to ContentOutputFormat.static final int DEFAULT_BATCH_SIZE
static final int DEFAULT_TXN_SIZE
static final String TXN_SIZE
"mapreduce.marklogic.output.transactionsize"
)
which, if set, indicates the number of requests in one transaction.
Optional.static final String ASSIGNMENT_POLICY
"mapreduce.marklogic.output.assignmentpolicy"
)
which, if set, indicates assignment policy for output documents.
Optional.static final String TEMPORAL_COLLECTION
"mapreduce.marklogic.output.temporalcollection"
)
which, if set, indicates temporal collection for documents.
Optional.static final String INPUT_QUERY_LANGUAGE
"mapreduce.marklogic.input.querylanguage"
)
which, if set, specifies the query language will be used for input query and split query.
Optional. Valid values: XQuery, Javascript.
Default: XQuery.static final String OUTPUT_QUERY_LANGUAGE
"mapreduce.marklogic.output.querylanguage"
)
which, if set, specified the query language will be used for output query.
Optional. Valid values: XQuery, Javascript.
Default: XQuery.static final String REDACTION_RULE_COLLECTION
"mapreduce.marklogic.input.redaction.rules"
)
which, if set, specifies a comma-separated list of
redaction rule collection URIs.
Optional. If not set, no data will be redacted.static final String COPY_COLLECTIONS
"mapreduce.marklogic.copycollections"
)
which, if set, specifies whether to copy document collections from
source to destination.static final String COPY_QUALITY
"mapreduce.marklogic.copyquality"
)
which, if set, specifies whether to copy document quality from
source to destination.static final String COPY_METADATA
"mapreduce.marklogic.copymetadata"
)
which, if set, specifies whether to copy document metadata from
source to destination.static final String INPUT_RESTRICT_HOSTS
"mapreduce.marklogic.input.restricthosts"
)
which, if set, specifies whether to restrict input hosts that
mlcp will connect to.static final String OUTPUT_RESTRICT_HOSTS
"mapreduce.marklogic.output.restricthosts"
)
which, if set, specifies whether to restrict output hosts that
mlcp will connecot to.static final long MIN_NODEUPDATE_VERSION
static final String SERVER_THREAD_COUNT
static final double THREAD_MULTIPLIER
Copyright © 2020 MarkLogic Corporation
Complete online documentation for MarkLogic Server, XQuery and related components may be found at developer.marklogic.com