Products: Hadoop Connector - MarkLogic Community

Hadoop is an open-source framework for distributed processing of large data sets across clusters of computers using simple programming models. When used with MarkLogic, Hadoop provides cost-effective batch computation and distributed storage.

Downloads

Connector 2.3.4 zip — 3.5 MB (SHA1)
Connector 2.3.4 source zip — 263 KB (SHA1)

Major Features

Stage raw data in HDFS and prepare, reformat, extract, join, or filter for use in interactive applications in MarkLogic
Enrich or transform data in situ in MarkLogic using Java and MapReduce, taking advantage of MarkLogic’s fast indexes and security model
Age data out of a MarkLogic database into archival storage on HDFS or transfer it in parallel to another system
Leverage existing MapReduce and Java libraries to process MarkLogic data
Operate on data as documents, nodes, or values
Access MarkLogic text, geospatial, value, and document structure indexes to send only the most relevant data to Hadoop for processing
Send Hadoop reduce results to multiple MarkLogic forests in parallel
Rely on the connector to optimize data access (for both locality and streaming IO) across MarkLogic forests
Support for secure HDFS

Getting Started

Requirements

The Connector for Hadoop is supported against the Hortonworks Data Platform (HDP) version 2.6 the Cloudera Distribution of Hadoop (CDH) version 5.8, and Mapr 5.1 The source is licensed under the commercial-friendly Apache 2.0 license and is freely available for inspection or modification.

Maven

Dependencies

<dependency>  
  <groupId>com.marklogic</groupId> 
  <artifactId>marklogic-mapreduce2</artifactId> 
  <version>2.3.4</version> 
</dependency>

HDFS Client Bundles

Customers can now download pre-packaged Hadoop HDFS client bundles and install them on your MarkLogic hosts. A bundle is available for each supported Hadoop distribution. Using one of these bundles is required if you use HDFS for forest storage.

HDFS Download Options

Downloads for MarkLogic 10.0-4:

Client Bundle for CDH 5.8

Downloads for MarkLogic 9.0-12:

Client Bundle for CDH 5.8

Downloads for MarkLogic 8.0-9:

Downloads for MarkLogic 9:

Connector 2.2.12 zip — 3.5 MB
Connector 2.2.12 source zip — 311 KB

Downloads for MarkLogic 8:

Connector 2.1.9 zip — 1.6 MB
Connector 2.1.9 source zip — 250 KB

Get started with the MarkLogic Connector for Hadoop by learning about how to deploy the Connector with a MarkLogic Server Cluster and making a secure connection to the MarkLogic Server with SSL.

Read the Documentation

Review the procedures for installing and configuring Apache Hadoop MapReduce and the MarkLogic Connector for Hadoop.

Read the Documentation

This module provides helper functions for creating advanced mode input and split queries for the MarkLogic Connector for Hadoop.

Read the Documentation

Product

Hadoop Connector

Downloads

Major Features

Getting Started

Requirements

Maven

Dependencies

HDFS Client Bundles

HDFS Download Options

Documentation

Stay on top of everything Marklogic.

This website uses cookies.