What is secure HDFS?

A component of Hadoop Secure Mode feature set. Each user and service requires authentication before accessing the HDFS filesystem.

Why is secure HDFS important?

By default, Hadoop runs in non-secure mode, where no user authentication is required. Hadoop Secure Mode enforces authentication for each user and service by Kerberos in order to use Hadoop services.

What MarkLogic integration components support secure mode?

All three of MarkLogic’s Hadoop integration points support Kerberos authentication:

  1. MarkLogic Content Pump (mlcp)
  2. MarkLogic Connector for Hadoop
  3. MarkLogic Server (i.e. forests on HDFS)

Which Hadoop distributions does MarkLogic 9 support?

Cloudera 5.8, Hortonworks 2.4, MapR 5.1 (only MapR-FS is supported).

What about Spark?

MarkLogic does not need a native Spark connector. MarkLogic integrates directly with HDFS and the Hadoop Connector can read and write Hadoop-compatible datasets. As part of the Apache universe, Spark is also able to read and write Hadoop-compatible datasets. This common language can we exploited without the need for a native connector.

Learn More

Hadoop Integration

Learn how Hadoop helps with effectively manage massive amounts of structured and unstructured data, and why unlike traditional databases, MarkLogic is the best database for Hadoop.

Hadoop Documentation

This documentation explains the process of installing and configuring Apache Hadoop MapReduce and the MarkLogic Connector for Hadoop, and walks through running sample applications.

Connector for Hadoop

You can now download a pre-packaged Hadoop HDFS client bundle and install this bundle on your MarkLogic hosts. A bundle is available for each supported Hadoop distribution.

This website uses cookies.

By continuing to use this website you are giving consent to cookies being used in accordance with the MarkLogic Privacy Statement.