Hadoop FAQ

Anthony Roach
Last updated August 24, 2017

What is secure HDFS?

A component of Hadoop Secure Mode feature set. Each user and service requires authentication before accessing the HDFS filesystem.

Why is secure HDFS important?

By default, Hadoop runs in non-secure mode, where no user authentication is required. Hadoop Secure Mode enforces authentication for each user and service by Kerberos in order to use Hadoop services.

What MarkLogic integration components support secure mode?

All three of MarkLogic’s Hadoop integration points support Kerberos authentication:

  1. MarkLogic Content Pump (mlcp)
  2. MarkLogic Connector for Hadoop
  3. MarkLogic Server (i.e. forests on HDFS)

Which Hadoop distributions does MarkLogic 9 support?

Cloudera 5.8, Hortonworks 2.4, MapR 5.1 (only MapR-FS is supported).

What about Spark?

MarkLogic does not need a native Spark connector. MarkLogic integrates directly with HDFS and the Hadoop Connector can read and write Hadoop-compatible datasets. As part of the Apache universe, Spark is also able to read and write Hadoop-compatible datasets. This common language can we exploited without the need for a native connector.

Comments