MarkLogic Data Hub Service

Fast data integration + improved data governance and security, with no infrastructure to buy or manage.

Learn More


Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up


MarkLogic World 2019

Learn how to simplify data integration & build innovative applications. Join us in Washington D.C. May 14-15!

Find Out More


Stay On Top Of Everything MarkLogic

Be the first to know! News, product information, and events delivered straight to your inbox.

Sign Me Up

Learning the MarkLogic Java API

Evan Lenz
Last updated September 14, 2012

MarkLogic is an enterprise-class NoSQL database built on search engine technology. You can use it to store, search, and query massive amounts of data, represented as documents having various formats. MarkLogic exposes its core functionality through a Java API, allowing you to write applications in pure Java. The Java API makes use of a powerful underlying REST API for communicating with MarkLogic Server. This tutorial will walk you through a series of HOWTOs for working with MarkLogic exclusively through its Java API, using a series of sample apps that illustrate the use cases.

MarkLogic basics

The basic unit of organization in MarkLogic is the document. Documents can occur in one of four formats:

  • XML
  • JSON
  • text
  • binary

Each document is identified by a URI, such as "/example/foo.json", which is unique within the database.

As with files on a filesystem, documents can be grouped into directories. This is done implicitly via the URI. For example, the document with the URI "/docs/plays/hamlet.xml" resides in the "/docs/plays/" directory.

Documents can also be grouped (independently of their URI) into collections. A collection is essentially a tag (string) associated with the document. A document can have any number of collection tags associated with it.

MarkLogic is agnostic with regard to what document structures you use. For example, it is not necessary to provide a document schema of any sort. The one general guideline to keep in mind is that, in comparison to an RDBMS, documents are like rows. In other words, since documents are the basic unit of retrieval, given the choice, it's better to have a large number of small documents than it is to have a small number of large documents.

The Java API provides CRUD capabilities (Create, Read, Update, Delete) on documents. It also lets you perform tasks relating to search, query, and analytics. Search and query are about finding documents. Analytics is about retrieving values from across many documents and optionally performing aggregate calculations on those values. Where MarkLogic really shines is in the combination of search and analytics, providing such things as faceted navigation across your data.

We'll look at examples of each of these. But first, let's get everything set up. While you're certainly free to peruse this tutorial without running the examples, I highly recommend taking the time to install MarkLogic, download the tutorial project, and directly interact with the sample programs. Instructions for doing all of that are on the next page.


Stack Overflow iconStack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.


The commenting feature on this page is enabled by a third party. Comments posted to this page are publicly visible.
  • Thanks for giving such wonderful information about Java API. Excellent write up. Something different.-<a href="">Java collection Framework</a>
  • Great information! Thanks for posting.
  • Excellent!