The basic unit of organization in MarkLogic is a Document, encoded in JSON likeand XML like The set of JSON keys, objects, and arrays, or XML elements and attributes you use in your documents is up to you. MarkLogic does not require adherence to any schemas.
MarkLogic also supports documents encoded in binary form or plain text as well. We refer to this encoding (JSON, XML, text, or binary) as the document's Format.
A document's URI is a unique key that must be chosen when you insert a document into the database.
Each document has a unique URI.
You use this URI to retrieve or refer to the document later. Typically document URIs begin
with a slash like
(NB: You will also see use of the term Fragment. Unless you've specifically enabled a feature called "fragmenting", a fragment is the same thing as a document, the basic unit of storage in MarkLogic. See this blog post for some additional explanation).
How does MarkLogic organize documents in the database? Logically, MarkLogic provides two concepts: Collections and Directories. You can think of collections as unordered sets. If you have a notion of tag as well, that may help. Collections can hold multiple documents and documents can belong to multiple collections.
Directories are similar in concept to the notion of directories or folders in file systems. They are hierarchical and membership is implicit based on the path syntax of URIs.
Beyond directories and collections, MarkLogic also provides role-based security and document-level permissions to help you organize your documents securely.
MarkLogic stores documents (and associated directories and collections) in logical structures called Databases. The on-disk storage for a database is organized in physical pieces called Forests and forests are, in term, broken up into smaller pieces, called Stands.
In order to support these interfaces, MarkLogic provides access through a variety of different App servers, each of which implements a specific networking protocol on a TCP port. MarkLogic provides a few App Servers out-of-the-box, but you will usually configure up one or more for use with your application.
HTTP App Servers
The most common of these are HTTP App servers, which map incoming HTTP requests to file paths on the server filesystem (or document URIs in an associated database). An HTTP App server listens for requests, then executes any XQuery code in the corresponding file (or document), and then sends a response.
This is much like what other Application servers do.
Java Servlet containers do this for Java, Apache HTTPD (with mod_php for example) does it for PHP, and so on.
In MarkLogic, the app server functionality is actually inside MarkLogic itself and the programming language is XQuery.
NB: Those coming from an understanding of Oracle and PL/SQL, may find it helpful to think of XQuery as MarkLogic's Stored Procedures language. It is natively understood and, although, it is not the only way, it is the lowest-level, most-efficient way to write code against MarkLogic.
XQuery is a W3C standard functional programming language, designed to query and transform collections of structured and unstructured data, usually in the form of XML, text and other data formats. It's a great language for a database and inside MarkLogic you'll find a highly optimized and tuned XQuery interpreter.
Beyond the standard functions in the W3C spec, MarkLogic provides a large number of its own, covering:
- HTTP requests and responses
- Database create, read, update, detele (CRUD)
- Full-text search, spelling, thesaurus
- File system access
- String manipulation
- JSON, XML, Binary formats
- Math and cryptography functions
- Configuration, monitoring, and administration
- HTTP and SMTP clients
- And much more...
If you choose to, you can script an entire application in XQuery the same way you might in PHP or Python. But, you don't need to. You can connect to MarkLogic in ways that fit into your environment, without using XQuery yourself, at all.
Historically, XQuery has been the main (and for some time, only) programmatic interface to MarkLogic. As you'd expect, a lot of MarkLogic documentation and examples use XQuery. While we are working to genericize these, to gain the best understanding today, you may find it helpful to pick up a little XQuery, even if you do not plan to use it.
For more details, see the API reference.
REST API Instances
MarkLogic provides a rich, extensible REST API. To use it, you configure a REST API instance, which is a specialized HTTP App Server. When you configure a REST API instance for your database, MarkLogic
- Creates a separate, small database,
- Installs a copy of the REST API XQuery implementation into that separate database, and
- Configures an HTTP App Server to point to this copy for code execution against your database
You then point your client at the port of the HTTP app server. (See here for a video).
In addition to HTTP App Servers, MarkLogic provides the following App server types:
- Similar in concept to an RDBMS JDBC Server. It provides database access and adhoc XQuery code execution. MarkLogic provides Java and .Net XCC clients for XDBC.
- WebDAV servers support the WebDAV protocol to allow WebDAV clients to have read and write access (depending on the security configuration) to a database. A WebDAV server only accesses documents and directories in a database.
- This app server provides a standard ODBC interface for use with the MarkLogic-provided ODBC Driver. With it, you can issue SQL queries over relational-style data resident in MarkLogic.
For more detail, see the official Concepts Guide.