In part 1, we highlighted some of the basic features of the new docs.marklogic.com. Now let's take a quick look under the hood. Powering this application is—you'll never guess it—MarkLogic! In almost all cases, each page corresponds to one document in the database. This is a good guideline in general for search applications. Although the Search API can return results at sub-document levels, the most common and natural approach is for there to be a 1:1 correspondence between documents (fragments) and search results.
This is all well and good, but what if your content isn't already in the format and structure you want? You have a couple of choices (touched on in "Navigating a jungle of data"):
- write the brute-force XQuery & XSLT to adapt the content to the output you want, at run time
- write pre-processing code so that, as much as possible, the content loaded into the database suits the needs of the application.
I took the latter approach for a couple of reasons:
- The application code was much easier to write, since the data was in a natural structure for building a search application.
- The application code ran much faster since all the heavy lifting had already been completed during the "build" phase.
If you're curious, you can take a look at the master script that kicks off the complete build. This includes a lot of heavy-duty XSLT for converting content from the format it was originally authored in.
Once the content is in place (build completed), it's now time for the application code to do its work (at run time). When you make a request to a page on docs.marklogic.com (as well as on developer.marklogic.com), the URL is mapped to the underlying document in the database. That document is then transformed by an XSLT stylesheet (page.xsl, which imports many other XSLT scripts used by the developer site). In fact, although the look and feel has changed, the basic architecture of the site hasn't changed for two years. (See A peek inside RunDMC, part 1 and part 2.) One of the advantages of using XSLT here is that we can re-use all the existing template rules for rendering the rest of the developer site but override specific ones to achieve a different effect on docs.marklogic.com.
So far we've talked about:
- getting the content in place (build process)
- transforming it via XSLT (at run time)
Now let's briefly touch on what the build- and run-time code outputs. For the table of contents on the left, we're using the jQuery Treeview plugin:
The HTML of the TOC itself is pre-generated at build time, since it doesn't need to change at run time. An additional optimization is that parts of the TOC are lazily loaded as pre-rendered static HTML files (also generated at build time).
The tabs for switching between TOCs is implemented using the jQuery UI Tabs plugin:
An early (and still common) approach to
developing web applications was to load one container page,
including the header, footer, TOC, navigation, etc., and then use
AJAX calls to grab the content, each time the user clicks a link.
The advantage of this approach is that each page loads very
quickly, since the browser doesn't have to download the whole
template for each page. In this scenario, the base URL in the
browser doesn't change, but only the fragment identifier, using
what's called the "hash-bang" technique, e.g.,
#!mypage. The disadvantage is that, to put it
this approach breaks the Web.
- clean URLs that change when you go from page to page (no hash-bangs)
- fast page loads via Ajax
I hope you enjoyed this look under the hood! Let me know if you have any questions.
Note: not only is the application code open-source, but our issues list is too. If you notice anything wrong with the behavior of our online docs, or want to suggest an improvement, we'd love it if you could report it by submitting a new issue on GitHub.