Building a faceted search application

As referenced in A Tale of Two Facets this write-up demonstrates how simple it is to develop and deploy a non-trivial faceted search and discovery application using MarkLogic's Application Builder. In less than 30 minutes, you can ingest data into an ACID compliant NoSQL database with government grade security. No up-front schema design or application-limiting assumptions are required. Readers can compare this approach to what’s described in “Faceted Search with MongoDB”.

We’ll build our faceted application using a database of “top-songs,” instead of books. Our songs are in XML format, however MarkLogic can identify facets in JSON, delimited text and a variety of other formats including RDBMS exports just as easily. MarkLogic is schema agnostic and performs indexing on ingestion.

A quick scan of one of the XML data files allows us to pick facets such as “artist,” “week (released)” and “genre.“ Developers can easily add facets for “writers,” “producers,” “song length” or any facet-able business object,  even those that become known at a later date.

Let’s begin by setting up the target database in MarkLogic. This step wasn’t covered in the mongoDB article and is illustrated here for completeness. After navigating to localhost:8000, the developer will

  1. Select the “Information Studio” tab
  2. Click the “+New Database” button
  3. Enter a database name such as mysongs in the text filed
  4. Click the “Create Database” button
  5. Click the “+New Application” button

After clicking the “+New Application” button, the developer is taken to the screen below, where, after ensuring that mysongs appears in the “Target database” dropdown, he can provide an “Application Name” of say MyFavoriteSongs. The “Create Application” button is clicked.

app builder

At this point the developer has access to a series of tabs enabling him to customize and deploy the faceted search application. For now, we’ll select the “Deploy” tab, accept the defaults and deploy an application shell to port 7778 on the localhost. The “Deploy” screen and the resulting application shell are shown below.

image2

 

image3

As you can see, with just a few clicks, we’ve deployed key elements for a faceted search and discovery application. A facet container appears on the left, a search bar on top, widgets containers are below the search bar, and a results panel appears below the widget containers.

 

Ingesting the song catalog data using the open source MarkLogic Content Pump (mlcp) command line tool is a simple process. The command can be issued from any linux, Mac OS X or Windows prompt. Linux will be used in this example:

 

$mlcp.sh –options_file options.txt

 

Following are the contents of the options.txt file:

 

IMPORT
-host
localhost
-port
8041
-username
admin
-password
admin
-input_file_path
/Users/mmalgeri/Documents/workspace/demos/top-songs/songs
-input_file_type
XML
-output_uri_prefix
"/topsongs/"
-output_uri_suffix
".xml"
-output_collections
"songs"
-filename_as_collection
true

 

Note that the port specified is 8041. This is the location of an instance-wide MarkLogic service that listens for data loading requests, into ANY database hosted on the MarkLogic server, from programs like mlcp.

 

After running the mlcp.sh command and refreshing the browser, we see that 1,155 songs have been loaded into our application, with basic URI links appearing in the results pane. It’s important to note that mlcp supports ingestion of billions of documents using its distributed loading features.

image4

A quick scan of headers in any of the data files allows a developer or business analyst to select objects on which to facet. In this example, using the admin interface for our mysongs database, we set up indexes for the week, genre, title, and artist facets.  Clicking the “ok” button allows us to return to the MarkLogic application builder to complete our work.

image5

In the “Assemble” screen, we select a pie chart widget to view artists and a horizontal bar graph to display genres.

image6

In the “Results” screen, we accept elements picked up by Application Builder to configure the “Title,” “Snippet,” and “Metadata” sections.

image7

Finally, in the “Appearance” screen, we select a text logo and change the “Skin” to “Dawn,” keeping things simple, although extensive customization is relatively easy.

image8

When the “Deploy” button is clicked, our faceted, search and discovery application is displayed.

image9

If a user clicks on “Mariah Carey” in the pie chart widget, note how the facets, widgets and result set are updated in the next screenshot.

image10

 



Also, note in the next screenshot, how facet values such as artist: "The Beatles" and week: 1964-02-08 can be typed into the search box. What’s not shown is how the facets and their respective values are displayed as search suggestions as the user types.

 

image11

 

Conclusion

 

The My Favorite Songs application took less then 30 minutes to build, which included iterations to correct typos in index creation and time to play with widgets.

 

Once the application is deployed, developers are not confined to working within MarkLogic’s app builder tool. Every html, css, javascript file and query module is available for editing in a developer’s IDE of choice and subsequent deployment of application changes to the MarkLogic server. 

Comments

  • What is p, i, descr in the snippet section. Where can I get the source data for this example. ?
    • p, i, and descr refer to XML elements in the data set, which App Builder should rely on to build snippets. The source data isn't publicly available, but there is one example document from the Top Songs set on GitHub: https://github.com/marklogic/top-songs/blob/master/java/TopSongs/src/data/song.xml. MarkLogic comes with a sample application using Oscars data. You could use that to get a similar experience working with Application Builder. See the Application Builder Developer's Guide: http://docs.marklogic.com/guide/app-builder/intro#id_21750.
      • Thansk @ David for the valuable information. Apologies for the delayed reply.
  • I am trying to find out the path or the location of the directory where marklogic's application stores the application files when you deploy the application but I unable to find the location. Can you please help me in the above application context if I build it through application builder and my marklogic server is location in c:\Marklogic , which directory I can find this application
    • I replied over on Stack Overflow: http://stackoverflow.com/questions/27825637/which-directory-does-marklogic-deploys-the-application-created-with-appbuilder.