[MarkLogic Dev General] MarkLogic vs SQL Server search performance
GLindley at ABC-CLIO.com
Tue Dec 30 08:59:09 PST 2008
Thanks for your suggestions, Mike. See below.
> I strongly recommend pagination in your query: see
This greatly increases the performance, but there is a hitch. In my
case, there is a special requirement for the search results page that
all of the categories that have at least one matching record are to be
displayed. (Categories are things like map, image, biography, etc.)
I think this means that I have to loop through all matching records in
order to grab all of the matched categories... unless there is a way to
craft a fast search that only pulls out the categories. Then I could
combine the fast category search with the fast paginated search. I'll
explore that some more.
> As well as xdmp:query-meters(), you should consult
> xdmp:query-trace() - see
Here's the output from query-meters() and query-trace(). I didn't see
anything, except I'm not sure what the value of the <qm:elapsed-time>
element means. (The search took approximately 5 seconds to return.)
/eval line 1: Analyzing path for search: doc()
/eval line 1: Step 1 is searchable: doc()
/eval line 1: Path is fully searchable.
/eval line 1: Gathering constraints.
/eval line 1: Search query contributed 1 constraint:
cts:field-word-query("FullText", "president", ("lang=en"), 1)
/eval line 1: Executing search.
/eval line 1: Selected 4090 fragments
> Looking at the structure of your documents, I'd try storing
> each entry as a separate document. So your search would
> become /entry rather than/content/entry.
I removed the content element and created and loaded a separate document
for each record. This didn't change the performance, however.
> On 2008-12-29 11:58, Grant Lindley wrote:
> > I'm comparing full-text search performance between
> MarkLogic 4.0 and
> > SQL
> > Server2005 from a C# .NET web page.
> > So far searches take about twice as long in MarkLogic
> compared to SQL
> > Server, and I'm looking for suggestions to improve
> performance in ML.
> > The test data consists of 14,035 searchable records that
> take up 52 MB
> > in an XML text file.
> > Here's a sample record:
> > <content>
> > <entry entryId="121866">
> > <title>Alvar Aalto</title>
> > <sortTitle>Aalto, Alvar</sortTitle>
> > <searchTitle>Aalto, Alvar</searchTitle>
> > <synopsis>Finland's most distinguished designer, Alvar
> Aalto is
> > renowned for his building designs as well as for his unique
> > furniture designs that are the archetype of Finnish furniture.
> > </synopsis>
> > <mainText> Finland's most distinguished architect and
> designer, ...
> > [long text removed]</mainText>
> > <entryDate></entryDate>
> > <searchExclude>False</searchExclude>
> > <hyperlink>False</hyperlink>
> > <furtherReading>Alvar Aalto Museum Web Site
> > (http://www.alvaraalto.fi)</furtherReading>
> > <siteCredits>ABC-CLIO</siteCredits>
> > <citationCredits></citationCredits>
> > <citationCredits2></citationCredits2>
> > <accentUpdated>True</accentUpdated>
> > <category categoryId="22">
> > <displayTitle>Individuals</displayTitle>
> > <formOrder>30</formOrder>
> > <filterable>True</filterable>
> > <categoryTypeId>5</categoryTypeId>
> > <longDescription>Individuals</longDescription>
> > </category>
> > <subTopic subTopicId="62" topicId="3">
> > <displayTitle>Finland</displayTitle>
> > <description>Finland</description>
> > <sortOrder>-1</sortOrder>
> > </subTopic>
> > <topic topicId="3">
> > <description>Europe</description>
> > </topic>
> > </entry>
> > </content>
> > The elements that are included in the search are title, sortTitle,
> > mainText, and siteCredits.
> > For the MarkLogic index settings, I have selected only
> basic stemmed
> > searches and fast phrase searches.
> > The best results so far have been obtained when the entry
> element has
> > been added as a fragment root.
> > Here's the code currently being used to execute the search:
> > cts:search(fn:doc()//content/entry,
> > cts:field-word-query("FullText", "president"), "unfiltered" )
> > where "FullText" is a field that has been set up with the four
> > searchable elements above.
> > I tried running with xdmp:query-meters() and didn't find any cache
> > misses.
> > I'm experienced with SQL Server, but brand new to MarkLogic, so any
> > suggestions would be appreciated.
> > -Grant
> > --
> > _______________________________________________
> > General mailing list
> > General at developer.marklogic.com
> > http://xqzone.com/mailman/listinfo/general
> General mailing list
> General at developer.marklogic.com
More information about the General