[MarkLogic Dev General] Re: General Digest, Vol 58, Issue 28
Michael Blakeley
michael.blakeley at marklogic.com
Tue Apr 28 16:31:15 PDT 2009
Dave,
What's the typical size of a DESC element? It sounds like they must be
fairly large to have a noticeable effect on performance: 100-kB? 500-kB?
Do you anticipate needing to write queries that AND something in the
main document with terms in the DESC element? Perhaps an and-query of
the patent title with words in the body, or something like that?
I'm asking because the best solution might change according to the
features you need. If you need to write queries that join something in
the main document with something in the DESC element, then you may be
better off leaving the document un-fragmented. The index entries point
to fragments, so it's tricky to query terms that live in two different
fragments.
If you do not need to write such queries, then you should probably
contact support and see if we can come up with an appropriate solution.
-- Mike
On 2009-04-28 10:41, Dave Feldmeier wrote:
> Mike,
>
> You are correct - I meant to say that DESC is a fragment root, not a
> fragment parent.
>
> The reason that I fragment is for performance. In my application, I
> display a set of lines of bibliographic info (title, inventor, various
> dates, etc.), one line per document. I split the document into two
> pieces: bibliographic information and the main text (which is larger
> than the bibliographic info).
>
> After implementing fragmenting, the search performance is better because
> I need only fetch the bibliographic info and not then entire document. I
> also allow the user to sort on various columns, so if the user clicks on
> a column name, I repeat the most recent search terms but request the
> appropriate sort order for the returned results. I only return the first
> page of results and a page tends to be in the range of 20 to 30
> documents, depending on the user. It's this latter operation for which
> I'm trying to improve the performance.
>
> -Dave
>> Message: 2
>> Date: Mon, 27 Apr 2009 15:47:50 -0700
>> From: Michael Blakeley<michael.blakeley at marklogic.com>
>> Subject: Re: [MarkLogic Dev General] Questions about results ordering
>> with element range indices
>> To: General Mark Logic Developer Discussion
>> <general at developer.marklogic.com>
>> Message-ID:<49F63616.8080906 at marklogic.com>
>> Content-Type: text/plain; charset=UTF-8; format=flowed
>>
>> Dave,
>>
>> One correction: the fragment root behaves the way you've described. A
>> fragment parent on DESC creates a new sub-fragment for every child of
>> DESC. That could create many more fragments than you want.
>>
>> But I wonder why you decided to fragment these documents at all?
>>
>> -- Mike
>>
>> On 2009-04-27 13:30, Dave Feldmeier wrote:
>>
>>> Mike,
>>>
>>> Indexing is complete. The search element and sort element should be in the same fragment. An abbreviated form of my XML structure is:
>>> <PATENT>
>>> <PATNUM>
>>> <ASSS>
>>> several layers down<ASSS_AESNC>
>>> <DESC>
>>> other stuff
>>> All tags are unique at all levels of the XML hierarchy (e.g., PATNUM appears only at the top level and not within<DESC>).
>>>
>>> I have set<DESC> as a fragment parent. My understanding is that<DESC> and below will be one fragment and everything else will be in a second fragment. In this case, ASSC_AENSC and PATNUM shoud be in the same fragment, correct? Do I also need to set<PATENT> as a fragment root?
>>>
>>> In some cases,<DESC> does not exist and my guess is that it's for these cases in which the document has a single fragment that the ordering constraint help (only two documents in the example that I gave).
>>>
>>> Also, I have the default namespace. However, all documents in the system have the same XML structure, so I didn't think that there would be a problem.
>>>
>>> What am I missing here? Thanks.
>>>
>>> -Dave
>>>
>
> _______________________________________________
> General mailing list
> General at developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general
More information about the General
mailing list