[MarkLogic Dev General] fields where the contents is a URL
Grobstein, Spike
Spike.Grobstein at wmg.com
Fri Nov 6 13:32:57 PST 2009
My goal for this is to do a query and get back a sequence of all values
that are in that field. I have a lot of documents that contain:
<param type="profile_url">http://www.blah.com/profile/path</param>
I was doing a query where I was requesting all documents that were
within a date range (we've got elements that contain a datestamp) that
were for a specific site (ie: facebook), then pulling unique values from
the above field, but I was having speed and memory usage issues... I
kept getting the cachefull exception.
I really need to be able to get a list of values that are in that field
so I can create an index page and also do faster queries from that.
Fields were working great until I tried to use it on URLs.
Any other suggestions?
...spike
________________________________
From: general-bounces at developer.marklogic.com
[mailto:general-bounces at developer.marklogic.com] On Behalf Of Frank
Rubino
Sent: Thursday, November 05, 2009 2:01 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] fields where the contents is a URL
Spike-
I think you should look at a different way to index the url. For
instance, can you set up a range index with a scalar type anyURI?
Frank
On Nov 5, 2009, at 11:56 AM, Grobstein, Spike wrote:
I've got a Field configured in my database that I want to do
field-words() queries against, but the contents of the element is a URL.
It seems that when I do searches, the field is the URL broken up by
symbol. For example:
http://www.facebook.com/Seal?sid=01cfb667e33bd4a46d3460853fbf3fe7&ref=se
arch
is translated into the following fieldwords:
* http
* www
* facebook
* com
* Seal
* Sid
* 01cfb667e33bd4a46d3460853fbf3fe7
* ref
* search
Is there a way around this? Should I not be using Fields?
I need to be able to do queries based on the full URL.
Thanks!
...spike
Spike Grobstein
_______________________________________________
General mailing list
General at developer.marklogic.com
http://xqzone.com/mailman/listinfo/general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://xqzone.marklogic.com/pipermail/general/attachments/20091106/fd2452cf/attachment.html
More information about the General
mailing list