[MarkLogic Dev General] element-word-match

Danny Sokolsky Danny.Sokolsky at marklogic.com
Tue Jul 14 11:50:08 PDT 2015


Thanks Andreas, and thanks Geert.

Another way to get around this is to pass an empty and-query to the word lexicon call.  Something like:

cts:element-words(xs:QName("title"), "", (), cts:and-query( () ) )

That is not needed for range index lexicon calls, but appears to be needed for word lexicon calls in order to filter out deleted fragments when running as a user with the admin role.

I filed a bug on this end for this behavior as I don't think you should need to do that.  At any rate, pass in the empty and-query (which matches everything) and that will solve your issue.

Thanks,
-Danny 

From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Andreas Hubmer
Sent: Tuesday, July 14, 2015 12:27 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] element-word-match

Hi,

In my applications I never use the admin role. But in the unit tests it is acceptable for me because there I it would often mean more work to switch users between setup/tearDown and the actual tests (the application user does not have all the rights needed for the test setup). My security model is tested in separate integration tests with the application user.

I've boiled the problem down to the following steps:
1) create an element word lexicon for <title>
2) Insert and afterwards delete a document:
xquery version "1.0-ml";
xdmp:document-insert("/tmp.xml", 
                <document>
                        <title>puzzling</title>            
                </document>, 
                xdmp:permission("some-non-admin-role", "read"))
;
xdmp:document-delete("/tmp.xml")
3) Run a lexicon query (cts:element-words or cts:element-word-match for "*"):
cts:element-words(xs:QName("title"))

Geert, as you said, the problem concerns only the admin user. Running the lexicon query with "some-non-admin-role" returned an empty sequence.

In my unit tests I'll try to use xdmp:merge with the current request timestamp as workaround. This should help at least on the Linux cluster. But on my local Windows machine merging did not help.

I guess the reason for this anomaly is performance for the admin role. But for me it is unexpected behaviour. Performance should not outplay correctness.

That said, thanks a lot for your support!

Best regards,
Andreas


2015-07-14 1:07 GMT+02:00 Danny Sokolsky <Danny.Sokolsky at marklogic.com>:
I don’t think that it should make a difference running as admin.  I think that used to be the case but it has worked that way for quite a while.
 
Andreas, do you have a simple test case you can share that shows this?
 
Thanks,
-Danny
 
From: general-bounces at developer.marklogic.com [mailto:general-bounces at developer.marklogic.com] On Behalf Of Geert Josten
Sent: Monday, July 13, 2015 11:26 AM
To: MarkLogic Developer Discussion

Subject: Re: [MarkLogic Dev General] element-word-match
 
I can’t tell exactly either, but your observations could be right that it only occurs with word and value lexicon lookups. Those start with values, and only look at fragments later (if you provide a cts:query for instance). A cts:search starts with fragments, so easier to ignore deleted fragments from the start..
 
Again, this is a rare anomaly, only observed with admin user. Normal applications should never run with admin user. It is a good practice to always look for a customized security model that hands out as little privileges as possible to users directly..
 
Cheers,
Geert
 
From: <general-bounces at developer.marklogic.com> on behalf of Andreas Hubmer <andreas.hubmer at ebcont.com>
Reply-To: MarkLogic Developer Discussion <general at developer.marklogic.com>
Date: Monday, July 13, 2015 at 5:26 PM
To: MarkLogic Developer Discussion <general at developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] element-word-match
 
Can you elaborate on "Deleted fragments are only visible to admin" ? In what case are deleted fragments visible to the admin? 
So far I haven't seen deleted fragments in query results, except for the lexicon lookup with cts:element-word-match. In two additional tests I've just seen that 
* words of deleted fragments are returned by cts:element-words
* words of deleted fragments are not considered in estimations of search result sizes: xdmp:estimate(cts:search(doc(), cts:element-word-query(xs:QName("title"), "Test"), "unfiltered"))
 
So, in the case of an unfiltered query deleted fragments are not visible any more, but in the case of lexicon lookups the deleted fragments seem to be visible. This is still puzzling to me.
 
Cheers,
Andreas
 
2015-07-13 16:14 GMT+02:00 Geert Josten <Geert.Josten at marklogic.com>:
Deleted fragments are only visible to admin, as security is bypassed for that user. That also gives you a slight performance gain. As a general rule of thumb, never run tests as admin..
 
Cheers,
Geert
 
From: <general-bounces at developer.marklogic.com> on behalf of Andreas Hubmer <andreas.hubmer at ebcont.com>
Reply-To: MarkLogic Developer Discussion <general at developer.marklogic.com>
Date: Monday, July 13, 2015 at 4:05 PM
To: MarkLogic Developer Discussion <general at developer.marklogic.com>
Subject: Re: [MarkLogic Dev General] element-word-match
 
xdmp:merge with <merge-timestamp>{xdmp:request-timestamp()}</merge-timestamp> does not make a difference on my local Windows machine. Afterwards there are still deleted fragments in the database.
 
I've also tested it on a Linux cluster and there xdmp:merge with <merge-timestamp>{xdmp:request-timestamp()}</merge-timestamp> does indeed remove the deleted fragments. Afterwards cts:element-word-match returns the expected empty list.
 
Nevertheless I am surprised that cts:element-word-match returns values from deleted fragments.
 
Cheers,
Andreas
 
2015-07-13 15:42 GMT+02:00 Christopher Hamlin <cbhamlin at gmail.com>:
On Mon, Jul 13, 2015 at 9:30 AM, Andreas Hubmer
<andreas.hubmer at ebcont.com> wrote:
> Hi,
>
>> I think in recent versions of ML it's the case that calling merge won¹t
>> necessarily merge right down to 0 deleted fragments?
> Yes, that seems to be the case. Even long after calling xdmp:merge the
> deleted fragments exist.

It won't normally discard deleted fragments right away:

https://help.marklogic.com/Knowledgebase/Article/View/193/0/unable-to-merge-all-deleted-fragments-on-forest-with-32gb-max-merge-size
_______________________________________________
General mailing list
General at developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general



 
-- 
Andreas Hubmer
IT Consultant
 
EBCONT enterprise technologies GmbH 
Millennium Tower
Handelskai 94-96
A-1200 Vienna
 
Mobile: +43 664 60651861
Fax: +43 2772 512 69-9
Email: andreas.hubmer at ebcont.com
Web: http://www.ebcont.com
 
OUR TEAM IS YOUR SUCCESS
 
UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d

_______________________________________________
General mailing list
General at developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general



 
-- 
Andreas Hubmer
IT Consultant
 
EBCONT enterprise technologies GmbH 
Millennium Tower
Handelskai 94-96
A-1200 Vienna
 
Mobile: +43 664 60651861
Fax: +43 2772 512 69-9
Email: andreas.hubmer at ebcont.com
Web: http://www.ebcont.com
 
OUR TEAM IS YOUR SUCCESS
 
UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d

_______________________________________________
General mailing list
General at developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general




-- 
Andreas Hubmer
IT Consultant

EBCONT enterprise technologies GmbH 
Millennium Tower
Handelskai 94-96
A-1200 Vienna

Mobile: +43 664 60651861
Fax: +43 2772 512 69-9
Email: andreas.hubmer at ebcont.com
Web: http://www.ebcont.com

OUR TEAM IS YOUR SUCCESS

UID-Nr. ATU68135644
HG St.Pölten - FN 399978 d


More information about the General mailing list