[MarkLogic Dev General] Multiple (sequential) CPF pipelines

Geert Josten geert.josten at dayon.nl
Wed Aug 22 12:28:14 PDT 2012


Hi David,



Looks to me like this should work. State changes at the end of each
pipeline, at which all updates are committed to the database of MarkLogic.
At that point the CPF framework notices another update, checks state, looks
for the appropriate pipeline and calls it.



I thought the state should remain at what the last pipeline that is run
assigns to it. Have you tried checking the doc properties of the id list
file? You may also want to check ErrorLog.txt for anomalies, and you can
also enable CPF traces for extensive logging messages. I recall you could
add ‘CPF’ to the list of traces of the Group on the Diagnostics page in the
Admin interface.



Kind regards,

Geert



*Van:* general-bounces at developer.marklogic.com [mailto:
general-bounces at developer.marklogic.com] *Namens *Fox, David
*Verzonden:* woensdag 22 augustus 2012 21:14
*Aan:* general at developer.marklogic.com
*Onderwerp:* [MarkLogic Dev General] Multiple (sequential) CPF pipelines



Hello,



I have two pipelines that I want to execute on a single file. This file
contains a list of document Ids, something like:



<items>

    <item>doc1.xml</item>

    <item>doc2.xml</item>

      ...

</items>



The following pipeline executes 'q1.xqy' which iterates through these
documents and does something for each file:



<?xml-stylesheet href="/cpf/pipelines.css" type="text/css"?>

<pipeline xsi:schemaLocation="http://marklogic.com/cpf/pipelines
pipelines.xsd" xmlns="http://marklogic.com/cpf/pipelines" xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance">

  <pipeline-name>Query1</pipeline-name>

  <success-action>

    <module>/MarkLogic/cpf/actions/success-action.xqy</module>

  </success-action>

  <failure-action>

    <module>/MarkLogic/cpf/actions/failure-action.xqy</module>

  </failure-action>

  <state-transition>

    <state>http://marklogic.com/states/initial</state>

    <on-success>http://marklogic.com/states/intermediate</on-success>

    <on-failure>http://marklogic.com/states/error</on-failure>

    <execute>

      <condition>

        <module>/MarkLogic/cpf/actions/namespace-condition.xqy</module>

        <options xmlns="/MarkLogic/cpf/actions/namespace-condition.xqy">

                    <root-element>items</root-element>

                    <namespace/>

                </options>

      </condition>

      <action>

        <module>q1.xqy</module>

      </action>

    </execute>

  </state-transition>

</pipeline>



This part works fine. I now have a second pipeline, called q2.xqy, which I
want to execute only after q1.xqy has completed. It looks like:



<?xml-stylesheet href="/cpf/pipelines.css" type="text/css"?>

<pipeline xsi:schemaLocation="http://marklogic.com/cpf/pipelines
pipelines.xsd" xmlns="http://marklogic.com/cpf/pipelines" xmlns:xsi="
http://www.w3.org/2001/XMLSchema-instance">

    <pipeline-name>Query 2</pipeline-name>

    <success-action>

        <module>/MarkLogic/cpf/actions/success-action.xqy</module>

    </success-action>

    <failure-action>

        <module>/MarkLogic/cpf/actions/failure-action.xqy</module>

    </failure-action>

    <state-transition>

        <state>http://marklogic.com/states/intermediate</state>

        <on-success>http://marklogic.com/states/updated</on-success>

        <on-failure>http://marklogic.com/states/error</on-failure>

        <execute>

            <condition>


<module>/MarkLogic/cpf/actions/namespace-condition.xqy</module>

                <options
xmlns="/MarkLogic/cpf/actions/namespace-condition.xqy">

                    <root-element>items</root-element>

                    <namespace/>

                </options>

            </condition>

            <action>

                <module>q2.xqy</module>

            </action>

        </execute>

    </state-transition>

</pipeline>



Note it has the same condition as the first query, since the file it acts
on has not changed. I am trying to use state transitions to manage the
order in which each pipeline is executed. So the first pipeline should set
the state to 'intermediate' at which point the second pipeline should be
executed and finally set the state to updated. Currently, the second
pipeline does run, but the state never moves to ‘updated’.



I’m not sure this is even what I want, since I need the second pipeline to
wait for the first one to complete. Does the state change immediately or
wait until the query completes? If the former, then perhaps I should be
using processing status rather than state?



How do I create a second pipeline that only runs after the first one
completes?







David Fox

Vocabulary Developer

The Associated Press

dfox at ap.org

212.621.5491



The information contained in this communication is intended for the use
of the designated recipients named above. If the reader of this
communication is not the intended recipient, you are hereby notified
that you have received this communication in error, and that any review,
dissemination, distribution or copying of this communication is strictly
prohibited. If you have received this communication in error, please
notify The Associated Press immediately by telephone at +1-212-621-1898
and delete this email. Thank you.
[IP_US_DISC]



msk dccc60c6d2c3a6438f0cf467d9a4938
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20120822/f01e0cae/attachment-0001.html 


More information about the General mailing list