[MarkLogic Dev General] Can node libraries be installed server-side?

Will Lawrence will.lawrence at gmail.com
Mon Jul 13 20:48:58 PDT 2015


Thanks, Erik.

It helped me get in the right frame of mind when thinking critically on
where certain ingestion logic should reside. And thanks for digging into
the example of node-xslx and pointing out that it's async built on an
underlying sync library. I definitely looked at the binary extract for xslx
and the Open Office pipeline, but these seem to only allow rough grain text
searches. I need to be able to create indexes and create fine-grain queries
on the data. Plus, xslx has the nasty behavior of putting any repeated
strings into a separate sharedStrings.xml file and there didn't seem to be
any MarkLogic server side solution to remedy this. And I need to automate
or at least control the shredding process from an external tier as much as
possible because there will be a lot of different sets of xslx. I'm
thinking of massaging xslx into json, send to MarkLogic, and use CPF to
split each "row" into a document since the transform function can't do a
xdmp.documentInsert().

Ok, back to the node/npm/JavaScript libraries. Here's a knowledgebase page
<https://help.marklogic.com/knowledgebase/article/View/222/0/server-side-javascript-implementation-and-module-reuse>
I just came across that offers additional explanation that you pretty much
nailed. I've also included my troubleshooting steps in how to require a
library server side using the example of 'lodash.js'.

I tried to send lodash.js to modules database and then use it in in a
transform with `require(“lodash.js”)` statement, but it failed with:

> "message": "JS-JAVASCRIPT: var _ = require('lodash.js'); -- Error running
> JavaScript request: XDMP-NOEXECUTE: Document is not of executable mimetype.
> URI: lodash.js
>
So, I needed to write it as lodash.*sjs* and require(“lodash.*sjs*”). But
then this failed with:

> "message": "JS-JAVASCRIPT: var _ = require('lodash.sjs'); -- Error running
> JavaScript request: XDMP-MODNOTFOUND: Module lodash.sjs not found

To fix this, send as uri: “*/*lodash.sjs" and used with require(“*/*
lodash.sjs”).

Note: I used contentType: "application/vnd.marklogic-javascript” when
sending lodash.sjs to server and used the node.js client
api modulesDb.documents.write instead of the more specialized
db.config.extlibs.write
because I couldn't get the transform's require statement to work. Plus, the
former feels like it gives more flexibility without having to learn a
special set of write and read calls. Maybe my perspective will change on
this with time.


Regards,

Will

------------------------------

Message: 2
Date: Mon, 13 Jul 2015 02:55:54 +0000
From: Erik Hennum <Erik.Hennum at marklogic.com>
Subject: Re: [MarkLogic Dev General] Can node libraries be installed
        server-side?
To: MarkLogic Developer Discussion <general at developer.marklogic.com>
Message-ID:
        <DFDF2FD50BF5AA42ADAF93FF2E3CA185070EAF9D at EXCHG10-BE01.marklogic.com
>
Content-Type: text/plain; charset="iso-8859-1"

Hi, Will:

There are some significant differences between Node.js and MarkLogic as a
JavaScript runtime environment (even though both make use of v8).

First and foremost, Node.js emphasizes asynchronous IO.  As a transactional
database, MarkLogic emphasizes synchronous IO.  You can execute
asynchronous actions in MarkLogic (via the task server), but when you do an
xdmp.documentInsert(), the operation blocks until the operation succeeds or
fails.

Stepping back, the tier where you implement an action is not arbitrary.  In
the database, it's best to write short actions (similar to stored
procedure) for query expansion, query composition, inbound or outbound data
transformation, and so on.  The middle tier is great for information bus
operations, business logic, and so on.

With that perspective, the libraries that make sense to use as dependencies
for server-side JavaScript actions are those that finish synchronous
actions quickly.

For that reason, in the particular case, my guess would be that js-xlsx
(the core library wrapped by node-xlsx) might be a better fit for
server-side processing than node-xlsx (which adds asynchronous IO
conveniences that would not work in the server).

At present, you would need to either modify the mimetypes configuration to
identify *.js as an extension for server-side JavaScript (so the server
knows that it's not static JavaScript to send to the client) or rename the
library extension to sjs.

You could put the library in the modules database as described in:

    http://docs.marklogic.com/guide/rest-dev/extensions#id_55309

Then, require the library in your transform or main module.

The speculations about package management for such dependencies is very
interesting.

By the way, the server can extract metadata from spreadsheets without
installing an external library:


http://docs.marklogic.com/guide/search-dev/binary-document-metadata#id_74790


Hoping that helps,



Erik Hennum

------------------------------

Message: 1
Date: Sun, 12 Jul 2015 22:19:41 -0400
From: Will Lawrence <will.lawrence at gmail.com>
Subject: [MarkLogic Dev General] Can node libraries be installed
        server-side?
To: general at developer.marklogic.com
Message-ID:
        <CAGEHXqseoL3dqoGoBk-T6FZe6fx-M8DhyLbnLw3LC6T1c0mBpQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I tried but couldn't find any examples or guidance for using node libraries
within .sjs files on the MarkLogic server. How could we use, for example,
the npm module 'node-xlsx' in a transform?

It would be great to be able to leverage the power of the npm and node
micro-library ecosystem within .sjs files.

Perhaps there could be an .npmrc file controlled via the MarkLogic admin to
specify if the server is allowed to talk to registry.npmjs.com or an
enterprise npm registry or non at all. Then, a REST API could be exposed to
write dependencies to the MarkLogic's package.json that would automatically
do an 'npm install' so that when an .sjs file is installed, it can execute
the line:

```spreadsheetShredder = require('node-xlsx');

Regards,
Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://developer.marklogic.com/pipermail/general/attachments/20150712/60738f32/attachment-0001.html



On Sun, Jul 12, 2015 at 10:19 PM, Will Lawrence <will.lawrence at gmail.com>
wrote:

> I tried but couldn't find any examples or guidance for using node
> libraries within .sjs files on the MarkLogic server. How could we use, for
> example, the npm module 'node-xlsx' in a transform?
>
> It would be great to be able to leverage the power of the npm and node
> micro-library ecosystem within .sjs files.
>
> Perhaps there could be an .npmrc file controlled via the MarkLogic admin
> to specify if the server is allowed to talk to registry.npmjs.com or an
> enterprise npm registry or non at all. Then, a REST API could be exposed to
> write dependencies to the MarkLogic's package.json that would automatically
> do an 'npm install' so that when an .sjs file is installed, it can execute
> the line:
>
> ```spreadsheetShredder = require('node-xlsx');
>
> Regards,
> Will
>



-- 
William Lawrence
703-873-7035
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://developer.marklogic.com/pipermail/general/attachments/20150713/4d1f1e24/attachment.html 


More information about the General mailing list