Community blogging update: Big Data apps, XQuery tricks, code performance, etc.

by Evan Lenz

There has been a fair bit of MarkLogic blogging going on since the last community blogging update. We start with a couple of posts reflecting on the nature of Big Data problems. Philip Fennell reflects on the essential role that links play in garnering value from Big Data:

And Micah Dubinko observes that, as fast as it is to get up and running in MarkLogic, it's sometimes essential to do a bit of data modeling:

As mentioned in Micah's post, Alex Milowski has written a series of posts on using MarkLogic to experiment with big weather data. I'm looking forward to the next post in this series:

While we're talking hands-on fun with MarkLogic, check out Dave Erickson's brand new tutorial/challenge utilizing data from NPR. I really like the approach he's taking here. Rather than walk you through each little, granular step, he gives you a slightly higher-level description of the task and lets you figure out how to do each step using your own ingenuity and ability to search the MarkLogic documentation (hint: use the search bar on this site). I'm looking forward to working through these tutorials. Keep them coming, Dave!

Are you new to XQuery? Dave has some pointers for you too:

In the handy-tips-and-tricks department, check out Paxton Hare's post on gzip compression:

And Jon Cook's cookbook of common XQuery tasks. (For alternative, faster formulations of some of these, be sure to see the post's comment/reply by yours truly.):

Speaking of useful tidbits, my absolute favorite post this time around (because it was so enlightening to me) was Michael Blakeley's "Directory Assistance." I had wrongly assumed that you needed to have directory fragments to make use of functions like cts:directory-query() and xdmp:directory(). Read the post to understand what I'm talking about. This should be required reading:

As usual, we have a few insightful posts in the XQuery performance department:

Speaking of performance, Ryan Dew provides an update on his progress in creating a fast-performing in-memory-update XQuery module for MarkLogic:

Ryan has also been playing around with implementing XQuery 3.0 functions:

And, most recently, he has commended the use of XSLT for the view layer in MarkLogic applications, an approach I can't help but be partial to. I was a little skeptical of his point #3 (metaprogramming) until I realized that I'm doing exactly what he describes regarding JavaScript optimization in the code that runs the Community site. And reusing it too, which goes to his point #2 about "Partials". What can I say, XSLT is cool.

Speaking of project updates, Demian Hess provides a progress report on his fast-advancing "ML.NET" project:

A community blogging update would be incomplete without at least one installment from Dave Cassel—this time about a library for converting CSV to XML:

Last but not least, we have both a preview and a review of MarkLogic World 2012:

Please let me know if you're blogging or want to start blogging about MarkLogic, and I'll be sure to keep an eye out for the new, awesome content you write!