XQuery Coding Guidelines

by Micah Dubinko

Code gets read more often than it gets written, so how it gets laid out on the screen is a critical component of maintainability. When a project involves more than one person checking in code, it gets even more important.

So here's a peek inside the MarkLogic Engineering team's process -- our set of agreed-upon guidelines for formatting XQuery code. It's arranged in a laws-of-robotics sequence with the cardinal rule first. Following rules fill out the details, but can't override the earlier rules.

It's my hope that this is a useful resource for XQuery programmers everywhere.   The intent of this guide is to provide a few key rules for sensible application, rather than exhaustive enumeration.


Rule 0

Thou shalt not mix tabs and spaces. The entire project must exclusively use spaces (4) for indentation, except for third party libraries (e.g. XSLTforms, YUI).


Rule 1

Structure function signatures using the following example:

Rule 2

Use common sense and make code readable on the screen unless it would conflict with rule 0 or 1.


Rule 3

If editing existing code, adopt a style to fit in with what's already there, unless it would conflict with rules 0, 1 or 2.


Rule 4

If/Then/Else or For/Let/Where/Order by/Return statements should either fit on a single line, or else have the aforementioned keywords left aligned, unless it would conflict with rules 0, 1, 2 or 3.

Examples:

 

Rule 5

Use a consistent amount of indentation as the rest of the project (4 spaces), unless it would conflict with rules 0, 1, 2, 3 or 4.

 

Rule 6

Use sparing comments to indicate the intent of blocks of code; if the project uses XQDoc-style comments do this also, unless it would conflict with rules 0, 1, 2, 3, 4 or 5.


Rule 7

Use short but meaningful variable and function names. Use a default prefix instead of fn:, and always use a prefix for defined functions unless it would conflict with rules 0, 1, 2, 3, 4, 5 or 6.

Comments

  • Blow hards talking common sense as if no one else has ever thought of this....
  • Worth mentioning xdmp:pretty-print(). xdmp:pretty-print("for $x in 1 to 10 return $x") => "for $x in 1 to 10 return $x"  
    • Good you mention it, I was about to.. :-) Does pretty-print conform to the rules? ;-)
  • Good initiative! Personally, I think indentation helps identify code blocks. So +1 for a line-feed+indent after := when followed by a multi-line value definition. I practically always write the if/then/else as a multi-line, though. If you insist writing the then on the next line (I always put it after the if (..) though), add an extra line-feed+indent after it. Idem dito for the else, unless it is really short, as in else (). For the same reason I tend to add a line-feed + indent after a return as well. Or a double line-feed and no indent if I expect a lot of returns after each other.. I also like the guideline on xqdoc.org, but it seems inaccessible at the moment: http://xqdoc.org/xquery-style.html Kind regards, Geert
    • I prefer the else/if blocks to line up with the precieding if.  Most common use case (for me) of if/else/if/else/if is to simulate a "switch" where all logic is the same level.   Indenting for every if makes this intention unclear. like. if( $type eq 'a') then     foo() else if( $type eq 'b' ) then     bar() else    spam()
      • I'm of a similar mind here for that repetitive case, although I might put the "else" at the beginning of the line rather than on its own line. Since I'm not on a big engineering team, I tend to tailor my indentation choices to maximize the communication of my intentions in each case (always experimenting here). Whitespace is one of the paints on my palette. ;-)
        • Interesting point you make here. The above rules should not concern personal taste, but serve higher purposes, like: make the code easier to understand and thus easier to maintain by others.A lot of words are being spent on when and where to apply whitespace. And not for no reason: I personally think that proper indentation can help to understand the flow of an entire block of code even from the very first glance. The 'left-margin' plays an important role in that. I also think that more line-feed+indent can help to visually distinguish flow-control versus that being controlled, or left and right operands of an operator, particularly one like the assignment. Unless very short, I find a let statement easier to read like this: let $myvar as element(myelem)? :=         (parentelem/myelem[empty(somecondition)])[1] Cheers..
        • Yes looking at my 'historic' code I tend to actually use if( $type eq 'a') then     foo() else if( $type eq 'b' ) then     bar()else     spam:   which is closer to what i might do in Java or C or sh (sh has an elseif which is even a closer match to the intention). OTOH if it is TRULY a nested else ... then I could agree with the parent where that should be indented. But if the intent of "else if" is to represent a "elseif" then I favor keeping the same indentation.
  • Rule -0.5:  Thou shalt spell "guidelines" correctly. No, but seriously -- this is nice, thanks for it.  I've railed against tabs my whole career. 
    • Thanks to both of you.  Serves me right for posting this up quickly before a holiday.  The blame should go to me and not Micah - my bad fingers here
  • For someone insisting on consistent 4 space indentation, you're using a lot of 5 space indentation ;-)