CQL Grammar: The Next Generation

10th October 2002

THIS OUT OF DATE. DO NOT READ IT.
This is a from-scratch attempt to write a grammar for CQL, the Common Query Language being defined for the SRW protocol as a part of the ZING initiative.

I'm writing this without reference to the current grammar at lcweb.loc.gov/z3950/agency/zing/srwu/cql.html in the hope that I'll accidentally fix some bugs in that grammar, and knowing that I can fix the bugs in mine by reference to that one after the event.

Terminal symbols in this grammar are as follows:

Here it is:

query =   expr
| expr op expr
 
expr = term
| "(" query ")"
op = "and" | "or" | and-expr ]
prox-expr = term [ prox-op prox-expr ]
 
prox-op = "prox" [ "[" prox-modifiers "]" ]?
prox-modifiers = [ prox-unit ]
[ "/" [ order-relation ]
[ "/" [ prox-distance ]
[ "/" prox-order ] ] ]
prox-unit = "word" | "sentence" | "paragraph" | "element"
prox-order = "ordered" | "unordered"
prox-unit = NUMBER
 
term = "(" or-expr ")"
| [ qualifier relation value ]
 
qualifier = [ qualset "." ] index
 
relation = simple-relation [ ":" modifier ]*
simple-relation   = word-relation
| order-relation
word-relation = SYMBOL (e.g. "any", "all", "adjacent", "exact")
order-relation = "=" | "<" | "<=" | ">=" | ">" | "<>"
 
qualset = SYMBOL
index = SYMBOL
value = SYMBOL | STRING
modifier = SYMBOL (e.g. "fuzzy", "stem", "relevant")

Changes

The CQL defined by this grammar is different from the official one in that it allows so-called ``complex terms'' such as title=(foo and bar)

Feedback to <mike@indexdata.com> is welcome!