The GEO Z39.50 profile (http://www.blueangeltech.com/standards/GeoProfile/geo22.htm) is unusually complex and powerful, dealing as it does with many concepts not normally encountered by more typical, library-oriented, applications of Z39.50. Its attribute set (http://www.blueangeltech.com/standards/GeoProfile/annex_a.htm) includes support for complex non-bibliogaphic queries.
Some months ago, the GEO community expressed interest in using SRW as its web-services protocol. The only significant barrier to its adoption was a perceived lack of expressive power in SRW's query language, CQL. In particular, CQL as currently defined (www.loc.gov/zing/cql/cql-syntax.html) does not provide a way to express GEO relations such as ``Overlaps'', ``Enclosed'' and ``Before or During'', nor term structures such as ``Coordinate string'' and ``Date string''.
The GEO community's first proposal to remedy this deficiency was to make the Type-1 query a part of SRW. The idea was to introduce some way to encode a Z39.50 Type-1 query so that it could be transmitted as part of an SRW search request - either by base64-encoding the query's BER code, or by translating the tree into equivalent XML. Under this arrangement, the GEO community could continue to use its existing attribute set with the new protocol.
There was feeling among the CQL developers that, while this would solve GEO's immediate problems, it would constitute a wasted opportunity to broaden the scope and applicability of CQL. Since CQL is intended to be a fully general-purpose and expressive query language, its developers felt that it should be extended to cater for the GEO requirements, rendering the use of the Type-1 query unnecessary.
The CQL developers accordingly developed an informal proposal to add a set of new relations and relation modifiers to CQL, supporting specific GEO requirements. This approach involved several new keywords (within, overlaps, ISOdate, etc.)
It was hard to detect much enthusiasm from this approach from the GEO community. After some discussion, it became apparent that part of the problem was the inflexibility of this approach: it relies on the CQL developers corrects anticipating and interpreting all of the GEO community's needs.
A further weakness with this approach is that the same process would need to be gone through every time another new community needed CQL extensions in order to express its queries. In contrast, Z39.50's notion of an attribute set allows independent communities to unilaterally invent new kinds of relation and term-structure as required, without reference to a central governing body (the ZIG, in this case). This flexibility is seen as highly desirable.
The new proposal is not GEO-specific, but instead allows communities more flexibility in defining their own CQL semantics. It turns out to be surprisingly simple both to state and implement:
Under these rules, all previously valid CQL queries are still valid and have the same interpretation as they previously had.
Note: CQL does not have an explicit concept of a term-structure specifier, but relation modifiers fulfil this role neatly - just as, in programming languages like Perl, related but differing operators like = and eq indicate the type of their operands for the purposes of comparison. For, for example, the relation-and-modifier =/x.ISOdate might be defined to mean compare for equality, treating the term as a date in ISO format.
Assuming that the xyz context-set is in force:
dc.date = 2000-01-08 OK (has old meaning) dc.date any 2000-01-08 OK, though strange (has old meaning) dc.date foobar 2000-01-08 error ("foobar" is not in the CQL context-set) dc.date xyz.foobar 2000-01-08 OK dc.date xyz.foobar/ISOdate 2000-01-08 error ("ISOdate" is not in the CQL context-set) dc.date xyz.foobar/std.ISOdate 2000-01-08 OK dc.date and 2000-01-08 "and" is interpeted as a boolean, not a relation dc.date xyz.and 2000-01-08 OK ("xyz.and" is interpeted as a relation) dc.date paragraph 2000-01-08 syntax error: "paragraph" is a keyword dc.date xyz.paragraph 2000-01-08 OK foo or/xyz.exclusive bar either foo or bar but not both
Assuming nothing about the prevailing context-set:
>geo="http://www.blueangeltech.com/standards/GeoProfile/cql/" >dc="http://www.loc.gov/z3950/agency/zing/cql/dc-indexes.html" dc.subject all "dinosaurs tracks" and geo.location geo.within texas
An automatically generated example from Eliot Christian's SRU Geospatial Search Demo page at www.gils.net/sru-geo.html:
>geoIndex="http://www.blueangeltech.com/Standards/GeoProfile/annex_a.htm#Use%20Attributes" >geoRelation="http://www.blueangeltech.com/Standards/GeoProfile/annex_a.htm#Relation%20Attributes" >geoStructure="http://www.blueangeltech.com/Standards/GeoProfile/annex_a.htm#Structure%20Attributes" (geoIndex.title geoRelation.match/geoStructure.phrase "sickle claw") and (geoIndex.timeperiod =/geo.date "19680312,19980318") or (geoIndex.northbc < "78") and (geoIndex.coordinates geoRelation.overlap/geoStructure.coordinate "-106.7,25.8,-93.5,36.5")
This proposal is accepted. Having been approved by all five authors, it was be submitted to the SRW working group for consideration. It was approved for inclusion in CQL 1.1 (part of SRW 1.1) at the meeting of 25th-26th September 2003.
We know of three CQL parsing implementations (all of them free software):
As proof of concept, support for the extension described in this document has been added to all three of these implementations. It is available for download in release 0.7 of CQL-Java and release 2.0.4 of YAZ.
Rob's sample GEO-profiled CQL-to-Z39.50 gateway is at http://srw.o-r-g.org:8080/metar/docs.html