Notes from meeting of 25th September 2003

25th September 2003

List of issues is at http://lcweb.loc.gov/z3950/agency/zing/srw/srw1-1-proposals.html

0. Versioning
        0.1. Diagnostics (extra agenda item)
1. SRU: new operations on the same base record
2. Non escaped XML records
3. Client to request TTL
4. Complex content/qualified dublin core
5. Scan
        5.1. Echoed query in SRU (extra agenda item)
        5.2. Fetching only the record in SRU
6. Don't nill mandatory elements in response
7. SRU: URL to an xsl stylesheet
8. Explain diagnostics
9. Short names
10. Embedded XML for SRU
11. CQL Range Search
12. XPath Parameter
13. Record Identifier
14. handling structured data as an (X)CQL term
15. associating URL with prefix
16. Eliot's additional issues

0. Versioning

We can't use namespaces to indicate protocol version because (A) SRU doesn't use them in its request, and (B) the various server toolkits will just refuse to process the new namespaces if their WSDL only knows about the old namespace.

So we need to keep using the old namespace and send a new, mandatory, version parameter v instead, in both SRW and SRU.

By historical accident, the old SRW namespace has 1.0 in its URI: accordingly, we choose to change it just once, now, to a URI that has no version-number embedded, and keep using that URI for all future versions. v1.0 is therefore officially abandoned.

Client and server negotiate to a best mutually-supported version of the protocol. For example, if a client sends ``I speak version 1.5'', then:

The server includes in its response an element specifying the version it's using (i.e. the minimum of the client's and server's version).

Version numbers are of the form major.minor, so for example version 1.10 is higher than 1.9.

Auto-configuring generic SOAP clients, which fetch WSDL from the server to figure out what syntax to send, must also fetch the ZeeRex record in order to know the semantics. That ZeeRex record must also specify what version-number the client should send. (The version element v is now mandatory in ZeeRex records.)

A new diagnostic is required, ``Server does not support requested version'', with addInfo being the best supported version number.

0.1. Diagnostics (extra agenda item)

The diagnostic schema currently includes code (numeric) and details (analogous to Z39.50 addInfo). The schema needs to have a human-readable message element as well as these, mostly for SRU's benefit; and the schema documentation needs to have an extra column, ``interpretation of details''.

We need a new diagnostic 4, ``other error'' (analogous to BIB-1 diagnostic 100, to be used when we don't know what the problem is but have some human-readable text to pass on.)

1. SRU: new operations on the same base record

In the best of all possible words, we want the ``I am a Scan request'' indication to come after the ? in SRU URLs, so that relative URLs can easily be generated to related services: scan to search, search to update, etc.

Accordingly, we say that Scan is not a new SOAP service, but a new operation on the existing service. (This is nice not only because we can use relative URLs but also because we don't have to list lots of different endpoint URLs for a single database: one each for search, scan, update, etc.)

For SRW, it's easy to specify operation - SOAP says how to do it. For SRU, SOAP 1.1 doesn't give any guidance. But apparently SOAP 1.2 has specification for how to use GET, so we should find out how it does it and use that if it's useful to us. However, if it's not useful for SRU (e.g. because SOAP uses an HTTP header), then we'll have to roll our own way of specifying operation as a parameter, as in:

http://where.ever.com/magic/srw?op=scan&term=...

2. Non escaped XML records

Search requests to include a new recordPacking parameter, indicating how to embed records in responses. The default for SRW is string, and for SRU is xml. The server must provide the requested packing, or fail with a diagnostic.

We are not yet sure whether both forms can share the same recordData container, or whether the non-encoded, embedded XML record should go in a different container element. We mildly prefer the former, but don't know if it's easy in XML schemas. Matthew will tell us.

3. Client to request TTL

It is clearly right that clients should be able to request a particular TTL. Servers include the actual TTL they plan to use in their response. Servers may:

When the server's response gives TTL as zero, it should not include a result-set name.

4. Complex content/qualified dublin core

We are all happy with this, though we notice that it's really not an SRW point, but a new schema. Ray will add it to the table of LoC-maintained schemas.

5. Scan

Parameters:

We also need a way to specify other, more esotric, things such as:

For these, it's tempting to use a CQL clause to represent this combination of index, term and extras, like this:

Rather than allowing a CQL fragment to be used, which introduces introduce the possibility of horrible things like foo or bar, I would have preferred to pick the relevant parts of the query apart into the following separate parameters:

But on this, I have been outvoted: instead, we'll have a single element, scanClause, which is a CQL fragment.

We may also want to be able to specify special seed-terms such as ``beginning of index'' and ``end of index''. These would be alternatives to the usual term parameter. It would also be nice to be able to include these as pseudo-terms in the Scan response, perhaps as surrogate diagnostics?

Scan response consists of an array of term, each of which is:

5.1. Echoed query in SRU (extra agenda item)

Rob proposes that when SRU servers echo the query back to the client, they should do so as XCQL rather than CQL, since this is easier for typical dumb-client platforms such as XSLT in browsers to handle. No-one objects, but we don't want to nail it down until it's been shown to people like Theo.

5.2. Fetching only the record in SRU

Ralph wants to have a flag in the SRU (only) search request to ask that only the located record be returned, not the whole SRW/U response structures.

6. Don't nill mandatory elements in response

We all agree.

7. SRU: URL to an xsl stylesheet

An SRU client may include an XSLT stylesheet URL in a search request; in this case, the server doesn't do anything with it - it merely echoes it back in the response, embedded in the XML in the standard way so that the client translates the SRU response for display using that stylesheet.

Neat.

8. Explain diagnostics

Expain records should be returned wrapped in a record element, as in a search-response, so that if something goes wrong there's a place to put the diagnostics. Record-packing is handled as for search.

9. Short names

OK.

10. Embedded XML for SRU

This is the same issue as #2, so we've already handled it.

11. CQL Range Search

Covered by profilability issue, see below.

12. XPath Parameter

This seems nice. We like it. It gives us all the power of both eSpec-1, eSpec-q and more, in return for a single line of specification pointing at the XPath standard :-)

(### Why not also XPath access points?)

13. Record Identifier

Rather than a unique identifier, which may not be supported in all cases, we plan to return as addInfo a CQL query that uniquely identifies the record that couldn't be presented.

14. handling structured data as an (X)CQL term

This is a non-issue.

15. associating URL with prefix

### Covered by profilability.

16. Eliot's additional issues

Context sets.

The "CQL" context-set.

Road-map for converting any give Z39.50 profile to SRW/CQL.

Want to avoid needing complex mappings with DC, etc.

We have multiple ``title'', ``author'' etc. in DC and Bath sets. Should the Bath SRW profile use the DC indexes? What is the difference? Seb's pragmatic question is: under what circumstances would a server actually interpret dc.title and bath.title differently?

Feedback to <mike@indexdata.com> is welcome!