Cheshire3 Information Retrieval Framework |
Cheshire3 is a fast XML search engine, written in Python for extensability and using C libraries for speed. It is feature rich, including support for XML namespaces, unicode, a distributable object oriented model and all the features expected of a digital library system. Standards are foremost, including SRW/U and CQL, as well as Z39.50 and OAI. It is highly modular and configurable, enabling very specific needs to be addressed with a minimum of effort. The API is stable and fully documented, allowing easy third party development of components.
Given a set of records, Cheshire3 can extract data into one or more indexes after processing with configurable workflows to add extra normalisation and processing. Once the indexes have been constructed, it supports such operations as search, retrieve, browse and sort. Using Apache handlers, any interface from a shop front, to Z39.50 to OAI can be provided (all included by default), but the abstract protocolHandler allows integration into any environment that will support Python.
You can get the current Cheshire3 beta code from sourceforge, or from our FTP site ftp://ftp.cheshire3.org/pub/cheshire3/ along with all the required software packages.
Please note that at this time Cheshire3 is under development and may not be suitable for a production system. That said, it is currently in use in production systems including SRW servers and web store front systems.
We have Documentation! Although not yet 100% complete, it is getting there and should be sufficient to get started.
This is the third generation of the Cheshire system, started more than 10 years ago at UC Berkeley and more recently developed in a partnership between Berkeley and the University of Liverpool. Cheshire2 is used by several national services in both the UK and Europe, as well as by several services and projects in the US. We gratefully acknolwedge the ongoing support of the JISC.