Todo's:

technical:

The rendering of the pdf-documents is as yoy have seen, often unsatisfactory. This is a pure technical (read: financial) problem.
The rewriting of some stuff to make the retrieval engine work under Windows.
research issues

Recognition of other numerics (coördinates, measures, weights).
In fact, this is already solved, but we have as yet no sensible interface.

Disambiguation of geographical locatieons.
Before we can tie in with all the beautiful GIS stuff, we have to recognize the data that can be linked to that information.

An engine to recognize and collect data that is relevant for objects mentioned in the text. Hmmm.
The weighting of relevance over different categories.