The goals of our project:

To collect data relevant for images from context.
Initial experiments with full and even formatted text showed that this goal was as yet impractical. There are many relatively small files with OCR-red versions of (reasonably) formatted texts, but the quantities really were too small to use ML techniques, and rulebased solutions were expensive and not very succesful.
Plan B

Concentrate on full text, as most data was in one way or another contained in the reports and papers in the library of the institute. Here we had enough material to apply ML techniques.
Proceed incrementally: do not try to offer an all-encompassing solution, but solve each problem on its own terms.
To offer the archeologist immediate solutions for retrieval in digitalized archeological texts.