This wiki space contains archival documentation of Project Bamboo, April 2008 - March 2013.
Enhancing Scanned Texts with Context
A demonstration of how scanned texts can be enhanced with links to contextually relevant resources. Using the output of an optical character recognition (OCR) process, line and word locations can be determined, allowing interactive selection and highlighting of references to people and places. These references can be detected automatically or manually added as annotations. Part of the Contexts and Relationships: Ireland and Irish Studies project.
This demonstrator shows how NLP techniques, in conjunction with search technology can help scholars in identifying contextual content for texts being studied automatically. The process starts with scanning of the texts (in the example used in the demonstrator the scanning was done by our collaborators at Queen's University Belfast, and is being included in JSTOR). We transformed the TIFF page images and XML OCR output we received
This page contains a link to a video demonstration showing the features of the demonstrator (click on the "see the demo video" at the bottom of the page)
The live demonstator is also available for exploration.
This demonstrator was created by Ryan Shaw for the work of two projects funded by the Institute for Museum and Library Services (IMLS): Contexts and Relationships: Ireland and Irish Studies and Bringing Lives to Light: Biography in Context.