Shared Cataloging of Early Printed Images
Role: Project Manager, Natural Language Processing Lead (2022–present)
Website (project announcement): datalab.ucdavis.edu/collaborations
With support from the Getty Foundation, the Shared Cataloging of Early Printed Images project is developing an environment in which catalogers and researchers can search across collections and institutions for copies of the same or similar images. We use a suite of computer vision software to identify similarities between woodcuts, line drawings, and other such materials from the early modern period; my work with the project involves supplementing this software with data from language models, which I have built from both archival sources and from the metadata archivists have used to describe images. Using the combined output of these text and vision models, our environment links together similar images and provides a means for retrieving cataloging information for each image, or for uploading new metadata about them. In essence, once this environment is complete it will allow users to ask, "Has anyone else described an image like this?" and, if so, "How was it described?"