In this post, Chris Stanton, Digital Projects and Metadata Librarian of the Metropolitan New York Library Council, describes his work helping institutions contribute to the Digital Public Library of American (via New York’s hub, the Empire State Digital Library) from the CollectiveAccess-based digital collections websites.
Since beginning work here at METRO in 2014, the Empire State Digital Network (ESDN) team has contributed 320,000+ records from 200+ New York State cultural heritage institutions to DPLA, with more content and institutions added every month! As the New York State service hub for DPLA, we’ve became very familiar with the range of systems and software our partners are using to store metadata and we help them to share that metadata with projects like DPLA.
Here in New York, we’ve encountered CONTENTdm, Islandora, Omeka, Vital, PastPerfect and other systems. We’ve also worked with a number of partners using CollectiveAccess, and we hope to work with more CollectiveAccess users in the future!
The New York State Archives was our initial CollectiveAccess partner in late 2014. They helped us to figure out our workflows and requirements for contributing metadata from CA to DPLA. We learned a lot about CollectiveAccess working alongside the NYSA during that period and we were able to use that experience when we began work with The New School Archives and Special Collections in summer 2015. Both are now contributing thousands of records from CollectiveAccess to DPLA (available for browsing here and here).
As with most ESDN partners, regardless of system, the primary area of work required for CollectiveAccess users looking to contribute to DPLA concerns the configuration of an OAI-PMH feed. OAI-PMH, the Open Archives Initiative-Protocol for Metadata Harvesting, is the primary means by which ESDN receives metadata from partners, and is also the mechanism we use to provide metadata from our partners to DPLA.
CollectiveAccess includes an OAI-PMH Provider that can be configured to share metadata in the format of your choice (probably Dublin Core). In order to share their metadata, both the NYSA and The New School created data exporter mappings that enabled them to get their Dublin Core metadata to DPLA.
ESDN and DPLA have specific metadata requirements (more info on those here) that NYSA and The New School needed to account for in their XML export mapping. If you have already created an exporter mapping (maybe for the WorldCat Digital Collection Gateway), or if you are creating one for the first time, I recommend reviewing ESDN and DPLA requirements while doing so. In particular, regardless of whether you’re an ESDN partner or working with another DPLA service hub, start by comparing and working through the ESDN mapping worksheet alongside your data exporter mapping.
The New York State Archives data exporter mapping is available to view and download here (thank you Heather Bolander-Smith!). The NYSA mapping represents a simple Dublin Core mapping that aligns very well with ESDN and DPLA requirements. Important fields from that mapping export include:
- Mapping to dc:identifier from source ca_object_representations.media.small (with prefix option: http://digitalcollections.archives.nysed.gov/media/collectiveaccess), which works to create a link to a thumbnail Preview such as: http://digitalcollections.archives.nysed.gov/media/collectiveaccess/images/3/0/5/34155_ca_object_representations_media_30509_small.jpg
- Mapping to dc:identifier with template option http://digitalcollections.archives.nysed.gov/index.php/Detail/Object/Show/object_id/^ca_objects.object_id which creates a persistent link to each item: http://digitalcollections.archives.nysed.gov/index.php/Detail/Object/Show/object_id/25969
- Mapping to dc:rights with “constant” variable for the Rights statement that NYSA chose to initially assign to all contributed records. While we are working on improving accuracy in object-level rights statements and would no longer recommend the use of blanket rights statements across collections (look at rightsstatements.org instead!), I wanted to highlight the ability to provide “constant” values in contributed records where/if required.
- Multiple mappings to dc:title for the object Title (from ca_objects.preferred_labels) and the Collection Title (from ca_collections.preferred_labels). The CA exporter and ESDN workflow allows for ESDN to correctly align values with their corresponding DPLA field even if the same Dublin Core element is used for multiple data values.
The CollectiveAccess data exporter provides the ability (and flexibility) to create comprehensive exports of your data to share with DPLA. It is also a great opportunity to learn more about how data is stored and used in CollectiveAccess.
In addition to configuring the OAI-PMH provider and completing the data exporter mapping, CollectiveAccess users in New York work through a number of quality review steps before their metadata is shared with DPLA. ESDN partners review their data feeds and individual field values directly with us, and are able to adjust their data exporter mapping based on that review. We have found that the CollectiveAccess data exporter mechanism allows CA users to easily make changes in their mappings as needed.
For those reading who do not represent a CollectiveAccess user in New York State, please view the DPLA service hub info page for the appropriate state or regional hub to contact to begin the contribution process. If you don’t see your state listed, DPLA is working on providing hub services for institutions in all 50 states (see blog post here), and you can also get involved with DPLA by contacting and working with DPLA Community Reps in your area.
Regardless of whether you’re a CollectiveAccess user in New York State (if you are, we hope to hear from you soon!) or elsewhere, I hope that information and resources here help give you a useful roadmap for sharing your data with DPLA.