Posted by: Wade | December 30, 2010

Opening a Gateway

Back in the Fall, McMaster started using OCLC’s WorldCat Digital Gateway to harvest metadata from our institutional repository. OCLC first developed WCDG to work with ContentDM, but has expanded the service to harvest from any OAI-PMH compliant repository.

The Gateway process converts Dublin Core metadata from the harvested repository to MARC and provides some control over the mappings and how frequently harvests occur for various collections in the IR. For items new to the repository, a WorldCat record is created; for older items, the existing WorldCat record is updated to reflect any changes in the metadata. This was exactly what I was looking for: a way to integrate our IR content with other print and electronic resources, both for our own users and for researchers beyond McMaster, without a lot of manual record creation or editing. (It’s also free, so something to take full advantage of.)

After a quick look through the numbers and our Google Analytics on the IR, it appears to be having the desired impact. Our Digital Strategies Librarian and I hope to do a more detailed analysis of the results in the new year. In the meantime, a few thoughts.

Involve a cataloguer

Conversion from a less granular to a more granular metadata format is always a little dicey. The ability to change the mapping on the fly and see the results in preview before running a full sync is handy. Although the interface is designed so that no knowledge of MARC is required, I recommend having someone who understands MARC (and OCLC MARC in particular) look at the mappings in the MARC view. It’s not always obvious from the labeled display where things are going, and some of the coding (is this a text document that happens to be online or is it really a computer file) is readily apparent to a cataloguer in the MARC view. That may seem like an irrelevant distinction but in the age of faceted browsing and discovery layers, coding is what gets resources in the correct bucket. It took us a couple of iterations to get the mappings nailed down.

Editor wanted

While the MARC view is useful for seeing where Dublin Core elements are ending up, it would be even more useful as a true editor. It would have been quick and easy to edit the MARC tags to change a mapping rather than going through the process of turning a mapping off in one place and turning it on somewhere else. The ability to add constant notes or update coding directly in the MARC display would also be really nice. Again, it took several tries to get a constant data note to appear in the right place when I could have very quickly plunked it into the appropriate MARC field. I’d also like to see a way to split long fields after a set number of characters. Many of the summary fields for our digital theses get truncated, but there isn’t a specific character that we can use as a delimiter to chunk them up.

Make a date

Publication dates are imported from the repository metadata to the MARC publication data field (260 $c) but are not reflected in the OCLC-MARC fixed field Date1 element. We’re pulling the records down for loading to our local catalogue and while they go in just fine date-less, it meant that our Endeca-based OPAC couldn’t sort them into the proper year for browses or limits. Since we’re batching the records out of WorldCat using the Connexion cataloguing software anyway, I put together a Connexion macro that grabs the year from 260 $c and plugs it into Date1. It works well, but adds a step. Populating the date field as part of the initial conversion would be better.

Reporting back

Each time a collection in the repository is synced to WorldCat WCDG auto-generates a report available in an online view and for download in Excel and XML. This is great for keeping an eye on the process, making sure that scheduled syncs are happening when they should, and getting OCLC numbers for records created or updated. Additional reports that contained only the OCLC numbers in a plain-text format for easy batch-extracting or, better yet, a file of the MARC records themselves would be really handy.

Stay tuned

More details to come! If you’d like to have a look at the records created through the Gateway process, you can see them in our catalogue.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.