Metadata creation: human, machine, or both?
I found this interesting post on the Radical Cataloging list (RADCAT). In Messing Around With Metadata Jacob Harris describes the metadata created at The New York Times as a combination of machine extraction as well as human intervention:
... people are ultimately controlling the process. In the beginning, rules for the automatic extraction and tagging are set by an Information Architect. In the end, final approval and correction of suggested metadata is done by various Web producers before publication. Web producers also do the important job of accurately summarizing the story. So, while we have machines to help out the process, it’s still ultimately a human endeavor, largely because automated summarization and classification has its problems. [emphasis mine]
Hmm. Will our library metadata also be a machine/human joint effort? Is it already? Any thoughts?
Of course library metadata will be a joint effort. Even if publishers miraculously start producing consistent and accurate metadata (or even just the untagged data)in a uniform format across the industry so that libraries can harvest it, it will just be because publishers are using humans to input and verify it. I'm much more confident in a computer's ability to determine appropriate subjects by analyzing text than I am in a computer's ability to differentiate between a title and a subtitle or between the three possible titles on the front page of a lot of conference proceedings. I think there has to be a human involved for quality control, at least. Either that or be willing to accept less accurate and complete metadata.
Posted by:Alex | Thursday, November 29, 2007 at 02:41 PM
Alex, thanks for this comment. I agree with you. The picture that's developing for the future work of catalogers and metadata librarians includes quality control of harvested metadata. It's actually a role many of us already have as OCLC members. The quality and extensiveness of records in WorldCat really varies these days (especially for foreign language publications).
So, We will continue to have a quality control role. And, I suppose many more of us will be working for publishers or book vendors.
Posted by:Chris Schwartz | Monday, December 03, 2007 at 06:40 AM