« Essential listening: The genius of cataloging | Main | Testing FRBR? »

Wednesday, November 28, 2007

Metadata creation: human, machine, or both?

I found this interesting post on the Radical Cataloging list (RADCAT). In Messing Around With Metadata Jacob Harris describes the metadata created at The New York Times as a combination of machine extraction as well as human intervention:

... people are ultimately controlling the process. In the beginning, rules for the automatic extraction and tagging are set by an Information Architect. In the end, final approval and correction of suggested metadata is done by various Web producers before publication. Web producers also do the important job of accurately summarizing the story. So, while we have machines to help out the process, it’s still ultimately a human endeavor, largely because automated summarization and classification has its problems. [emphasis mine]

Hmm. Will our library metadata also be a machine/human joint effort? Is it already? Any thoughts?

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2383516/23726352

Listed below are links to weblogs that reference Metadata creation: human, machine, or both?:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Of course library metadata will be a joint effort. Even if publishers miraculously start producing consistent and accurate metadata (or even just the untagged data)in a uniform format across the industry so that libraries can harvest it, it will just be because publishers are using humans to input and verify it. I'm much more confident in a computer's ability to determine appropriate subjects by analyzing text than I am in a computer's ability to differentiate between a title and a subtitle or between the three possible titles on the front page of a lot of conference proceedings. I think there has to be a human involved for quality control, at least. Either that or be willing to accept less accurate and complete metadata.

Alex, thanks for this comment. I agree with you. The picture that's developing for the future work of catalogers and metadata librarians includes quality control of harvested metadata. It's actually a role many of us already have as OCLC members. The quality and extensiveness of records in WorldCat really varies these days (especially for foreign language publications).

So, We will continue to have a quality control role. And, I suppose many more of us will be working for publishers or book vendors.

Post a comment

  • The focus of this blog is the future of cataloging and metadata in libraries. The new cataloging code, RDA: Resource Description and Access, is a significant issue. The future of the MARC 21 format will also be explored. ILS/OPAC's future will be touch on. Also, I hope to use this blog to collocate some of the important papers, articles, websites, etc. that deal with the future of cataloging and metadata.

Future of Cataloging: Key Resources

Blog powered by TypePad
Member since 04/2007

Search

Enter your email address:

Delivered by FeedBurner

AddThis Social Bookmark Button

July 2008

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31