« Insights from a cataloging trainer | Main | Google Books metadata redux »

Wednesday, September 15, 2010


Feed You can follow this conversation by subscribing to the comment feed for this post.


when you implement a discovery layer that has facets based on information in the fixed fields, you really notice when the fixed fields are wrong (or not coded at all)


Jonathan Rochkind

"Fill in fields correctly or not at all, but not carelessly – data has to be as reliable as possible for machine processing and for the display generated from it."

This is an excellent excellent point, which I think maybe is a change from the pre-computer era.

I get the feeling (looking at my data), that traditionally, cataloging departments said "We need to fill out this field for every record because, oh, it's in some standard. But it's not really a field that matters that much, our ILS doesn't really do much with it (or it doesn't print on the card! Or isn't that useful on the card), so we won't spend much time on it, or worry if we've done it accurately or not."

This really comes back to haunt you in the computer era. If a given data element has a lot of carelessly entered data, it means the entire data element is basically useless, because there's no way for software to know which records are accurate and which are not. So why spend any time on it at all, if the result is basically useless?

Enter it right, or don't enter it at all. If I have a corpus of a million recors, and 400k of them have a blank data element, I know what the score is, and can decide if there's something I can do anyway with the 600k that do. But if 400k of them have an incorrect value in a data element -- I'm out of luck, the whole field might as well not be there, because there's no way (or in some cases only a resource-intensive and approximate difficult way) to know _which_ 600k are right!

Jonathan Rochkind

[And I'm not sure there are that many people in library land de-emphasizing the importance of cataloging. Maybe outside of library land. Maybe inside of library land some (I'll say it) clueless administrators who are not software engineers.

Very few software engineers in library land think this, because we know our software is built on cataloging data. What software engineers in library land get frustrated about is when lots of human cataloging effort is spent creating metadata -- that can't in fact be effectively be used by software, because it wasn't designed the right way.

I tried to dis-spell this distrustful myth in some of the first posts on my blog, over three years ago. I still believe it:



Christine Schwartz

@ Alison, @Jonathan, Thanks for the thoughtful comments!

Good catalogers have always correctly coded fixed fields knowing that the information might well be used in the future.

I've always been against coding metadata for the current system. It's bad practice. I guess people find it tempting to code MARC records for their current ILS. It may seem like the easy way out, but in the long run will only cause problems.

I've always thought that an accurate bibliographic record was better than a long, extensive records coded badly and not following national cataloging standards. I was influenced early in my career by an article on this topic written by Peter Graham. Will track down the title.

The comments to this entry are closed.

Scope of blog

  • The focus of this blog is the future of cataloging and metadata in libraries.

Enter your email address:

Delivered by FeedBurner

Twitter Updates

    follow me on Twitter

    July 2014

    Sun Mon Tue Wed Thu Fri Sat
        1 2 3 4 5
    6 7 8 9 10 11 12
    13 14 15 16 17 18 19
    20 21 22 23 24 25 26
    27 28 29 30 31    


    Future of Cataloging: Key Resources (to May 2008)

    Blog powered by Typepad
    Member since 04/2007