In my alphabetical list of possible duplicate records, I've reached titles beginning with the word "long". The titles begin specifically with "long" - not "a long" which were listed in the beginning after the numbers. And not "the long" which will come much later. The sorting program does not recognize initial articles.
In the process of finding and merging duplicates in the MassCat catalog, I'm also finding and reporting duplicates in other catalogs. As I've written before, I key in a word, sort by title A-Z and then begin looking for problems.
One of those problems is CIP (Cataloging in Publication) records. Generally, they are pretty good, but because the record was created while the book was in the process of publication, they lack information such as the number of pages, the height, and whether or not there are illustrations.
When I find such a record, I go to the C/W MARS catalog and look at their bib record to find the missing information. Sometimes I find 2 exact records, which I then report to the cataloging staff at C/W MARS. Given the size of their catalog, they have very few duplicates. I find maybe one or two in a month. Compare that with MassCat where I find 20 in a day. Well, that's what I'm there for.
When C/W MARS started, it had one catalog for all of its members. Then, it moved to different software that couldn't handle the size, so it was split between CMARS and WMARS. When they moved to yet different software, the catalogs were once again combined. But not all the duplicates were automatically merged. Human intervention is needed for those last few stragglers.
The other place I find duplicate records is in OCLC. That is such a HUGE catalog, there are bound to be "issues". And so much of the information is loaded automatically, there are bound to be even more "issues". I think I find a duplicate record nearly every time I search the catalog. It is not unusual to find 2 or 3 of the exact same thing. Those I report to the cataloging staff at OCLC.
Sometimes I get the feeling my purpose in this world is to find and merge (or report) the duplicate bib records in every catalog on Earth.
No comments:
Post a Comment