Tuesday, March 22, 2016

Hang Fire

When I keyed "hang fire" into the Google.com search box, what appeared was "delay or be delayed in taking action or progressing". That's a perfect definition for my situation - especially the progressing part.

As I've said before on this blog, I have an alphabetical list of potential duplicate records. I search the MassCat catalog to determine if, indeed, they are duplicates. Sometimes they are. Sometimes not; the title and author are the same, but one is perhaps large print, another a mass market paperback and yet another the audio version. Between the less-than-perfect bib records and the idiosyncratic way that Koha sorts those records, potential duplicates don't always end up one after the other. They may be several records apart and not always easy to find. Sometimes they'll be on different "pages". Each bit of punctuation affects the sorting: Fire and Fire! do not end up near each other.

The way I search for duplicates (and other problems) in the catalog is to use my alphabetical list as a guideline. Instead of searching the entire title, I do a keyword search on one word of the title and the author's first or last name, depending which is faster and easier to type.

This can be a pretty boring job at times. They way to keep myself interested is to make a game of it - which is how "hang fire" came about. There were several titles beginning with the word Fire. I decided to key in that one word and see what happened.

The first thing is that I got a list of over 3200 titles matching that search. This was a keyword search, remember. The word "Fire" can be anywhere in the record. Unfortunately, Koha only searches the first 1000 titles, so I knew I couldn't get to that middle 1000 easily. I sorted by title A-Z and began to look for duplicates, typos, funky characters that should be accent marks, incomplete records, and any other bib record in need of a cataloger's attention. I found plenty - hence "hang fire". I've been working on variations of this search for over a week! 

Here is one of the things I found: 3 records for Oscar Handlin's Fire bell in the night, although one was spelled Fire-ball in the night. There is now one record, spelled correctly.

When I finally reached record #1000, I resorted by title Z-A and worked backwards through another 1000. After merging all of the duplicate records (now there are only about 3100 or so), I resorted A-Z, went to record  #940, looked through that last batch and found a few more things to take care of. I resorted Z-A and did the same from the other end.

At that point, I went back to my duplicate list and did slightly more detailed searches (usually 2 title words and one of the author's names) on each title listed to make sure I hadn't missed anything in that massive scanning. While I never did get to that middle 1000 records, I can definitely say there are now fewer records and they are cleaner and more complete. 

Isn't that what it's all about? Now on to firefighter, firefighting, etc. It will be a while before I get to the letter "G".

No comments:

Post a Comment