Showing posts with label catalogs. Show all posts
Showing posts with label catalogs. Show all posts

Wednesday, October 11, 2017

The (In)Accurate Catalog

I've been working on the MassCat catalog for about 5 years now. First 10 hours/week, then 15, and now 18. Admittedly, I don't spend all of those hours on clean-up. I import vendor-provided records, search for and import records from OCLC, and create new records. I so very much want this catalog to correctly reflect the holdings of the MassCat libraries.

Yesterday, I had a very discouraging day. I'm still on the letter "L" in my alphabetical list of possible duplicates, specifically the word "Library". For some reason, I found record after record of the electronic version of a book and the corresponding holdings appeared to be for a print version. I know from past experience that many of these libraries do not have e-books.

Most of the time, just to be sure, I send an email to the director and explain the situation. I than ask exactly what the library owns and, if necessary (which it usually is) I find the bib record for the print book and overlay the e-book record.

Yesterday, I found over 30 e-book records that I suspected were really print books. I stopped emailing after the first 10. I figure I'll find the others again. Or maybe the books will have been weeded (dream on!) and I can just delete the record.

Nothing makes me happier than finding some skimpy or weird record with no holdings. ZAP! It's gone! Never to sully my catalog again.

Other things I encountered recently: a book by Edgar Allen Poe, now correctly by Edgar Allan Poe and another by Willliam somebody. He now has only two els in his first name.

I know I'm making progress because I keep statistics. Every month I merge hundreds of duplicate records, replace hundreds of skimpy records, and edit hundreds of other records. That third category consists of filling in pages on CIP records, correcting funky characters that should be accent marks, and correcting spellings.

If only I could get library staff to actually LOOK at the record before they import it and make sure it actually matches what they have in hand, I'd be a VERY HAPPY CATALOGER.

The one positive of this situation is that I have lots of war stories to tell when I'm teaching cataloging workshops.

Saturday, November 26, 2016

Why I Do What I Do

Three days each week, 6 hours on each of those days, I sit in front of a computer at the Massachusetts Library System and work with (on?) the MassCat online catalog. This may seem like a boring and tedious job to some people (and sometimes it is), but it suits me and actually brings me much satisfaction.

I do several different things: I import bib records that library receive from their vendors when they buy new books and other materials; I find bib records via paid sources to which I have access; I create bib records when they don't already exist. All of this keeps a library's catalog accurate so patrons and staff know what the library owns and where the item is.

Most of my time is spent searching for and merging duplicate records. If you've spent any time reading this blog, you know I'm now on the letter "K". I started on titles beginning with the word "Kissinger" yesterday. (There were over 800 hits using that one word in a keyword search.)

As I find duplicates, I look at the records side by side to make sure they are exactly the same and then hit the "Merge" button. All libraries with holdings on either record are now all listed as owners of this one version of a title.

Okay, that's what I do; now for why I do it.

There are lots of benefits from merging duplicate records, but I'm sure most people - even most librarians - don't even think about it, so here they are.

Benefits to the Patrons: Whether in a public, school, or special library, if one enters a keyword search with one or two words (as in the Kissinger example above), the result is a long list of books, audio books, videos, and other materials containing that word. Sorting through that list can be difficult. Many books, especially popular titles like James Patterson's Private Paris, come in a variety of formats: regular print, large print, book on CD. There might also be a mass market paperback, a DVD, Blu-ray and/or an e-book.

Each of these formats needs its own bib record for Interlibrary Loan purposes. If I can only comfortably read large print, I don't want to get a mass market paperback; I don't want a Blu-ray disc if I don't have a player on which to view it.

It can be confusing enough navigating the myriad versions of a title, but having two or more bib records of the exact same thing only adds to the confusion. Hence, merging duplicate records.

Another benefit for the patron is having all owning libraries listed on one record, rather than each owning library having a separate record. Some libraries, particularly historical societies and other special libraries, do not allow their materials to circulate; a person can use them on site, but not take them out of the building. But if another library also owns the same title and does allow their collection to circulate, I can borrow that copy via Interlibrary Loan.

And if two different historical societies own the same item, which is often the case, and I can only use that item on site, I can travel to whichever is closer.

There are benefits to the library staff as well especially for those involved in collection development: If a particular title in my collection has been lost but several other libraries in the area own it (which I can easily tell by looking at the list of owning libraries listed on said title), I might not bother to replace that title since my patrons can borrow it elsewhere. That helps stretch my materials budget. Similarly, if a specific book is looking shabby, I can weed it knowing my patrons have access to it at another library. That keeps my library looking fresher and more inviting.

As I stand and stretch and take a quick walk because too much sitting can stiffen my joints, I reflect on what I've accomplished that day and think of my contribution - small but important - to MassCat libraries.


Tuesday, March 22, 2016

Hang Fire

When I keyed "hang fire" into the Google.com search box, what appeared was "delay or be delayed in taking action or progressing". That's a perfect definition for my situation - especially the progressing part.

As I've said before on this blog, I have an alphabetical list of potential duplicate records. I search the MassCat catalog to determine if, indeed, they are duplicates. Sometimes they are. Sometimes not; the title and author are the same, but one is perhaps large print, another a mass market paperback and yet another the audio version. Between the less-than-perfect bib records and the idiosyncratic way that Koha sorts those records, potential duplicates don't always end up one after the other. They may be several records apart and not always easy to find. Sometimes they'll be on different "pages". Each bit of punctuation affects the sorting: Fire and Fire! do not end up near each other.

The way I search for duplicates (and other problems) in the catalog is to use my alphabetical list as a guideline. Instead of searching the entire title, I do a keyword search on one word of the title and the author's first or last name, depending which is faster and easier to type.

This can be a pretty boring job at times. They way to keep myself interested is to make a game of it - which is how "hang fire" came about. There were several titles beginning with the word Fire. I decided to key in that one word and see what happened.

The first thing is that I got a list of over 3200 titles matching that search. This was a keyword search, remember. The word "Fire" can be anywhere in the record. Unfortunately, Koha only searches the first 1000 titles, so I knew I couldn't get to that middle 1000 easily. I sorted by title A-Z and began to look for duplicates, typos, funky characters that should be accent marks, incomplete records, and any other bib record in need of a cataloger's attention. I found plenty - hence "hang fire". I've been working on variations of this search for over a week! 

Here is one of the things I found: 3 records for Oscar Handlin's Fire bell in the night, although one was spelled Fire-ball in the night. There is now one record, spelled correctly.

When I finally reached record #1000, I resorted by title Z-A and worked backwards through another 1000. After merging all of the duplicate records (now there are only about 3100 or so), I resorted A-Z, went to record  #940, looked through that last batch and found a few more things to take care of. I resorted Z-A and did the same from the other end.

At that point, I went back to my duplicate list and did slightly more detailed searches (usually 2 title words and one of the author's names) on each title listed to make sure I hadn't missed anything in that massive scanning. While I never did get to that middle 1000 records, I can definitely say there are now fewer records and they are cleaner and more complete. 

Isn't that what it's all about? Now on to firefighter, firefighting, etc. It will be a while before I get to the letter "G".

Tuesday, March 8, 2016

Funky letters

The English language doesn't have any accent marks except on the words that have been borrowed (or should that be usurped since I don't think we're going to return them) like facade (which should have a cedilla since it is French) or Noel (which should have an umlaut, but you rarely see it with one). In the library world, accent marks are called diacritics.

Now that we have word processing programs, those diacritics are often automatically added by the computer software (but not in Blogger, I guess, since they did not pop in).

Since libraries often have books (and other materials, of course) in languages other than English, and usually have books (and other materials) written by or acted by people with names that require diacritics, you'll see many cedillas and umlauts as well as acute or grave accents, and rings (which I thought was called an angstrom but, according to Wikipedia, that is a unit of measure). Depending on the software a particular library uses for its catalog, these accent marks may or may not display correctly.

When bib records are transferred from one program to another (such as Koha, which is used by MassCat), the correct coding for those accent marks may or may not be transferred correctly.

One of the things I do at work is to look for funky characters (that's letters, not people). As I scan the list of brief records looking for duplicates and misspellings (remember the Portuguese Picket Dictionary?) I look for misplaced symbols like ? or @ in the middle of words. That usually signals there should be some sort of accent mark over or under or through a letter. Often I can tell what it should be, but since all of these programs translate the codes differently, I'm not always sure.

Fortunately, it's easy for me to correct these - what should I call them? - mis-translations when I find them. I actually have a word document with all of the different possibilities of letters with accent marks. I simply copy and paste over the offending symbol. Or if it's someone's name and I'm not certain what it should be, I go to WorldCat.org, search the part I do know, and copy and paste the correction.

If  only all of life's problems could be solved so easily.

Wednesday, December 2, 2015

Anniversaries and the Letter "F"

Last week was Thanksgiving. It marks a milestone in my life different from the traditional giving thanks and turkey dinner. On the Monday after Thanksgiving in 1967, I began my first full-time library job. It was at the UMass/Amherst library - then Goodell - typing the headings on the tops of catalog cards and then filing them into the card catalog. I'm so glad that those labor-intensive cards are gone.

Who would have thought that 48 years later I'd still be "wearing the sensible shoes" as they sometimes say on the AUTOCAT email list.

As I consider retirement, perhaps I should keep this date in mind. In two years it will be an even 50. Now that's something to celebrate.

On the MassCat front, I've finally cracked the letter "F" in my alphabetical list of possible duplicates. I've been working on F for a couple of weeks now. To put this in perspective, I briefly blogged about arriving at "E" in early May and "D" in June of 2014. Even though everything is automated, it's still pretty labor-intensive.

Monday, May 4, 2015

The Portuguese Picket Dictionary

I wonder if anyone has ever searched for a Portuguese Pocket Dictionary in the MassCat catalog. If so, they would not have found it. Whoever entered the data for that particular resource did so incorrectly.

This is the sort of situation I face every Monday, Wednesday and Friday as I pour over the catalog looking for problems like typos, incomplete words, and bibliographic records too skimpy to know what the item really is.

My first priority is hunting for duplicate (and triplicate and quadruplicate) records. Sometimes it's hard enough to navigate the catalog without being confronted with two of the exact same thing and several more variations. Which is which?

I have a printout, in alphabetical order, of potential duplicate records. A few weeks ago I reached the letter E. On June 25, 2014, I posted that I had arrived at the letter D. It's going slowly, but I'm doing a lot of other thing, too.

Since Friday was May 1, I totaled my statistics for the month of April. While I don't have the exact figures in front of me, I merged over 800 bibliographic records. That's a pretty typical month. I also replaced several hundred incomplete records with ones with more detailed information. And several hundred additional records received what I call "minor edits" like correcting typos and adding page numbers when I can find them.

As I look at all the work that needs to be done to this catalog, I know I'll have a job for a very long time. At least now anyone looking for a Portuguese Pocket Dictionary will be able to find one.