We’re working on a major catalogue update at work, which has me thinking a lot about how people use catalogues, databases, and other collections of information.
In talking about our new catalogue, I’ve also been reminded that most people don’t know how these things work, or what might be useful to them – so it seems like a great time for a short series of posts about that.
So, the first thing we should start with is what’s a catalogue?
For libraries, a catalogue is a highly specialised database that holds information about books in the collection. Often these are parts of an Integrated Library System, or ILS, that tracks a whole bunch of things. Sometimes the catalogue only does pieces of it.
Common things included:
- Information about works in the collection (such as title, author, publisher, publication information, call number, subject headings). This is sometimes referred to as the bibliographic record.
- Information about particular items in the collection, i.e. each actual thing that’s on the shelf (or however it’s stored or accessed). This is sometimes called a ‘holding’ record (because it describes the holdings of the library).
- Loan information about specific items in the collection and who has them.
- Information about electronic resources (sometimes this might be a link to them, sometimes systems pull in all the things in a database so you can search for them all in one place.)
- Additional resources the library has chosen to add (documents, files, etc.)
These records may have public notes (things to help library users) or staff-only notes (to help staff manage resources and answer questions.)
Again, not every library will have all these things.
Our collection at work has bibliographic records, but doesn’t have separate holdings records (all the information about all our copies is in a single record: this is sometimes a bit clunky, but it works okay for us because we don’t check a lot of items out.)
Likewise, we don’t have a separate circulation (or loan) module – all the loan information is in the record. Library users can’t see it, because it doesn’t display to them, just to the tools staff uses.
(In some libraries, this would be a problem, but in our library, there’s just one and a half staff members, and we both need to have access to it. The library assistant usually deals with loans and circulation, but if she’s on vacation or something comes up, I need to be able to see what’s going on and make changes too.)
When you put information into a catalogue, you are collecting metadata – that’s the term for ‘information about a thing’.
My favourite explanation of metadata comes from a Scientific American piece from 2012 that used Santa Claus and Christmas lists as examples. Go read it, if you’re not sure how metadata works, I’ll wait for you.
So, metadata about books includes the title, author, publication information. It might also include things like if a book is considered a particularly good resource, or is on a recommendation list. It might include if it was donated (and if so, by who). All kinds of things can be metadata.
Libraries have some commonly used systems for formatting it. A lot of libraries still use MARC (which stands for Machine Readable Cataloging record). Here’s a longish explanation from the Library of Congress about the details. This provides the structure for the data.
Besides the structure, there needs to be consistency in how you write things down. For a long time, libraries used the Anglo-American Cataloging Rules (AACR or AACR2 for the 2nd edition, etc.) but now a lot of libraries use RDA or Resource Description and Access.
Why all the rules?
Computers are still fairly stupid – they’re really quick at matching up things we tell them with things that they have stored, but they need a lot of help to match up things like typos or alternate ways of phrasing things.
(Google, Amazon, and the other tech giant companies have huge amounts of resources and lots of cutting edge design capabilities to make that work. Your average library just doesn’t. Your average library is probably pretty excited if their staff computers are less than four years old.)
So, in order for the computer to match things up, the library needs to be using consistent words (what’s called a “Controlled vocabulary” for things like subject headings and formats) and an underlying structure.
What does this mean in practice?
A lot of what I’m doing in our new catalogue right now is setting up that structure and arranging the different screens so they do what we want.
For example, we have a lot of options on the screen to add things to the catalogue, but when we edit things, we’re usually only editing a couple of specific pieces. So I set it up so those are at the top of the screen, and then we can get to everything else if we need to, but don’t have to scroll to get to it.
(You have no idea how exciting this is, when you’ve been spending years having to scroll down a very similar-appearing form to look for one specific field.)
But another big part of what I’m working on is fixing things so we’re using a smaller list of terms for things like format and location. That means people will be able to filter usefully by them, which will be amazing.
(This is going to take months and months. Fixing the formats and locations are pretty quick, but we have 14,000 subject headings, and a lot of them are tiny variants or typos of the ones we actually want.)