Next Generation Library Catalogs

Eric Lease Morgan, Head of Digital Access and Information Architecture Department at the University Libraries of Notre Dame            

This is a link to an outline of his idea for next generation library catalogs on the ND library website originally composed in 2006 and updated in 2007. A more recent and shorter version is available on his website, Infomotions.

He asked about what people want to learn and addressed whether he will be able to do that.

Initial Questions

  • What is the Catalog?
  • What does it Contain?
  • What functionality do you expect from it
  • What problem is it expected to solve?

What is the Catalog?

Concept of Index

  • list of words as a pointer
  • he advocates that catalog is type of index
  • index is finding tool, database is organization of information
    • Google is index, URL’s are the pointers

1995 collecting eletronic journals

  • created an 856 subfield u and people said you can’t do that.
  • expanded from ownership to licensed material, and where to find other items.
  • Catalog more of a finding aid.

What does it contain?

  • Pointers (other databases)
  • Images
  • Laptops
  • “What doesn’t it contain” is a good question.
    • Live animals 🙂
    • Each library is different therefore community standards guide what is not available.
      • constrained by limited resources
      • assuming what users already have (focus on support materials)
  • Why couldn’t it contain articles?
    • Wilson came along and made Guide to Periodical Literature
    • early example of library outsourcing

What problem is it expected to solve?

  • Large collections are hard to browse – helps access those items
  • Depends on who you ask
    • catalog, circ – inventory
    • Ref – need to find in every available way
    • User – helps them find items.
  • Issue of Interlibrary Loan
  • Catalog is not Integrated Library System – different words
  • What about classification on timeframe or method of access (ex. must wear gloves, can get in 3 days)

Articulate what you need first so that you can provide an appropriate solution rather than the other way around.

New Models



 New products on market and how they are the same (from a computer science perspective) – refers to his diagram

Relational database is at the center.

  • Write reports against database and feed it to indexer – so that items can be found. (Many accept MARC)
  • Should be able to change one record from Samuel Clemens to Mark Twain
    • run a script to create an index (all are using Lucene with Solr as the interface to it)
      • Primo, eXtensible Catalog, VuFind, Endeca, Aquabrowser (OLE [Open Library Enviornment] doesn’t count yet)
    • find word “river”
      • in docmumment 1 at position 5
      • also Mississippi in doc 1 at position 4
      • if pharase search doc would be retrieved because

Coming from Information Retrieval community.

Basically paying someone to use Lucene against a MySQL database and package it for us.

Services against the index.

  • Against an item
    • Tagging (one-click addition of subject)
    • Text the call number
    • Review
  • Against search results/index
    • Sort
    • Find more like this
    • did you mean
    • facet

One of the things we can do that Google can’t do is provide relevancy

  • We know our communities
    • Have student information from registrar – embed it- part of reference interview already inside system
      • english student searching for physics (no dissertations)
  • Dumb search engine – just (“Mississippi river”) or (mississippi AND river)
  • Smart search engine will add context specific conditions to this gross query – Large print or (automatic discipline subject)
    • How to get this information on the front end?

Other ideas

  • Tools that make slide shows out of image sets
  • Articles – go to vendor and refuse interface – only send titles, subjects, abstracts, and static URL.
  • Am. Lit – get 100 articles
    • make tool that compares and contrasts with a tag cloud. (subject terms on the fly)
  • Data set – make graph or chart
    • books – graph for when books were published.

Lucene is the core to the current service options.

We know our community and can go further (Value added)

  • The library makes my life easier

Differences between products

  • VuFind doesn’t use database to find stuff.
    • free as free kitten
  • Primo converts items into prop. XML – PNX and convert it into index
    • designed to harvest from the net (OAI)

Q: What is open source?

A: Can see the code of the software, Closed source is like car (I sell it to you but hood is welded shut), Open source is software development community process.

Zebra is a good indexer but not as elegant as Lucene

Trend of proprietary databases to allow data indexing

  • Is happening in Europe. Publisher will allow metadata not full-text.

Evergreen, Koha, etc.Open source ILS

  • realistic to expect this market to grow?
  • leaning towards that
  • if more than half of implementations were open source based in five years (paying for support)
  • Marshall Breeding takes statistics on these things.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s