originally: http://www.oclc.org/research/projects/frbr/algorithm.htm
FRBR (Functional Requirements for Bibliographic Records) is a 1998 recommendation of the International Federation of Library Associations and Institutions (IFLA) to restructure catalog databases to reflect the conceptual structure of information resources. This project is one of several FRBR-related OCLC Research projects.
More information about FRBR and related OCLC Research projects is available on the main FRBR Project page.
OCLC Research FRBR projects examine issues associated with converting a set of bibliographic records to conform to FRBR requirements (a process referred to as "FRBRization").
Chief scientist Thom Hickey led the development of computer algorithms to explore automating FRBR conversions. Hickey and his associates also clustered the entire 48 million record WorldCat database at the 'work' level and created a number of subsets, including records representing works of fiction.
We encourage you to download the algorithm for use in converting your own bibliographic databases. Use of the algorithm is governed by the OCLC Research Public License, an Open Software Initiative-approved license.
Researchers made a copy of WorldCat that included holdings data and NACO authorities, and created a personal author file.
Records were processed in MARC Communications format after being converted to Unicode. Much of the early research investigated how best to divide a particular 'work' into its component 'expressions'. Unfortunately, this and other FRBR research has shown that the information in existing bibliographic records is, in general, insufficient to reliably divide a work into expressions, so this line of investigation has been abandoned for now.
Our research then focused on the seemingly simpler problem of collecting bibliographic records into groups corresponding to different works (such as Shakespeare's Hamlet). An algorithm was developed, based primarily on author and titles found in bibliographic records, to find works in the WorldCat database with a high degree of reliability. One major finding is that looking authors and author/titles up in the authority file has a significant positive impact on the matching of works.
Since the NACO authority normalization rules were used to simplify names and titles before matching, researchers investigated existing implementations of the rules. Discrepancies found between implementations led to the establishment of a public NACO normalization test-bed to make it possible for others to compare and verify their normalization routines to that developed in this project.
Some of the more difficult records to group properly into works are those without authors or uniform titles. Many of these records will match on title, but really represent different groups. Work is continuing on exploring and exploiting information in bibliographic records to help establish reliable matches without bringing together unrelated records.