I would like to put to Conference a number of points on authority control. At its simplest and least controversial the hypothesis is that authority control works better in theory than in practice. The extreme hypothesis is that authority control simply does not work. I suspect that the truth lies somewhere between the two. However, I am sure that we cannot continue to rubbish search engines and highlight the chaos of the web unless we can show that the claims that we make for the library catalogue and authority control with its controlled vocabularies for subjects, names and titles can be substantiated. The alternative is to put the necessary changes in hand that will make authority control what we claim it to be. Information management on the Internet needs authority control but it only needs it if it works consistently and exhaustively. We spend an immense amount of money on setting up a structure to ensure that every contingency however unlikely is catered for by our authority control tools. We appear to spend very little in assessing whether the end user is getting the correct answer to his input.
We have in bibliographic control a wealth of experience to contribute. Although we have convinced ourselves that the methods we use are producing effective and efficient catalogues we have not convinced our users. They continue to find the catalogue a complex tool to use but what is worse, as I hope to demonstrate, they miss material that they should find. Probably in much the same way that librarians say the search engines do. The danger for librarianship is that our users are beginning to turn their backs on the library catalogue in favour of search engines.
Whether it has been forced on libraries by the vendors or by the cost of comprehensive links the root of the problem is that the cross reference has been given a very low priority and in many important cases is missing altogether. We must ensure that the cross reference links headings which are synonymous and that only one is used as the main one For example there are forty one variations in the way Dostoyevsky's name can be spelt. It is rare for more than one or two cross references to be used. What is far worse, the user is likely to miss material because it is very common for more than one variation to be used as a main heading. With the computing power that is available today the catalogue users should expect every known variation of a catalogue access point to have a cross-reference to the chosen authority control access point. The claim that authority control improves the precision of searches and provides collocation is only true when it is applied exhaustively. Dostoyevsky is a good example. There are many cases when users finding entries under one variation of the name assume that this constitutes the whole of the libraries holdings of the authors works .See Case studies 1and 4.
It is probably true to say that the Internet. heralds the second revolution in information. The first information revolution took place half a millennium ago with the invention of printing. This dramatically speeded up the way in which information was disseminated.. However, it took mankind over five centuries to fully capitalise on the ability to transfer information and to devise methods of information retrieval. This is perhaps a misnomer as we never ever retrieved information. What we retrieved was the containers of information. The Internet, which is responsible for the second information revolution, will have an impact as great as the invention of printing but will differ from it because the effects will not take half a millennium before they are fully felt- the second information revolution is taking place almost while we watch. In the area of bibliographic control it has opened a number of doors. It is now possible to access nearly every important catalogue through the Internet and many of them belong to clumps or clusters which can be accessed as a group because they are linked by the Z39.50 protocol. What is not clear is how bibliographic control can influence information management in an Internet environment. The library catalogue is only one of a number of information sources along with search engines, internet directories and online databases. It is important that a means of integrating them all is given high priority.
I am fortunate in having worked on and helped to develop a new type of tool which makes it possible to test whether authority control is working. This is BOPAC2 and it is not my purpose to use this Forum to advertise the advantages of BOPAC2 but I need to briefly describe the tool that I am using. BOPAC2 is not in itself an OPAC but it adds value to existing OPACs. It is able to download quite large retrievals. For example up to five hundred can be easily downloaded within a reasonable time. Once downloaded the retrieval can be displayed and examined in a number of ways as I will demonstrate in the Case Studies that I give. Movement from one display to another is very fast.
I am sure that what I have written is going to be controversial as it attempts to question the viability of authority control which is one of the cornerstones of modern cataloguing. I do this not because I believe that authority control is not important but because I believe that it is not working effectively. I believe that Conference, as a matter of priority, should be looking at ways in which it could be made to do the things that we claim for it. It is a long time since I catalogued a book and I am not pretending that I approach this subject with a cataloguer's viewpoint or expertise but I think that I am looking at it from the catalogue users point of view and with twenty years of researching into various aspects of bibliographic control.
It seems that somewhere along the line the cross reference and the see also reference have been sidetracked and OPACs on the Inteernet rarely if at all provide links via the cross reference and the see also reference This is in spite of the fact that our complex standards provide for their use. Why is this? I suspect that there are two reasons. The first is that the standards themselves are so complex. Cataloguers tend to be perfectionists and rules for descriptive cataloguing legislate for every possibility even when our users are not even aware that the possibility exists. The second is economic. It costs money to ensure that a cross reference is in place. No one disputes that cross references are useful. There seems to be little awareness that they are essential if we are to ensure that authority control works properly. All the work that we put into authority control is wasted without a comprehensive set of cross references to ensure that our users are not misled into believing that the heading that they have chosen produces all the material that they are looking for. It is a pity that cataloguers have not appreciated that the main heading is more important than the main entry.
The card catalogue usually provided the cross references and the see also references that enabled users to move around the library catalogue in much the same way that links enable users to surf the internet. However, having provided the structure that creates these references the attitude seems to be '"there are too many of them" and "where do you draw the line". This attitude must be wrong when at least one of the search engines is indexing over a billion web pages. Both the search engine and the library catalogue need authority control to ensure that there are no breaks in the links which join like to like and related to alternative search headings. It is no use the library profession claiming that they have powerful tools in the see and see also references which the search engines lack and then neglecting to ensure that their own catalogues always use them.. We cannot claim as many librarians often do that the Internet is a chaotic muddle of information compared with the discipline of the catalogue. Priority must be given to putting this right. My case studies show the catalogue users are misled into thinking that the entry point that they have chosen has produced all the material held by the library. The power and the potential of the cross reference is that combined with authority control this could be eliminated. Perhaps more important if this could be done for the library catalogue it could be made to do it for the search engine. A first step would be to produce an OPAC powered by a search engine which is what we propose to examine in further research with BOPAC2.
Many cataloguers if pressed will probably say that authority control works better in theory than in practice. I believe that these case studies have shown that in a proportion of cases it simply does not work at all. There is a mass of literature on authority control most of it saying how valuable it is but very few demonstrate how effective it is and how much it costs. As far as costs are concerned there is ample evidence that it is one of the most expensive components of the cataloguing operation. Librarians are often quoted as saying that the Web is chaotic. This is in spite of the fact that its relevance ranking often brings the required information to the top of the retrieval, it searches a large amount of material very quickly and users find it much easier to manipulate than the library catalogue. However it lacks authority control which is perhaps the most important contribution that the library catalogue has to offer to t information management. But it has to be made to work properly. The links in the form of cross references and see also references must be comprehensively applied. As a matter of urgency Conference should give priority into ways in which the cross reference and see also reference can be brought back fully into our automated catalogues.
All of my case studies can be tested fully by anyone with access to the Internet on BOPAC2 using the methods that I have described. The evidence is there for anyone to see that authority control does not always work. My case studies I believe do provide valid evidence
This is not meant to be a negative or destructive exercise. My object is to highlight a problem in the systems that we use to construct our catalogues. Authority control is too important to abandon. We need to put in place the changes that will make it do the things that we claim that it does. There are a number of steps that should be taken and taken as a matter of urgency.