E3 9.1

donee: L u s ± o n s Introduction

S«

r e c u i i M n e n d a "b zLcz>m

This chapter starts with a brief discussion of users' current expectations from interactive computer systems. 5ections 9.2 - 9.4 bring together some of the results of the evaluation described in Chapter 6, for each of the three devices which were under test. We try to give answers to the evaluation questions listed in 5ection 8.1. It is impossible to do meaningful research in the design of online catalogues, or other interactive computer systems for untrained users, without a testbed system which is finished to quite a high standard. The system cannot be used for the collection of realistic and representative data unless users perceive it as being a proper tool for the job - the job, in our case, being the location of books on given subjects in a library. Four or five years ago this would not have been true, but by now the great majority of users have had previous experience of interactive computer programs - computer games on home micros, fruit machines, cash dispensing machines, viewdata systems and online catalogues. Users expect these programs to reach certain standards of acceptability, suitability and performance. R few years ago almost any online catalogue in a library was greeted with enthusiasm, and users tended to blame themselves for failures Csee, for example, 11, Mppendix 5J3. This is no longer true. None of our interviewed users said anything which suggested they might think their failure was connected with the way they searched rather than the way the computer processed their search (admittedly we did not as/< them this). Nor was there any comment to this effect in the suggestion books by the terminals. The easiest way to evaluate our devices with respect to fairly crude, but "objective", measures of recall and precision would have been to implement them in a retrieval system specifically designed for the repetition by experimenters of searches collected from the use of a real system. This would have avoided all the complications of presenting the devices acceptably in an online catalogue, which has to be extremely simple to use. Before trie proposal was submitted we knew, from repetition of searches from Okapi '64 logs, that some degree of automatic stemming and the use of cross-reference tables would be beneficial. We also knew the extent of the problems

-137-

9 Conclusions &

recommendations

caused by miskeyings and misspelLings Cat though we did not know whether attempted automatic correction would be worth the overheads}. What we did not know was how the devices should be presented. The investigation of present at ion, of

methods of incorporating
logue,

the devices

in an online

cata-

was the most important part of the research.

Pbout two person-years was spent on program design and programming during this project, although some of this was work towards a relevance feedback system Cthe subject of another project}. We did not have as much time for the collection and analysis of data as we would have Liked. We do have a wealth of transaction Log data - from some 5000 sessions at the time of writing - mostly from use of the EXP system. There are several research projects using our data and systems which would make suitable topics for Masters' dissertations in Library or information science. These include linguistic analyses of search statements, the investigation of different matching rules in spelling corrections and a study of weighting schemes and cut-off rules in ranked output searching. Further details will be given on request. 9.2 Stemming
9.2.1 Weak stemming

Table 8.6 shows that weak stemming caused the OIL system to find more records than the 05TEM system in 74 C4Q%3 of 15b initial searches. In four of the 74 the extra records were all or mostly false drops, and in six the results were mixed. Pn example of erroneous conflation at the weak stem level is the case of "skiing" and "sky", which are both transformed to *ski M . We have not had time to attempt an analysis of the types of search which are improved by weak stemming. The proportion of the above searches which fail completely on 05TEM is, however, small. In other words, weak stemming rarely turned a search from a complete failure into a success. We know there are such searches CRCCOUNTS DICTIGNORYD, but they do not seem to be very common. There appear to be two reasons for this - the combinatorial search, and the fact that both subject headings and titles are indexed, with many of the PCL records having both British CPRECI53 and Pmerican CLC5H3 subject descriptors. The extensive indexing means that many records contain, say, the singular form of a word in the title and its plural form in a heading. There are also many records with both British and Rmerican spellings. Because the combinatorial search often results in the retrieval of records which do not contain all the words oi the search, it is difficult to estimate the proportion

-138-

3 Conclusions

&

recommendations

where a real user would not have found any relevant records. This particularly applies to two-term searches where, if neither word is very common, all records containing either of" the words will be retrieved. R good example is RB0R7IUN RCT5. Because of the weak stemming this finds three records under "Pbortion Pet" on CTL. These are, of course, displayed first Cfollowed by the remaining records under "abortion - . 05TEM finds the 158 records indexed by one of "abortion" and "acts". The 16 records under "abortion" come out first, because "abortion" is Less common than "acts", so the user might still come across the three "Rbortion Ret" records. CIn a previous experiment [23 we used the rule - for comparison purposes that a search is to be counted a failure if no relevant record appears in the first ten which are displayed.3 ! It is clear, though, that with weak stemming a high proportion of searches find more relevant records than they do without stemming. Of the 64 cases where CTL found more than 05TEM, 33 C2 of which were all false drop?.} contained more records of maximum possible weight Ci.e. records which would have been retrieved using if the words were combined using RND3.
9.2.2 Spelling standardisation

In the searches which behaved better on CTL than on 057LM there is only one word C"advertising") which is affected by spelling standardisation. R random sample of 68 words extracted from the much) larger set of logged searches collected up to mid-February 1387 contained three examples: "behaviour", "advertising" and "color". This suggests that there is an appreciable proportion of such words in real searches. The words "Cre3organisCzJationCs3" occurred 48 times in 7700 searches. The bibliographic file contains 675 books under "organisation" and 648 under "organization"; in 50 of them both forms occur. Despite the examples in 6.2.11 Cshoe --> she etc3 we found no real search where the effect of spelling standardisation might have been detrimental. The mapping of "oe" to "e" should be conditional on the weak stem being at Least five Letters Long Cto avoid poet --> pet 3. "Poetry" Cwhich occurred five times in 7700 searches) should be treated as exceptional if there is strong stemming which might conflate it with "petriCfied3". Rmme --> am is occasionally contentious - "program" is a homograph in Rmerican English, and in only one of its meanings is it synonymous with the British or French "programme". This mapping might better be done with strong stemming than with weak. Ism --> ist, which is really stemming not spelling standardisation, should be incorporated with weak stemming, as almost all "ism/ist" pairs are very closely related in meaning, but there are a few

-139-

3 Conclusions

&

recommendations

words Clike * organism/ist"3 which must be treated as exceptional. 9.2.3 Strong stemming

Table 8.6 shows that the EXP system found more records than CTL in 53 C34%) of 155 initial searches. In nine of these cases the extra records were false drops, and in six the results were mixed. In 17 of the searches C13 good}, EXP showed an "absolute - improvement over 05TEM - i.e. CTL obtained the same result as 05TEM, but EXP retrieved more records. These results are partly due to the effect of the go/see list C9.43. Strong stemming affected the result in 37 of the 53 searches, beneficially in 22 and with mixed or bad results in the other 15. Summarising, strong stemming led to better results than weak stemming alone in 14% of the 155 initial searches, and to mixed or worse results in 10%. Table 8.7 shows a few of these searches. Clearly, strong stemming is not always safe. Strong stemming alone, applied to all searches, would be disastrous. On the other hand, it behaves well more often than it behaves badly. We would guess that it is rarely detrimental when applied to searches of three or more terms. When combined with weak stemming we would tentatively recommend its use in a combinatorial system like Gkapi, provided strong stems are always given lower weight than corresponding weak stems. This would ensure that records retrieved on strong stems would usually be displayed after records retrieved on weak stems. Such records would certainly not be offered as "matching your search exactly 8 , as can happen in the present system. COkapi '85 is supposed to ensure that strong stems have lower weight than weak stems Csee 6.5.33, but the procedure which assigns weights was never properly finished.} The question of improved stemming procedures must be considered. The most ambntious of the schemes mentioned in Chapter 3 is the MRR5 project C3.3.8J, but we have not seen detailed enough material to make an assessment of its suitability for this type of application. Practically all of the schemes referred to in Chapter 3 would, for example, conflate "organisation" and "organism". Even "industry" and "industrialisation" should not be blindly conflated. One possibility would be to limit possibly contentious stemming to words which produce stems which occur only rarely in the bibliographic file. Rlthough there would still be false drops the user could be warned that not all the records match very well, and there will not be many books indexed under the contentious stems. Such a system would also need a transparent way of showing the user why it found the records. Highlighting of the relevant stem is the obvious answer, but, as pointed out in 7.6.1, this is not particularly easy to achieve. To make-

9 Cone Lusions

& recommend

at

ions

a n i m p r o v e d s y s t e m f o r s t r o n g s t e m m i n g i t w o u l d be n e c e s s a r y t o u s e a c o n s i d e r a b l e number o f c o n d i t i o n a l r u l e s and a d i c t i o n a r y o f w o r d s Cnot s t e m s ) t o w h i c h t h e r u l e s a p p l y Ccf UNITED i n o u r w e a k s t e m m i n g p r o c e d u r e l ) . The d i c t i o n a r y w o u l d be c o n s u l t e d b e f o r e a p p l y i n g " b l i n d " s t e m m i n g . Two e x a m p l e s o f s u c h r u l e s Bre i s " I f w o r d i s DRBPN15RTIGN o r ORbHNlSER o r O R u U N I S R B I L I T Y [ e t c ] s t e m i t t o 0RGWNI5" a n d - I f w o r d i s DRGHN15M o r 0R6PN1S1 o r GRGRNIC l e a v e i t unchanged". 9.2.4 1 Answers to the questions on stemming

Does i t [stemming] s i g n i f i c a n t l y i n c r e a s e r e c a l l ? I f so, f o r what types of search? I n p a r t i c u l a r , how o f t e n do stemmed searches succeed where they would f a i l w i t h o u t stemming?

I t does s i g n i f i c a n t l y i n c r e a s e r e c a l l ; we h a v e made n o i n v e s t i g a t i o n o f t h e second q u e s t i o n , and t h e answer t o t h i r d i s "not very o f t e n " . 2 Does stemming s i g n i f i c a n t l y decrease p r e c i s i o n or lead t o f a l s e drops? a marked does. decrease in

the

Weak s t e m m i n g d o e s n o t L e a d t o p r e c i s i o n , but s t r o n g stemming 3

How does the use of b o t h s t r o n g and weak stemming (EXP system) compare w i t h weak stemming o n l y CCTL system]? For example one might f i n d t h a t t h e r e a r e , on average, fewer r e p h r a s i n g s of searches on EXP than on CTL.

T h e r e w a s n o s i g n i f i c a n t d i f f e r e n c e i n t h e mean n u m b e r C j u s t u n d e r t w o ) o f s e a r c h e s p e r s e s s i o n b e t w e e n EXP a n d CTL. 4 Does the EXP s y s t e m ' s t w o - l e v e l merge C6.5D make any d i f f e r e n c e Cexcept t o decrease search speed)?

The t w o - L e v e l m e r g e a v o i d e d t h e n e e d f o r t h e c o n s t r u c t i o n o f s e a r c h s e q u e n c i n g r u l e s Cof t h e f o r m " r e p e a t t h e s e a r c h u s i n g s t r o n g s t e m m i n g i f i t f a i l s w i t h weak s t e m m i n g ) . It enables the user i n t e r a c t i o n t o appear p l e a s a n t l y s i m p l e . More g e n e r a l l y , i t i s a t e c h n i q u e w h i c h a l l o w s t h e use of i m p l i c i t OR r e l a t i o n s h i p s i n r a n k e d - o u t p u t s e a r c h i n g . Rn i n f o r m a l d e s c r i p t i o n o f t h e m e r g e p r o c e d u r e c a n be r e a d between the l i n e s of S e c t i o n 6 . 5 . The a c t u a l m e r g e a l g o r i t h m w i l l be p u b l i s h e d e l s e w h e r e . I t i s a v a i l a b l e on request. 5 I s t h e r e a case f o r u s i n g s t r o n g stemming o n l y ? I f so, s h o u l d t h i s a p p l y t o a l l s e a r c h e s , or o n l y t o those cont a i n i n g more than a c e r t a i n number Ctwo, say) of terms?

S t r o n g s t e m m i n g o n l y w o u l d be a l m o s t i n s u p p o r t a b l e i n a general catalogue. I t may b e a c c e p t a b l e f o r s y s t e m s w h i c h

-141-

3 Conclusions

&

recommendations

access small collections of specialised material, but we are not concerned with such systems here.
9.2.5 Recommendations on stemming

Weak stemming is undoubtedly beneficial. In fact, it is inexcusable for it not to be provided in a keyword catalogue. Even weak stemming procedures should be improved by using a rather small dictionary of exceptions Cours consists of the single word "united"3. Alternatively, searches could be processed using two levels - no stemming and weak stemming, with the weak stems given lower weights than the "raw - words. The former makes Lighter demands on computing resources, but someone has to invest a good deal of intellectual effort in constructing an exception table Cwhich might need contextual informationD. 01 though spelling standardisation only affects a small proportion of searches Cin our subject areas) it costs almost nothing to incorporate it with weak stemming, and its effect should be almost entireLy beneficial. With the possible exception of "amme" it should be used at any level of search. It is doubtful whether really good results can be obtained with strong stemming unless it does use a fairly Large set of word-specific rules. However, it is on balance better than nothing, until we have better indexing C9.73 and better Iinguistic processing. 9.3 Spelling correction 6 How effective is EXP's semi-automatic correction procedure? How does it compare with users' response to CTL's 'CRN'T FIND' message? CFigs 7.5 and 7.6).

This was answered in 8.7.4, where Table 8.9 shows that users' treatment of words which are not known to the system is almost certainly better if spelling correction is applied than if it is not. On the EXP system 7 8 % were handled •well", against 6 4 % on the CTL system where the user has to type a replacement. Nevertheless there is scope for improving the correction procedure, which appears to be able to correct only about half the misspellings. There does not seem to be a serious rival, using current hardware, for a two-stage process comprising soundex or ngram similarity matching followed by a string similarity check of the user's word against the list selected at the first stage. For systems like online catalogues where it is undesirable to present the user with a choice of replacements, soundex is probably preferable to an n-gram

-142-

S ConeLusions

&

recommendations

technique (5.43. (We did not have time to experiment with n-grams, but the research done by WilLett and others, and the SPEEDCOP team (5.4.23 probably renders further experiments unnecessary.3
3.3.1 Recommendations and discussion

Semi-automatic spelling correction should be used in online catalogues. The procedure described in 6.4 is not unsatisfactory. It should be improved by 1 Weakening the selection of candidate replacements so that the correct replacement is more likely to appear in the output from this stage. We suggest a procedure very similar to the original Soundex: truncate at four or five characters and ignore vowels Cother than initial vowelsD. If this gives rise to some very long Lists, then it can be tightened by retaining the first two Letters unchanged (see Section 8.7.1 for some evidence that thus would not markedly decrease recall}. The treatment of misspellings Cas opposed to miskeyings3 would be somewhat improved if some attention were given, in coding the consonant structure, to the treatment of consonant groups such as "ng" (treat it as belonging to the same class as "n"3 and to "dg" (treat it like "g M 3. Ensuring that the dictionary contains as many as possible of the words which are actually used, by incorporating words from a very Large number of real searches. This means that the dictionary would contain words which the system will recognise but which do not occur in the bibllographic indexes. The catalogue must be able to report 'No books under "brimstone"' and give the user options of starting a new search, ignoring the word or entering a replacement (cf Fig 7.53. It must not ofier "brainstem" (Table 8.B3. This would show the user that the system recognises the word, but has nothing indexed under it. (Gkapi '86 can do this, but only for go I see terms which have no postings. 3 The preceding paragraph leads naturally to the suggestion that all the user words should be Looked up in the dictionary. Since more than 8 0 % of users' words ar^e Likely to occur in the dictionary (Table 6.1 shows 15.8% of a large sample were misspellings or "rubbish"3, it is not efficient processing to do this if the dictionary is separate from the index. But an ordinary inverted index designed for the retrieval of postings lists cannot be searched in such a way as to retrieve lists of candidate corrections for a misspelt word. Hence the dictionary should be partially duplicated in the index: the index would contain all recognisable words even if they do not occur in the bibllographic source data. This would not seriously increase the size of an inverted index, because most of t hie indexing storage is occupied by

2

-143-

9

Conclusions

& recommendat

ions

postings Lists rather than by terms. Finally, the system should be augmented by the inclusion of a small set of common and unambiguous misspellings, which should be directly mapped to their corrected versions. This is better implemented by putting such words into the cross-reference table C9.4D than into the spelling subsystem. 3 Our procedure for measuring similarity (Rppendix 23 should be improved. Much work has been done on this, under such headings as "string similarity measures", but we could not find any procedure which looked outstandingly good without being computationally complex. It Looks as if increasing sophistication leads to diminishing returns. We chose the "anagram" method because it is easy to implement, but such cases as the thacher --> teacher example show that it is not good enough C8.7.13. We have not given any further consideration tu this. It is one of the minor research topics suggested in 9.1. list

9.4 Cross-reference tables - the go/see

This contains some 230 sets of terms. Some of the sets consist of a single phrase which is to be treated as if it were a word. Others contain more than one item, and have the effect of causing a search for any one of the items to retrieve records indexed under any of .them. There is an extended discussion of the types of term in the list in 6.3. The list itself, designed tor our particular user population, is given in Appendix S. R summary of results combined with answers to the questions of 8.1 is given in the next section.
9.4.1 Pnswers to the questions on cross-reference tabies

7

How often does it [the table] make any difference? Does our List contain appropriate entries? How should one compile such a list for a given environment?

Rbout a quarter of 1087 searches contained a member of the List C8.83. The terms which were actually used are given in Table 8.10. Rn examination of searches suggests a number of additional entries, such as contract Law = law of contract , because "contract - RND "law" gives about 100 postings, a considerable proportion of which are false drops due to false coordination. We drew up the list after a study of past searches by users of the same library. This appears to be a good way of doing it. It could be expanded greatly by the use of search data from other discipl m e s .

-144-

3 Conclusions &

recommendations

8

Does the List Lead to faLse drops C'us1 [pronoun] = 'United States']?

We h a v e found no exarnpLe of a faLse d r o p a r i s i n g from the u s e of a g o / s e e term. O n e of the Largest g r o u p s of go/see e n t r i e s is that c o n s i s t i n g of a b b r e v i a t i o n s and a c r o n y m s Linked to the speLt out forms in w h i c h they are m o r e LikeLy to be g i v e n by c a t a l o g u e r s . P e o p l e Like to m a k e a c r o n y m s w h i c h a r e n o m o g r a p h i c and s u g g e s t i v e , such as O k a p i C e L u s i v e , Long g e s t a t i o n p e r i o d } . This is alL right in o r d i n a r y w r i t t e n L a n g u a g e , b e c a u s e of context and CdecreasingLyD the u s e of upper c a s e . It is a s e r i o u s p r o b l e m for i n f o r m a t i o n retrieval s y s t e m s . W e c a n put "U5" in the List b e c a u s e the p r o n o u n 'us" is rare in b i b l i o g r a p h i c Csubject3 data and v e r y u n l i k e l y in s e a r c h s t a t e ments. But w e cannot put "RIDS" in the List. "IT" might be c o n s i d e r e d for i n c l u s i o n . The p r o n o u n "it" o c c u r s some 2 0 0 times in t i t l e s , but this can be s t o p p e d . In 7,700 s e a r c h e s of O k a p i '66 there w e r e seven o c c u r r e n c e s of "it" or " I T " , and three of "information t e c h n o l o g y " . RlI the o c c u r r e n c e s of "it/If" in s e a r c h e s intend the p r o n o u n Ce.g. I N D U S ! R 1 H L R E V O L U T I O N W R S IT R R E V O L U T I O N ) . To cope w i t h w o r d s like R I D S and I T , there h a s to be a new type of object in the List Csee b e l o w } . 9 Should there be more than one type of object in the List Ce.g. see alsos as well as sees}?

C l e a r l y a c a t a l o g u e should h a v e w a y s of o f f e r i n g see a l s o references. This is rather o u t s i d e the scope of the present project. The M I D S and IT e x a m p l e s show that there should be a third type of object - sets of h o m o g r a p h s . These a r e like m u l t i - v a l u e d see r e f e r e n c e s : aids - see aids CroLe 1) or aids (role 2 9 . 9.4.2 Recommendations on cross-reference tables

Our a p p l i c a t i o n has proved s u c c e s s f u l a c r o s s a fairly w i d e range of subject a r e a s . We suspect that c o m p i l i n g e n t r i e s for the hard s c i e n c e s w o u l d be r e l a t i v e l y easy as there i s , on the w h o l e , less a m b i g u i t y . E x i s t i n g thesauri are a rich s o u r c e of m a t e r i a l . On the other h a n d , the p r o b l e m is c e r t a i n l y greater in the h u m a n i t i e s . Pny List r e q u i r e s c o n s t a n t m a i n t e n a n c e to reflect language c h a n g e s . Since lists of our type are far smaller t h a n , say, subject a u t h o r i t y f i l e s , such m a i n t e n a n c e w o u l d not lead to the p r o b l e m of s c a l e w h i c h is o n e the reasons why i n d e x i n g and c l a s s i f i c a t i o n languages tend to lack c u r r e n c y . W e r e c o m m e n d that an e x t e n d e d c r o s s - r e f e r e n c e list should be c o m p i l e d , and that this be m a i n t a i n e d by m e r g i n g e n t r i e s from c o n t r i b u t i n g l i b r a r i e s .

-145-

LoncLusionm

& recommendations

The use of tables complexity end computational demands both whmn indexing end whmn scorching 0 It is trivial to extract either individual wordi or entire "headingsra from source 'kmni or from users 5 input. But procedures for automatically checking mil embedded subphrases as candidate indfK or search teems arc much more compticefodo Rtfonfion needs to bo given to the design of efficient algorithmic The member ©f lookups is proper fi©met 4© the number ©f chords in the text being processed. It is cerfoinly necessary t© hold the feble in quick aeeesi memoryo CWe avoided this problem of scorch time < the lis** = > is in effect embodied in the index C B o 3 ) 0 Dkepi 5 8B has only restricted knowledge ©t the lief sihen it is performing searches^ If should not 0 for cxsmple^ be ^ble f o explain t ' c the user thet shuon it is looking op a UK° it is else looking op °United Kingdom 0 0 °Breef Iritein 0 B efc 0 If mey be thought desirebte that the system should be capable of explaining itself 0 3 HOMOGRRPHS MQRPHS C4o2oi) handles some homograph narrow subject eree.
ie searcher f o r P1D5 could be asked >bddt en a feirh

Please explain

D

eids°

D you mrnmn o 1 i °eids° ° devises tor helping © r 2 s Required I m n Deficiency Syndrom! m ue Type e i © i r P

The problem here ( a p a r t from t h a t of c o m p i l i n g the l i s t ) us one of i d e n t i f y i n g the meaning of the word en the b i b l i o graphic record. I n most cases, e . g . DCbina°g t h i s has t o be done manual lye Programs would have t o be w r i t t e n t o enable sndexers t o run m H R H C f i l e against a homograph tieto I t would p i c k out candidate swords end shoe/ them i n t h e i r context 0 mnd prompt the isndexer t o s e l e c t the eppropr i e f e r©le 0

UoS U§df§ fl

p e r c e p t i o n o f c m l bebinricujn north t h e eysfcm e d

1 fbst sort of conceptual fmsdets b yiers heve ©f the sets0 @ ( o u ? H e do they think i t corks? Is i t e©^f©rtiMs t© Lge o ©ie? i s i t (MKcitin§ © h@^ing © §itiy? r r H study of user behaviour i s o u t s i d e the scope of the present p r o j e c t » However 8 i s pointed out i n B 1 o « i f is essential to do research of t h i s type on a catalogue which does not behove i n such a y i y as seriously to confuse,

3 Conclusions &

recommendations

surprise or irritate users. From the comments given in 8.4.3 - particularly from the fact that most users did not offer any comments - and the (remarkably few) entries in the suggestion books, it is fairly obvious that Dkapi '8B behaves in such a way that it is taken for granted by most users. Most users regard it as being neutral. H substantial proportion seemed to think it rather or very good compared with other manual or computer catalogues which they had used. 5ome users certainly notice that it "does words separately", and there was even one favourable comment about this. 5o long as a search succeeds, word searching is doubtless acceptable - catalogue users do not mind how the system works if it seems to find the right books. When searches are unsuccessful, the system is criticised as being "unintelligent" ("it only Looks for keywords doesn't analyse the search"3. There were at least three interview comments, and several more in the suggestion books, to this effect. Most users do nut expect a catalogue to be exciting, or even interesting. Catalogues are taken for granted and regarded purely as tools which are to be used without the necessity of applying much in the way of forethought or initiative. We think that most users see Dkapi 'SB as a tool which is at least as effective as other catalogues. However, we believe that online catalogues may come to be regarded as multi-function power tools rather than as spades. 11 Does it give a dangerous impression of cleverness or of infallibility? R significant minority of searches were of the type Cwhich we classified as Q - see B.3.63 exemplified by the search BY WHP1 MEPN5 PRE WE EDUCATED FUR SEXUPL INEQUALITY IN WURK. It is unlikely that users would try to Look for such phrases as headings in a conventional catalogue. Ukapi invites the user to "Type a word or phrase which describes the books you want". Many Ll-type searches do satisfy this description. They are descriptive of the books the user wants. But they do not describe the books in a way which concords with the way books are described in bibliographic records. It is very difficult to think of any concise prompt which would inhibit people from entering this type of search. There are two ways of tackling this problem. By far the simplest is to use a stop list which includes a wide range of function words and pronouns. Some Q searches then work quite well. But many will still fail because they tend to be far too specific Cneither SEXUPL INEQUALITY nor INEQUALITY PT WORK finds more than a handful of books in

-147-

Delusions & rec

lendafions

the catalogue, end SEXURL INEQUALITY PT WORK, lurpnsingLy, finds nothing but false drops} 0 R better approach may be t© use io&ie simple linguistic analysis to try to identify "inappropriate03 itirchii« and suggest to the user that he or she might try something rather less specific 0 Do® (Rpptioajbiiliiy ®f ©uo finxdimrp Pit the Dkapi Pfi©§rch (has been aimed ©i investigating techniques which are p©ssibl© mowy using existing Pi§©yrciio Thai ii 5 they could ©pp©^}P in commercial applications within five years ©m s©o Some have already appeared - not meoessarily as m result ©f the Okapi n i i i P c h Cnoiably the us© in keyword searching ©f combinatorial merging instead ©f implicit PND°B this wis first implemented = s'io m ©atat©gu© m in CTTE30 The ©ute©me ©f the present pr©jeei is no exception. Rill the devices ©ould be implemented noy in m eoomercial system without demanding more in the way of hardware resources than what is normally available for integrated library systems 0 However 0 although catalogue access is the most eomputafaomally demanding facet of mn integrated sysfeny from the design and programming point of view it is only a small portion of the whole 0 The design of catalogue access facilities has to be done so that of is compatible with the demands of cataloguing and acquisitions Cwhich need rapid updating of files and indexes) and of circulation control 0 This makes catalogue access very much more difficult for the oommerccal designer than it is for us 8 who do not have to link to circulation status s and who update files only occasionally B and offI one 0 Hnofher very important point is that to avoid extended development times commercial designers nearly always have to work within the constraints of languages s operating systems and drntmbmrnw management systems which were designed long before the days of interactive computing for casual userso Ife do not use any existing system softwareo Qkapi depends ©nly ©n the pre-exisience of four primitive input and ©utput functions0 Much system software offers very tempting easy=to=program facilities [sorting and merging s the automatic extraction ©t swords from text 8 automatic mmxnimnmn^m of indexedsequential files) 0 Of the system software which we have come across none is quite good enough to provide more than a just-acceptable compromise D Some system software will do the jobp but will not do if efficiently enough 0 Hn example is index lookup 0 In Qkapi we can be fairly prodigal with this. H search of four words may involve eight or fen Lookup operations^ including weak and strong stems and perhaps mn attempt to match a possibly misspelt word 0 On

3 Conclusions &

recommendations

Qkapi Lookup rarely takes more than two disk accesses, because we can optimise the file structure to suit Lookup rather than updating. A typical Lookup in a commercial database system takes three to five or more accesses. Of the devices treated in this report, it should be fairly easy to graft a single Level of stemming onto most keyword systems. A few systems already allow the use of Limited automatic cross-referencing. We have shown that this facility is worth using. Spelling correction systems are a Little more ambitious, but they are not very demanding on storage, and they have the advantage that the dictionary, once constructed, does not need much maintenance. We hope libraries will demand systems in which the search ACCOUNTS DICTIONARY does not fail when the Library holds books with titles Like a A dictionary of accounting" and subject headings Like "Accountancy - dictionaries". If they do not make these demands on suppliers they are not meeting their responsibilities to user's. 9.7 Concluding remarks Although Gkapi '86 is a relatively good subject search system given the content of bibliographic records, it is, by absolute standards, rather poor. Fourteen of 122 sessions reported in Table 6.1 and Section 6.4.2 failed although the Library held probably-relevant material. Seven failed because, although the searches were quite comprehensible, the language did not match that of the catalogue well enough to be picked up by any of our devices. Four failed because they were too specific. Only two CSTERLING when the user wanted "sterling shares and gold" and BRITAIN AS R DEVELOPING COUNTRY for "Economic development of Britain in the 16th century) failed because the search did not describe the user's needs. Almost all these searches would have succeeded, and many mure searches which did not completely fail would have given better recall and probably better precision if our records had proper analytical indexing using contents pages and added free Language descriptors. The efficacy of such enhanced indexing was demonstrated long ago C 3 ] , The time is Long overdue for a Large scale test of analytical indexing in a ranked-output system. Much research effort has been put into the investigation of ways of making inadequately described records accessible. Is it not better to attack the problem by improving the quality and richness of the access points to bibliographic

files?

-149-

3 Conclusions

&

recommendations

References 1 MITEV N N, VENNER G M and WRLKER 5. Designing an online public access catalogue : Okapi, a catalogue on a local area network. CLibrary and Information Research Report 333. London : British Library, 1985. 2 JDNE5 R M. Improving Okapi : transaction Log analysis of failed searches in an online catalogue. VINE 62, May 1386, 3-13. 3 5YRPCU5E UNIVER5ITY. INFORMATION STUDIES DEPRRTMENT.
5 U B J E C T R E C E S S P R O J E C T . Books of the Subject Recess Project are for Use : finaL to the Council on report Library

Resources. Pauline Rtherton, Director. Syracuse University, 1978.

- i^n_