3 3.1 5Y5TEM Introduction D E S C R I P T ICDN This chapter describes the retrieval systems which were used for the evaluation. In a previous report [WRLK87b] we separated "Design and implementation8 from "System description". One reviewer [HRNC88] suggested that it would have been easier to read if the two had been combined. In the present report we have done this, although this makes rather a long chapter. Three systems were used in the evaluation experiments. These are referred to as "dumb" or "D", "qe" or "Q" Cfor "query expansion"3 or "intermediate", and "full" or "F B . The dumb system is similar to the "exp" Okapi system described in [WRLK87b, Ch 71, performing a "best match" search on words, phrases and stems extracted and derived from users' input. It has the added capability of allowing the user to compile a list of selected books to be printed out. The qe system has all the features of the dumb system together with a query expansion option in which it offers to "Look for books similar to" those already chosen. The full system has all the features of the qe system and an additional "shelf browsing" option. This enables the user to browse through books in Dewey class number order - roughly the order in which they would appear on the shelves, using the book most recently selected as a starting point. The three experimental systems are very similar to each other in the sense that it is not always immediately obvious which system one is using. They differ mainly in the browsing facilities provided. We also occasionally refer to a hypothetical "production" system - one that might be used with real users in a real library. Rll the systems are embedded in one program which can be run with different parameters to produce various search systems. 3.2 The bibliographic database Each of the systems searches an index of keys derived from title-Like and subject heading (Library of Congress and PRECI5 verbal feature heading] fields of PCL's MRRC monograph catalogue. This index also contains corporate and conference names from author fields. There is an index of Dewey numbers which is used by the qe and full systems. Okapi bibliographic files are derived from UK MRRC tapes. Much information is discarded during this process, including physical description and statements of responsibility. However, because they are intended for subject searching they contain far more information than the minimum recommended by the Centre for Catalogue Research report [5ERL82J. In particular they contain all names, most of the title -13- 3 System description information and all subject headings. Some punctuation is added to enhance display, and there are a few special characters which indicate the type of indexing required or the presence of non-RSCII characters which may not be displayable. The bibliographic and index files are described in more detail in Appendix 1. 3.3 Hardware and software The search and indexing programs are written in C, and run on Sun 3/50 processors under Sun Microsystems' Release 3.4 of their version of the 4.2 BSD UNIX operating system. Many of the functions and data structures are similar to previous versions of Dkapi Cwhich were written in ZBO assembly language and ran on 8-bit machines}. Each Sun processor would probably handle upwards of eight simultaneous users, aLthough no serious testing has been done. The terminals are of a monochrome VT22Qcompatible type using 25 lines of 80 character spaces, with some use of a line-drawing character set to make windows and boxes. 3.4 Dialogue style Rll commands are single keystrokes. The user only has to type the initial letter of the appropriate option and this is always highlighted and underlined for emphasis. Rn effort has been made to avoid side effects resulting from typing more than the initial Letter, but this does not always work very well when the terminal is connected over a network. R Large proportion of computer users are used to screen editors and other programs where commands do not have to be terminated, and we have found that most other people quickly become accustomed to the single keystroke style. The system is actually looking for a minimum unambiguous abbreviation, but the command names have all been chosen so that they differ in the first character whenever they are valid at the same time. "D" is used both for "dispLay" and "down", but these are never simultaneously valid, and anyway have closely related functions, the "down" command resulting in a new display screen containing the "next" records. If an invalid command is entered it is echoed and remains on the screen for a few seconds or until another key is pressed. Occasionally an invalid command produces a warning message: a "down" command can produce a Little box or "snippet" containing "You're already at the end - try something else". 3.5 3.5.7 Obtaining and processing user input Obtaining input Rs the three systems are designed for subject searching only, there is no choice of search types such as author, title, subject. The first screen the user sees is the search input screen (Fig 3.13. This is clearLy headed "SUBJECT 5ERRCH" to deter people from trying to look for specific items by author and/or title. R single sentence tries to suggest what the system does. This description is simplified but, in stating that the system looks for books including "all Cor most) of your words in their titles or subject descriptions", it should give enough information to be useful in deciding how to phrase a search. The term "subject description" is doubtless unfamiliar to most users but it has a degree of transparency. "Subject heading" might be preferable in the United States. It is important to convey that the system searches on -20- 3 System description information other than the title alone. The catalogue will accept a search statement consisting of a single word, a List of related words, a "natural language" phrase or even a Dewey number. However, it simply asks for "a word or a phrase which describes the books you want" as most people seem to find it easier and more natural to type a fairly coherent phrase than a string of unconnected terms. The input area is delineated by a rectangular box which will take up to 76 characters of input; experience has shown that this is almost always enough. To start the search process it is necessary to press the Return key. People familiar with multi-user computer systems may do this without being reminded, so there is no permanent prompt. If after a few seconds the user appears to have finished typing and has not pressed return, a message Cor "snippet") appears just beneath the input box CFig 3.13. Messages which appear suddenly and unpredictably are more Likely to be noticed than ones which are part of the scenery. 3.5.2 Input preprocessing The search is now put into a form in which it can be parsed into constituent words and subphrases to be Looked up in the indexes. Punctuation and spacing are tidied up, Letters are put into Lower case and most non-alphanumeric characters removed. This produces a pre-processed search statement which is comprehensible but not always exactly what the user typed. For example "Post-war theatre & cinema actors organisations in the 11.5.0" becomes "postwar theatre and cinema actors organisations in the usa", "statistics,regression,correlation" becomes "statistics regression correlation". The pre-processed search is then disassembled into words. Regular English plurals and "ing" endings are removed and spellings are adjusted to remove many of the differences between Rmerican and British English, and the search is reassembled into a phrase. The example now becomes "postwar theater and cinema actor organization in the usa". This is purely an internal representation of the search. It is not shown to the user, who would not normally need to know that these minor adjustments have been made. There is a fuller description of this pre-processing in Appendix 2. CNote that a simitar process is applied to the text of the bibliographic records during indexing.) 3.5.3 Parsing and Lookup Then the look-up screen appears Ctop of Figs 3.2 - 3.43. Constituents of the pre-processed search statement are Looked up in a database which contains the systems' rather meagre Linguistic knowledge. This database is known as the "go/see list", or gsL. The gsL contains five classes of entries. These are C D stop words like "the" and "a", C2) common prefixes, C3) classes of terms which are to be treated as synonymous C"usa", "united states", "united states of america"; "child", "children"), C4) phrases which behave Like single words C"soap opera") and C53 "dubious" words and phrases which although they cannot be stopped are not to be given much weight in comparison with other terms. Examples of terms in class 5 include "introduction", "lecture notes", "system", "theory". There are fuLLer details of the construction and use of such a gsl in an earlier version of Okapi in [WPLK87b, Chapter 63. RLL the components of the pre-processed search statement are then Looked up in the gsl to identify stop words, words and phrases which have synonyms, phrases to be treated as words and "dubious" terms. For -21- 3 System description example, the search "TV commercials and soap opera* will have been pre-processed to Htv commercial and soap opera". "Tv" finds a match in the gsl Cit is equiv^lenced to "television"3, so the first constituent term is "television" and its synonyms. "Commercial" is not in the gsL, so this is the second term. "Rnd" is in the gsl as a stop word. Finally "soap opera" is matched as a phrase. Rs soon as a term has been identified and looked up in the indexes the result is shown to the user. If the term is found the system displays a line like '343 books under "tv"' CFigs 3.2 - 3.43. If the term cannot be found in either the gsl or the indexes there is a message Like 'can't find "televsion"' CFigs 3.2 and 3.33. If it is in the gsl but not in the indexes Cthe system can "know" terms which do not appear in any bibliographic record3 there is a message like "No books under "haitian"'. In the two Last cases the user is forced either to replace the word or to instruct the system to ignore it; when the user has taken the appropriate action the search restarts with the message at the top of the screen showing the revised query. Few other reference retrieval systems force the user to make a decision about terms which the system does not know. In the case of keyword systems with implicit RND this usually leads to "zero hits" Calthough there are systems which recover to the last "non-zero" result3. In other "best match" systems such as LIBERTRS there may be a non-zero result but it will often be a misleading one. Users often do not notice that a term has not been found. Misspellings and miskeyings are freauent. Our way of dealing with this problem is simple to implement and makes a significant contribution to efficiency and usability. CR previous Dkapi system would sometimes try to suggest a replacement. We concluded that this was worth doing, even if it did suggest "teacher" for "Thacher". The rather tight matching procedure was less profligate with absurdities than are seme of the commercially available spelling checkers. It managed to suggest a replacement rather more than half the time, and more than three-quarters of the suggestions were correct. We did not have time to incorporate this feature in the systems described here.3 3.6 3.6.1 Search processing Term weighting and the merge Each term is assigned a weight which depends inversely on the number of records which contain it - a rare term is worth more than a common one. Following Croft and Harper C2.23 and many others we use one of the formulas term weight = logCN / n3 or term weight = LogCCN - n3 / n3 where n is the number of postings for the term and N is a constant which must be larger than the number of postings for the most highly posted term in the search. For the experiments we used N = 32768 for all. searches. The most highLy posted term Cthe class containing "great britain"3 had about 12,000 postings. Terms of "dubious" subject content Csee 3.5 above3 then have their weights Carbitrarily3 reduced by 50%. -22- 3 System description The weight of a retrieved record is the sum of the weights of those of the query terms by which it is indexed. R "maximum possible" weight is calculated. This is the weight of a record containing all the terms of the query, a record which would be retrieved by a system performing an implicit boolean RND on the search terms. Two threshold weights are then assigned - a weight above which a record will be deemed to match "quite well 3 , and a weight below which a record will not be retrieved at all. In general these thresholds are set at two-thirds and half the maximum possible weight. These values are arbitrary but well tried and seem to be reasonable. They could be lowered somewhat for searches containing more than three or four terms. Searches containing only one or two terms are treated differently. Previous experience has shown that up to two-thirds of all searches contain one or two terms. In a singLe term search the weight of the term is of course irrelevant: a record either matches or not. Two term searches are treated as follows. Terms are either "rare" or "common". R term is rare if it would not be too tedious for the user to look at all records containing the term. For the experiments the rarity threshold was set to 64. Rny other term is common. For all two term searches both terms have to be present for a record to match "well". The setting of the lower threshold depends on whether the terms are C D both rare, C23 one rare and one common or C3) both common. TWO RRRE TERMS Example: "vebe consistometer". RLL records containing either term are retrieved. ONE RRRE TERM RND ONE COMMON Example: "the psychology of biscuits". Rll records containing the term are retrieved. TWO COMMON TERMS Example: "economic history". Records must contain both terms. When counting the number of terms a "dubious" term is not supposed to be counted. There are some errors in the program which is supposed to deal with this, so some searches containing dubious terms do not quite behave as expected. When term weights have been assigned the postings Lists for all the terms are merged. The output from the merge consists of all the postings for records which have a weight of at least the lower threshold (minimum acceptable weight}. These are counted and sorted in order of decreasing weight. This is the order in which records will be shown, if the user chooses to display records. rare 3.6.2 Initial search results and options On completion of the merge, the system summarises the results of the search CFig 3.4, bottom half}. The messages give some indication of how well the records may be expected to match the search. There are never more than two lines in the results message, which can indicate how many match the search "well" or "fairly well" and how many have been found altogether. If there are no records of "good" weight the message may say "n books found, but they don't match your search very well". -23- 3 System description Options are displayed beneath the results message CFig 3.43. "Display" is always at the head of the list Cunless no items have been found). It is set apart from the other three options, encouraging the user to look at the books the system has found. Display is not discouraged even when the system has found a large number of records of good weight. With the dumb system it might be sensible to permute the order of the options according to results, and make suggestive prompts about trying "a more specific search" Cmany records} or "a different way of expressing your search" (very few records). The qe and the full systems are intended to help the user alter the breadth and focus of a search as it progresses. Even so, there was no significant difference between the systems in the number of searches per session. The other options are presented in a separate group. "New" goes to the screen of Fig 3.1B, which displays up to three previous search statements in a session, to remind users of what they have already tried. "Edit" goes to a screen which is the same as Fig 3.16 except that the input box contains the previous search with the cursor positioned two spaces after the end of it, so that the search can conveniently be repeated, added to or truncated back. "Quit" takes the user to an empty initial input screen CFig 3.13. 3.7 3.7.7 Record displays Brief record display When given the "display" command the system shows a screenful of brief bibliographic records, each one taking up a single line on the screen CFig 3.5). The display is headed 9LI5T OF BOOKS". Up to nine records are displayed on the screen. Each record contains title, author, Dewey number and publication date, displayed in fixed length fields under column headings. Rny field except date may be truncated. The title field is 33 characters Long so most titles are truncated. In some cases the truncated title couLd mean that a user fails to identify a possibly useful item. However, a truncated field is always transparently indicated by the two dots immediately following it. This type of fixed field display probably has more advantages than disadvantages. It is almost certainly quicker to scan than the two-Line layout used by one or two of the commercial systems, and it carries more information per screen without being difficult to read. 5ingle-line displays with variable length fields separated by slashes have been used in some systems. They carry more information, but are unappealing and difficult to read. RECORD NUMBERING Brief records are numbered with single digits 1 to 3. Ten or eleven records would fit quite comfortably on a 25 Line screen, but this is not compatible with the single keystroke command style. C l few systems have f used letters for line identification instead of numbers. Most people are much quicker at finding numbers than letters on the keyboard. This problem can be alleviated - as can many others - by using a mouse or tracker ball for command selection.) If a set of records extends over more than one screen the numbering starts again from 1. There are arguments for continuous numbering, but this again is not compatible with singLe keystroke commands. In any -24- 3 System description the shelf browsing display in numbered continuously. set of records the systems 18 of 33" in the top right case, unbounded, browsing displays such as the full system C3.10, Fig 3.13) cannot be Whenever the user is Looking at a "finite" display a message of the form "Books 10 to hand corner of the screen. ORDER OF DI5PLRY Records are displayed in descending weight order. Records with equal weights are displayed in reverse order of publication date Cmost recent first) and within this by author and title. There is no sorting of records Capart from by weight] when the system is operating. The reverse date order is achieved by keeping the bibliographic file in what is in effect pre-sorted order. When the weight of displayed records crosses one of the threshold weights there are messages "the rest of the books may match your search Less well" or "the rest of the books may not match your search very well". One of these messages can be seen in Fig 3.5, where the first three records contain both the words of the search but all the rest contain only the word "insecticide" Crecords under "environment" alone have not been retrieved because it is a common word, with more than 1200 postings). CONTEXT INFQRMRTIDN AND 0PTI0N5 Other important information is displayed at the top and bottom of the screen (Fig 3.S3. Context information is at the top. In the top lefthand corner is the heading which tells the user that this is a "LIST OF B00K5". The search statement is displayed immediately underneath this. In the top righthand corner is the message indicating the current position in this list of books. The information at the bottom of the screen tells the user what can be done next. Immediately beneath the record display window is the message which encourages users to Look at further details for particular items: "Type its number to see if a book is relevant". The word "number" is highlighted and refers to the reference number from 1 to 3 which is assigned to each record. In some previous systems we have used "Type its number far fuller details of one book", and in others there was no brief display at all. For the experiments described here it was essential that users should indicate which records they considered relevant. Brief records are unreliable indicators of relevance. We did consider using fulL records only, as in the systems described in [WRLK87b]. The List of valid commands is given at the foot of the screen. If there are more than nine records the user can go "down" to see the other books which have been retrieved. Having gone down a screen the user can come back up again and can move freely up and down the List. Other options, including the umbrella "Restart/new search/quit", are discussed in Later sections. 3.7.2 Full record display On responding to the prompt to "Type its number to see if a book is relevant", the user obtains a "full" bibliographic record for that book CFigs 3.6, 3.7). While the record is being formatted for display it is "indexed" - that 3 System d&scpipiiQin is l terms are extracted from subject-rich fields„ including Dewey number „ just as they were when the file was being indexed0 These terms are, to start wi'thj associated only with the current record„ They are used to enable the systems to highlight those terms in the record which caused it to be retrieved,, Cln systems like Okapi where records may be retrieved by terms which are not literally in the search this highlighting is not very straightforward. It cannot be done, as it is in some systemsf by strimgsearching the text of the recordo Hgainfl there are some faults in the program, and highlighting does not work very reliably in the experimental systems.) In the dumb system these extracted terms have no further uses but in the qe and full systems they may be used to expand the current search. If the user answers fflyesM to the relevance question (see below) the extracted terms are added to a List which contains the terms extracted from all the chosen records„ This list is used as the source of search terms if the user asks the system to look for "Hare books like the ones you have chosen"0 RECORD CONTENT The display shows the author and other associated names, main, series and part titles, publisher, date of publication^ and subject headings0 Each field or field group is on a separate line and is clearly labelled* Information regarding number and location of the copies held, along with the shelfmarkp is given at the foot of the record. This location information was redundant (except perhaps for the Dewey number) during the experiments, because subjects were not asked to consider availability, merely to choose a list of books„ It would be essential in a production systems Hs in the brief display, context information is given at the head of the screen. This comprises the display heading, search statement and the position occupied in the list of retrieved items by this particular book as "Book 1 of 2G°° CFig 3*6) 0 Terms in the record which are included in the user J s search statement are highlighted so that the user may see why the item has been retrieved0 3oB Obtaining relevance judgments Ht the foot of the full record in all three systems users are presented with a question which they must answer before proceeding„ In the dumb 0 system the user is asked l (Would you like the computer to remember this book in case you want to look at it again?800 After two or three records have been seen this rather ponderous question is shortened to ffiRemember this book?"= If the user answers "y" the system adds the item to a list of chosen books which can be reviewed (3*11) and which is printed out at the end of the search0 In the qe and full systems the question is different: fflls this at all the sort of book you are looking for?" (wording suggested by Professor 5 E Robertson of City University)0 Hs with the dumb system the item is added to a list of chosen books if the user's reponse to the question is ffi E y o However j the qe and full systems also use the subject-rich fields , from chosen records in their query expansion facilities and the full system can use a chosen record as a starting point for shelf-order browsing. The phrasing of the question in these two systems reflects this attempt to build up a profile of the user^s area of interest„ -26- 3 System description In all three systems, if the user answers "no" to the question at the foot of the fuLL record the system returns to the original display of brief records, R rejected record is marked by a single star in this and all subsequent displays. R record chosen as relevant is marked with two stars. The stars are not explained, and some of the experimental subjects did not grasp their meaning. 3.9 Query expansion The primary object of this research was to take some steps towards testing the efficacy of query expansion. The general idea is that terms extracted from records chosen by the user as relevant are added to the terms derived directly from the user's query, and a new search carried out. 3.3.7 Term extraction and weighting Figs 3.6 and 3.7 show the user judging two books relevant to the search for "insecticides and the environment". The system adds the terms it has extracted from title, subject and cLass number fields to the List of terms which it will use if the user requests query expansion. This list aLready contains the query terms "insecticide" and "environment". It now also contains potentially useful terms such as "insect", "control", "organochlarine", "persistent", "pollutant", "pesticide", "environmental" and two Dewey numbers, as well as a few unhelpful terms like "economy", "natural" and "selection". Against each term is the number of relevant records in which it has occurred. This number is referred to as r, and is used in re-calculating term weights before an expanded search is done. Rs soon as at Least one record has been chosen as relevant on the Q or F systems an additional prompt, "Type More to look for books similar to the oneCs) you chose", appears on the brief record display screens CFig 3.83. If the user selects this option, weights are assigned to the terms in the term list and the "best" terms are selected and used for a new search. The formula used for weighting is based on the F4' formula of Robertson and Sparck Jones given in 2.2. The formula is LogCCr + 0.5 / R - r + 0.5)/Cn - r + 0 . 5 / N - n - R + r + 0.533 where N and n are as given above in 3.6.1 for the original search, R = number of records chosen as relevant, and r = number of chosen records which contain the term When R and r are zero - that is, before any records have been judged relevant - the formula gives substantially the same weights as the initial weighting rule given above. Its general effect is to increase the weight of terms which occur in most of the relevant records and decrease the weight of those which tend not to occur in relevant records. -27- 3 Example: System description Original search term weights C = 327683: N Term postings Cn3 47 1204 weight insecticide environment 103 26 The most highly weighted terms after the records shown i n Figs 3.6 and 3.7 have been judged relevant (N = 32768, R = 23 are: Term postings Cn3 r e l recs w i t h term 0 3 weight organochlorine 668.651 chlorine insecticide persistent 632.35042 pollutant pesticide biochemical insect compound organic environment selection 2 2 3 26 7 1 1 28 S3 67 118 138 204 1204 2 331 2 1 144 144 136 127 123 116 102 90 83 81 78 73 70 66 After further records have been retrieved and chosen, the Dewey number 632.35042 eventually reaches the top of the weight table, followed by "pesticide". This Dewey number represents undesired effects of pesticides and their control, and i t is used to classify nine of the thirteen books eventually chosen in this search CFig 3.153. The system s e l e c t s the most h i g h l y w e i g h t e d terms from the term L i s t and performs a new s e a r c h , fl parameter determines the maximum number Cup to 323 of terms used f o r query expansion. Some i n f o r m a l experiments were done to determine a reasonable range of v a l u e s f o r t h i s parameter. I f set too low, at 4 f o r example, expansion searches tended t o r e t r i e v e j u s t a h a n d f u l of records indexed by a few rare terms Cone of the c r i t i c i s m s of the Robertson/Sparck Jones f o r m u l a i s t h a t i t g i v e s c o m p a r a t i v e l y h i g h weight to rare terms even i f they have appeared i n o n l y one r e l e v a n t r e c o r d - see 2 . 2 3 . Rt t h e o t h e r extreme, u s i n g a l a r g e number of terms seemed t o g i v e g e n e r a l l y good r e s u l t s , but w i t h d e c r e a s i n g r e t u r n s , and searches u s i n g dozens of terms need computing resources which might be unacceptably h i g h i n a Live s i t u a t i o n . There was a n o t i c e a b l e but not very s i g n i f i c a n t d i f f e r e n c e between the r e s u l t s o b t a i n e d w i t h 16 and w i t h 24 terms. The parameter was C f a i r l y a r b i t r a r i l y 3 set t o 16 f o r the e x p e r i m e n t . The term s e l e c t i o n process i s d e s c r i b e d i n the f o l l o w i n g s u b s e c t i o n . 3.3.2 S e l e c t i o n of terms for query expansion The term L i s t i s s o r t e d by descending term w e i g h t . S t a r t i n g a t the top of t h e L i s t terms are s e l e c t e d p r o v i d e d C13 the user has not a l r e a d y -28- 3 System description seen all the records indexed by this term, (2) the term is not in a List of "dubious" words ("introductory", "fiction", etc) (see 3.6), C3) the term weight is positive and C4) fewer than 16 terms have been selected. Each of the selected terms is then looked up, and a merge performed just as in the original search. This process may take some seconds. Rs in the original search, a threshold weight is applied below which a record wilL not enter the output set. This threshold is based on the weights which would be achieved by the records already chosen by the user, if they were retrieved now. In other words, the new records must have a substantial amount of indexing in common with those previously judged relevant. Normally, some or all of these records will come out at or near the top of the list in the new set of records. Since they have already been chosen they are removed from the new set. Rny records which have been previously rejected by the user are also removed. However, the new set often contains records which have already been seen in brief, but neither chosen nor rejected. Sometimes there will now be no records remaining in the new set, particularly wnen the expansion has been based en just one richly indexed record. 3.3.3 Query expansion search - screen display and options R window opens overlaying the record display from which "More" was selected CFig 3.S). If there are no records of high enough weight in the new set there is a message "No more books found". Unlike in the original search the system cannot realistically guess how close any of the new records are to what the user wants, so it merely reports that the most similar books should appear first (unless there are very few, in which case it is silent). If there are some new records but the user chooses not to see them the "More" option will still be available, and simply retrieves the same set again. Fig 3.10 shows the two records found by query expansion. The display is similar to other brief record screens, but is headed "LI5T OF BOOKS similar to the oneCs) you chose", beneath which is a reminder of the original search. The first book is on the analysis of waters for insecticides. Although it is certainly relevant it is perhaps rather specialized. This record has no Less than eleven subject headings. Its retrieval, at the head of the list, illustrates the tendency which almost all keyword systems have to retrieve the most extensively indexed records. Sparsely described records, which may be recognizable as relevant to a knowledgeable user, are rarely retrieved. Browsing titles in classification order should help in the retrieval of some of these records C3.10). Fig 3.11 shows the user selecting the second record. The terms by which this record have been retrieved are highlighted (apart from the Dewey number). Selection of this record will add the useful term "pollution" to the List, and will increase the weight of a number of other good terms. 3.10 Shelf order browsing In the full system, as soon as a user has judged a record relevant (from the original set of records or from a set resulting from query expansion) the system asks a second question: "Would you like to see books shelved near this one?". If the user replies "n" the system returns to the screen of brief records from which the full record was -29- 3 System description choseno If the user replies Kyffi the system displays a screenful of brief records in classmark oirder0 Rn example is shown in Fig 3.13. The book from which the display was chosen is in the centre of the display window (record 53 and is of course marked with two stars to indicate that it has already been chosen as relevant. Using the B Down B and 0 Up s options th© user can browse indefinitely forwards and backwards in th© classmark list. Records already seen in full are marked with one star if the record was rejected or two stars if the record has been chosen. Records may be selected from this shelf-order display in exactly the same way as from the original brief display. Records in the classmark display contain classmark„ title» author and publication date in that order. Hs with other brief displays, the first three fields may be truncated„ Hpart from the fact that the classmark is the leading element^ the structure of the displayed record is the same as in the initial display of brief records. Records are numbered 1 to 3 on each screen. The options below the display window include BType Back to return to your previous list of books 0 . fflBackB returns to the display from which the classmark option was chosen. Context information is given at the head of the screen. Hs the number of records the user can look at is not limited it is not possible to give an exact position in the list as with other brief record displays„ The display is headed "LIST OF BOOKS classified near8 the classmark of the book from which the display was entered„ with the original search statement shown below this message8 as in the display of the results of a query expansion search. 3.11 Other options Other facilities which are always available from a brief display are "quit™j ffinew search8 and ffiedit search8. There are two further options which are only available if at least one record has been judged relevant . These are "Print™ and 8ViewE „ "Print80 does what it says sends the chosen records to the system printer. It did not appear on the experimental systems as they were set up to print automatically on completion of a search. The °°viewffi option was introduced at the suggestion of some of our advisers. They felt that users would forget what they had chosen and hence there should be an easy way of reviewing choices CPig 3.15). It was originally intended that these options would be displayed on and directly available from the bottom line of the brief record display screens along with w Up B and "Down8. This was tried, but the designers felt that the options line was becoming cluttered and difficult to read. It was therefore decided that these important but less frequently used commands should be offered from a subsidiary menu. This is shown in Fig 3.14. fc were unable to think of a good name for this choice. "Other commands'0 commits the sin of not making it obvious to the user that there is a way of finishing or starting agains so the option was labelled ^Restart/new search/quitffl. Unfortunately this does not make evident the existence of the BViewffi option. During the evaluation experiment subjects were shown the BRestartw menu during the experimenter^s introductory spiel. This went against our principle that retrieval systems should be operable at sight without training. Some redesign of the interaction and options layouts would be necessary in a production system. Fig 3.16 shows the second and subsequent input screen which appears if -30- 3 System description the user seLects "New" from the subsidiary menu. The functions of "New", "Edit" and "Quit" have already been described C3.B.23. 3 Fig 3.1 System description Subject search input screen - new search SUBJECT SEFHIH ** OKRPI The computer will Look for books which include all Cor most) of your words in their titles or subject descriptions Type a word or a phrase which describes the books you want : ! insecticides and the environment! ! !Press Return when you have ! I finished typing your search ! Fig 3.2 D i s p l a y w h i l e looking up - word not found SUBJECT SERRCH for 'insecticides and the envirnment" 26 books under "insecticides" CRN'T FIND 'envirnment' Press the Return key if you want to change this word Press the space bar to continue without this 'word F i g 3.3 User replacing a misspelt word SUBJECT SERRCH for "insecticides and the envirnment' 26 books under 'insecticides' CRN'T FIND 'envirnment' Type your replacement : environments -32- 3 System description F i g 3.4 L o o k u p and s e a r c h results SUBJECT 5ERRCH for "insecticides and the environment' 26 books under " i n s e c t i c i d e s ' 1204 books under "environment" 3 books match your search well (26 books found altogether] Type Display to look at the books found Type New i f you want to do a d i f f e r e n t search Type Edit to change or add to your search Type Quit i f you have finished Fig 3.5 Brief record display - original search LIST OF BOOKS Search: "insecticides and the environment" Books 1 to 9 of 26 No. Title 1 2 3 THE 4 5 6 7 8 9 Ruthor Classmark Date Biochemical insect control : its impact.. QURfllSHI M 5 Advances in environmental science and .. (METCfiLF R L3 Organochlorine insecticides : persisten.. CMORIRRTY F) RE5T OF THE BOCKS MTT NOT W T C H YOUR 5ERRCH VERY WELL Insects, experts and the insecticide cr.. PERKINS J H Proceedings of the 1979 British Crop P P . British Crop .. Proceedings of the 1979 British Crop Pr. British Crop .. Proceedings of the 1979 British Crop Pr. British Crop .. The insecticide, herbicide, fungicide .. PAGE B G Organochlorine insecticides and polychl. Standing Comrni. 668.651 1977 628.5 1976 632.95042 1975 632.7 668.65 668.65 668.65 632.95 628.168 1982 1980 1980 1980 1979 1979 Type its number to see if a book is relevant 1 or type Down (next), Restart/new search/quit -33- 3 System description F i g 3.B Full record display FULL DISPLAY Search: "insecticides and the environment' Book 1 of 26 AUTH0RC5): QURRISHI M 5 TITLEC53: Biochemical insect control : its impact on economy, environment, and natural selection. PUBLICATION: Wiley, 1977. SUBJECTC5D: Insecticides. Not in this branch No. of copies in other PCL libraries : E&5 (1) Shelved at : 668.651 QUR Is this at all the sort of book you are looking for? Cyjn) YES Fig 3.7 Full record display FULL DISPLRY Search: 'insecticides and the environment" RUTHORCS): MORIARTY F TITLEC53: Organochlorine insecticides PUBLICATION: Academic Press, 1975. Book 3 of 26 : persistent organic pollutants. SUBJECTC5): Pesticides - Environmental aspects. Chlorine organic compounds. Envirorment. Pollution by pesticides: Organic chlorine compounds. Not in this branch No. of copies in other PCL libraries : E&5 C1D Shelved at : 632.35042 ORG Is this at all the sort of book you are looking for? (yjrp YES -34- 3 Fig 3.8 System description Brief display after some records have been chosen LIST OF BOOKS Search: "insecticides and the environment' Books 1 to 9 of 26 No. Title 1** 2% 3** THE 4 5 6 7 8 3 Ruthor Classmark Date Biochemical insect control : its impact.. QURfllSHI M S Rdvances in environmental science and .. CMETCflLF R L3 Organochlorine insecticides : persisten.. (MORIRRTY F) RE5T OF THE B00K5 MRY NOT MATCH YOUR SEARCH VERY WELL Insects, experts and the insecticide cr.. PERKINS J H Proceedings of the 1379 British Crop PP.. British Crop .. Proceedings of the 1373 British Crop Pr.. British Crop .. Proceedings of the 1373 British Crop Pr.. British Crop .. The insecticide, herbicide, fungicide .. PAGE B B Organochlorine insecticides and polychl.. Standing Commi. 1377 668.651 1976 628.5 632.95042 1975 632.7 668.65 668.65 668.65 632.35 628.168 1982 1980 1980 1980 1979 1979 Type its number to see if a book is relevant Type More to look for books similar to the ones you chose or type Down (next], Restart/new search/quit Fig 3.3 D i s p l a y during a query expansion search LIST OF BOOKS Search: "insecticides and the environment1 Books 1 to 3 of 26 No. Title 1** 2 3** THE 4 5 S 7 8 3 Ruthor Classmark Date 1377 1376 75 182 !B0 !80 !80 173 173 668.651 Biochemical insect control : its impact.. QURfllSHI M S 628.5 Rdvances in environmental science and .. (METCRLF R L) Organochlorine insectici REST OF THE B00K5 MRY NO Looking for more books similar to the Insects, experts and the you have chosen... 2 Proceedings of the 1373 Proceedings of the 1373 Found 2 more books Proceedings of the 1373 The insecticide, herbici Type Disp to see the books Organochlorine insectici lor type Back to go back where you were before Type its number to see if a Type More to look far books or type Down Cnext}, Restart/new search/quit MORE -35- 3 System description Fig 3.10 Brief display of records found by query expansion LIST OF BOOKS similar to the ones you chose (Original search: 'insecticides and the environment1) Books 1 to 2 of 2 No. Title 1 2 Ruthor Classmark Date Organochlorine insecticides and palychl.. Standing Commi.. 628.168 1373 Environmental pollution by pesticides. EDWRRD5 C R 632.35042 1373 ** END DF LIST ** Type its number to see if a book is relevant Type Back to return to the books you originally found or type Restart/new search/quit Fig 3.11 Full display from query expansion FILL DI5PLRY of books similar to the ones you chose (Original search: 'insecticides and the environment1) RUTH0RC5): EDWRRDS C fl TITLEC53: Environmental pollution by pesticides. Environmental science research. PUBLICATION: Plenum Press, 1373. Book 2 of 2 5UBJECTCS3: Pesticides - Environmental aspects. Chlorine organic compounds. Environment. Pollution by pesticides: Organic chlorine compounds. Not in this branch No. of copies in other PEL libraries : E&5 C1D Shelved at : 632.35042 EDA! Is this at all the sort of book you are looking for? Cyjn) YES -36- 3 System description Fig 3.12 Choosing c L a s s i f i c a t i o n browsing FULL DI5PLRY Search: " i n s e c t i c i d e s and the environment* Book 1 of 26 RUTH0RC5): QURfllSHI M 5 TITLEC53: Biochemical insect control : its impact on economy, and natural selection. PUBLICATION: Wiley, 1977. 5UBJECTC5): Insecticides. environment, Not in this branch No. of copies in other PCL libraries : ESS (13 5helved at : 668.651 QUR [Chosen] !Would you l i k e to see books shelved i I near t h i s one? Cyin] YES ! Fig 3.13 Brief display in classified sequence LIST OF BOOKS c l a s s i f i e d near 668.651 (Original search: ' i n s e c t i c i d e s and the environment'} No. Classmark Title Author International . . WRRE 6 W Great Britain.... United 5tates . , , QURfllSHI M 5 BUSVINE J R POWELL P C STREET R CDTTRELL A H Date 1571 1975 1974 1976 1977 1971 1983 1979 1975 1 2 3 4 5** 6 7 8 3 668.65 Pesticide chemistry : proceedings of .. 668.65077 Pesticides : an auto-tutorial approach. 668.650941 The non-agricultural use of pesticides. 668.650973 Manual of chemical methods for pestici. 668.651 Biochemical insect control : its impac. 668.651 fl critical review of the techniques fo. 668.9 Engineering with polymers. 669 Metals in the service of man. 669 Rn introduction to metallurgy. Type its number to see if a book is relevant Type More to look for books similar to the one you chose Type Back to return to your previous list of books or type Down (next), Up Cprev], Restart/new search/quit -37- 3 F i g 3.14 System description Other o p t i o n s menu LI5T OF EOOKS similar to the ones you chose (Original search: 'insecticides and the environment") Books 1 to 5 of 5 No. Title 1 2 3 4 5 RE5TRRT REQUE5TED te 72 71 78 77 75 Since 'Silent spring'. You can do The mutagenicity of pest Pesticide microbiology : New (start a new search} Ecological effects of pe View your list of chosen books Environmental dynamics o Edit or repeat your last search (your selections will be lost) ** END OF LIST ** Please choose one VIEW or type Quit if you have finished or Back to go back where you were before Type its number to see if a Type Back to return to the b or type Restart/new search/quit RE5TRRT Fig 3 . 1 5 R e v i e w i n g the list of c h o s e n b o o k s LIST OF BOOKS YOU HAVE CHOSEN (Original search: 'insecticides and the environment") Books 1 to 9 of 13 No. Title Ruthor Elassmark 668.651 632.95042 632.95042 632.95042 632.95042 632.95042 632.95042 669.651 632.95042 Date 1977 1975 1973 1977 1972 1972 1975 1971 1977 1** Biochemical insect control : its impact.. QURRI5HI M 5 2** Organochlorine insecticides : persisten.. CMORIRRTY F) 3** Environmental pollution by pesticides. EDWARDS C fl 4** Pesticides in aquatic environments. CKHflN M R Q) 5** Environmental toxicology of pesticides. CMRT5UMURA F) 6** Since '5ilent spring'. GRfiHPM F 7%% Persistent pesticides in the environment. EDWARDS C R 8** • critical review of the techniques for.. BU5VINE J R 9** Ecological effects of pesticides. CPERRING F H) Type its number to see further details of one book Type Back to go back to where you were before viewing the 'chosen' list or type Down (next), Restart/new search/quit -38- 3 System description F i g 3.16 I n p u t screen f o r second or subsequent search SUBJECT 5ERRCH ** OKRPI ! Previous searchCes] ! 1 "safety in the workplace" I 2 "factories act" ! 3 "workshop practice" Books found ! 31 ! 7! 137 ! Type a word or a phrase which describes the books you want : :i -33-