- 35 PART B USES As mentioned in Part A, we were asked as part of the Design Study work to collect information about possible uses of the 'ideal1 collection, were it to be built, for research and teaching. This would provide some basis for an assessment by or on behalf of BLR&D as to whether the 'ideal collection is really needed, that is, is stated to be needed by research workers and teachers, is in fact needed, and is needed for worthwhile projects and activities* The questionnaires used to collect the information, one for research and two for teaching, are reproduced in Appendices 8 and 9. An estimate of the octential utility of an 'ideal* collection, to complement the specification given in Part A, can be obtained from the responses to the questionnaires, which are summarised below. 1. Research projects The table below gives those to whom the questionnaire of Appendix 8 was sent. The 48 persons listed were chosen either because they had been involved in previous discussions of the 'ideal' collection and/or have been active in research in recent years. The table also gives the distribution of responses by five categories: thus 27 out of the 48 responded, 23 of them favourably with 28 projects altogether. Name T. Aitchison D. Austin K. Bakewell E. Barraclough *G. Bates N. Belkin R. Bottle M. Brittain C. Cleverdon A. Cooper *B. Croft S. Datta J. Digger L. Evans R. Field A. Flowerdew A. Gray *J. Griffiths A. Harley *D. Harper M. Heine V. Horsnell - A. Hindle M. Keen Place Inspec BL/BSD Liverpool P Newcastle U Cambridge U City U City U Loughborough U Cranfield I Loughborough U Cambridge U North London P L.A. Inspec Inspec Centre for Fnv. Stud. Cardiff U University C BL/LD Cambridge U Newcastle P City U Lancaster U C. Libr. Wales Name B. R. J. M. J. Place City U Brunei U BL/SRL Sheffield U Aslib North London P Inspec U.C. Swansea Aston U C.A.B. Cambridge U University C Ukcis Belfast u Cambridge U Open U A.E.R.E. Ukcis University C Loughborough U C. Libr. Wales UMIST Kent U City U J. A. B. R. R. C. S. A. J. K. D. J. D. B. R. *A. P. E. P. Kostrewski Lea Leigh Lynch Martyn Mills Negus Niblett Oddy Richens van Rijsbergen Robertson Robson Smith Sparck Jones Swift Terry Veal Vickery Wall Wheatley Williams Wilson Yates-Mercer * = research students/assistants treated as a random sample of possible future research workers - 36 Total names Total responses Y = positive response, form(s) returned with specific project(s) details (Y) = positive response, but in letter without organised detail e.g. of scale * 2 = interested but unable for indicated (usually good) reasons to provide detail now N = negative response, indicating not interested or unable to envisage project in current environment mixed Total non responses ( ) = 48 = 27 = 17 ^or which nrojects = 23 = 5 = 2 = 2 = 1 =21 The research projects supplied are summarised, and salient features relevant to the 'ideal9 collection as specified in Part A are noted, in the table below. It must be emphasised that these brief project descriptions are our own summaries of typically paragraph length statemts,, and that the projects should not be judged in detail on them. Summary of possible projects in 28 responses Y and (Y) Project Our comments re IC Specification IC varied enough in subject? 1. 2. Investigate specialist/generalist differences among indexers and users of indexes. Comparison of PRECIS and other indexing methods. Social sciences preferred: document texts needed; IC enough subjects, especially for monographs? Investigation of proposition that for a given collection there is an optimum size of dictionary. Citations of interest, 3. 4. Work on bibliographic data base management techniques. 5. Test of an information retrieval system based on Multidisciplinary an 'anomalous state of knowledge' hypothesis. document set needed and exhaustive relevance judgements. Research on nature, use and value of document citation clusters. Study of file organisation techniques for large document sets. Citations needed IC large enough? 6. 7. - 37 8, Study of various automatic/manual indexing mixtures. 9. Analysis of statistical characteristics of profiles and relevance assessments. Range of alternative reguest sets (inc. SDI) needed. IC large enough? 10. Application of fuzzy set theory to information retrieval. 11. Application of graph-theoretic information theory. 12. Study of library circulation data and their uses. 13. Identification of search technigues genuinely suited to on-line retrieval. 14. Psycholinguistic study of acceptability of presentation of search output. 15. Assessment of text compression technigues in on-line retrieval, in relation to special hardware. 16. Hardware evaluation for on-line retrieval. 17. Study of effects of different key sets in retrieval. 18. Research on use of long, highly structured gueries for text searching. 19. Exploitation of term associations in interactive retrieval. 20. Study of efficient document clustering technigues and of the relations between document and term clustering. 21. Research on term dependency. 22. Identification of good index terms through a study of their predictors. 23. Development of retrieval system testing methodology. 24. Analysis of real value of Boolean reguest structures. TC large enough? IC no information of this kind. Would like data soon. Would like data soon. Full texts needed. IC enough requests? Much actual search data is needed. - 38 - 25. Research on atuomatic feedback techniques involving query expansion. 66. Investigation of 'aboutness' and the search process. 27. Development of procedures for exploiting the IC in teaching subject retrieval. 28. Work on indexing languages specially designed for automated searching. Social sciences preferred. in addition 1 mixed response lists possible projects on the intermediate lexicon, bibliographic coupling versus subject document classing/ and term relations; but these are very tentative and are therefore not considered further. We have categorised the projects in various ways so as to give some idea of the scale and type of the research effort involved, in relation to the 'ideal' collection specification. The table below lists the results. We again emphasise that this categorisation should not be regarded as wholly accurate, since it in some cases depends on inference from the actilal information supplied. In section A of the table projects are grouped by scale, and by whether their requirements in terms of quantity and kind of data could be met by the 'ideal' collection as specified. In B they are characterised according to whether they would use the collection as it stands or extend it. In C by size in terms of staff and time. In D by status, and in E by the type of computing environment required. Finally, in section F, the projects are grouped according to whether they are, in our view, basic or general information retrieval research projects, or are applied or specific. Categorisation of possible projects (by (KSJ, with some inference) A. Data requirements (subject to the comments above) No. projects 25 3 (28) 1. Extensive scale (i.e. size rather than richness) 2. Moderate scale Requirements in scale and content; Could be met by IC as specified Could probably be sufficiently met by the IC Doubtful if could be met Could not be met 14 10 3 1 (28) - 39 - Use 1. Would use IC primarily as it stands 2. Would extend IC through new indexing or new requests n/a C. Size of possible projects Time 1. Short, 1 year 2. Medium, 2 years 3. Long, 3 years •? 11 16 1 (28) not indicated n/a Staff 1. Small, 1-2 2. Medium, 3-4 3. Large o not indicated n/a 1 12 5 3 6 1 (28) 16 4 1 6 1 (28) Status 1. Fairly definite (irrespective of whether, administratively, could start now) 2. Preliminary work or pilot study needed, or tentative n/a 18 9 1 (28) Machine environment 1. Would need IC on-line 2. Would not need on-line (though might be useful) n/a 9 18 1 (28) F * Character (KSJ's interpretation) 1. More basic, or general, research 2. More applied, or specific research n/a 13 13 2 (28) - 40 - From the information summarised in these tables we obtain the following picture of the 28 projects. a) Nearly all of them require large scale data. The scale and content of the data required by 14 of the 28 could be met by the specified 'ideal* collection, and the requirements of another 10 could probaly be satisfied in practice; only a few could probably not, and only 1 certainly not. 16 of the projects would develop the collection further in indexing or searching, largely for comparison with the information already provided. It is worth noticing that many of these projects would include searching for new requests, for which the collection, contrary to some views of it as unacceptably static, is deemed acceptable. many of the projects are middling in size: 2 years and 2 staff respectively are the favourite choices. more than half of the projects, 18,are fairly definite; some of the remainder are tentative and some depend explicitly on preliminary or pilot studies. 9 of the projects depend on using the 'ideal' collection on-line, and some of the others would probably find such use convenient. the projects divide evenly by their research character, basic or applied. Many of them are of a quite concrete character relevant to operational systems in some degree, and there are very few which might be described as quite remote from such systems. b) c) d) e) f) Overall, the picture presented by the research project replies is a fairly solid one. The response rate from active research workers was quite high (some of those to whom questionnaires were sent being no longer active): and the number of projects involved would represent a substantial proportion of the total research effort being conducted in information science in this country. The projects are on the whole fairly concrete and fairly moderate ones which if successfully conducted would bear on operational systems, and in particular on modern on-line systems. At the same time they generally call for data on a scale and of a kind not currently available for test purposes. Whether much of the research is of a desirable type, and whether the data required could be provided other than through the 'ideal' collection, are points considered in Part C of this Report. It is quite improper for us to attempt a detailed evaluation of the quality of the projects submitted: indeed it must be emphasised that the covering letter accompanying the questionnaire stated explicitly that there could be no commitment to any projects outlined either from those supplying them or from BLR&DD as a grant-giving body. The questionnaire did not call for great detail, and the projects outlined should not be evaluated for merit as if they were regular proposals; it should also be borne in mind that the projects which could not be conducted for, say, two years, since if the 'ideal' collection was to be built it would take some time to set up. 41 - 2 . Teaching activities a) general The questionnaire for teaching establishments was very simple, (see Appendix 9 ) , and was designed only to discover whether these have any interest in the 'ideal' collection, and if so what specific requirements would have to be met. The table below lists the establishments approached, and their responses. Place Aberdeen Robert Gordon's Institute of Technology Aberystwyth College of Librarianship Wales Belfast Queen's University Birmingham City of Birmingham Polytechnic Brighton Brighton Polytechnic Glasgow University of Strathclyde Leeds Leeds Polytechnic Liverpool Liverpool Polytechnic London Ealing Technical London Polytechnic of North London London University College London Loughborough Loughborough Technical College Loughborough Loughborough University Manchester Manchester Polytechnic Newcastle upon Tyne Newcastle upon Tyne University Sheffield University of Sheffield total Y = 8 - = 9 Reply (Y = response - = no response) Y Y Y Y (but teaching interests in part covered by research project) Y Y Y Y (plus University College London) - 42 For those establishments responding positively the table below indicates expressed needs, and comments on them in relation to the 'ideal1 collection specification. The numbers of students involved are added for information. Requirements of teaching establishments in responses Y Establishment Need Comments re IC Specification enough relevance information? OK not in IC OK mostly OK but UDC?, non-bibliographic? OK OK OK OK would need extended social science other set OK Aberystwyth Belfast 1 large collection as good as ON-TAP 1 large collection e.g. CAB 1 large collection MARC non-specialised subject (soft) small collections including e.g. UDC, also non-biblio. small collections several collections non-specialised subject 1 large file on-line 2 large files Qlasgow Leeds Loughborough U. Manchester Newcastle Sheffield Numbers of students Large = 100+ Medium = 50-100 1 large file on-line undergraduate postgraduate undergraduate postgraduate undergraduate postgraduate Aberystwyth, Manchester Aberystwyth Glasgow, Leeds, Loughborough U, Newcastle Glasgow, Leeds, Loughborough U, Manchester, Sheffield. Belfast Belfast Small = -50 It is to be regretted that the response rate for this Questionnaire was not better. The main point to be made about the replies are that in general the 'ideal' collection as specified would meet expressed needs, the main reservations being on the numbers of subjects covered and the provision of MARC records. The establishments responding process significant numbers of students, so the collection could make a valuable contribution to teaching, particularly if it could be set up on-line. - 43 b) on-line education The point just made was investigated through a specific questionnaire, the second in Appendix 9. This was sent to participants in a British Library-sponsored Workshop, Those from educational establishments were grouped by establishment, though a separate questionnaire was sent to the organiser, Dr. Keenan, and at British Library's sugqestion to Professor Vickery. The total educational questionnaires sent out was thus 12 (non-educational participants being sent the questionnaire mainly as a matter of courtesy); 10 responses were obtained. The table below lists establishments and responses, and analyses the positive responses by requirements and computing facilities. On-line education questionnaire Place Aberystwvth - CLW Belfast - OU Birmingham P Brighton P London -PNL UCL CU (= Vickery) Loughborough U Keenan Manchester P Newcastle P Sheffield U For responses Y Interested in the IC - yes Reply Y Y Y Y Y Y Y Y Y Y Aberystwyth, Belfast, Birmingham, London UCL, London CU, Loughborough U, Manchester, Newcastle, Sheffield London PNL no Special facilities required for on-line teaching yes Aberystwyth comprehensive index and search options Belfast several subjects Loughborough U several collections Newcastle several subjects Sheffield several subjects, command language compatibility London PNL : experience of operational systems no Birmingham, London UCL, London CU, Manchester Would/do use or prefer remote computer Aberystwyth, Sheffield, London PNL, Loughborouqh U Would/do use or prefer local computer Belfast, Birmingham, London UCL, London CU, Manchester, Newcastle The response rate here was good. It was very generally thought that the 'ideal' collection would be of use, the main reservation being the ranqe of subjects. It is worth noticing that there is a bias towards the use of local computers, suggesting a need for a portable on-line search package, which might well be of use for some research projects as well.