-11- CHAPTER 2 THEORETICAL BASIS OF THE ASK IR SYSTEM As explained in Section 1.2, the basis of this project is a combination of what we have rather loosely termed the fASK hypothesis' with the principles underlying Oddy's THOMAS system (Oddy, 1975). In this Chapter we expand somewhat upon the theory underlying these two approaches to IR, and on how they can be combined into a framework suitable for what we think of as a 'second generation' IR system. Our basic premise arises from what we consider to be one of the central difficulties of IR: that people who use IR systems typically do so their state of knowledge because they have recognized an anomaly in of some topic, but they are unable to specify precisely what is necessary to resolve that anomaly. This premise can be seen as a restatement Thus, we and perhaps extension of ideas proposed by Taylor (1968). presume that it is unrealistic (in general) to ask the user of an IR system to say exactly what it is that s/he needs to know, since it is just the lack of that knowledge which has brought her/him to the system in the first place. This premise leads us to conclude that IR systems should be designed with the non-specifiability of information need as a major parameter. What sort of an IR system could this be? a mechanism for We consider that IR systems, in general, consist of: representing information need; a text store; a mechanism for representing and organising texts; a mechanism for retrieving texts appropriate to particular information needs; and, usually, a mechanism for evaluating the effectiveness of the retrieval. Figure 2 (from Robertson, 1979) indicates these, and some other features, and the relationships among them. From Figure 2, and experience, one can see that the starting points for IR system design are at either text or need representation, and that which of these one chooses, and tne chosen method of representation, will strongly influence all of the other elements of the IR system. Most previous systems have begun from the text representation end, with not very much influence from the need end. We note that need representation appears to be the fundamental problem of IR, and so we suggest that a good IR system should be one which begins with need representation and designs the rest of the system about a mechanism and formalism, specifically designed for that purpose. -13- H X < u Q) X) £3 • i in _u h- 1—