This paper is a critical and historical analysis of evaluations of IR systems ... and approaches are discussed. together with major challenges ..... ago Churchman.
EVALUATION
OF EVALUATION
IN INFORMATION
RETRIEVAL
Tefko Saracevic, PhD School of Communication, Information and Library Rutgers University 4 Huntington St. New Brunswick, NJ 08903 saracevic@zodiac. rutgers. edu From the inception
ABSTRACT
was
Evaluation
is a major
Studies,
force in research, development
a major
development
and
in
(R&D).
and historical
applications related to information retrieval (IR). This paper is a critical and historical analysis of evaluations of IR systems and processes. Strengths and shortcomings of evaluation efforts and approaches are discussed. together with major challenges and questions. A limited comparison is made with evaluation in experts systems and Online Public Access Catalogs (OPACS). Evaluation is further analyzed in relation to the broad context and specific problems addressed. Levels of evaluation are identified and contrasted. most IR evaluations were concerned with the processing level, but others were
of IR systems a half century
force
the
aspects related
evaluation,
raise critical
a limited
way
information
systems, most notably
IR
scientflc
efforts
article
during
concerned
with
accelerated
and exponential
traditional
were enthusiastic
from
sense
is
undertaken
different
levels
in
relation
are to inte~ate
and to incorporate
to
problem
these
evaluation
in new
applications.
systems
and
expert
and of IR
compare
evaluation
in
of related
systems
Vannevar
Bush,
World
and OPACS.
the head of U. S.
War,
related to “the massive task of making
the bewildering
array of knowledge”
and
the application
suggested
the problem
growth
govermnent
addressing
information
Bush
was
explosion,’
the
of knowledge
shared the concern
in and
technological
approach
IR
fueled
by
approach
in
by the 1950s finding,
itu?ornlation
1945).
of records
others
the
and technology,
‘information
about the suggested
significant technology
in science
of
Many
As a result,
addressed
more accessible
of the emerging
to the task (Bush,
and technology.
for solution.
IR evaluations
with
the Second
as a solution
The challenges
of IR
and
the basic
the broad Problem
In a most influential
science
research
and shortcomings
issues and questions,
evaluation
technology
IR
ago, evaluation
IR
to analyze
to evaluation
processes, assess the accomplishments
Context:
of
In this paper I wish
conducted at the output, users and use, and social levels. A major problem is the isolation of evaluations at a given level. Issues related to systems under evaluation, and evaluation criteria, measures, measuring instruments, and methodologies are examined. A general point is also considered: IR is mcreasmgly imbedded into many other applications, such as evaluation in the the Internet or digital libraries. Little applications.
progress
emerged,
as the
explosion,
major
fist
in
and later in all the areas of human
science
and
endeavor.
Introduction Information Evaluation process
means assessing (technique,
such, evaluation technology, apphcatlous. a difficult much
methods,
procedure
is accepted
and
many
What
should
and vexing
research
is
and related
done
. . . ), product,
other form
of a system, or policy.
necessity
areas,
including
Thus,
At times
general itself.
This
namely
remained
addressed.
time,
social
changed,
but the basic
constant.
Thus, the ultimate
How
resolvm~
to lean
apphed~
valuative
signflcaotly,
of the information
is ofiell
be:
to this day the context IR is still being
IR changed
our understanding
measures,
it is necessary
questions,
problem
of it. Over
As
in any area of evaluation criteria,
explosion
the major
in science,
the basis of evaluation
on evaluation
aspects.
about evaluation
or value
as a critical
problem.
back and ask even more questions
performance
succcssfil
problem was
the problem Clearly,
orientation
are conf?”onted
this begs many
Permission to make digital/hard cc)pics of all ur port o!’ this m:iteri:ll without fee is granted provided tlmt LIN: c[)pics are not made or distributed for prolit or commercial :idvanL:ige, Lhe ACM c{opyri:}lt/ server notice., the title of the publicutinn and its du~c :lppc:~r, :Lild notice is given that copyright is l~y permission of the Asstwi:ition for Computing Machinery, inc. (ACM). To copy o[lwrwisc, to republish, to pOst on servers or to redistribute e to lisls, ruluircs specific Pcrmifisii)n and/or fee. SIGIR’95 Seattle CA USA(o 1995 ACM O-8979 I -714-6/95/07.$3.50
related
of see king,.
finding,
f~,om
mass
mfor”matlon
and rn,vrzad of choices
These are very bard questions
the
available
is
a complex problem,
social,
at its base IR is an activity
with
the
same
(social,
Ultimately,
After
IR exists
us In