Modern Information Retrieval
Review of Baeza-Yates and Ribeiro-Neto's book on modern information retrieval systems, models, languages and architectures
Dr. E. Garcia
Mi Islita.com
Email | Last Update: 09/03/05
Topics
Information Retrieval
Audience
Structure
Highlights
Comments
Recommendations
References
Modern Information Retrieval
By Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Addison Wesley; first edition, 544p., illus., biblio., index. ISBN 0-201-39829-X (PB) $42.81.
Audience
Modern Information Retrieval is a textbook for computer science undergraduate and graduate courses and a reference book for IR practitioners.
Structure
The book structure consists of a Preface, Acknowledgements, Biographies, fifteen chapters, Appendix, Glossary, References and Index. The first part of the book is authored or co-authored and addresses query and text operations, retrieval and indexing. The second part consists of special topics authored by leading researchers in their fields. This part focuses on architecture, multimedia IR, bibliographical systems and digital libraries.
Subjects: Computer Science, Information Retrieval, Multimedia IR, Digital Libraries
Table of Contents
Preface v
Acknowledgements vii
Biographies xvii
Introduction 1
Modeling 19
Retrieval Evaluation 73
Query Languages 99
Query Operations 117
Text and Multimedia Languages and Properties 141
Text Operations 163
Indexing and Searching 191
Parallel and Distributed IR 229
User Interfaces and Visualization 257
Multimedia IR: Models and Languages 325
Multimedia IR: Indexing and Searching 345
Searching the Web 367
Libraries and Bibliographical Systems 397
Digital Libraries 415
Appendix: Porter's Algorithm 433
Glossary 437
References 455
Index 501
Highlights
In Chapter 1 the authors provide teaching suggestions on how to use the book with different computer science courses at the undergraduate and graduate level.
Chapter 2 covers different information retrieval models: (a) from vector space to boolean models, (b) from algebraic to probabilistic models and (c) from text retrieval to browsing models.
Chapter 3 discuses performance evaluation in terms of precision and recall measures and alternate measures.
Chapter 4 is dedicated exclusively to queries. Word, context, boolean, natural, and structured queries are discussed.
Chapter 5 covers query operations such as relevance feedback, automatic local analysis through clustering and context analysis and automatic global analysis through similarity and statistical thesauruses.
Chapter 6 covers metadata and markup languages (SGML, HTML and XML).
Chapter 7 presents a comprehensive discussion of text operations such as preprocessing (lexical analysis, stopwords removal, stemming and index terms selection.
Chapter 8 is dedicated to indexing and searching, inverted files, sequential searching and pattern matching.
Chapter 9 discuses parallel and distributed IR architectures, partitioning and processing.
chapter 10 is the longest chapter of the book and covers human-computer interactions (HCI), information access, starting points, query specifications, context, relevance judgements and interfaces.
Chapter 11 covers models and languages for multimedia IR (modeling and query languages).
Chapter 12 covers indexing and searching for multimedia IR (time series and imaging).
Chapter 13 is dedicated to Web search engines, browsing, metasearches and hyperlink-based searches.
Chapter 14 discuses libraries and bibliographical systems.
Chapter 15 is dedicated to digital libraries, projects and standards.
Comments
The Glossary section of this book consists of 18 pages and defines important technical terms used in IR. The References section consists of 45 pages of significant IR work. The Index section consists of 13 pages, which enhances the book usability. Each chapter ends with a Trends and Research Issues and a Bibliographic Discussion section.
The best features of the book are its cohesive presentation and organization. The use of a common nomenclature and notation helps students and readers to assimilate key concepts and with "connecting the dots" across chapters. The text is reinforced by a mirrored Web site with several resources, errata page and teaching material (1 - 4). In Chapter 1, the authors even "go the extra mile" with teachers and suggest how chapters could be used with different undergraduate and graduate courses. All this makes the book a great educational resource for students and teachers.
However, while the companion site has some exercises for students to practice, the book itself lacks of how-to calculations or exercise sections. In future editions, it may be a good idea to include with each chapter an Exercises and Answers to Exercises section. This would encourage students to put into practice the theory, use their own judgement and assimilate what they have learned.
Recommendations
This book is recommended for computer science courses at the undergraduate and graduate level. It is also recommended for technical libraries and as a primary reference for IR practitioners.
References
- Modern Information Retrieval - Companion Web Site; R. Baeza-Yates and B. Ribeiro-Neto.
- Modern Information Retrieval - Glossary; R. Baeza-Yates and B. Ribeiro-Neto.
- Modern Information Retrieval - Errata; R. Baeza-Yates and B. Ribeiro-Neto.
- Modern Information Retrieval - Teaching Material; R. Baeza-Yates and B. Ribeiro-Neto.

