Information Retrieval Tutorials
A Collection of Information Retrieval Tutorials for IR Students and Search Engine Marketers
Here is a list of IR tutorials. Some include examples, fast tracks, reader's feedback and reviews or exercises.
Fast Tracks
Fast tracks are meant to be quick references. For detailed explanations please read the corresponding tutorials.
- LSI Keyword Research
- Singular Value Decomposition (SVD)
- A Linear Algebra Approach to Term Vectors
- Term Vector Calculations
Robertson-Sparck Jones Probabilistic Model Tutorial
This is a tutorial on the Robertson-Sparck Jones Probabilistic Model for Information Retrieval. Includes
- Independence Assumptions and Ordering Principles.
- Derivation of RSJ Weighting Functions.
- Working Example.
SVD and LSI Tutorial 5: LSI Keyword Research and Co-Occurrence Theory
In this LSI tutorial you will learn how to cluster keywords in a k-dimensional reduced space. Includes
- Clustering of terms in a k-dimensional space.
- Debunking the SEO "Synonymity Myth".
- How Co-Occurrence Affects LSI Scores.
SVD and LSI Tutorial 4: Latent Semantic Indexing (LSI) How-to Calculations
Learn how to calculate LSI scores for documents and queries. Includes
- Exposing "LSI based" Snakeoil Marketers.
- Common SEO Myths and Misconceptions.
- Computing LSI Matrices and Ranking Documents.
SVD and LSI Tutorial 3: Computing the Full SVD of a Matrix
This tutorial shows you how to compute the full SVD of a matrix. Includes
- Computing "right" Eigenvectors.
- Computing "left" Eigenvectors.
- A handy shortcut.
SVD and LSI Tutorial 2: Computing Singular Values
This tutorial shows you how to compute singular values. Includes
- Matrix Transposition.
- The Frobenius Norm.
- Computing AT, ATA, AAT, S.
SVD and LSI Tutorial 1: Understanding SVD and LSI
This tutorial introduces you to SVD and LSI. Includes
- Search Engine Marketers and their LSI Myths.
- SVD/LSI Applications and Limitations.
- A Geometrical Visualization of SVD.
Association and Scalar Clusters Tutorial - Part 1: Back Mapping Term Clusters to Documents
Covers advanced cluster analysis, including:
- association and scalar clusters.
- similarity matrix theory.
- back mapping term clusters to documents.
Row Pruning Algorithm Tutorial
Covers association rules pruning, including:
- association rules theory.
- RPA algorithm.
- URPA algorithm.
Document Indexing
Covers basic document indexing techniques, including:
- document linearization.
- tokenization.
- stop word filtration.
- stemming.
- weighting.
Cosine Similarity
Covers how-to calculations used in vector space theory, including:
- Dot Products.
- Euclidean Distances.
- representing documents and queries as vectors.
- cosine similarity calculations.
- document normalization.
EF-Ratios
An introduction to EF-Ratios and how these can be used in keyword research studies. Includes:
- keyword research misconceptions.
- EXACT and FINDALL modes.
- Local and Global EF-Ratios.
- identification of candidate sequences.
- applications to search engine marketing.
Matrix Tutorial 3: Eigenvalues and Eigenvectors
This tutorial covers eigenvalues and eigenvectors. Includes
- calculating largest eigenvalues and eigenvectors.
- vector iteration through the Power Method and Deflation Method.
- references to resources on link model calculations, including PageRank.
Matrix Tutorial 2: Matrix Operations
This tutorial covers basic matrix operations like
- addition, substractions, and multiplication of matrices.
- multiplication and division of matrices by a scalar.
- solution of square matrices by determinants.
Matrix Tutorial 1: Stochastic Matrices
This tutorial introduces the concept of matrices, eigenvalues, and eigenvectors to IR students and search engine marketers. In Part 1 we go through some definitions and familiarize readers with different type of matrices. Emphasis is given to stochastic matrices. Covers:
- square, triangular, scalar, and transpose matrices.
- rank of a matrix, digraphs indegrees and outdegrees.
- stochastic matrices and Markov Chains.

