Matrix Tutorial 1: Stochastic Matrices
A matrix tutorial. Includes, square, triangular, scalar, transpose, and stochastic matrices. Also covers rank of a matrix, digraphs and Markov chains.
Dr. E. Garcia
Mi Islita.com
Email | Last Update: 07/11/06
Topics
About this Math Tutorial
Principal and Trace of a Square Matrix
Row Vectors, Column Vectors, Scalar and Transpose Matrices
The Rank of a Matrix
Demystifying Stochastic Matrices
Digraphs, Indegrees and Outdegrees
Markov Chains and Link Models
SEO Blogonomies: The Search Engine Markov Chain
What's Next?
Tutorial Review
References
About this Math Tutorial
This tutorial introduces matrices, eigenvalues, and eigenvectors to IR students and search engine marketers. In Part 1 we go through some definitions and familiarize readers with different type of matrices. Emphasis is given to stochastic matrices. In Part 2 we stop momentarily to explain some basic matrix operations. Part 3 demystifies eigenvalues and eigenvectors, showing how to calculate these.
We hope that presenting the material in this order, i.e., visualization of matrices first, followed by matrix operations, might help students to associate math operations with what they have visualized already. Currently, many matrix tutorials intermingle execution with visualization, forcing students to stop and do a one-by-one mapping between text and graphics, before processing new material. In our opinion that approach injects to the discourse an unnecessary level of difficulty.
By separating visualization from execution, by the end of this tutorial the reader will be able to discriminate between different type of matrices. Students will be able to identify key concepts such as the rank of a matrix, digraphs, markov chains, and other key concepts without resourcing to math operations.
We do not pretend to make a comprehensive review out of this tutorial. Rather the material is limited to what we think might be relevant to link models and cluster structures. Applications and examples are provided.
Most of the material and examples are taken from two great books (1, 2) I read way back while in grad school and before the inception of commercial search engines (Google, Yahoo, MSN, etc) in the Web scene:
- Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf, Springer-Verlag (1986).
- Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw Hill (1991).
Why we wrote this tutorial for an audience consisting of IR students and search marketers? Well, there are plenty of reasons. Consider this:
- matrices simplify the handling of rutinary business model calculations.
- eigenvectors can be used to understand link models and networks.
- eigenvalues, eigenvectors, stochastic matrices, Markov Chains, etc are used to understand dissimilar random processes.
- often students and search engine marketers find these concepts too abstracts or complex to understand.
- research articles about these topics are often misquoted in SEM discussion forums/SEO blogs and key concepts become "blogonomies".
Thus, one of the goals of this tutorial is help our audience to grasp these concepts, while we dispel some myths. In this way next time a reader has an encounter with these topics he/she can grasp the gist of the discourse --or at least a good portion of it.
Principal and Trace of a Square Matrix
Let us first define what is a matrix and go through some basic definitions.
A matrix is just a rectangular array of rows (m) and columns (n); that is, a table. Thus, tabular data entered into an Excel spreadsheet can be viewed as a matrix. If you run a mom-n-pop business and for some reason you have arranged numbers or letters in rows and columns, you have handled matrices already.
If a matrix has the same number of rows (m) and columns (n) is termed a square matrix; i.e., m = n. The matrix is said to be of the nth order or of order n. Thus, an array consisting of two rows and two columns is a square matrix of order m = n = 2 and an array consisting of three rows and three columns is a square matrix of order m = n = 3.
Elements of a matrix are identified by assigning subscripts to rows and columns. Thus, for matrix A its elements are aij. For instance, a32 means element in row 3 column 2.
The diagonal extending from the upper-left corner to the lower-right corner of a square matrix is termed the principal. The elements of the principal are termed the principal elements or diagonal elements. The sum of the principal is the trace of the matrix. The trace is an important concept, as we will see in Part 1 and Part 2 of this tutorial. These concepts are illustrated in Figure 1.

Figure 1. The principal and trace of a square matrix.
Row Vectors, Column Vectors, Scalar and Transpose Matrices
A one-row matrix is a called a row vector. Similarly, a one-column matrix is termed a column vector. A null matrix is one with all elements being zero.
A matrix in which all nondiagonal elements have zero value is a diagonal matrix. If all elements of a diagonal matrix are equal, we call this a scalar matrix. If all elements of a scalar matrix are 1 this is termed a unit matrix or an identity matrix, I.
A transpose matrix AT is obtained by converting rows into columns and columns into rows. Some of these definitions are illustrated in Figure 2.

Figure 2. Several matrices
The Rank of a Matrix
A matrix in which all elements above or below the principal have zero value is a triangular matrix. Moreover, a triangular matrix is classified as lower-triangular or upper-triangular, respectively, according to whether the zero elements lie above or below the principal.
The rank of a matrix is equal to the number of linearly independent rows or linearly independent columns it contains, whichever of these two number is smaller. Accordingly, the rank of a square matrix is equal to the number of nonzero rows in its upper-triangular matrix or the number of nonzero columns in its equivalent lower-triangular matrix, whichever of these two number is smaller.
Figure 3 shows a square matrix and its equivalent triangular matrix. This was obtained by subjecting the matrix to elementary column operations. Don't worry for now about tranforming a square matrix into a triangular matrix. What is important is the following: since B contains 3 nonzero columns, A is of rank 3.

Figure 3. Rank of a square matrix.
Another way of computing the rank of a matrix involves the use of singular values. This will be discussed in an upcoming tutorial on Singular Value Decomposition (SVD).
Some times search engine marketers and those that sell links quote papers about link models, not knowing that the term "rank" of a link graph is used in those articles in reference to the rank of a matrix and not in reference to any web page ranks (i.e., positioning of search results). Next thing one reads from these marketers is what we call a bunches of blogonomies. We call a "blogonomy" the dissemination of false knowledge through blogs or public forums and "blogorrhea" when a false concept is promoted for a profit.
Demystifying Stochastic Matrices
If all elements of a matrix are non negative, we can normalize rows by adding row elements together and dividing each element by the corresponding row totals. Obviously, adding together normalized row elements equals 1. In general, a matrix whose sum of all row elements (or column elements) equals 1 is called a stochastic matrix. Elements of a stochastic matrix can be zero if their row totals (or column totals) equal 1. See Figure 4.

Figure 4. Stochastic matrices.
Since a stochastic matrix can be obtained also by normalizing columns, authors often use the expressions row-stochastic matrix and column-stochastic matrix, in order to distiguish between the two cases. The expression doubly stochastic matrix is reserved for square matrices whose sums of both rows and columns equal 1. This is the case if both matrix A and its transpose AT are stochastics.
Digraphs, Indegrees and Outdegrees
A directed graph or digraph consists of a number of points (nodes) linked together by arrows or lines also called edges. Arrows indicates the direction of the relationship between two nodes. The number of arrows ending at a specific node is called the outdegree of the node and the number of arrows leading from it, is called the indegree.
To illustrate these concepts, let me use the example presented by the authors of Graphical Exploratory Data Analysis (1) from 1986.

Figure 5. Digraph for the friendship between six persons.
Here they represented the friendship between six individuals as a digraph (any similarity with link graphs flying around?). The direction of the arrows says it all. 1, 3 and 6 consider 2 a friend, but 2 is friendly with 3, only.
The following array describes how the nodes are related. Note that row totals give outdegrees and column totals indegrees.

Figure 6. Indegrees and outdegrees for the friendship between six persons.
When these type of relationships are represented in matrix notation the resultant array is called an adjacency matrix. Dividing each row element of the adjacency matrix by the correponding outdegree, yields a row-stochastic matrix,

Figure 7. Row-stochastic matrix obtained from a digraph.
Markov Chains and Link Models
Now that we have the basic ideas clarified, let's move forward and talk about random processes.
A random process is a process or series of events that occur by chance. If the process evolve in time is called a Markov chain. Looking at some of the stochastic matrices we have derived, if instead of mere numbers the elements represent probabilities pij these are called transition probabilities. The corresponding matrix is termed a transition matrix
Therefore, it can be said that a Markov chain is just a random process evolving in time according with the transition probabilities of the Markov chain.
SEO Blogonomies: The Search Engine Markov Chain
The spreading of incorrect knowledge or at best innaccurate representation of concepts is prevalent in circles associated to search engine optimization (SEO). This is a social phenomenon more notorious in the blogosphere and through public forums (sites and discussion forums). Because of this, we call the phenomenon "blogonomies". We are currently compiling a list of the most notorious blogonomies spreaded over the search engine marketing world.
Many blogonomies are promoted by well known SEO and SEM specialists. These folks are called "experts" by their followers and pose as such in their SEM conferences. They often quote each other or call each other "experts". Many of these folks like to write the fine line of fallacies, producing material where false concepts are decorated with scientific terms and "fat" words. They are also experts in damage control and in saving face.
We are not interested in investigating what actually motivates the phenomenon of blogonomies since that is self-evident. What we want is make the reader aware of the phenomenon. As a sample of what you could expect to see listed in our SEO Blogonomies here is one: The Search Engine Markov Chain Blogonomy.
Some SEOs have written -giving the impression to readers- that search engines use a mythical Markov Chain to find patterns in search engine search results or sites, like if such chain is a special kind of detection instrument, tool or technique that is applied to find keyword patterns in a web page or to detect how the document was optimized. This is pure non sense.
There is no such thing as a mythical Search Engine Markov Chain, which only exists in the mind of these folks and followers, who often misquote research articles. A markov chain is simply a random process that occurs over time according to some transition probabilities.
Suppose we run an experiment that has N possible results (states). Suppose that we keep repeating the experiment and that the probability of each of the results or states occurring on the (N+1)th repetition depends only on the result of the Nth repetition of the experiment. This is called a markov chain.
Thus a markov chain is not an instrument, technique, tool or the like that allegedly is used by search engines to rank web pages or to find word patterns in documents. True that there is a lot of research in which things have been modeled as markov processes in an attempt at understanding better behaviours and link graphs, but the analogy stops there.
True that there is something called an absorbing markov chain, but this is a specific case involving random walks with absorbing states. Perhaps it might be a good idea to write a tutorial on regular markov chains and absorbing markov chains or, better, recommend readers to take a look at the book of James T. Sandefur, Discrete Dynamical Systems, Theory and Applications (Oxford University Press; Chapter 6 Absorbing Markov Chains) (3). If you like fractals, chaos and iterations, this book is for you.
Meanwhile, if while drunk you walked randomly from one point to another, chances are that you might have "markov-chained yourself", already.
What's Next?
What all this discourse has to do with web links (linked web pages)? Well, consider a random walk over a set of linked pages. This can be defined by a transition matrix, which in this case is the links matrix. The largest eigenvector of the transition matrix tell us the probabilities of the walk ending on the candidate pages. To understand the significance of this statement we first need to define what we mean by the largest eigenvector and how these are computed. This and the calculations involved, will be explained, step by step in Part 2 and Part 3 of this tutorial.
Next: Matrix Tutorial 2: Matrix Operations
Tutorial Review
- What is the difference between a diagonal, scalar and identity matrix?
- Which of the following is a square matrix: an array of 3 rows and 2 columns or an array of 10 rows and 10 columns?
- Look at the business or sport section of a newspaper and try to find tabular data that could be represented as a square matrix. Calculate its trace and transpose.
- Derive a column-stochastic matrix from Figure 6.
- Look at your site map. Try to derive a digraph from your site map or link structure. For this exercise, consider only your pages, ignoring third-party links (this of course will represent an ideal scenario). Compute a row-stochastic matrix or a column-stochastic matrix. Have fun.
- What is a markov chain?
References
- Graphical Exploratory Data Analysis; S.H.C du Toit, A.G.W. Steyn and R.H. Stumpf, Springer-Verlag (1986).
- Handbook of Applied Mathematics for Engineers and Scientists; Max Kurtz, McGraw Hill (1991).
- Discrete Dynamical Systems, Theory and Applications; James T. Sandefur, Oxford University Press; Chapter 6 Absorbing Markov Chains (1990).

