It doesn't require linear algebra to understand the paper or how the algorithm works, but it does require linear algebra to understand why the algorithm works. In general, since the induced 1-norm of a stochastic matrix S is exactly equal to 1 but not smaller than 1, the mapping x↦Sx is NOT a contraction. Neither convergence of the power method nor uniqueness of fixed point are guaranteed. (If there are multiple fixed points, there are multiple inconsistent rankings.)
In the paper, the significance of the so-called "damping factor" is not clear. However, with linear algebra, we know that the damping factor makes the stochastic matrix positive rather than merely nonnegative. Hence the Perron eigenvalue is "simple" (i.e. of multiplicity one), the Perron vector is unique and the power iteration must converge to it.
That's easier explained using fixed-point theory: The damping factor makes the mapping x↦Sx into an actual contraction (on the space of probability distributions). Not to mention that it has a simple common-sense justification (you don't want to get stuck in a subnetwork that only links to itself, or drown out a subnetwork that has more outlinks than inlinks).
There is probably some gain from understanding the algorithm specifically as a Markov chain iteration (if nothing else, it provides a great example for Markov chain iteration), but I think it's perfectly possible -- and easier -- to understand it as a fixed-point iteration on a compact space. And I am someone who does algebra for a living and normally explains everything algebraically if ever possible...
In the paper, the significance of the so-called "damping factor" is not clear. However, with linear algebra, we know that the damping factor makes the stochastic matrix positive rather than merely nonnegative. Hence the Perron eigenvalue is "simple" (i.e. of multiplicity one), the Perron vector is unique and the power iteration must converge to it.