Skip to main content.

In this post, I want to present a very elegant proof of the Cayley-Hamilton Theorem which works over all commutative unitary rings by reducing to the case over the complex numbers, where a topological argument is used to reduce to the case of diagonalizable matrices. First of all, let us state the definitions and the theorem itself.

Definition.

Let R be a commutative unitary ring and A \in R^{n \times n} a n \times n-matrix over R. The characteristic polynomial of A is the polynomial \chi_A := \det(x E_n - A) \in R[x].

Then the theorem says:

Theorem (Cayley-Hamilton).

Let R be a commutative unitary ring and A \in R^{n \times n}. Then \chi_A(A) = 0.

We first begin with a fascinating reduction argument, which I first saw in a lecture of Paul Balmer at the ethz:

Lemma.

The Theorem of Cayley-Hamilton holds over any commutative unitary ring if, and only if, it holds over the complex numbers.

Proof.

Clearly, if the theorem holds for all rings, so it does for the special case R = \C. So assume that it holds for \C.

Let R be any commutative unitary ring and A \in R^{n \times n}, A = (a_{ij})_{ij}. Set S := \Z[x_{ij} \mid 1 \le i, j \le n] and consider the ring homomorphism \varphi : S \to R, f \mapsto f(a_{11}, a_{12}, \dots, a_{nn}). Over S, consider the matrix B := (x_{ij})_{ij}. Now \varphi induces S-algebra homomorphisms \varphi^* : S^{n \times n} \to R^{n \times n} and \varphi' : S[x] \to R[x] with \varphi^*(B) = A. Clearly, they satisfy \varphi'(\chi_B) = \chi_A and \varphi^*(\chi_B(B)) = \chi_A(A). Therefore, it suffices to prove \chi_B(B) = 0.

Now \C has infinite transcendence degree over \Q (otherwise, it could be countable), whence there exists an embedding \psi : S \to \C; simply choose n^2 algebraically independent elements in \C and map the x_{ij} to them. Again, we get maps \psi^* : S^{n \times n} \to \C^{n \times n} and \psi' : S[x] \to \C[x] which are injective and satisfy \psi'(\chi_B) = \chi_{\psi^*(B)} and \chi_{\psi^*(B)}(\psi^*(B)) = \psi'(\chi_B)(\psi^*(B)) = \psi^*(\chi_B(B)). But by assumption, Cayley-Hamilton holds over \C, whence \chi_{\psi^*(B)}(\psi^*(B)) = 0. Since \psi^* is injective, \chi_B(B) = 0, which implies \chi_A(A) = 0 as mentioned above.

Now we can concentrate on showing the Theorem of Cayley-Hamilton for the complex numbers. We begin with a special case, namely the diagonalizable matrices.

Definition.

A matrix A \in R^{n \times n} is said to be diagonalizable if there exists an invertible matrix T \in GL_n(R) such that

T^{-1} A T = \Matrix{ \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & \lambda_n } =: diag(\lambda_1, \dots, \lambda_n)

for \lambda_1, \dots, \lambda_n \in R.

We then have:

Lemma.

The Theorem of Cayley-Hamilton holds for diagonalizable matrices.

Proof.

We first assume that A = diag(\lambda_1, \dots, \lambda_n). Then one gets \chi_A = \prod_{i=1}^n (x - \lambda_i), and since

(A - \lambda_i E_n) = diag(\lambda_1 - \lambda_i, \dots, \lambda_{i-1} - \lambda_i, 0, \lambda_{i+1} - \lambda_i, \dots, \lambda_n - \lambda_i)

one gets \chi_A(A) = 0.

Now assume that A is diagonalizable, and let T \in GL_n(R) such that T^{-1} A T = diag(\lambda_1, \dots, \lambda_n). Clearly, \det T^{-1} = (\det T)^{-1} and, therefore,

\chi_A ={} & \det(x E_n - A) = \det T^{-1} \cdot \det(x E_n - A) \cdot \det T \\ {}={} & \det (T^{-1} (x E_n - A) T) = \det(x E_n - T^{-1} A T) = \chi_{T^{-1} A T}.

Now write \chi_A = \sum_{i=0}^n a_i x^i with a_i \in R. Then

T^{-1} \chi_A(A) T = \sum_{i=0}^n a_i T^{-1} A^i T = \sum_{i=0}^n a_i (T^{-1} A T)^i = \chi_A(T^{-1} A T),

whence T^{-1} \chi_A(A) T = \chi_{T^{-1} A T}(T^{-1} A T). But now T^{-1} A T = diag(\lambda_1, \dots, \lambda_n), whence we get T^{-1} \chi_A(A) T = 0 and, hence, \chi_A(A) = 0.

We now get to the main piece of proving Cayley-Hamilton over \C:

Lemma.

Endow \C^{n \times n} with the Euclidean topology and consider the set

D := \{ A \in \C^{n \times n} \mid A \text{ diagonalizable } \}.

Then D is dense in \C^{n \times n}.

For this proof, we need two facts from linear algebra:

  • Every matrix over \C is equivalent to a triagonal matrix; this can be done if, and only if, the characteristic polynomial of the matrix splits into linear factors. But, by the Fundamental Theorem of Algebra, this is always the case over \C.
  • An n \times n-matrix with n distinct eigenvalues is diagonalizable.
Proof.

Let A \in \C^{n \times n} be an arbitrary matrix. Then there exists a matrix T \in GL_n(\C) such that

T^{-1} A T = \Matrix{ \lambda_1 & * & \cdots & * \\ 0 & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & * \\ 0 & \cdots & 0 & \lambda_n }

with \lambda_1, \dots, \lambda_n \in \C. As the transcendence degree of \C over \Q is infinite, there exist elements \mu_1, \dots, \mu_n \in \C such that for every j \in \N_{>0}, the set \{ \lambda_i + \frac{1}{j} \mu_i \mid 1 \le i \le n \} has exactly n elements. Define A_j := A + \frac{1}{j} diag(\mu_1, \dots, \mu_n), j \in \N_{>0}. Then A_j \to A for j \to \infty and A_j has n distinct eigenvalues for every j, namely \lambda_1 + \frac{1}{j} \mu_1, \dots, \lambda_n + \frac{1}{j} \mu_n. But this implies that A_j \in D, whence we found a sequence in D converging to A.

Now, we are able to conclude:

Theorem (Cayley-Hamilton over the complex numbers).

Let A \in \C^{n \times n}. Then \chi_A(A) = 0.

Proof.

Set S := \{ A \in \C^{n \times n} \mid \chi_A(A) = 0 \}. Clearly, D \subseteq S and D is dense in \C^{n \times n} by the previous lemma. Hence, it suffices to show that S is closed.

But note that the map \Phi : \C^{n \times n} \to \C^{n \times n}, A \mapsto \chi_A(A) is defined by polynomials; hence, it is continuous. Now S = \Phi^{-1}(\{ 0 \}) is the preimage of a closed set, whence S is closed itself.

This completes the proof of the theorem:

Proof (Cayley-Hamilton over commutative unitary rings).

By the first lemma, it suffices to show the theorem over \C. But this is accomplished by the previous theorem.

Comments.

No comments.