The Group Algebra (III)

As alluded to at the end of the previous article, we shall consider the case where K is algebraically closed, i.e. every polynomial with coefficients in K factors as a product of linear polynomials. E.g. KC is a common choice.

Having assumed this, we see that any division ring D ⊇ K which is finite-dimensional as a K-vector space must be K itself; indeed, if x ∈ D, then K(x) is a commutative division ring (i.e. a field) over K which is of finite dimension, so it must be K itself, and x ∈ K. Thus,

K[G] \cong \prod_{i=1}^k M_{n_i}(K)

and for any simple K[G]-module V, we have EndK[G](V) = K. This also gives:

|G| = \dim_K K[G] = \sum_{i=1}^k \dim(V_i)^2,

where Vi runs through all simple K[G]-modules up to isomorphism.

In particular, this mean \left< \chi_V, \chi_V\right> = 1 for any simple K[G]-module V. Thus, their characters actually form an orthonormal set over K. The next theorem we want to show is:

Theorem. The set of \chi_V, where V runs through all simple K[G]-modules, forms an orthonormal basis of the space of class functions.

Proof

We need to show the characters span the space of class functions. Let fG → K be any class function; subtracting by a linear combination of characters, we may assume f is orthogonal to all \chi_V. Let

\alpha = \sum_{g\in G} f(g^{-1})\cdot g \in K[G].

Since f is a class function, we get \alpha g = g\alpha \in K[G] for any g in G. Thus α commutes with every element of K[G]. If V is a simple K[G]-module, let m : V → V be the multiplication-by-α map; since α commutes with every element of K[G], m is a K[G]-linear map. From above, we know that EndK[G](V) = K, so m is a scalar map represented by λ ∈ K.

To find this λ, we take the trace of m :

\begin{aligned}\lambda\cdot \dim V &= \text{tr}(\lambda\cdot 1_V) = \text{tr}(m : V\to V) =\sum_{g\in G} f(g^{-1})\text{tr}(g:V\to V)\\ &= \sum_{g\in G} f(g^{-1})\chi_V(g) = |G|\left<f,\chi_V\right>\end{aligned}

which is zero since f was picked to be orthogonal to all characters. Hence multiplication-by-α is the zero map for all simple K[G]-module V; since K[G] is semisimple, multiplication-by-α is also zero on it. But that map takes e to α so α=0. ♦

Summary. This is what we know so far:

  • The number k of isomorphism classes of simple K[G]-modules is precisely the number of conjugancy classes of G (by the above theorem).
  • Their dimensions satisfy (\dim V_1)^2 + \ldots + (\dim V_k)^2 = |G|.
  • Their characters form an orthonormal basis for the space of class functions G → K.

As \chi_V runs through all characters of simple modules and g runs through conjugancy classes, we obtain a k × k tables for the values \chi_V(g). Here are the character tables for S4 and S5:

chartable_s4_2

The number [n] in square brackets refers to the number of elements in each conjugancy class. Let us check that \chi_2 and \chi_1 are orthogonal:

\left<\chi_2, \chi_1\right> = \frac 1 {24} (1\times 2\times 3 + 6\times 0 \times -1 + 8\times -1 \times 0 + 6\times 0 \times -1 + 3\times 2 \times -1) = 0.

chartable_s5_2

For the details of computations, refer to an earlier article. Since both character tables comprise of solely integers, one suspects that in fact the modules are defined over Q, i.e. that the characters of simple Q[G]-modules are already orthonormal. This is, in fact, true for any symmetric group Sn – but we will have to revisit this another day.

Finally, we end this article with an example of a simple K[G]-module V where End(V) is a non-commutative division ring.

Division Ring Example

Let KR, and consider the division ring of quaternions H. Let G = {±1, ±i, ±j, ±k} ⊂ H which is a non-abelian group of order 8.Let’s compute its character table. First note that G has 5 conjugancy classes: {+1}, {-1}, {±i}, {±j}, {±k}. Next:

  • There’s the trivial representation, where all g → 1.
  • Let N = {±1, ±i}, which is of index 2 in G and thus normal. The non-trivial representation of G/N ≅ C2 → R* then pulls back to a representation of G which is irreducible since it has dimension 1.
  • Do the same for {±1, ±j} and {±1, ±k}.

This gives us 4 irreducible representations of dimension 1. Let us compute the character table over C. Since \sum_i (\dim V_i)^2 = 8 the last character has dimension 2. The other entries are easy to compute:

chartable_5x5

So this is for C, but what about over R? Note from our description above that the 4 characters of dimension 1 are defined over R. Write

\mathbf{R}[G] \cong V_{\text{triv}} \oplus V_i \oplus V_j \oplus V_k \oplus W

where the character τ of W is given by \tau = \chi_{\mathbf{R}[G]} - \chi_{\text{triv}} - \chi_i - \chi_j - \chi_k = 2\chi. Is W simple?

Note that upon extension to C, we have W\otimes_{\mathbf{R}} \mathbf{C} \cong V^2 where V is the simple C[G]-module with character χ = (2, -2, 0, 0, 0) above. Thus if W is not simple, we must have W \cong V^2 for some R[G]-module V, such that dimRV = 2.

We shall prove that such a V does not exist; let ρ : G → GL(2, R) be the corresponding group homomorphism.

  • We have ρ(+1) = I and ρ(-1) = -I (the latter because it has trace -2 from the character table).
  • The other 6 elements all satisfy g^2 = -1 so \rho(g)^2 = -I and so ρ(g) has eigenvalues ±√-1. From the character table, its trace is zero so the eigenvalues are +√-1 and -√-1.
  • Thus ρ(g) has det = 1 and trace = 0, corresponding to the product and sum of its eigenvalues.
  • It remains to show that if AB are 2×2 real matrices with det = 1 and trace = 0, then the same cannot hold fo AB.

For that, it is easy to show A and B are of the form:

A = \begin{pmatrix} a & b \\ -\frac{1+a^2}b & -a\end{pmatrix},\quad B=\begin{pmatrix} c & d\\ -\frac{1+c^2}d & -c\end{pmatrix}.

 Upon expanding tr(AB) we obtain:

2ac -(1+c^2)\frac b d - (1+a^2)\frac d b = 0\implies b^2 + d^2+ (bc-ad)^2 = 0.

Over the reals, we must have bd = 0, which is absurd. Thus, V does not exist so W is already simple as an R[G]-module. Hence the structure of the group ring R[G] is:

\mathbf{R}[G] \cong \mathbf{R} \times \mathbf{R} \times \mathbf{R} \times\mathbf{R} \times D

where D is a division ring of dimension 4 over R. [ In fact, it is the ring of quaternions. ]

Posted in Notes | Tagged , , , , , | Leave a comment

The Group Algebra (II)

We continue our discussion of the group algebra.

Constructing K[G]-modules

Recall that such a module V is also called a representation of G over K, and corresponds to a group homomorphism \rho : G \to GL_K(V).

(i) Given a K[G]-module V, a submodule W of V is precisely a vector subspace such that g(W) ⊆ W for all g in G.

(ii) Given K[G]-modules V and W we have the direct sum V ⊕ W.

matrix_direct_sum_v2

(iii) Given K[G]-modules V and W we have the tensor product V ⊗K W. This is a K[G]-module since for each gG, the map V \times W \to V\otimes_K W taking (v, w) \mapsto gv\otimes gw is K-bilinear, so it induces a linear map V\otimes_K W \to V\otimes_K W taking v\otimes w \mapsto gv \otimes gw.

matrix_tensor_product

[ Note: if you’re not familiar with tensor product, we will discuss this in a more general setting later. ]

(iv) Given a K[G]-module V, its dual (as a K-vector space) is given by V^* := \text{Hom}_K(V, K). It is given the structure of a K[G]-module as follows:

f : V\to K, g\in G \ \mapsto \ (g\cdot f) : V \to K, v \mapsto f(g{^-1}v).

Again, the inverse is required to ensure group action occurs in the right order. The dual gives us a representation with the same dimension.

(v) More generally, given K[G]-modules VW, let \text{Hom}_K(V, W) be the space of K-linear maps V → W. It is given the structure of a K[G]-module as follows:

f:V \to W, g\in G \ \mapsto \ (g\cdot f): V\to W, v\mapsto g\cdot f(g^{-1}v).

This clearly generalises (iv), if we assume G acts trivially on K.

Note : from basic linear algebra, \text{Hom}_K(V, W) \cong V^* \otimes_K W when V and W are finite-dimensional. The LHS has a K[G]-module structure via (iv) while the RHS has a K[G]-module structure via (ii) and (iii). It is easy to check that they are consistent. ]

blue-lin

Character Theory

The character of a K[G]-module V is defined to be:

\chi_V : G\to K, \quad \chi_V(g) := \text{tr}(g : V\to V),

where tr is the trace as a K-linear map.

Note that \chi_V(hgh^{-1}) = \chi_V(g) since \text{tr}(B\cdot AB^{-1}) = \text{tr}(AB^{-1}\cdot B) = \text{tr}(A) for any square matrices A and B with B invertible. Generally, a function χ : G → K is said to be a class function if \chi(hgh^{-1}) = \chi(g) for any gh in GThus characters are class functions.

Note : the set of class functions is a vector space over K, whose dimension is the number of conjugancy classes of G.

Elementary linear algebra tells us it’s not hard to compute the characters of the direct sum, tensor product, and dual from the characters of the individual modules:

\begin{aligned}\chi_{V\oplus W}(g) &= \chi_V(g) + \chi_W(g), \\ \chi_{V\otimes W}(g) &= \chi_V(g) \times \chi_W(g)\\ \chi_{V^*}(g), &= \chi_V(g^{-1}).\end{aligned}

 As a result, we have

\chi_{\text{Hom}(V,W)}(g) = \chi_{V^* \otimes W}(g) = \chi_{V^*}(g) \chi_W(g) = \chi_V(g^{-1})\chi_W(g).

The next lemma is simple yet critical.

Lemma. If V is a K[G]-module, let V^G := \{v\in V: gv = v \forall g\in G\} be the space of G-invariant vectors. Then:

\dim_K V^G = \frac 1 {|G|} \sum_{g\in G} \chi_V(g).

Proof

Let pV → V be the K-linear map p(v) := \frac 1{|G|}\sum_{g\in G} g\cdot v. We have:

  1. Image of p lies in VG : indeed for any h in G, h\cdot p(v) = \frac 1{|G|}\sum_{g\in G} hg\cdot v \stackrel{x=hg}{=} \frac 1{|G|}\sum_{x\in G} x\cdot v = p(v).
  2. If v ∈ VG, then p(v) = v : this follows from gvv for all g.
  3. p2p : for any v in V, property 1 says p(v) ∈ VG; then property 2 says p(p(v)) = p(v).

Thus, p is a projection map onto VG, and so its trace is precisely dim(VG). But that trace is precisely \frac 1{|G|} \sum_{g\in G} \text{tr}(g:V\to V) which proves our lemma. ♦

blue-lin

Orthogonality of Irreducible Characters

Let VW be K[G]-modules; we now apply the above lemma on the K[G]-module U:=\text{Hom}_K(V, W). The lemma says that the dimension of the space UG is:

\frac 1{|G|} \sum_{g\in G} \chi_U(g) = \frac 1{|G|}\sum_{g\in G} \chi_V(g^{-1})\chi_W(g).

On the other hand, what is UG ? An element f of U gives a linear map VW, and g acts on it via (g\cdot f):v \mapsto g\cdot f(g^{-1} v); thus f is G-invariant if and only if (f(g\cdot v) = g\cdot f(v) for all g in Gv in VIn short, the space of G-invariant elements of U is precisely the space of intertwining operators V→W, or equivalently, K[G]-module homomorphisms V→W.

But we recall Schur’s lemma: if V and W are simple K[G]-modules, then:

  • V non-isomorphic to W ⇒ HomK[G](VW) = 0.
  • V isomorphic to W ⇒ HomK[G](VW) is a division ring containing K, so its dimension ≥ 1.

Let us summmarise everything here.

Summary. Define an inner product on the space of class functions as follows: given class functions χ and ψ, let:

\left<\chi, \psi\right> := \frac 1 {|G|}\sum_{g\in G} \chi(g^{-1}) \psi(g).

This is a bilinear map onto K. If χ and ψ are characters of simple K[G]-modules, then they are orthogonal if the modules are not isomorphic. [ We call such characters irreducible characters. ]

So the collection of irreducible characters forms an orthogonal set in the space of class functions. In particular, this means they are linearly independent! In general, however, they do not span the full space of class functions.

Example

Let G = {egg2} be the finite group of order 3 and KQ. The group ring Q[G] is easily seen to be isomorphic to Q[T]/(T3-1) (via mapping g to T). This is isomorphic to Q × Q(√-3), so it has only two simple modules up to isomorphism. Yet it has 3 conjugancy classes, so the characters do not span the full space of class functions.

Note that if we had picked KC instead, there would be no problem since (T3-1) is completely factorisable in C. This hints that the case where K is algebraically closed is nice.

Posted in Notes | Leave a comment

The Group Algebra (I)

[ Note: the contents of this article overlap with a previous series on character theory. ]

Let K be a field and G a finite group. The group algebra K[G] is defined to be a vector space over K with basis \{g : g\in G\}, where “g” here is an abstract symbol for each element g of G. Thus, K[G] has dimension |G| over K. Now K[G] has a ring structure obtained from group multiplication, where the algebra product σ⋅τ is precisely the group product, i.e. an element of G. For example, multiplication in Q[S3] gives

(1∙(1,2) – 3∙ (2,3)) ∙ (2∙ (1,3) + 4e) = 2∙(1,3,2) + 4∙(1,2) – 6∙(1,2,3) – 12∙(2,3).

Note

The most important aspect of the group algebra is that its modules M correspond to group homomorphisms \rho : G \to GL_K(M), of G to the group of invertible K-linear maps M→M; such a ρ is called a representation of G over K. Indeed, if M is a K[G]-module, then considering how elements g ∈ K[G] acts on M (for g ∈ G), we obtain a homomorphism G \to GL_K(M).

Conversely, if we have a group homomorphism \rho : G\to GL_K(M), then this gives M the structure of a K[G]-module by letting \sum_{g\in G} c_g\cdot g (c_g \in K) take m ∈ M to \sum_{g\in G} c_g (\rho(g))(m).

Note that a homomorphism of K[G]-modules fV → W is a linear map of vector spaces such that f(g⋅v) = g⋅f(v) for any g in G and v in V. Such a map is also called an intertwining map (since it commutes with every element of G).

Example 1

Every group has a trivial representation, corresponding to G → K* which maps all g to 1. As a K[G]-module representation, this is V := K and \sum_{g\in G} c_g\cdot g takes every v in V to \sum_{g\in G} c_g v.

Example 2

V := K[G] is a module over the ring K[G]; this corresponds to the regular representation. E.g. if G = {egg2} is the cyclic group of order 3, then the group homomorphism \rho : G \to GL_K(K^3) is given by:

\rho(e) = \begin{pmatrix} 1 &0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{pmatrix}, \quad \rho(g) = \begin{pmatrix} 0 & 0 & 1\\ 1&0&0 \\ 0&1&0\end{pmatrix}, \quad \rho(g^2) =\begin{pmatrix} 0&1&0 \\ 0&0&1 \\ 1&0&0\end{pmatrix}.

Example 3

Let KRG = S3,  MR3 and let \rho : G \to GL(M) be the homomorphism taking g \in S_3 to the linear map

\rho_g : M\to M, \qquad (x_1, x_2, x_3) \mapsto (x_{g^{-1}(1)}, x_{g^{-1}(2)}, x_{g^{-1}(3)}).

E.g. g = (1,3,2) takes (xyz) to (yzx) and h = (1, 2) takes (xyz) to (yxz). The inverse of g is necessary for the subscripts to ensure that \rho_g \rho_h = \rho_{gh} for any g, h\in S_3. Now M is a module over R[S3], where, e.g. a(1, 2) + b(1, 3, 2) takes the point (x, yz) to (ay+byax+bzaz+bx) so its corresponding matrix is \begin{pmatrix} 0 & a+b & 0 \\ a & 0 & b \\ b & 0 & a\end{pmatrix}, assuming elements of M are written as column vectors.

blue-lin

Main Result

Theorem. If the characteristic of K does not divide |G|, then K[G] is semisimple.

Proof

We shall prove: for each left ideal I of K[G], there is a left ideal J such that I ⊕ JK[G] (i.e. I ∩ J = 0 and IJK[G]). Assuming this holds, since K[G] is of finite dimension, it must have a minimal left ideal I1. By our assumption, there exists J1 such that I1 ⊕ J1 = K[G]. Again, J1 must have a simple submodule I2 etc . Eventually, we obtain K[G] as a direct sum of minimal left ideals.

Now suppose I is a left ideal of K[G]. This is a vector subspace, so we can write K[G] = I ⊕ W for a subspace W. Let pK[G] → K[G] be the projection I ⊕ WI, so that p2p and p has image I (see # later). Let us now define the K-linear map:

q : K[G] \to K[G], \quad q(\alpha) := \frac 1 {|G|} \sum_{g\in G} g\cdot p(g^{-1} \alpha) for \alpha \in K[G].

[ Note that this is well-defined since |G| is invertible in K. ] We have the following properties of q:

  1. The image of q is in I : indeed image of p lies in I and g⋅I ⊆ I for any g in G.
  2. q(α) = α for all α in I : indeed, g^{-1}\alpha \in I and since p(α) = α for all α in I, the result follows. Together with property 1, this means im(q) = I.
  3. q2: for any α in K[G], we have q(α) in I, so by property 2, q(q(α)) = q(α).
  4. q is a K[G]-module homomorphism: clearly q is K-linear; furthermore for any h in G, we have

h\cdot q(h^{-1}\alpha) = \frac 1 {|G|} \sum_{g\in G} hg\cdot p(g^{-1}h^{-1}\alpha)\stackrel{x=gh}{=} \frac 1{|G|} \sum_{x\in G} x\cdot p(x^{-1}\alpha) = q(\alpha) \implies q(h\alpha) = h\cdot q(\alpha)

Since q2qq is a projection and we have K[G] = ker(q) ⊕ im(q) = ker(q) ⊕ I (see # later). And since q is a K[G]-module homomorphism, its kernel is a submodule, and we have proven our assumption. ♦

[ # Note: along the way, we used the fact that a K-linear map pV → V such that p2p gives us V = ker(p) ⊕ im(p). The proof of this is an easy exercise. ]

From our classification of semisimple rings, we obtain the

Corollary. If char(K) does not divide |G|, then K[G] is isomorphic to a finite product of matrix rings over division rings. Writing K[G] \cong \prod_{i=1}^k M_{n_i}(D_i) and letting m_i = [D_i : K] gives us

|G| = \dim_K K[G] = \sum_{i=1}^k n_i^2 m_i,

where k is the number of isomorphism classes of simple K[G]-modules. [ Note: the simple K[G]-modules are also called irreducible representations of G. ]

Posted in Notes | Tagged , , , , , , | Leave a comment

Structure of Semisimple Rings

It turns out there is a nice classification for semisimple rings.

Theorem. Any semisimple ring R is a finite product:

R \cong \prod_{i=1}^k M_{n_i}(D_i),

where each D_i is a division ring and M_n(D) is the ring of n × n matrices with entries in D. Furthermore, the list of (n_i, D_i) is unique up to permutation, and isomorphism class of D_i.

We saw that a semisimple ring R is a finite direct sum of simple submodules (left ideals):

R = M_1^{m_1} \oplus M_2^{m_2} \oplus \ldots \oplus M_k^{m_k},

where the simple modules M_1, \ldots, M_k are pairwise non-isomorphic. Schur’s lemma says that for simple modules M and M’, \text{Hom}_R(M, M') is zero if M and M’ are not isomorphic, and is a division ring otherwise.

More generally, for a simple module M, we have:

\text{Hom}(M^m, M^n) \cong \text{Hom}(M, M)^{mn} \cong D^{mn}

where D is a division ring.

[ Note: in general, we have \text{Hom}(\oplus_i M_i, N) \cong \prod_i \text{Hom}(M_i, N) and \text{Hom}(M, \prod_i N_i) \cong \prod_i \text{Hom}(M, N_i) by the universal properties of direct sum and product. When we have finitely many terms, the direct sum is the direct product. ]

Next we have the isomorphism \text{Hom}_R(R,R) \cong R^{op}, where

x\in R^{op} \quad \leftrightarrow \quad f_x : R \to R, z \mapsto zx.

[ We need to take the opposite ring since f_y\circ f_x = f_{xy}. ]

Piecing all these together, we have:

R^{op} \cong \text{Hom}(M_1^{n_1}\oplus \ldots \oplus M_k^{n_k}, M_1^{n_1} \oplus \ldots \oplus M_k^{n_k})\cong \oplus_{i,j} \text{Hom}(M_i^{n_i}, M_j^{n_j}).

By the above discussion, each term \text{Hom}(M_i^{n_i}, M_j^{n_j}) is either 0 (if i ≠ j) or isomorphic to M_{n_i}(D_i), where D_i := \text{Hom}(M_i, M_i) is a division ring. In conclusion:

R\cong \prod_{i=1}^k M_{n_i}(D_i) for division rings D_i.

[ We leave it to the reader to prove that matrix product in M_n(\text{Hom}(M, M)) corresponds to composition of endomorphisms \text{Hom}(M^n, M^n). ]

To show that the set of (n_i, D_i) is unique up to isomorphism, note:

Proposition. The ring R = M_n(D), for division ring D, has a unique simple module up to isomorphism, namely D^n.

Proof

Indeed, R is a direct sum of column vectors. E.g. for n = 2, we have:

M_2(D) = \left\{ \begin{pmatrix} a & b \\ c & d \end{pmatrix}\right\} = \left\{ \begin{pmatrix} a & 0 \\ b & 0\end{pmatrix}\right\} \oplus \left\{ \begin{pmatrix} 0 & c\\ 0 & d\end{pmatrix}\right\}.

It is easy to see that the space of column vectors D^n is simple as an R-module. On the other hand, since every simple R-module occurs as a simple left ideal of R, D^n is the only possible simple R-module. ♦

Hence, for a ring R = \prod_{i=1}^k M_{n_i}(D_i), there are exactly k simple modules up to isomorphism, each simple module occurs n_i times and its endomorphism ring is isomorphic to D_i^{op}. This shows that we can recover (n_i, D_i) from the ring R itself, so it must be unique.

The theorem also shows:

Corollary. R is semisimple iff its opposite ring R^{op} is. Another of saying this is: R is “left semisimple” iff it is “right semisimple”.

 Coming up next, the most interesting example of semisimple rings: group rings.

Posted in Notes | Tagged , , , | Leave a comment

Semisimple Rings and Modules

After discussing simple modules, the next best thing is to look at semisimple modules, which are just direct sums of simple modules. Here’s a summary of the results we’ll prove:

  • A module is semisimple iff it is a sum of simple submodules.
  • Quotients, sums, direct sums and submodules of semisimple modules are also semisimple.

blue-lin

Semisimple Modules

Again, we fix a base ring R, possibly non-commutative. All modules are left modules.

Definition. An R-module M is said to be semisimple if it is a sum of simple submodules.

The key theorem we wish to prove is the following.

Theorem. Let M be a semisimple R-module and N \subseteq M a submodule. Then we can find simple submodules M_i \subseteq M (indexed by i\in I) such that

M = N \oplus \left( \oplus_{i\in I} M_i \right).

The “direct sum” ⊕ means that every element m of M is uniquely writable as a sum n + \sum_i m_i, where n\in N, m_i\in M_i and only finitely many terms are non-zero.

Proof

This will be by Zorn’s lemma. Consider collections ∑ of simple submodules S of M such that:

M_\Sigma := N \oplus \left( \oplus_{S\in \Sigma} S\right)

is a direct sum. Note that at least one ∑ exists, i.e. ∑ = ∅ is valid (in which case we get M_\emptyset = N). [ For those who worry about set-theoretic validity, note that the collection of all such ∑ forms a bona fide set. ]

To apply Zorn’s, we need to prove that every chain of ∑’s has an upper bound.

Suppose \{\Sigma_\alpha\}_\alpha is a chain: i.e. for any \Sigma_\alpha, \Sigma_\beta, either \Sigma_\alpha\subseteq \Sigma_\beta or \Sigma_\beta \subseteq \Sigma_\alpha. Let \Sigma = \cup_\alpha \Sigma_\alpha; let us show that M_\Sigma = N \oplus \left(\oplus_{S\in\Sigma} S\right) is a direct sum.

  • If not, then n + \sum_{S\in \Sigma} m_S = 0 for some n\in N, m_S \in S. But this is a finite sum, so the equality already holds in some \Sigma_\alpha (since the \Sigma_\alpha‘s form a chain), which is a contradiction.

Thus, the chain \{\Sigma_\alpha\}_\alpha has an upper bound. Zorn’s lemma tells us there is a maximal ∑. If M_\Sigma \ne M, pick m \in M-M_\Sigma. Since M is a sum of simple submodules, write

m = m_1 + m_2 + \ldots + m_r, \qquad m_k \in M_k, where each Mk is simple.

Since m\not\in M_\Sigma we have M_k \not\subseteq M_\Sigma for some k. But this means M_k \cap M_J is a proper submodule of Mk, and must be zero (since Mk is simple). Hence M_k \oplus M_\Sigma is a direct sum, so we could have added the simple module Mk to the collection ∑, contradicting its maximality. Thus, M_\Sigma = M and we’re done. ♦

Now we’re ready to prove all the necessary properties of semisimple modules.

Corollary 1. Every semisimple module M is a direct sum of simple modules.

Proof. Pick N = 0 in the theorem. ♦

Corollary 2. If each N_i\subseteq M is a semisimple submodule of a module M, then so is N :=\sum N_i.

Proof. Each N_i is a sum of simple modules; by definition so is N. ♦

Corollary 3. If N is a submodule of a semisimple M, then there is a submodule P of M such that M = N\oplus P.

Proof. Apply the theorem and let P := \oplus_{i\in I} M_i. ♦

Corollary 4. A submodule and quotient of a semisimple module M is semisimple.

Proof. Submodule follows from the definition of semisimplicity; quotient follows from the theorem, since M/N \cong \oplus_{i\in I} M_i is a direct sum of simple modules. ♦

blue-lin

Semisimple Rings

Definition. The ring R is semisimple if it is a semisimple module over itself.

The main result we want to show is:

Theorem. Any module over a semisimple ring R is semisimple.

Proof

Let M be a module. If m is a non-zero element of M, take the homomorphism fR → M, which takes r → rm. Then Rm is a submodule of M isomorphic to R/ker(f), which is a semisimple R-module since R is. Thus Rm is semisimple. Since M is a sum of semisimple submodules, M is also semisimple. ♦

Let us look at some ways to create semisimple rings.

Proposition. (i) If I is a (two-sided) ideal of semisimple ring R, then R/I is a semisimple ring.

(ii) If R and S are semisimple rings, so is R × S.

Proof

(i) Any left ideal of R/I corresponds to a left ideal of R containing I, which is a sum of simple submodules J. The image of J in R/I, i.e. (J + I)/IJ / (J ∩ I), is thus either 0 or J. Either way, R/I is a sum of simple submodules.

(ii) Any left ideal M of R × S is of the form I × J, for left ideal I of R and J of S. [ To see why, multiply elements of M by (1, 0) and (0, 1). ] Since I and J are both sums of simple submodules, so is I × J. ♦

Finally, decomposing R gives us a complete list of simple R-modules.

Proposition. Let R be a semisimple ring; write R = \oplus_i N_i as a direct sum of simple left ideals. Then any simple module M is isomorphic to some N_i. In particular, there are only finitely many simple R-modules up to isomorphism.

[ Note: the N_i which occur may repeat; in the extreme case, we can even have R \cong N^k for a single simple module N; this just means N is the only simple R-module up to isomorphism. ]

Proof

We know that any simple module M is isomorphic to quotient R/I for a maximal left ideal I of R. For each i, consider

f_i : N_i \to \oplus_i N_i \to (\oplus_i N_i)/I = M.

Since N_i, M are both simple, f_i = 0 or an isomorphism. If all f_i=0, then so is \sum_i f_i : R = \oplus_i N_i \to M which is absurd. Hence some f_i is an isomorphism, which proves the first statement.

The second statement follows from the following lemma. ♦

Lemma. Writing the base ring as a direct sum of submodules R = \oplus_i N_i, only finitely many of the modules are non-zero. 

Proof

Indeed, write 1 as a finite sum x_1 + x_2 + \ldots + x_k where x_i \in N_i. For an N_i not in this list, any y\in N_i gives:

y = y\cdot 1 = y x_1 + y x_2 + \ldots + y x_k \in N_1 + N_2 + \ldots + N_k.

So y = 0 since N_i does not lie in the list of N_1, \ldots, N_k. ♦

blue-lin

Examples

1. Every field K is semisimple, since K itself is a simple K-module.

2. The ring Z is not semisimple. [ Why? ]

3. Every finite abelian group M is a product of cyclic groups. Thus M is semisimple as a Z-module if and only if it is a product of prime cyclic groups. Indeed, each prime cyclic group is clearly simple, hence so is their product. Conversely, if M contains a subgroup N isomorphic to Z/pr for r>1, then M has a unique simple subgroup pr-1Z/pr so it is not semisimple.

4. Is the ring RR[x]/(x2) semisimple? What about SR[x]/(x2 – 1) [ Answer: no to the first question, because its only simple left ideal is Rx, so 1 does not lie in the sum of all simple left ideals. Yes to the second, since it is isomorphic to R × R.  ]

5. Let R be the ring of upper triangular real 2 × 2 matrices \begin{pmatrix} a & b \\ 0 & d\end{pmatrix}. Is this semisimple? [ Answer: no, let R act on the space of column vectors R2. Call this R-module M. The space N := R(1, 0)t is a simple submodule of M, but we cannot find a submodule P for which MN ⊕ P. Thus we get a non-semisimple R-module. ]

6. Let R be the ring of real 2 × 2 matrices. Prove that R is semisimple. [ Hint: write R as a direct sum of two simple modules, which are spaces of column vectors. ]

Posted in Notes | Tagged , , , | Leave a comment

Simple Modules

We briefly talked about modules over a (possibly non-commutative) ring R. An important aspect of modules is that unlike vector spaces, modules are usually not free, i.e. they don’t have a basis. For example, take the Z-module given by Z/2Z.

[ Recall: a Z-module is just the same as an abelian group, and a Z-module homomorphism corresponds precisely to a homomorphism of abelian groups. ]

Another annoying aspect is that for a submodule N\subseteq M, we often cannot find a P\subseteq M such that M = N\oplus P (i.e. N\cap P = 0 and N + P = M, which just means that every element of M is uniquely writable as x+y, with x in N and y in P). A good example is given by N = 2ZM = Z; for any other submodule P of Z, we either get P=0 or a non-zero module which must necessarily intersect N.

Another problem is that a submodule of a free module is not necessarily free. This is less intuitive, since any subgroup of a free abelian group is actually free (even the non-finitely generated case!), so a beginning student may have problems grasping it.

Exercise : let R be the ring Z × Z, where multiplication and addition are both component-wise (e.g. (2, 1) × (-5, 3) = (-10, 3)). Prove that the R-module R has a submodule which is not free. [ Answer: take the submodule Z × {0}. ]

In this article, we will look at a simple case where modules are rather well-behaved. This is probably as good as it gets, next to vector spaces over division rings.

[ Convention: all modules are left modules, i.e. for r\in R and m\in M we have the multiplication rm\in M. This ensures that multiplication-by-r followed by multiplication-by-s is simply multiplication-by-sr, since s(rm) = (sr)m. In the case of right modules, this would have been multiplication-by-rs. ]

blue-lin

Simple Modules

First we have:

Definition. An R-module M is said to be simple if it is non-zero, and has no submodules except 0 and itself.

Simple modules are akin to prime numbers, and just like we don’t accept 1 as a prime, nor do we accept 0 as a simple module (it messes up the factorization terms). In the case where R is a field (or even a division ring), a simple module is the same as a vector space of dimension 1. However, in the general case not all simple modules are isomorphic:

Exercse: let RZ. Find all simple Z-modules. [ Answer: they’re modules of the form Z/for prime p. ]

Even in the case where RZ it’s clear not every module has a simple submodule. E.g. take MZ. So the following theory only works in rather limited cases, specifically when the ring is not too “complicated”.

[ Aside: what constitutes a “complicated” ring? Roughly speaking, these are rings which are “hard to understand”. We’ll give some examples here: fields are the easiest types of rings, and division rings aren’t too bad, although division ring extensions are much harder to describe than field extensions. Commutative rings can be classified by a value of “dimension”, which is roughly the number of parameters required to describe the ring, but these will come later. ]

A trivial but important observation:

Let N be a simple submodule of M. For any submodule N’ of M, either N ∩ N’ = 0 or N’ contains N.

If N’ also simple, then N ∩ N’ = 0 or N = N’.

Since N ∩ N’ is a submodule of N, a simple module, it is either 0 or N itself; this proves the first statement. The second statement follows easily from the first. ♦

Another easy observation:

Any simple module is isomorphic to R/I, for a maximal left ideal I of R.

[ Note: a left ideal I of R is simply a left submodule. In other words, a left ideal is a subset I of R such that (i) (I, +) is an abelian subgroup, and (ii) for any r\in R, x \in I, we have rx \in I. ]

[ A maximal submodule of M is a submodule N of M such that N ≠ M and, whenever N \subseteq P \subseteq M, we must have PN or PM. From the correspondence between (a) submodules of M/N and (b) submodules of M containing N, we conclude that N is maximal iff M/N is simple. ]

Proof

If I is a maximal left ideal (and thus maximal submodule) of R, then as noted above, R/I is a simple module. Conversely, let M be a simple R-module; pick a non-zero element m of M. Take the R-module homomorphism f:R \to M, r \mapsto rm. Its image is a non-zero submodule of M so must be the whole of M. Thus f is surjective and M \cong R/\text{ker} f. ♦

Finally, the following result is extremely important!

Schur’s Lemma. If f:M \to N is a homomorphism of simple modules, then either f=0 or f is an isomorphism. In particular, the ring \text{End}_R(M) is a division ring for a simple module M.

Proof

The kernel of f is a submodule of M, so it is either 0 or whole of M. Likewise, the image of f must be 0 or the whole of N. By considering various cases, we see that either f = 0 (in which case ker(f) = M and im(f) = 0) or f is an isomorphism (in which case ker(f) = 0 and im(f) = N). The second statement follows from the first, since the product in End(M) is just composition of endomorphisms. ♦

blue-lin

Some Examples

Let’s consider some concrete cases.

  • Take RZ and MZ/5Z, which is a simple R-module. To find the endomorphism ring End(M), consider an fM → M. This is wholly determined by f(1), so we obtain a map End(M) → M which takes f to f(1). You can check that this is surjective, so End(M) = Z/5, a field.
  • Take R = R × R, where R is the field of real numbers, and MR × {0}, an R-module which is clearly simple. Again an endomorphism fM → M is uniquely determined by f((1,0)) so we have End(M) = R, a field.
  • Let RR × R again, and MR × {0}, N = {0} × R be simple R-modules. Are they isomorphic as R-modules? What about M' = \{(x, x) : x\in\mathbf{R}\} and N' = \{(x, 2x): x\in\mathbf{R}\}?  [ Answer: no for the first question, we have (0, 1)·M = 0 but (0, 1)·N ≠ 0. Yes to the second question. ]
  • Let R be the ring of upper-triangular 2 × 2 matrices with real entries, i.e. \begin{pmatrix} a & b \\ 0 & d\end{pmatrix}, a,b,d\in\mathbf{R}. Let M be the space R2 and R acts on M via multiplying matrix with vector. Find a simple submodule of M. [ Answer: take the R-vector space spanned by (1, 0)’. ]
  • So far all End(M) we’ve seen are commutative. For a cheap way to get a non-commutative case, let R be a division ring and MR. Any R-linear fM → M is of the form f(m) = mr for some r. [ Question: why mr and not rm? ] Thus composing f(m) = mr followed by g(m) = ms gives us gf(m) = mrs and we have End(M) = Rop (the opposite ring!). Not every division ring is isomorphic to its opposite ring, but it’s hard to construct an example.
Posted in Notes | Tagged , , , , , | 2 Comments

Coming up next…

This blog has been dormant for a while, as I’ve been doing quite a bit of self-reading and ruminating over the stuffs I’ve read. I’d really like to post some of my thoughts, but there’s always the risk of misleading some of my readers.

So all subsequent posts come with a disclaimer: please proceed carefully. There may be mistakes here, or worse, misconceptions. But I’d rather run the risk of appearing silly, than to avoid posting and remaining so.

Another warning: many of the subsequent topics may be rather hard, so here goes …

Posted in Uncategorized | Leave a comment

From Euler Characteristics to Cohomology (II)

Boundary Maps

Here’s a brief recap of the previous article: we learnt that in refining a cell decomposition of an object M, we can, at each step, pick an i-dimensional cell and divide it in two. In this way, we introduce an additional i-dimensional cell and an (i-1)-dimensional cell.

fpart

Hence even after successive steps, the Euler characteristic \chi(M) := \sum_i (-1)^i n_i, where ni denotes the number of i-dimensional cells, is constant. So χ(M) is independent of the cell decomposition we pick for M and can be used to distinguish between topological spaces, i.e. if χ(M)≠χ(N) then M and N are not homeomorphic as topological spaces.

In fact, it is possible to obtain even finer distinguishing characteristics for the topology of M, known as Betti numbers. For starters, consider the boundary function ∂ which takes a cell to its boundary ∂C, while taking into account its orientation. If C is i-dimensional, then ∂C is a union of (i-1)-dimensional cells. For example, in the following diagram, ∂C can be expressed as a sum of 5 edges:

boundary1

The critical observation is that when we partition a cell into two, written as C \mapsto C_1 + C_2, then the boundary map is additive: \partial C = \partial C_1 + \partial C_2 as can be seen in the following diagram:

boundarypart

since the two edges which are equal but in opposite directions cancel each other out. The next property we note is that ∂(∂C) = 0, as illustrated by the following diagram.

doubleboundary

Betti Numbers

This suggests the following: let Si denote the set of i-dimensional cells in the decomposition. An ichain is defined as a formal sum D = \alpha_1 T_1 + \alpha_2 T_2 + \dots + \alpha_k T_k, where T_1, \ldots, T_k\in S_i are i-dimensional cells and \alpha_1, \ldots, \alpha_k\in \mathbb{R}. These sums are formal in the sense that we merely treat them symbolically; the set of such sums forms a real vector space \mathbb{R}^{S_i} via the following operations

  • addition: \left(\sum_i \alpha_i T_i\right) +\left(\sum_i \beta_i T_i\right) = \sum_i (\alpha_i + \beta_i) T_i;
  • scalar multiplication: c\cdot \left(\sum_i \alpha_i T_i\right) = \sum_i (c\alpha_i) T_i.

The boundary map then gives a linear map:

\partial_i : \mathbb{R}^{S_i} \to \mathbb{R}^{S_{i-1}},

such that the composition \partial_{i}\circ\partial_{i+1} : \mathbb{R}^{S_{i+1}} \to \mathbb{R}^{S_{i-1}} is the zero map. It follows from basic linear algebra that we have \text{im}(\partial_{i+1}) \subseteq \text{ker}(\partial_i) as subspaces of \mathbb{R}^{S_i}.

Definition. The i-th Betti number, denoted bi, is the dimension of the quotient \text{ker}(\partial_i)/\text{im}(\partial_{i+1}).

We’ll briefly justify why this definition is independent of our choice of cell decomposition. It suffices to reduce to the case where we partition an i-dimensional cell C \mapsto C_1 + C_2 and thus add an (i-1)-dimensional cell D.

fpart_2

The boundary maps are then modified by adding the following relations:

\begin{aligned} \partial'_{i+1} &= s\circ\partial_{i+1},\ \mbox{ where } s(C) = C_1 + C_2\\ \partial_i'(C_1) &= S + D, \ \partial_i'(C_2) = T - D, \ \mbox{ where } S+T = \partial_i(C)\\ \partial_{i-1}'(D) &= \ldots\end{aligned}

from which we can show that \text{ker}(\partial'_j)/\text{im}(\partial'_{j+1}) is isomorphic to \text{ker}(\partial_j)/\text{im}(\partial_{j+1}) for ji+1, ii-1, i-2.

Sample Computation: Torus

Consider the torus, which is obtained by gluing the opposite sides of a square together in the same direction:

torushomology

The above cellular decomposition then comprises of the following cells:

  • 2-cells : {AB}, oriented in clockwise;
  • 1-cells : {abc}, with orientation as above;
  • 0-cells : {v}, since all four vertices of the square are glued together.

Then we have the following relations:

\begin{aligned} \partial_2(A) &= a - b + c,\ \partial_2(B) = b - c - a; \\ \partial_1(a) &= \partial_1(b) = \partial_1(c) = v - v = 0;\\ \partial_0(v)&= 0.\end{aligned}

Now we can easily compute the Betti numbers of the torus.

  • First, \text{ker}(\partial_2) is spanned by a single element A+B so it has dimension 1. Thus b_2 = 1-0 = 1.
  • Next, \text{im}(\partial_2) is spanned by a+cb and has dimension 1. And \text{ker}(\partial_1) is the full space spanned by {abc}, so b_1 = 3-1 = 2.
  • Finally, \text{im}(\partial_1) =0 and \text{ker}(\partial_0) is spanned by v and has dimension 1, so b_0 = 1-0 = 1.

Exercise

Verify the Betti numbers of the following surfaces:

many_betti_numbers1

blue-lin

Homology Groups

In fact, by considering the abelian groups \mathbb{Z}^{S_i} instead of vector spaces \mathbb{R}^{S_i} and taking group quotients instead of vector space quotients, one can obtain an even finer distinguisher of topological spaces. The resulting groups are called homology groups and denoted by H_i(M), where M is the underlying object and i=0, 1, 2, … .

For example, consider the projective plane and the square which have the same Betti numbers.

two_shapes

If we let M be the projective plane on the left, we get:

\mathbb{Z}^2 \stackrel{\partial_2}\longrightarrow\mathbb{Z}^3 \stackrel{\partial_1}\longrightarrow\mathbb{Z}^2 \stackrel{\partial_0}\longrightarrow 0,

\partial_2:\begin{matrix}A\mapsto a+b+c,\\B\mapsto a-c+b\end{matrix}, \qquad \partial_1: \begin{matrix} a\mapsto x-y,\\ b\mapsto y-x, \\ c\mapsto x-x=0\end{matrix}, \qquad \partial_0: \begin{matrix} x\mapsto 0, \\ y\mapsto 0.\end{matrix}

We still have \text{ker} \partial_2 = 0 so the second homology group is H_2(M) = 0. Next, \text{im} \partial_2 has Z-basis a+b+c and ac+b, or equivalently, a+b+c and 2c. On the other hand, \text{ker}\partial_1 has Z-basis a+b and c. Thus, the resulting group quotient is H_1(M) = \mathbb{Z}/2. Finally, \text{im}\partial_1 has basis xy while \text{ker} \partial_0 has basis xy, so the homology group is H_0(M) = \mathbb{Z}.

On the other hand, we leave it to the reader to check that if N is the square, then H_2(N) = H_1(N) = 0 and H_0(N) = \mathbb{Z}. Thus, the homology groups may differ even if the Betti numbers are the same, but computing the homology groups is a little harder than finding the Betti numbers. This form of homology is known as cellular homology.

Exercise

Compute the homology groups for each of the nine shapes in the previous exercise.

Simplicial Homology

At higher dimensions, it will be a hassle to illustrate the various cells and their boundaries. Thankfully, in the case of simplicial complexes, this can be calculated combinatorically without any need for diagrams. Here, each m-dimensional cell is written as an (m+1)-tuple of points: \Delta = [ v_0, v_1, \ldots, v_m ] with its boundary given by:

\partial_i([v_0, v_1, \ldots, v_m]) := \sum_{i=0}^m (-1)^i [v_0, \ldots, \hat{v_i}, \ldots, v_m],

where the i-th term has all vertices except v_i. E.g. \partial_2([a, b, c]) = [b,c] - [a,c] + [a,b]. As an exercise, check that under this definition, we have \partial_{m-1}(\partial_m(\Delta)) = 0.

Example

Suppose we have the following simplicial complex which is the union of a tetrahedron’s surface and a triangle.

simplicial_complex_eg

This comprises of four 2-simplices, eight 1-simplices and five 0-simplices. The boundary maps are given as above, e.g. \partial_2([a, c, d]) = [c,d] - [a,d] + [a,c], \partial_1([a,e]) = [e] - [a] and \partial_0([d]) = 0. Now it suffices to obtain the homology groups from the sequence \mathbb{Z}^4 \to \mathbb{Z}^8 \to \mathbb{Z}^5 \to 0 which is left as an exercise for the reader.

db

In general, expressing a topological object as a simplicial complex can be a rather tedious affair, for the vertices of each simplex must be distinct, and two distinct simplices cannot be represented by the same tuple of vertices. E.g. in the case of a torus, one possibility is the following:

simplicial_complex_torus

which has 18 faces, 27 edges and 9 vertices. Hence, simplicial homology can be quite a hassle to compute by hand, but on the plus side one can easily write a program to compute it. Also, note that simplicial homology is a special case of cellular homology (one where all cells are simplices).

Exercise. What’s wrong with the following simplicial complex for the torus?

simplicial_complex_torus_notAnswer (highlight to read) : the 1-simplices [a, c] and [a, b] are duplicated. E.g. [a, b] appears twice for the two edges at the bottom of the square.

blue-lin

Betti Numbers and Euler Characteristic

One can obtain the Euler characteristic of M from its Betti numbers:

Theorem. For any object M, we have:

\chi(M) = \sum_{i\ge 0} (-1)^i b_i(M).

Proof

We know from linear algebra that for any linear map V → W, we have dim(ker(T)) + dim(im(T)) = dim(V). Hence for each i, we have

\dim(\text{ker}(\partial_i)) + \dim(\text{im}(\partial_i)) = \dim (\mathbb{R}^{S_i})=|S_i|.

And since b_i(M) = \dim(\text{ker}(\partial_i)) - \dim(\text{im}(\partial_{i+1})), we get

\begin{aligned}\sum_i (-1)^i b_i(M) &= \sum_i (-1)^i\dim(\text{ker}(\partial_i)) - (-1)^i\sum_i\dim(\text{im}(\partial_{i+1}))\\ &=\sum_i (-1)^i \dim(\text{ker} (\partial_i)) + \sum_i (-1)^{i+1}\dim( \text{im}(\partial_{i+1})) \\ &=\sum_i (-1)^i [\dim(\text{ker} (\partial_i)) + \dim(\text{im}(\partial_i))] \\ &= \sum_i (-1)^i |S_i| =\chi(M)\end{aligned}

as desired. ♦

Mayer-Vietoris Sequence

Suppose X = M\cup N. Recall that the principle of inclusion and exclusion allows us to compute the Euler characteristics of X from those of MN and M ∩ N. The question is: what about Betti numbers and homology groups? We would very much like to say b_m(X) = b_m(M) + b_m(N) - b_m(M\cap N) but we would be lying if we did. Instead, we have the extended exact sequence:

\begin{aligned}\ldots &\longrightarrow H_m(M\cap N) \longrightarrow H_m(M) \oplus H_m(N) \longrightarrow H_m(M\cup N) \\ &\longrightarrow H_{m-1}(M\cap N) \longrightarrow H_{m-1}(M) \oplus H_{m-1}(N) \longrightarrow H_{m-1}(M\cup N) \ldots \end{aligned}

[ Note: an exact sequence of abelian groups is a sequence of maps as above such that for any two consecutive maps A \stackrel {f}\to B \stackrel{g}\to C, we have \text{ker}(g) = \text{im}(f). Also, the direct sum \oplus of two groups is simply their product. ]

Let’s explain the various maps. First, pick a cellular decomposition of M and of N; then find a decomposition of X which is a common refinement of both. The inclusion M\hookrightarrow M\cup N then maps chains of M to those of M\cup N. Clearly, if an m-chain D of M satisfies ∂D = 0, so does its image in M\cup N. Furthermore, if the chain D is of the form ∂for some (m+1)-chain E, then the image D’ of D in M\cup N is of the form ∂E’, where E’ is the image of E in M\cup N. Hence the inclusion i_M :M\hookrightarrow M\cup N induces a map of homology groups:

{i_M}_* :H_m(M) \to H_m(M\cup N) and similarly {i_N}_* : H_m(N) \to H_m(M\cup N).

Note: even though i_M and i_N are injective, in general {i_M}_* and {i_N}_* are not. ] Likewise, the inclusions j_M : M\cap N \hookrightarrow M and j_N : M\cap N\hookrightarrow N induce {j_M}_* : H_m(M\cap N) \to H_m(M) and {j_N}_* : H_m(M\cap N) \to H_m(N). Now we define:

  • H_m(M\cap N) \to H_m(M) \oplus H_m(N) as the map [D] \mapsto ({j_M}_*([D]), -{j_N}_*([D])),
  • H_m(M) \oplus H_m(N) \to H_m(M\cup N) as the map [D] \mapsto {i_M}_*([D]) + {i_N}_*([D]).

The map left of interest is the boundary map:

H_m(M\cup N) \to H_{m-1}(M\cap N).

For that, suppose we have an m-chain D of M\cup N such that ∂D=0. Write it as a sum of m-chains D’+D”, where D’D” are chains of MN respectively. Now since ∂D’ = -∂D” is an (m-1)-chain in M as well as in N, it must be an (m-1)-chain in M ∩ N; this gives rise to the map H_m(M\cup N) \to H_{m-1}(M\cap N). The boundary map is also denoted ∂, which may cause some confusion since the same symbol was earlier used to for a map taking m-chains of M to (m-1)-chains of M.

[ For the conscientious reader, the map is well-defined because

  1. ∂(∂D’) = 0 so we can take the class of ∂D’ in M ∩ N;
  2. if we replace D by D+∂E, we can write E as a sum of (m+1)-chains E’+E”, where E’E” are chains of MN respectively, so D’ is replaced by D’+∂E’ which gives ∂D’ = ∂(D’+∂E’) = ∂D’;
  3. if we pick a different DD1D2, then D1D’ = D”D2 is a chain in M ∩ N, and thus ∂D1 = ∂(D1D’) + ∂D’ has the same image as ∂D’ modulo im(∂) in M ∩ N. ]

Example

Let consider the 3-sphere S^3 = \{(x,y,z,t) \in \mathbb{R}^4 : x^2+y^2 + z^2 + t^2 = 1\}. This is the union of

M = \{(x,y,z,t) \in S^3 : t\ge 0\} and N = \{(x,y,z,t) \in S^3 : t\le 0\}

with intersection M\cap N = \{(x,y,z,0) \in \mathbb{R}^4 : x^2 + y^2 + z^2 = 1\}. The long exact sequence above thus gives (together with H_4 = 0 since we’re dealing with two or three-dimensional objects here):

\begin{aligned} 0 &\to H_3(M\cap N) \to H_3(M)\oplus H_3(N) \to H_3(S^3)\\ &\to H_2(M\cap N) \to H_2(M)\oplus H_2(N) \to H_2(S^3) \\ &\to H_1(M\cap N) \to H_1(M)\oplus H_1(N) \to H_1(S^3) \\ &\to H_0(M\cap N) \to H_0(M)\oplus H_0(N) \to H_0(S^3) \to 0.\end{aligned}

Since M and N are homeomorphic to the solid ball \{(x,y,z) : x^2+y^2+z^2 \le 1\} and hence the full 3-simplex on [0, 1, 2, 3], it is easy to check that they have trivial homology in dimensions ≥ 1. On the other hand, M ∩ N is homeomorphic to the surface of a cube so we get:

H_3(S^3) \cong H_2(M\cap N) \cong \mathbb{Z}, \ H_2(S^3) \cong H_1(M\cap N) = 0

and H_1(S^3) =0 since it’s easy to check that H_0(M\cap N)\to H_0(M) \oplus H_0(N) is injective.

Posted in Notes | Tagged , , , , , , , | 2 Comments

From Euler Characteristics to Cohomology (I)

Warning: this is primarily an expository article, so the proofs are not airtight, but they should be sufficiently convincing. ]

The five platonic solids were well-known among the ancient Greeks (VEF denote the number of vertices, edges and faces respectively):

Image

[ Images edited from wikipedia.org. ]

In all cases, we have VE+F=2. In fact, the same equality holds for any polyhedron which can be “deformed” into the surface of a ball. For example, if we have a pyramid with an n-gon as a base, then V=n+1, E=2nF=n+1 which gives VE+F=2. Or, we can glue a pyramid with a square base to a cube (assuming the side lengths match), and obtain V=9, E=16, F=9. Let’s state and prove this result.

Theorem 1. A convex polyhedron whose surface comprises of V vertices, E edges and F faces satisfies VE+F=2.

Proof

Being convex, the polyhedron can be enclosed in a sphere and its surface projected bijectively to the surface of the sphere. This gives a partition of the sphere’s surface into polygons (called a cellular decomposition). For any two cellular decompositions P and P’, we say that P’ is a refinement of P if all vertices and edges of P are also in P’.

Since any two cellular decompositions have a common refinement, it suffices to show that if P’ is a refinement of P, then they have the same VE+F. Now any refinement may be obtained from a sequence of steps, each of which is one of the two following types:

polysteps

The left step changes (VEF) to (V+1, E+1, F) since it adds a vertex and an edge without introducing any new face. The right step changes (VEF) to (VE+1, F+1). In both cases, there is no change to VE+F. Hence this value is constant across all decompositions of the sphere and picking any single example gives VE+F=2. ♦

db

The above proof, while simple, has a few hidden pitfalls for the unwary. For more details, the interested reader may refer to Imre Lakatos’ book “Proofs and Refutations” for a rather in-depth look at the underlying issues as well as their historical relevance.

The simple VE+F formula has some interesting applications.

Example 1

Let’s prove that there are at most 5 platonic solids. Suppose we have a polyhedron with F faces, each of which is an n-gon with m edges meeting at each vertex. Then EFn/2 and V = 2E/m. The formula VE+F = 2 gives

\frac {2E} m - E + \frac {2E} n = 2 \implies \frac 1 m + \frac 1 n = \frac 1 2 + \frac 1 E > \frac 1 2.

Since mn ≥ 3, it’s easy to see that this inequality only holds for (mn) = (3, 3), (3, 4), (3, 5), (4, 3), (5, 3), after which one can show that each case has at most one platonic solid.

Example 2

The utility graph cannot be drawn on the sphere (and hence on the plane) without intersecting edges. Indeed, if it could, we would have V=6, E=9, and so = 2-V+= 5. But since the graph has no triangles, each face has at least 4 edges and this gives at least 5×4/2 = 10 edges, which is a contradiction.

Exercise. Prove that the complete graph with 5 vertices cannot be drawn on the sphere without intersecting edges. Read up on planar graphs.

blue-lin

Other Surfaces

A closer look at the proof of thereom 1 indicates that VE+F is constant for any cellular decomposition of a suitably nice surface. For example, on the surface of a torus (doughnut), the following decomposition gives – after unfolding – four faces of rectangles.

torus

Hence V=4, E=8, F=4, and VE+F=0. This suggests that VE+F is an indication of the global topological property of the shape. We will call this the Euler characteristic of the surface. If we partition the surface of a two-holed doughnut, we find that its Euler characteristic is -2, and so on. This suggests the following.

Proposition 2.

The Euler characteristic of the surface of a g-holed torus is 2-2g.

multitorus

We will leave the proof to the reader since it is a matter of straightforward computation.

In fact, we don’t have to restrict ourselves to smooth surfaces (the formal terminology is called closed manifolds); let’s look at some surfaces with edges.

Example 3

For a square, we have V=4, E=4, F=1, so the Euler characteristic = 1. Or, we can partition it into two triangles and get V=4, E=5, F=2.

Example 4

Consider the Möbius strip which is obtained by gluing a pair of opposite sides of a square by flipping over one edge. By dividing the square into two triangles, we get V=2 (since A=DB=C), E=4 (since AC=DB) and F=2, so the Euler characteristic is 0.

mobius_square

Example 5

The Klein bottle is obtained by gluing the two opposite sides of the Möbius strip together, i.e. AB to CD (preserving the direction). This gives V=1 (since A=B=C=D), E=3 and F=2 so the Euler characteristic is 0.

Example 6

The projective plane is obtained by gluing the oppposite edges of a square in opposite directions, as follows.

projplane

This gives V=2 (since the opposite vertices of the square are glued together), E=2, F=1, and the Euler characteristic is 1.

Example 7

In fact, you don’t even need the surface to be connected. Indeed, it’s clear that if X is the topological disjoint union of M and N, and χ(X) denotes the Euler characteristic of X, then

\chi(X) = \chi(M\coprod N) = \chi(M) + \chi(N).

More generally, we have the following result:

Principle of Inclusion and Exclusion. Suppose X is the union of M and N. Then \chi(M\cup N) = \chi(M)+\chi(N) - \chi(M\cap N).

Proof

Find a cellular decomposition of M and of N, then form a decomposition of X which is a refinement of both. The result then follows easily by counting the number of times each vertex/edge/face appears in M and N. ♦

Exercise

  • Draw the utility graph on the surface T of a torus such that the edges do not intersect. [Solution.]
  • Do the same with the complete graph on 5 vertices.
  • Find the largest n for which we can embed the complete graph on n vertices on T.
  • Read up on toroidal graphs, or more generally, topological graph theory.

blue-lin

Other Dimensions

The Euler characteristic can be generalised to objects of arbitrary dimensions. For one dimensional objects, they can be represented by graphs. E.g. the nuclear disarmament symbol can be represented by a graph with V=5 vertices and E=8 edges so the Euler characteristic is now χ = VE = -3.

nsymbol

Graphically, χ = 1-c where c is the number of “holes” in the diagram. It’s easy to show that the Euler characteristic χ = VE for 1-dimensional objects is well-defined, i.e. independent of the graph representation we pick. Indeed, the proof is identical to before (that of theorem 1), except that in this case we can only add a vertex to an edge, since connecting two vertices completely changes the object.

Thus, for a general geometric object, we can add a vertex to break an edge into two, or an edge to break a face into two, or a face to divide a block into two, … , as illustrated by the following (B denotes the number of 3-dimensional blocks):

partitioning1

This shows that for a 3-dimensional object, its Euler characteristic χ = VE+FB is constant. Clearly we can generalise this to arbitrarily many dimensions: if a cellular decomposition of M comprises of n_i cells of dimension i, then its Euler characteristic

\chi(M) := \sum_i (-1)^i n_i

is independent of the decomposition.

Example 8

Consider a solid cube. We already know that the surface of a cube satisfies VE+F=2. Hence, the solid cube has χ = 2-1 = 1.

Example 9

Now consider a solid cube with a small cubical hole at the centre. We could partition this solid as a union of 6 pieces of the form:

solid_shape

which gives us B=6, F=24, E=32, V=16 (after some careful counting!) and so χ=2.

Or, we could also use the principle of inclusion and exclusion (whose proof clearly generalises to any geometric shapes). If M is the above cube with a hole, and N is the small cube at the centre, then their union is the large cube and their intersection is the surface of a cube, so we have:

\chi(M\cup N) = \chi(M) + \chi(N) - \chi(M\cap N) \implies \chi(M) = \chi(M\cap N)

which we already know is 2.

Example 10

Consider the case of dimension 0, where we have a discrete set of points. The Euler characteristic is then the cardinality of this set (i.e. number of elements in it).

Simplicial Complexes

Let us now define in a more precise manner what objects we’re looking at. Heuristically, we start with a set S of points. A simplicial complex is then represented by a collection C of subsets of S such that if T\in C, then every subset of T also lies in C.

For example, if S = {1, 2, 3, 4} and C = { Ø, {1}, {2}, {3}, {4} {1,2}, {2,3}, {1,3}, {2,4}, {1,2,3} }, then the corresponding simplicial complex is:

cellcomplex

The formal topological definition is as follows: the n-dimensional simplex is the set

\Delta^n := \{ (x_0, x_1, \ldots, x_n) \in \mathbb{R}^{n+1} : \sum_{i=0}^n x_i = 1, x_0, x_1, \ldots, x_n \ge 0\},

or its affine transform. For example, here’s a 3-dimensional simplex.

simplex_3

The boundary of \Delta^n comprises of (n+1) simplices of dimension (n-1), each corresponding to an 0 ≤ i ≤ n, for which we take the set of (x_0, x_1, \ldots, x_n) \in \Delta^n satisfying x_i =0. Now take each element T\in C, and form the corresponding simplex

\Delta_T := \{ (x_i) \in \mathbb{R}^S : x_i = 0 \text{ for } i\not\in T, \sum_{i\in T} x_i = 1\}.

Now the simplicial complex for Σ is the topological space which is the subspace \cup_{T\in C} \Delta_T \subset \mathbf{R}^S. In other words, the objects we’re looking are topological spaces which are homeomorphic to a simplicial complex. Note that even though the construction of a simplicial complex only uses line segments, triangles, tetrahedrons, and their higher-dimensional counterparts, its cellular decomposition may use more general shapes such as quadrilaterals, pentagons, cubes etc.

Note: simplicial complexes are somewhat restrictive. A more general theory involves CW-complexes. The differences are subtle, and while CW-complexes may define some structures which are not homeomorphic to any simplicial complex (see here for some anomalous examples), the two theories are equivalent at the level of homotopic equivalence, which we will define in the next article.

Multiplicativity of χ

Finally, we have the following result.

Theorem. If M and N are geometric objects, we have χ(M × N) = χ(M) × χ(N).

Proof

First, note that each cellular decomposition of a geometric object gives a disjoint union of the underlying set of points:

partition2

Write M = \coprod_i M_i and N=\coprod_j N_j as disjoint unions in this way (some of the Mi‘s or Nj‘s may have the same dimension). Then their Euler characteristics are given by:

\chi(M) = \sum_i (-1)^{\dim (M_i)},\quad \chi(N) = \sum_j (-1)^{\dim(N_j)}.

Now M\times N = \coprod_{i, j} M_i \times N_j and \dim(M_i \times N_j) = \dim M_i + \dim N_j so the Euler characteristic of M × N is:

\sum_{i, j} (-1)^{\dim (M_i \times N_j)} = \sum_{i, j} (-1)^{\dim(M_i)} (-1)^{\dim(N_j)} = \sum_i (-1)^{\dim(M_i)} \sum_j (-1)^{\dim(N_j)}

which is χ(M)χ(N). ♦

Here’s a graphical representation of the above proof.

productpartition

Posted in Notes | Tagged , , , , , , , | Leave a comment

Elementary Module Theory (IV): Linear Algebra

Throughout this article, a general ring is denoted R while a division ring is denoted D.

Dimension of a Vector Space

First, let’s consider the dimension of a vector space V over D, denoted dim(V). If W is a subspace of V, we proved earlier that any basis of W can be extended to give a basis of V, thus dim(W) ≤ dim(V).

Furthermore, we claim that if \{v_i + W\} is a basis of the quotient space V/W, then the vi‘s, together with a basis \{w_j\} of W, form a basis of V:

  • If \sum_i r_i v_i + \sum_j r_j' w_j = 0 for some r_i, r_j' \in D, its image in V/W gives \sum_i r_i (v_i + W) = 0 and thus each r_i is zero. This gives \sum_j r_j' w_j = 0; since \{w_j\} forms a basis of W, each r_j' = 0. This proves that \{v_i\} \cup \{w_j\} is linearly independent.
  • Let v\in V. Its image v+W in V/W can be written as a linear combination \sum_i r_i (v_i + W) = v+W for some r_i \in R. Hence v - \sum_i r_i v_i \in W and can be written as a linear combination of \{w_j\}. So v can be written as a linear combination of \{v_i\} \cup \{w_j\}.

Conclusion: dim(W) + dim(V/W) = dim(V). Now if fV → W is any homomorphism of vector spaces, the first isomorphism theorem tells us that V/ker(f) is isomorphic to im(f). Hence, dim(V) = dim(ker(f)) + dim(im(f)).

If V is finite-dimensional and dim(V) = dim(W), then:

  • (f is injective) iff (ker(f) = 0) iff  (dim(ker(f)) = 0) iff (dim(im(f)) = dim(V)) iff (dim(im(f)) = dim(W)) iff (im(f) = W) iff (f is surjective).

Thus, (f is injective) iff (f is surjective) iff (f is an isomorphism).

db

For infinite-dimensional V and W, take the free vector spaces V = W = D^{(\mathbf{N})} and let fV → W take the tuple (r_1, r_2, \ldots) \mapsto (0, r_1, r_2, \ldots). Then f is injective but not surjective.

Over a general ring, even if M and N are free modules, the kernel and image of fM → N may not be free. This follows from the fact that a submodule of a free module is not free in general, as we saw earlier. Hence it doesn’t make sense to talk about dim(ker(f)) and dim(im(f)) for such cases.

In a Nutshell. The main results are:

  • for a D-linear map f : V → W, dim(V) = dim(ker(f)) + dim(im(f));
  • if dim(V) = dim(W), then f is injective iff it is surjective.

blue-lin

Matrix Algebra

Recall that an R-module M is free if and only if it has a basis \{m_i\}_{i\in I}, in which case we can identify R^{(I)} \cong M via (r_i)_{i\in I}\mapsto \sum_{i\in I} r_i m_i. Let’s restrict ourselves to the case of finite free modules, i.e. modules with finite bases. If M\cong R^a and N\cong R^b, the group of homomorphisms is identified with \text{Hom}(M, N)\cong R^{ab} in terms of b × a matrices in R.

Let’s make this identification a bit more explicit. Pick a basis \{m_1, \ldots, m_a\} of M and \{n_1, \ldots, n_b\} of N. We have:

R^a \cong M, \ (r_1, \ldots, r_a) \mapsto \sum_{i=1}^a r_i m_i\ and \ R^b \cong N, (r_1, \ldots, r_b)\mapsto \sum_{j=1}^b r_j n_j.

A module homomorphism fM → N is expressed as a matrix as follows:

matrix_id

Example 1

Take RR, the field of real numbers and M = \{a + bx + cx^2 : a, b, c\in\mathbf{R}\} and N = \{a + bx : a, b\in \mathbf{R}\} where x is an indeterminate here. The map fM → N given by f(p(x)) = dp/dx is easily checked to be R-linear.

Pick basis {1, x x2} of M and {1, x} of N. Since f(1) = 0, f(x) = 1 and f(x2) = 2x, the resulting f takes m_1 \mapsto 0, m_2\mapsto n_1, m_3 \mapsto 2n_2. Hence, the matrix corresponding to these bases is \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 2\end{pmatrix}.

On the other hand, if we pick basis {1+x, –x, 1+x2} of M and basis {1+x, 1+2x} of N, then

  • f(m_1) = f(1+x) = 1 = 2n_1 - n_2;
  • f(m_2) = -1 = -2n_1 + n_2;
  • f(m_3) = 2x = -2n_1 + 2n_2

which gives the matrix representation \begin{pmatrix} 2 & -2 & -2 \\ -1 & 1 & 2\end{pmatrix}.

Example 2

Let M = {ab√2 : ab integers} which is a Z-module. Take fM → M which takes z to (3-√2)z. It’s clear that f is a homomorphism of additive groups and hence Z-linear. Since the domain and codomain modules are identical (M), let’s pick a single basis.

If we pick {1, √2}, then

  • f(m_1) = f(1) = 3-\sqrt 2 = 3m_1 - m_2;
  • f(m_2) = f(\sqrt 2) = -2 + 3\sqrt 2 = -2m_1 + 3m_2

thus giving the matrix representation \begin{pmatrix} 3 & -2 \\ -1 & 3\end{pmatrix}. Replacing the basis by {-1, 1+√2} would give us: \begin{pmatrix} -4 & -1 \\ 1 & 2\end{pmatrix}.

Thus, the matrix representation for fV → W depends on our choice of bases for V and W. If VW, then it’s often convenient to pick the same basis.

blue-lin

Dual Module

We saw earlier that \text{Hom}(R, M) \cong M as an R-module isomorphism. What about Hom(MR) then?

Definition. The dual module of left-module M is defined to be M^* := \text{Hom}(M, R). This is a right R-module, via the following right action:

  • if r\in R and f:M\to R, then the resulting f\cdot r takes m\mapsto f(m)r.

From the universal property of direct sums and products, we see that:

(\oplus_{i\in I} M_i)^* \cong \prod_{i\in I} M_i^*.

Let’s check that we get a right-module structure on M*: indeed, (f\cdot r_1)\cdot r_2 takes m to (f\cdot r_1)(m)r_2 = (f(m)r_1)r_2 which is the image of f\cdot (r_1 r_2) acting on m.

The module M^* is called the dual because it’s a right module instead of a left one. Note that if N were a right-module, the resulting space Hom(NR) of all right-module homomorphisms would give us a left module N^*. It’s not true in general that M^{**} \cong M but it holds for finite-dimensional vector spaces over a division ring.

Theorem. If V is a finite-dimensional vector space over division ring D, then V^{**} \cong V.

Proof.

Consider the map V^* \times V\to D which takes (fv) to f(v). Fixing f, we get a map v\mapsto f(v) which is a left-module homomorphism. Fixing v, we get a right-module homomorphism f\mapsto f(v) since (f·r) corresponds to the map v\mapsto f(v)r by definition. This gives a left-module homomorphism \phi:V\to V^{**}.

Since V is finite dimensional, it suffices to show \text{ker}\phi = 0. But if v\in V-\{0\}, we can extend {v} to a basis of V. Define a linear map fV → D which takes v to 1 and all other basis elements to 0. Then (\phi(v))(f) = f(v) \ne 0 so \phi(v) \ne 0. This shows that \phi is injective and thus an isomorphism. ♦

One way to visualise the duality is via this diagram:

dual_and_module

Exercise

It’s tempting to define a module structure on Hom(MR) via (f\cdot r)(m) = f(rm). What’s wrong with this definition? [ Answer: the resulting f·r : MR is not a left-module homomorphism. ]

Dual Basis

Suppose \{ v_1, v_2, \ldots, v_n\} is a basis of V. Let f_i : V\to D (i = 1, …, n) be linear maps defined as follows:

f_i(v_j) = \begin{cases} 1, \quad &\text{ if } j = i, \\ 0, \quad &\text{ if } j\ne i.\end{cases}

Each f_i is well-defined by the universal property of the free module V. Using the Kronecker delta function, we can just write f_i(v_j) = \delta_{ij}. This is called the dual basis for \{v_1, \ldots, v_n\}.

[ Why is this a basis, you might ask? We know that dim(V*) = dim(V) = n, so it suffices to check that f_1, \ldots, f_n is linearly independent. For that, we write \sum_i f_i\cdot r_i=0 for some r_1, \ldots, r_n \in D (recall that V* is a right module). Then for each j = 1, …, n, we have

0 = \sum_i (f_i\cdot r_i)(v_j) = \sum_i f_i(v_j)r_i = \sum_i \delta_{ij}r_i = r_j

and we’re done. ]

Now if f\in V^* and v\in V, we can write f = \sum_{i=1}^n f_i c_i and v = \sum_{j=1}^n d_j v_j for some c_i, d_j \in D. Then

\begin{aligned}f(v) &= \left(\sum_{i=1}^n f_i c_i\right)\left(\sum_{j=1}^n d_j v_j\right) = \sum_{i=1}^n f_i\left(\sum_{j=1}^n d_j v_j\right)c_i \\ &= \sum_{i=1}^n \sum_{j=1}^n d_j\delta_{ij}c_i = \sum_{i=1}^n d_i c_i\end{aligned}

which is the product between a row vector & a column vector.

One thus gets a natural inner product between a vector space and its dual. Recall that in an Euclidean vector space V = \mathbf{R}^3, there’s a natural inner product given by the usual dot product which is inherent in the geometry of the space. However, for generic vector spaces, it’s hard to find a natural inner product. E.g. what would one be for the space of all polynomials of degree at most 2? Thus, the dual space provides a “cheap” and natural way to get an inner product.

Example

Consider the space V = \{a + bx + cx^2 : a, b, c\in \mathbf{R}\} over the reals R=R. Examples of elements of V* are:

  • f\mapsto f(1) which takes (a+bx+cx^2)\mapsto a+b+c;
  • f\mapsto \left.\frac {df}{dx}\right|_{x=-1} which takes (a+bx+cx^2) \mapsto -b+2c;
  • f\mapsto \int_0^1 (a+bx+cx^2) dx which takes (a+bx+cx^2) \mapsto a + \frac b 2 + \frac c 3.

It’s easy to check that these three elements of V* are linearly independent and hence form a basis. Note: in this case, the base ring is a field so right modules are also left, i.e. V* and V are isomorphic as abstract vector spaces! However, there’s no “natural” isomorphism between them since in order to establish an isomorphism, one needs to pick a basis of V, a basis of V* and map the corresponding elements to each other. On the other hand, the isomorphism between V** and V is completely natural.

Exercise.

[ All vector spaces in this exercise are of finite dimension. ]

Let \{v_1, \ldots, v_n\} be a basis of V and \{f_1, \ldots, f_n\} be its dual basis for V*. Denote the dual basis of \{f_1, \ldots, f_n\} by \{\alpha_1, \ldots, \alpha_n\} in V**. Prove that under the isomorphism V\cong V^{**}, we have v_i = \alpha_i.

Let \{v_i\} be a basis of V and \{w_j\} be a basis of W. If TV → W is a linear map, then the matrix representation of T with respect to bases \{v_i\}, \{w_j\} is denoted M.

  • Prove that the map T* : W* → V* which takes W → D to the composition º T : V → D is a linear map of right modules.
  • Let \{f_i\} be the dual basis of \{v_i\} for V* and \{g_j\} be the dual basis of \{w_j\} for W*. Prove that the matrix representation of T* with respect to bases \{f_i\}, \{g_j\} is the transpose of M.

blue-lin

More on Duality

Let V be a finite-dimensional vector space over D and V* be its dual. We claim that there’s a 1-1 correspondence between subspaces of V and those of V*, which is inclusion-reversing. Let’s describe this:

  • if W\subseteq V is a subspace, define W^\perp := \{ f\in V^* : f(w) = 0 \text{ for all } w\in W\};
  • if X\subseteq V^* is a subspace, define X^\perp := \{v\in V : f(v) = 0 \text{ for all } f\in X\}.

The following preliminary results are easy to prove.

Proposition.

  • W^\perp is a subspace of V*;
  • X^\perp is a subspace of V;
  • if W_1\subseteq W_2\subseteq V, then W_1^\perp \supseteq W_2^\perp;
  • if X_1\subseteq X_2 \subseteq V^*, then X_1^\perp \supseteq X_2^\perp;
  • W\subseteq W^{\perp\perp} and X\subseteq X^{\perp\perp}.

We’ll skip the proof, though we’ll note that the above result in fact holds for any subsets W\subseteq V and X\subseteq V^*. This observation also helps us to remember the direction of inclusion for W\subseteq W^{\perp\perp} since in this general case, W^{\perp\perp} is the subspace of V generated by W.

The main thing we want to prove is the following:

Theorem. If W\subseteq V is a subspace, then W^{\perp\perp} = W. Likewise if X\subseteq V^* is a subspace, then X^{\perp\perp} = X.

Proof.

Pick a basis \{v_1, \ldots, v_k\} of W and extend it to a basis \{v_1, \ldots, v_n\} of V, where dim(W) = k and dim(V) = n. Let \{f_1, \ldots, f_n\} \subset V^* be the dual basis.

If v\in V-W, write v = \sum_{i=1}^n r_i v_i where each r_i\in D. Since v is outside W, r_j\ne 0 for some j>k. This gives f_j(v) = f_j(\sum_i r_i v_i) = r_j \ne 0 and f_j\in W^\perp since j>k. Hence v\not\in W^{\perp\perp} and we have W^{\perp\perp} \subseteq W.

The case for X is obtained by replacing V with V* and identifying V^{**} \cong V.  ♦

Thus we get the following correspondence:

dual_corr

Furthermore, the dimensions “match”. E.g. suppose dim(V) = n, so dim(V*) = n. Then we claim that for any subspace W of V of dimension k,

  • \dim(W^{\perp}) = n-k;
  • V^* / W^\perp \cong W^* naturally.

Since dim(W*) = dim(W) = k, the first statement follows from the second. From results above, the inclusion map W → V induces a map of the dual spaces V* → W*. The kernel of this map is precisely the set of all f\in V^* such that f(w) = 0 for all w in W, which is exactly W^\perp. This proves our claim. ♦

Posted in Notes | Tagged , , , , | Leave a comment