Elementary Module Theory (III): Approaching Linear Algebra

The Hom Group

Continuing from the previous installation, here’s another way of writing the universal properties for direct sums and products. Let Hom(MN) be the set of all module homomorphisms → N; then:

\text{Hom}\left(N, \prod_i M_i\right) \cong \prod_i \text{Hom}(N, M_i), \quad \text{Hom}\left(\oplus_i M_i, N\right) \cong \prod_i \text{Hom}(M_i, N) (*)

for any R-module N.

In the case where there’re finitely many Mi‘s, the direct product and direct sum are identical, so we get:

\begin{aligned}\text{Hom}\left(\prod_{i=1}^r M_i, \prod_{j=1}^s N_j\right) \cong \prod_{i=1}^r \prod_{j=1}^s \text{Hom}(M_i, N_j).\end{aligned} (**)

This correspondence is extremely important. One can write this as a matrix form: f:\prod_i M_i \to \prod_j N_j can be broken up as follows

f(m) \leftrightarrow\begin{pmatrix} f_{11} & f_{12} & \ldots & f_{1r} \\ f_{21} & f_{22} & \ldots & f_{2r} \\ \vdots & \vdots & \ddots & \vdots \\ f_{s1} & f_{s2} & \ldots & f_{sr} \end{pmatrix} \begin{pmatrix} m_1 \\ m_2 \\ \vdots \\ m_r\end{pmatrix}, where f_{ji} : M_i \to N_j and m_i \in M_i.

In fact, there’s more to the correspondence (**) than a mere bijection of sets:

Proposition. The set Hom(M, N) forms an abelian group; if f, g:M\to N are module homomorphisms, then we define:

(f+g) : M\to N, \ m\mapsto f(m) + g(m).

The identity is given by f(m) = 0 for all m; it is also denoted by 0\in \text{Hom}(M, N).

Since the proof is straightforward, we’ll leave it as a simple exercise. The bijections in (*) and (**) are thus isomorphisms of abelian groups.

blue-lin

Free Modules

First we define:

Definition. Let I be an index set. The free module on I is the direct sum of copies of R, indexed by elements of I:

R^{(I)} := \oplus_{i\in I} R.

In contrast, the direct product of copies of R is given by R^I := \prod_{i\in I} R.

You might wonder why we’re interested in direct sum and not the product; it’s because of its universal property.

From the correspondence in (*), we obtain:

\text{Hom}(R^{(I)}, M) \cong \prod_{i\in I} \text{Hom}(R, M).

On the other hand, it’s easy to see that \text{Hom}(R, M) can be identified with M itself. Indeed, if fR → M is a module homomorphism, then it’s identified uniquely by the image f(1), from which we get f(r) = f(r·1) = r·f(1) for any r\in R. On the other hand, any element m\in M corresponds to the homomorphism f_m(r) := rm which satisfies f_m(1) = m. Thus we get a natural group isomorphism between Hom(RM) and M.

In fact, we can say this is an isomorphism of R-modules, if we define an R-module structure on Hom(RM) by letting r\in R act on f via (r\cdot f) : r'\mapsto f(r'r). To check that this makes sense, r_1\cdot(r_2 \cdot f) takes r’ to (r_2 f)(r'r_1) = f(r'r_1 r_2) which is the image of (r_1 r_2)\cdot f acting on r’.

Thus, we get:

\text{Hom}(R^{(I)}, M) \cong \prod_{i\in I} \text{Hom}(R, M) \cong \prod_{i\in I} M = M^I,

where the RHS is a direct product. Thus, the free module satisfies the following universal property:

Universal Property of Free Modules. There is a 1-1 correspondence between module homomorphisms f:R^{(I)} \to M and elements of the direct product M^I.

blue-lin

Rank of Free Modules

Finite free modules (i.e. free modules R^{(I)} where I is finite) are particularly nice since the linear maps between them are represented by matrices: from (**),

\text{Hom}(R^n, R^m) \cong \text{Hom}(R, R)^{mn} \cong R^{mn}, as an m × n matrix.

Composition of linear maps f:R^n \to R^m and g:R^m \to R^k then corresponds to the product of an k × m matrix and m × n matrix. Before we proceed though, we’d like to ask a more fundamental question:

Question. If M \cong R^{(I)}, then we call #I the rank of M. Is the rank well-defined? E.g. is it possible for R \cong R\times R to occur as R-modules?

This turns out to be a rather difficult problem, since the answer is (no, yes), when phrased in the most general setting. In other words, there exist rings R for which R\cong R\times R (and hence R\cong R^n for all n>0) as R-modules! Generally, rings for which R^m \cong R^n \implies m=n are said to satisfy the Invariant Basis Number property. There has been much study done on this but for now, we’ll contend ourselves with the following special cases:

  • Division rings (and hence fields) satisfy IBN.
  • Non-trivial commutative rings satisfy IBN.

The second case follows from the first if we’re allowed to use a standard result in commutative algebra: every non-trivial commutative ring R ≠ 1 has a maximal ideal I. Assuming this, if M \cong R^m \cong R^n, then taking the quotient module M/IM gives the isomorphism M/IM \cong (R/I)^m \cong (R/I)^n where (R/I)^m \cong (R/I)^n is in fact an isomorphism of (R/I)-modules. Since R/I is a field, the first case tells us m=n.

The case of division rings will be proven later.

blue-lin

Basis of Free Module

Free modules are probably the most well-behaved types of modules since many of the results in standard linear algebra carry over, e.g. the presence of a basis.

Definition. A subset S of module M is said to be linearly independent if whenever r_1, \ldots, r_n\in R and m_1, \ldots, m_n\in S satisfy:

r_1 m_1 + \ldots + r_n m_n = 0,

we have r_1 = \ldots = r_n = 0.

Definition. A subset S of M is said to be a basis if it is linearly independent and generates the module M.

Clearly, a subset of a linearly independent set is linearly independent as well. On the other hand, a superset of a generating set also generates the module. Thus, the basis lies as a fine balance between the two cases.

A free module R^{(I)} has a standard basis \{e_i\} which is indexed by i\in I. Let e_i be the element:

(e_i)_j = \begin{cases}1, \quad &\text{ if } j=i\\ 0,\quad &\text{ if }j\ne i.\end{cases}

For example, if I = {1, 2, 3}, the standard basis is given by

e_1 = (1, 0, 0), e_2 = (0, 1, 0), e_3 = (0, 0, 1)

which is hardly surprising if you’ve done linear algebra before.

Conversely, any module with a basis is free.

Proposition. If S\subset M is a basis, with elements indexed by \{m_i\}_{i\in I}, then there’s an isomorphism:

\phi:R^{(I)} \to M, which takes (r_i)_{i\in I} \mapsto \sum_{i\in I} r_i m_i.

Sketch of Proof.

First note that the RHS sum is well-defined since there’re only finitely many non-zero terms for ri. The map is also clearly R-linear. The fact that it’s surjective is precisely the condition that the mi‘s generate M. Also it’s injective if and only if the kernel is 0, which is exactly the condition that S is linearly independent. ♦

In conclusion:

Corollary. An R-module is free if and only if it has a basis.

Clearly, not all R-modules have a basis. This is amply clear even for the case R=Z, since the finite Z-module (i.e. abelian group) Z/2 has no basis. On the other hand, for R=Z, every submodule of a free module is free. This does not hold for a general ring, e.g. for RR[xy], the ring of polynomials in xy with real coefficients, the ideal <xy> is a submodule which is not free since any two elements are linearly dependent.

Finally, the astute reader who had any exposure to linear algebra would not be surprised to see the following.

Theorem. Every module over a division ring has a basis, and the cardinality of the basis (i.e. the rank of the module) is well-defined.

Question.

Let \{e_i\} be the standard basis for the free module R^{(I)}. Why is it not a basis for the direct product R^I?

blue-lin

Linear Algebra Over Division Rings.

Let R be a division ring D for the remaining of this article. A D-module will henceforth be known as a vector space over D, in accordance with linear algebra.

Theorem (Existence of Basis). Every vector space over D has a basis. Specifically, if S\subseteq T\subseteq D are subsets such that S is linearly independent and T is a generating set, then there’s a basis B such that S\subseteq B\subseteq T.

[ In particular, any linearly independent subset can be extended to a basis and any generating set has a subset which is a basis. ]

This gist of the proof is to keep adding elements to S while keeping it linearly independent, until one can’t add anymore. This will result in a basis.

Proof.

First, establish the groundwork for Zorn’s lemma.

  • Let Σ be the class of all linearly independent sets U where S\subseteq U\subseteq T. Now Σ is not empty since S\in\Sigma at least.
  • Partially order Σ by inclusion.
  • If \{U_a\}\subseteq \Sigma is a chain (i.e. for any ab we have U_a\subseteq U_b or U_b\subseteq U_a), then the union U:= \cup_a U_a is also linearly independent and S\subseteq U\subseteq T.
    • Indeed, U is linearly independent because any linear dependency \sum_i r_i m_i = 0 would involve only finitely many terms m_i \in U, so all these terms would come from a single Ua, thus violating its linear independence.

Hence, Zorn’s lemma tells us there’s a maximal linearly independent U among all S\subseteq U\subseteq T. We claim U generates M: if not, then <U> doesn’t contain T, for if it did, it would also contain <T> = M. Thus, we can pick m\in T-\left<U\right> and let U' := U\cup \{m\}. We claim U’ is linearly independent: indeed, if rm + r_1 m_1 + \ldots + r_n m_n = 0 for r, r_1, \ldots, r_n\in R and m_1, \ldots, m_n \in U, then r ≠ 0 since U is linearly independent and thus we can write:

rm = -\sum_{i=1}^n r_i m_i \implies m = \sum_{i=1}^n (-r^{-1} r_i)m_i \in \left<U\right>

which is a contradiction. Hence, U’ is a linearly independent set strictly containing U and contained in T, which violates the maximality of U. Conclusion: U is a basis. ♦

Theorem (Uniqueness of Rank). If M \cong D^I \cong D^J, then I and J have the same cardinality.

The gist of the proof is to replace elements of one basis with another, and show that there’s an injection I → J.

Proof.

Let \{e_i\}_{i\in I} and \{f_j\}_{j\in J} be bases of M, corresponding to the above isomorphisms.

  • Take the class Σ of all injections \phi:S \subseteq I\to J such that \{e_i\}_{i\in S}\cup \{e_j'\}_{j\in J-\phi(S)} is a basis of M.
  • Now Σ is not empty since it contains \emptyset \to J.
  • Partially order Σ as follows: (φ: S → J) ≤ (φ’: S’ → J) if and only if S\subseteq S' and \phi'|_S = \phi.
  • Show that if (\phi_a : S_a \to J) is a chain in Σ, then one can take the “union” \phi:\cup_a S_a \to J where \phi(s) = \phi_a(s) for any a such that s\in S_a.

Hence, Zorn’s lemma applies and there’s a maximal \phi : S \to J. We claim SI. If not, pick k\in I-S. Since \{e_i\}_{i\in S}\cup \{e_j'\}_{j\in J-\phi(S)} is a basis of M, write:

\begin{aligned}e_k = \sum_{i\in S} r_i e_i + \sum_{j\in J-\phi(S)} r_j e_j'\end{aligned} for some r_i, r_j\in R.

Since the ei‘s are linearly independent, the second sum is non-empty so pick any j’ for which r_{j'} \ne 0. We get:

\begin{aligned}e_{j'} = r_{j'}^{-1} e_k + \sum_{i\in S} (-r_{j'}^{-1} r_i) e_i + \sum_{j\in J-\phi(S)-\{j'\}}(-r_{j'}^{-1}r_j) e_j'.\end{aligned}

Hence, we can extend \phi:S\to J to \phi':S\cup \{k\} \to J by taking k to j’ and contradict the maximality of φ. Conclusion: there’s an injection I → J, and by symmetry, there’s an injection J → I as well. By the Cantor–Bernstein–Schroeder theoremI and J have the same cardinality. ♦

The dimension of a vector space over a division ring is thus defined to be the cardinality of any basis. It is a well-defined value.

Posted in Notes | Tagged , , , , , , , , , | Leave a comment

Elementary Module Theory (II)

Having defined submodules, let’s proceed to quotient modules. Unlike the case of groups and rings, any submodule can give a quotient module without any additional condition imposed.

Definition. Let N be a submodule of M. By definition, it’s an additive subgroup which is normal since (M, +) is abelian. So take the group quotient M/N and define a module structure on it via the scalar multiplication:

R\times M/N \to M/N,\quad (r, m+N) \mapsto (rm)+N.

This is called the module quotient.

The only worrying aspect is that scalar multiplication is well-defined: i.e. suppose m+Nm’+N; is it true that (rm)+N = (rm’)+N then? Well:

\begin{aligned}m+N = m'+N &\implies m-m'\in N \implies r(m-m')\in N\\ &\implies rm-rm'\in N \implies (rm)+N = (rm')+N.\end{aligned}

The remaining axioms of a module are then easily verified.

Examples

  1. Clearly, M/{0} = M and M/M = {0}.
  2. Let R = M_n(\mathbf{Z}) be the ring of n × n matrices with integer entries. Then M:=\mathbf{Z}^n is a module and N :=2M = \{2\mathbf{v} : \mathbf{v} \in M\} is a submodule. The resulting quotient is isomorphic to (\mathbf{Z}/2)^n.
  3. Recall that if RZ, the class of R-modules is precisely the class of abelian groups. The submodules are precisely the subgroups and quotient modules are the quotient groups.
  4. Let I\subseteq R be an ideal of R and M be an R-module. The module quotient M/IM is not only an R-module, but an (R/I)-module as well. Indeed, if r+I\in R/I then multiplying with m+IM \in M/IM gives rm+IM\in M/IM. This is well-defined since if r+I = r'+I and m+IM = m'+IM, then we have r-r'\in I and m-m'\in IM which gives us rm - r'm' = (r-r')m +r'(m-m')\in IM and so rm Ir’m’I. Thus scalar multiplication by R/I is well-defined and it’s easy to show that M/IM is an R/I-module.

blue-lin

Homomorphisms

The definition is straightforward.

Definition. Let M and N be R-modules. A module homomorphism is a function f : M → N such that:

f(m+m') = f(m) + f(m') and f(rm) = r\cdot f(m)

for any m, m'\in M and r\in R.

Notice that since f is a homomorphism of the underlying additive groups, we automatically have f(0) = 0 and f(-m) = –f(m). As in the case of groups, an injective (resp. surjective / bijective) homomorphism is called a monomorphism (resp. epimorphism / isomorphism).

The following properties are hardly surprising:

Proposition. Let f : M\to N be a homomorphism of R-modules.

  • If M'\subseteq M is a submodule, then f(M’) is a submodule of N.
  • If N'\subseteq N is a submodule, then f-1(N’) is a submodule of M.

In other words, the “push-forward” or “pull-back” of a submodule is also a submodule.

Proof.

The proof is easy: we know that f(M’) is an additive subgroup of N. Next, if n\in f(M') and r\in R, then we can write nf(m) for some m in M’. This gives rnr·f(m) = f(rm), which is in f(M’).

Likewise, f-1(N’) is an additive subgroup of M. If m\in f^{-1}(N') and r\in R, then f(m) lies in N’ so f(rm) = r·f(m) which lies in N’. Hence rm\in f^{-1}(N') and we’re done. ♦

In particular, we have the following special cases.

Definition. For a module homomorphism f:M\to N, the pullback f^{-1}(0) is called the kernel of f; this is a submodule of M, denoted ker(f).

Likewise, f(M) is called the image of f; this is a submodule of N, denoted im(f).

Finally the cokernel of f, denoted coker(f), is the quotient N/\text{im}(f).

[ Note: the cokernel is kind of unusual since we didn’t encounter it in the case of groups and rings. Its primary purpose is as a “dual” to the kernel. Roughly, the kernel satisfies some universal property; if we reverse the arrows around the resulting universal property describes the cokernel. ]

As before, we have the standard three isomorphism theorems.

First Isomorphism Theorem. Let f:M\to N be a homomorphism of R-modules. Then M/\text{ker}(f) \cong \text{im}(f).

Proof.

Construct a map g:M/\text{ker}(f) \to \text{im}(f) which takes m + \text{ker}(f) to f(m). Let’s show that g is injective and well-defined. Indeed:

m + \text{ker}(f) = m'+\text{ker}(f) \iff m-m'\in \text{ker}(f) \iff f(m-m')=0 \iff f(m) = f(m').

Finally, g is surjective by the definition of im(f), so we’re done. ♦

Armed with the first isomorphism theorem, the remaining two are a piece of cake.

Second Isomorphism Theorem. Let N, N’ be submodules of M. Then N/(N\cap N') \cong (N+N')/N'.

Third Isomorphism Theorem. Let P\subseteq N\subseteq M be submodules. Then N/P is a submodule of M/P and (M/P)/(N/P) \cong M/N.

Proof.

For the first result, map N → (N+N’) → (N+N’)/N’ by composing the inclusion N\subseteq N+N' with projection N+N' \to (N+N')/N'. Apply the first isomorphism theorem. The kernel of this map is \{n\in N : n+N'=0\} = N\cap N' while the image is the whole of (N+N’)/N’ since any element (n+n’)+N’ (for n\in N and n'\in N') is basically just n+N’.

For the second result, map M/P → M/N by taking m+P to m+N. The map is well-defined since if m+Pm’+P, then mm’ lies in P and hence N, so m+Nm’+N also. The map is obviously surjective. Finally, the kernel is

\{m+P\in M/P : m+N = 0\} = \{m+P : m\in N\} = P/N.

Now apply the first isomorphism theorem. ♦

Let N\subseteq M. As in the case of groups and rings, there’s a correspondence between submodules of M containing N, as well as submodules of M/N.

Theorem. Let N be a submodule of M. We have a bijection:

\{ \text{ submodule } P \subseteq M : P \supseteq N\} \leftrightarrow \{\text{ submodule } Q \subseteq M/N\}.

The correspondence preserves inclusion.

module_corr

Sketch of Proof.

Let fM → M/N be the projection map.

  • Any submodule P of M gives a submodule f(P) of M/N.
  • On the other hand, any submodule Q of M/N gives a submodule f-1(Q) of M containing N.

One then checks that f(f^{-1}(Q)) = Q always holds, and that if P contains N, then we have f^{-1}(f(P)) = P. ♦

Exercise.

Prove that in the above correspondence, the intersection (resp. sum) of submodules on one side corresponds to the intersection (resp. sum) of submodules on another.

[ Hint for exercise: avoid “picking elements” from the submodules if possible; instead argue that the intersection or sum satisfies some unique property pertaining to inclusion of submodules.]

blue-lin

Direct Sum and Direct Product

It’s clear that given any two R-modules M and N, we can form the product M × N and let R act on it component-wise. On the other hand, with infinitely many modules, there’re two common ways to construct this “product”.

Definition. Consider any class of R-modules \{M_i\}.

  • The direct product is the set-theoretic product \prod_i M_i. It’s given the structure of an R-module, by letting r\in R act on it component-wise.
  • The direct sum \oplus_i M_i is the subset of (m_i) \in \prod_i M_i for which only finitely many mi‘s are non-zero. This is clearly a submodule of the direct product.

For example, suppose Z and MnZ for n = 1, 2, 3, … .The direct product of the Mn‘s is the set of all sequences of integers, while the direct sum is the set of sequences of integers in which only finitely many terms are non-zero. 

The difference may seem subtle, but these two modules turn out to be quite different in their behaviour. In fact, one can even say they’re dual to each other, thanks to their universal properties.

Universal Property of Direct Product. For each index j, let \pi_j : \prod_i M_i \to M_j be the projection map. If N is a module and f_j : N\to M_j is any collection of module homomorphisms, then there’s a unique \phi : N\to \prod_i M_i such that:

\pi_j\circ \phi = f_j : N\to M_j for all j.

prod_commdiag

Universal Property of Direct Sum. For each index j, let \iota_j : M_j \to \oplus_i M_i be the natural inclusion map which takes m\in M_j to the tuple (m_i) which comprises of all zeros except m_j = m. If N is a module and g_j : M_j \to N is any collection of module homomorphisms, then there’s a unique \psi : \oplus_i M_i \to N such that:

\psi\circ \iota_j = g_j : M_j\to N for all j.

sum_commdiagSketch of Proof

For direct product, the collection of maps f_j : N\to M_j gives rise to the map \phi : N\to \prod_i M_i by taking:

\phi(n) := (f_i(n)) for all n\in N.

One easily checks that this is a module homomorphism and that it is the unique map satisfies the desired requirement.

For direct sum, the maps g_j : M_j \to N give \psi : \oplus_i M_i \to N by taking

\psi((m_i)) := \sum_i g_i(m_i) for any (m_i) \in \oplus_i M_i.

This is well-defined since there are only finitely many non-zero mi‘s so the RHS is a finite sum. ♦

Exercise

Consider the case where R=Z and M_1 = M_2 = M_3 = \ldots = \mathbf{Z}. Explain why \prod_i M_i doesn’t satisfy the universal property for direct sum and why \oplus_i M_i doesn’t satisfy that of direct product. [ Hint: in one case, the induced map fails to exist; in the other, the induced map is not unique. ]

Posted in Notes | Tagged , , , , , , , | Leave a comment

Elementary Module Theory (I)

Modules can be likened to “vector spaces for rings”. To be specific, we shall see later that a vector space is precisely a module over a field (or in some cases, a division ring). This set of notes assumes the reader is reasonably well-acquainted with rings and groups, and has some prior experience with linear algebra, at least the computational aspects.

Throughout this article, we’ll let R denote a ring, possibly non-commutative.

Definition. A (left)-module over R is an abelian group (M, +) together with a binary operation R × M → M (denoted as r·m or just rm, for r\in R, m\in M) such that for any r,s\in R and m,n\in M, we have:

  1. (r+s)m = rm + sm;
  2. r(m+n) = rm+rn;
  3. (rs)m = r(sm);
  4. 1\cdot m = m.

The map R\times M\to M is often called scalar multiplication, since one should think of elements of M as “vectors” and those of R as “scalars”, following the terminology of linear algebra.

Exercise.

Find a structure which satisfies conditions 1-3 but not 4.

Let’s begin with the elementary properties of modules. For any r\in R, m\in M, we have:

  • 0_R\cdot m = 0_M = r\cdot 0_m, where 0_R\in R and 0_M\in M;
  • (-r)m = -(rm) = r(-m).

The proof is pretty straightforward:

  • For the first equality, 0_R\cdot m + 0_R\cdot m =(0_R+0_R)\cdot m = 0_R\cdot m. Adding -(0_R\cdot m) to both sides gives 0_R\cdot m = 0_M. The other equality is left as an easy exercise.
  • For the first equality, (-r)m + (rm) = ((-r) + r)m = 0R·m. Hence adding -(rm) to both sides gives us the desired equality. The other is an exercise.

Right Modules and Opposite Rings

Instead of defining a left module, we could have applied the action of r\in R on the right. Thus, a right module M over R is an abelian group with a binary operation M × R → M such that for all r, s\in R and m,n\in M,

  1. m(rs) = mrms;
  2. (mn)rmrnr;
  3. m(rs) = (mr)s;
  4. m·1 = m.

Note that if we turn the order of operation around to give R × M → M, then conditions 1, 2, 4 are exactly as above while condition 3 then becomes “(rs)ms(rm)”. Hence, if R is a commutative ring, then a left module over R can easily be turned to a right module by flipping the action as above. Conclusion: if R is non-commutative, a right R-module cannot be easily turned into a left module.

However, we can define the opposite ring:

Definition. If R is a ring, then its opposite ring Rop is defined by taking the abelian group (R, +) with the product operation *, where r*s = sr, with the RHS product taken from R.

Then a right R-module can also be described as a left module over its opposite ring Rop. Obviously, if R is commutative, then its opposite ring is identical to itself. On the other hand, even if R is non-commutative, it’s possible for R\cong R^{op}, in which case one can turn a right R-module into a left one via this isomorphism. For example, if H denotes the division ring of all quaternions with real coefficients, then:

\phi:\mathbf{H}\to\mathbf{H},\quad a+ b\mathbf{i}+ c\mathbf{j}+ d\mathbf{k} \mapsto a-b\mathbf{i}-c \mathbf{j}-d\mathbf{k}

gives \phi(\alpha\beta) = \phi(\beta)\phi(\alpha) = \phi(\alpha) * \phi(\beta) where * is the product operation for the opposite ring.

Exercise.

Prove that the ring R of n × n real matrices is isomorphic to its opposite.

Finally, one notes that there are ample examples of rings which are not isomorphic to their opposite.

Examples of Modules

  1. Every ring R is a module over itself.
  2. Every vector space over a field K (e.g. R) is a K-module.
  3. Z-module is precisely an abelian group (clearly, the underlying additive structure of a module gives us an abelian group; in the other direction, every abelian group A has a multiplication-by-n map which takes A\to A, x\mapsto nx and gives us the structure of a Z-module).
  4. Every ideal I \subseteq R is an R-module, where scalar multiplication is given by ring product in R.
  5. The n-dimensional free R-module is R^n = R\times R\times\ldots\times R, the Cartesian product of n copies of R. Product is given by r(x_1, \ldots, x_n) := (rx_1, \ldots, rx_n).
  6. Consider the ring of Gaussian integers Z[i]. Now C is a Z[i]-module, where scalar multiplication is by the usual ring multiplication map in C.
  7. Let R := M_n(\mathbf{Z}) be the ring of matrices with integer entries. Then M :=\mathbf{Z}^n is an R-module, via multiplying a matrix by vector.
  8. In representation theory, a representation of a group G over field K is precisely a K[G]-module.

Exercise

Let H be the ring of quaternions. Is R an H-module, if we define scalar multiplication as follows?

\mathbf{H}\times\mathbf{R} \to \mathbf{R}, \ (a+bi + cj + dk, r) \mapsto (a+b+c+d)r.

blue-linSubmodules

For now on, unless explicitly stated, all modules are assumed to be left modules.

Now submodules are simply subsets of a module M which can inherit the module structure from M.

Definition. A submodule of M is a subset N\subseteq M satisfying the following:

  • 0\in N;
  • if m, n\in N then m+n \in N;
  • if r\in R, n\in N, then rn\in N.

Immediately, the second and third conditions give us: for any m,n\in N, mnm+(-1)n also lies in N. Thus, we can replace the first condition with N\ne\emptyset since we can then pick n\in N and infer that 0 =n-n\in N from the second and third conditions.

Given a collection of submodules of M, there’re at least two ways to create another submodule:

Proposition. If \{N_i\} is a collection of submodules of M, then N = \cap_i N_i is a submodule of M;and

N' = \sum_i N_i := \{n_1 + n_2 + \ldots + n_k : n_j \in N_{i_j} \text{ for } j =1,\ldots,k\}

is also a submodule of M. Thus N'=\sum_i N_i is the set of all sums of finitely many terms from \cup_i N_i.

Proof.

  • For N, clearly 0\in N. Next, if m,n\in N, then m,n\in N_i for every i. Hence, m-n\in N_i for every i, which gives m-n\in \cap_i N_i. Finally, suppose n\in N and r\in R. Then n\in N_i for each i so rn\in N_i for each i. Hence rn\in \cap_i N_i.
  • For N’, again 0\in N' and clearly N’ is closed under addition since if x,y\in N' are sums of finitely many terms from \cup_i N_i, then x+y is obtained by concatenating the two sums. Finally r(n_1 + n_2 + \ldots + n_k) = (rn_1) + (rn_2) + \ldots + (rn_k) where each rn_j \in N_{i_j}. ♦

Generated Submodules

Let S\subseteq M be any subset and consider:

\Sigma := \{ N : N\supseteq S, N \text{ is a submodule of } M\},

i.e. the collection of all submodules of M which contain S. This collection is non-empty since it at least contains M. Let’s take the intersection of all these modules:

\left<S\right> := \cap_{N\in\Sigma} N.

Being an intersection of submodules of M, <S> is also a submodule.

Definition. The resulting <S> is called the submodule generated by S.

Observe that <S> satisfies the following two critical properties:

  • \left<S\right> \supseteq S: indeed, it’s an intersection of N‘s, each of which contains S.
  • if N’ is a submodule of M such that N'\supseteq S, then N'\supseteq \left<S\right>: indeed since N’ contains S, it’s found in the collection ∑. Since <S> is defined to be the intersection of N’ and some other submodules, we have N'\supseteq \left<S\right>.

It is for this reason that we often say:

The generated submodule <S> is the smallest submodule containing S.

The case where S = {m} is a singleton set is of special interest: we claim that <S> is the product Rm = \{rm : r\in R\}. Indeed, since m\in \left<S\right> we clearly have rm\in\left<S\right> for any scalar r\in R thus giving us the inclusion Rm\subseteq \left<S\right>. On the other hand, it’s easy to check that Rm is itself a submodule of M containing m. Hence, Rm\supseteq \left<S\right> and the two sets are equal.

Examples

  1. Every module M has two obvious submodules: {0} and M. Any other submodule is called a proper submodule.
  2. Consider ring R as a module over itself. Its submodules are called left ideals (compare this with the normal, two-sided ideals). Unwinding the definition, a left ideal of R is a subset I\subseteq R such that
    1. 0\in I;
    2. if r,s\in I, then r+s\in I;
    3. if r\in R, s\in I, then rs\in I.
  3. Consider C as a Q-module. Then the set S = {1+i, 3-2i, √2} generates the submodule comprising of all abic√2, for all a,b,c\in \mathbf{Q}.
  4. If I is a left ideal of R and M is a module, then IM is a submodule of M. Here IM is defined as the set of all finite sums x_1 y_1 + x_2 y_2 + \ldots + x_k y_k, where x_i\in I and y_i \in M.
  5. Consider the ring R = M_n(\mathbf{Z}) of all n × n integer matrices, with module M = \mathbf{Z}^n. Then N = 2\mathbf{z}^n = \{2\mathbf{v} : \mathbf{v}\in\mathbf{Z}^n\} is a proper submodule of M.
  6. On the other hand, for the ring R = M_n(\mathbf{R}) of n × n real matrices and module M =\mathbf{R}^n, one sees that there’s no proper submodule. In other words, any single non-zero \mathbf{v}\in M generates the whole module M. Such modules are called simple modules and we’ll encounter them again later.
  7. Let RR × R and consider MR as a module over itself. Then N := {0} × R is a left R-module since N is an ideal of M, and in particular a left-ideal.

Posted in Notes | Tagged , , , , , , , | Leave a comment

Quick Guide to Character Theory (III): Examples and Further Topics

G10(a). Character Table of S4

Let’s construct the character table for G = S_4. First, we have the trivial and alternating representations (see examples 1 and 2 in G1), both of which are clearly irreducible.

Next, the action of G on {1, 2, 3, 4} induces a linear action of G on a space of dimension 4, by permuting the coordinates. The character for this last action is easy: since each g\in G maps to a permutation matrix, the trace is precisely the number of fixed points. This gives:

chartable_s4_1

Note that \left<\chi_0, \chi_{\text{triv}}\right> = \frac 1 {24}(4+12+8) = 1 so \chi_0 contains 1 copy of the trivial representation. And since \left<\chi_0, \chi_{\text{alt}}\right> = 0 it doesn’t include the alternating representation. Subtracting gives: \chi_1 := \chi_0 - \chi_{\text{triv}} = (3, 1, 0, -1, -1).

It’s easy to check that \left<\chi_1, \chi_1\right> = 1 so we’ve found another irreducible representation. Tensor product gives \chi_1\chi_{\text{alt}} = (3, -1, 0, 1, -1) which is also easily checked to be irreducible. Since we’ve found 4 out of 5 irreducible representations, the remaining one is easy. First, the degree must be \sqrt{24 - 1^2 - 1^2 - 3^2 - 3^2} = 2 so the regular representation must contain two copies of it. This gives:

\chi_2 := \frac 1 2 (\chi_{\text{reg}} - \chi_{\text{triv}} - \chi_{\text{alt}} - 3\chi_1 - 3\chi_1\chi_{\text{alt}}) = (2, 0, -1, 0, 2).

We thus obtain the full character table:

chartable_s4_2

blue-lin

G10(b). Character Table of S5

With a bit more effort, we can do G = S_5 as well. As before, we have the trivial and alternating representations. Again, the natural action of G on {1, 2, 3, 4, 5} gives:

chartable_s5_1

Now, \left<\chi_0, \chi_{\text{triv}}\right> = \frac 1 {120}(5 + 30 + 40 + 30 +15)=1 and \left<\chi_0, \chi_{\text{alt}}\right> = 0 so we take the difference \chi_1 := \chi_0 - \chi_{\text{triv}} = (4, 2, 1, 0, -1, 0, -1). One easily checks that \left<\chi_1, \chi_1\right> = 1 so we’ve found an irreducible representation. With it comes \chi_1\chi_{\text{alt}}, which is easily checked to be irreducible as well.

Thus we’ve found four. For the remaining three, let’s consider tensor products. Now \chi_1^2 is a rather huge representation (of degree 16), so let’s take a subspace instead.

Interlude : Symmetric and Alternating Tensors

Let V be a vector space and consider V\otimes V. The swapping map V\times V\to V\otimes V which takes (x,y)\mapsto y\otimes x is clearly bilinear, so it induces a linear map f:V\otimes V\to V\otimes V which takes x\otimes y\mapsto y\otimes x. Clearly f2 is the identity, since it maps all elements of the form x\otimes y back to themselves and the set of all such elements spans V\otimes V. Since a power of f is the identity, f is diagonalisable. Its eigenvalues must satisfy \lambda^2 = 1 so:

V\otimes V = \{w\in V\otimes V: f(w)=w\} \oplus \{w\in V\otimes V: f(w)=-w\}.

We’ll denote the two spaces by \text{Sym}^2 V and \text{Alt}^2 V respectively. Clearly, if \{e_1, \ldots, e_n\} is a basis of V, then a basis of \text{Sym}^2 V (resp. \text{Alt}^2 V) is given by:

\{e_i \otimes e_j + e_j\otimes e_i\}_{i\le j}\ (resp. \ \{e_i \otimes e_j - e_j\otimes e_i\}_{i<j}).

Thus, the two spaces are of dimensions \frac {n(n+1)}2 and \frac{n(n-1)}2 respectively.

Computing Characters for Symmetric/Alternating Tensors

Now suppose V is a C[G]-module. We claim that the subspace \text{Sym}^2 V\subseteq V\otimes V is invariant under every g\in G. To prove this, it suffices to show g commutes with the swapping map above f:V\otimes V\to V\otimes V which takes x\otimes y\to y\otimes x. This fact isn’t hard to prove:

f(g(x\otimes y)) = f(gx\otimes gy) = gy\otimes gx = g(y\otimes x) = g(f(x\otimes y))

for any x,y\in V. Since f commutes with g, a standard result from linear algebra tells us g is invariant on the eigenspaces of f. Thus g(\text{Sym}^2 V)\subseteq \text{Sym}^2 V and g(\text{Alt}^2 V)\subseteq \text{Alt}^2 V.

Our next task is to express \chi_{\text{Sym}^2 V}(g) in terms of \chi_V.

If \{e_1, \ldots, e_n\} is a basis of eigenvectors of g, with eigenvalues \{\lambda_1, \ldots, \lambda_n\}, then we have:

g(e_i \otimes e_j + e_j\otimes e_i) = \lambda_i \lambda_j (e_i\otimes e_j + e_j\otimes e_i),

so the trace of g on \text{Sym}^2 V is \sum_{i=1}^n \lambda_i^2 + \sum_{i<j} \lambda_i \lambda_j. Since \chi_V(g) = \sum_i \lambda_i and \chi_V(g^2) = \sum_i \lambda_i^2, it follows that:

\chi_{\text{Sym}^2 V}(g) = \frac 1 2 (\chi_V(g)^2 + \chi_V(g^2)) \implies \chi_{\text{Alt}^2 V}(g) = \frac 1 2 (\chi_V(g)^2 - \chi_V(g^2)).

If we take \chi = \chi_1 = (4, 2, 1, 0, -1, 0, -1) above, then:

\chi_2 := \chi_{\text{Sym}^2 V} = (10, 4, 1, 0, 0, 2, 1) and \chi_3 :=\chi_{\text{Alt}^2 V} = (6, 0, 0, 0, 1, -2, 0).

Now \chi_3 is irreducible. On the other hand \left<\chi_2, \chi_{\text{triv}}\right> = 1 and \left<\chi_2, \chi_1\right> = 1 so taking the difference gives the sixth irreducible character:

\chi_4 = \chi_2 -\chi_{\text{triv}} - \chi_1 = (5, 1, -1, -1, 0, 1, 1).

The last one is obviously \chi_4 \chi_{\text{alt}}. Conclusion:

chartable_s5_2

blue-lin

G10(c). Character Table of A5

Now we take the alternating group G = A_5. Since this is a subgroup of S5, let’s take the above 7 characters and restrict them to A_5 \subset S_5. However, we’re now left with only 4 since 3 of the characters are paired via \{\chi, \chi'\} such that \chi' = \chi\cdot\chi_{\text{alt}} (we say that the two characters are twists of each other).

A bit of effort tells us A5 has 5 conjugancy classes: e, (1, 2, 3), (1, 2, 3, 4, 5), (1, 2, 3, 4, 5)2, (1, 2)(3, 4). Restriction then gives:

chartable_a5_1

The first three are irreducible, while the last one satisfies \left<\chi_3, \chi_3\right> = 2 so it’s a direct sum of two non-isomorphic irreducible representations. Since \chi_3 is orthogonal to the other three characters, it is the direct sum of the two remaining irreducible characters. Their dimensions satisfy d_1 + d_2 = 6 and d_1^2 + d_2^2 = 18 so d_1 = d_2 = 3.

Critical observation: every element of G is conjugate to its inverse so any character satisfies:

\chi(g) = \chi(g^{-1}) = \overline{\chi(g)}

where the last equality follows from G7. Thus we can think of the characters of A5 as elements of a real vector space and obtain the last two irreducible characters via linear algebra. Indeed, let’s apply Gram-Schmidt to the orthonormal set \{\chi_{\text{triv}}, \chi_1, \chi_4\} to obtain some orthonormal basis:

\{\chi_{\text{triv}}, \chi_1, \chi_4, \phi_1, \phi_2\}\ , where \ \phi_1 = \frac 1 {\sqrt 3}(3, 0, 3, -2, -1) and \phi_2 = \sqrt{\frac 5 3}(3, 0, 0, 1, -1).

If \chi is one of the last two irreducible characters, then we can write \chi = a\phi_1 + b\phi_2 for real values ab satisfying a^2 + b^2 = 1. Since \chi(1) = 3 we also have a\sqrt 3 + b\sqrt {15} = 3 which gives:

(a,b) = \left(\frac{\sqrt 3 \pm \sqrt{15}}6, \frac{5 \mp \sqrt 5}{2\sqrt {15}}\right) .

Thus, \chi = (3, 0, \frac{1+\sqrt 5}2, \frac{1-\sqrt 5}2, -1) or (3, 0, \frac{1-\sqrt 5}2, \frac{1+\sqrt 5}2, -1) and our work is done.

chartable_a5_2

blue-lin

G11. Restricted and Induced Representations

If H ≤ G is a subgroup, then we can restrict any representation \rho:G\to GL(V) to that of H. We’ll denote this restricted representation by \text{Res}_G^H\rho and the corresponding character \text{Res}_G^H\chi.

db

As we saw in G10(c), even if \chi:G\to \mathbf{C} is irreducible, its restriction to H may not be. Even worse, even if\chi_1 and \chi_2 are orthogonal (i.e. \left<\chi_1, \chi_2\right>=0), their restrictions to H may not be. We’ll see this in an example later on.

Conversely, given a representation \phi:H\to GL(V), we’ll define an induced representation (denoted \text{Ind}_H^G\rho and \text{Ind}_H^G\chi). Abstractly, if V is a C[H]-module, we take:

W := \mathbf{C}[G]\otimes_{\mathbf{C}[H]} V

which is now a C[G]-module. Clearly, both Ind and Res are functorial, in the sense that if V and W are C[H]-modules with a C[H]-linear map fV → W, then this induces a C[G]-linear map \text{Ind}_H^G V \to \text{Ind}_H^G W – this follows from the more general fact that if R\subseteq S are rings, then S\otimes_R - is functorial. The case for Res is obvious. Furthermore, we have:

Proposition. We have natural isomorphisms:

\text{Res}_H^K (\text{Res}_G^H V) \cong \text{Res}_G^K V and \text{Ind}_H^G (\text{Ind}_K^H W)\cong \text{Ind}_K^G W

for any C[G]-module V and C[H]-module W.

Proof.

The case for Res is obvious. For Ind, this follows from the more general fact that if R\subseteq S\subseteq T are rings, then there’s a natural isomorphism T\otimes_S (S\otimes_R M) \cong T\otimes_R M for any R-module M. ♦

Frobenius Reciprocity Law

Next, we have the following general result for modules.

Proposition. Let R\subseteq S be rings, M be an R-module and N be an S-module. Then there’s a canonical isomorphism:

\text{Hom}_S(S\otimes_R M, N) \cong \text{Hom}_R(M, N).

Let’s consider the case where R = \mathbf{C}[H] and S = \mathbf{C}[G] and let the characters for M and N be denoted by \psi:H\to \mathbf{C} and \chi:G\to \mathbf{C} respectively. We get an isomorphism of complex vector spaces in the above proposition, with dimensions:

\begin{aligned}LHS:& \dim_\mathbf{C} \text{Hom}_{\mathbf{C}[G]}(\mathbf{C}[G]\otimes_{\mathbf{C}[H]} M, N) = \left<\text{Ind}_H^G \psi, \chi\right>_G,\\ RHS:& \dim_\mathbf{C} \text{Hom}_{\mathbf{C}[H]} (M, N)= \left<\psi, \text{Res}_G^H \chi\right>_H.\end{aligned}

Conclusion:

Frobenius Reciprocity Theorem. If \psi:H\to \mathbf{C} and \chi:G\to\mathbf{C} are characters of representations, then:

\left<\text{Ind}_H^G \psi, \chi\right>_G = \left<\psi, \text{Res}_G^H \chi\right>_H.

Since the set of such characters spans the space of class functions, equality actually holds for any class functions \psi and \chi.

[ Abstractly, one says that the Ind and Res maps are adjoint to each other. Roughly, this means that their underlying matrices are transposes of each other, as we’ll see below. ]

Example

Let’s consider groups S_3 \le S_4. Their character tables are given below:

chartable_two_groups

This gives us the following restrictions:

  • \text{Res}\chi_{\text{triv}} = \psi_{\text{triv}};
  • \text{Res}\chi_{\text{alt}} = \psi_{\text{alt}};
  • \text{Res}\chi_2 = \psi_1;
  • \text{Res}\chi_1 = \psi_{\text{triv}}+\psi_1;
  • \text{Res}\chi_1\chi_{\text{alt}} = \psi_{\text{alt}}+\psi_1.

The corresponding matrix for Res is then:

\begin{pmatrix} 1 & 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 & 1\\ 0 & 0 & 1 & 1 & 1\end{pmatrix}.

To compute Ind ψ’s, we could use the Frobenius reciprocity theorem, but let’s do it the hard way via explicit computation since we’d like to show the formula for Ind ψ.

Theorem. We have:

(\text{Ind}_H^G \psi)(g) = \frac 1 {|H|}\sum_{x\in G} \psi(xgx^{-1}),

where ψ is extended to the whole of G by taking zero outside H.

Proof.

The ring C[G] is free as a C[H]-module, with basis given by g_1, \ldots, g_k where the gi‘s is a set of left coset representatives for G/H. Hence if M is a C[H]-module, then M' :=\mathbf{C}[G]\otimes_{\mathbf{C}[H]} M is the direct sum of g_i M, where M is identified with 1\otimes M \subseteq M'.

Hence g\in G permutes the blocks g_i M around; if gg_i H \ne g_i H then there’s no contribution from g_i M to the trace. Otherwise, g_i^{-1} gg_i \in H so the contribution to the trace is exactly \psi(g_i^{-1}g g_i), i.e.

\begin{aligned}(\text{Ind}\psi)(g) = \sum_{i=1}^k \psi(g_i^{-1}g g_i) = \sum_{i=1}^k \frac 1 {|H|}\sum_{h\in H} \psi(h^{-1} g_i^{-1} g g_i h) \stackrel{x=g_i h}{=} \frac 1 {|H|}\sum_{x\in G}\psi(x^{-1}g x),\end{aligned}

where the second equality follows from the fact that ψ is a class function on H. ♦

Going back to the example, let’s compute \chi := \text{Ind} \psi; it turns out to be much more convenient to apply the first equality above: \chi = \sum_i \psi(g_i^{-1} g g_i). For coset representatives of S_3 \le S_4, let us take g_i = (1, 2, 3, 4)^i for i = 0, 1, 2, 3:

  • Let g=e. Then g_i^{-1} e g_i = e for each i. Thus \chi(e) = 4\psi(e).
  • Let g=(1, 2). Then g_i^{-1} g g_i \in S_3 only for i=0, 3. Thus \chi((1, 2)) = \psi((1, 2))+\psi((2,3)).
  • Let g=(1, 2, 3). Then g_i^{-1} g g_i \in S_3 only for i=0. So \chi((1, 2, 3)) = \psi((1, 2, 3)).
  • Let g=(1, 2, 3, 4). Then g_i^{-1}g g_i\not\in S_3 so \chi((1, 2, 3, 4)) = 0.
  • Similarly if g=(1, 2)(3, 4), then \chi(g) = 0.

This gives the following characters:

\text{Ind}\psi_{\text{triv}} = (4, 2, 1, 0, 0) = \chi_{\text{triv}} + \chi_1, \\ \text{Ind}\psi_{\text{alt}} = (4, -2, 1, 0, 0) = \chi_{\text{alt}} + \chi_1\chi_{\text{alt}}, \\ \text{Ind}\psi_1 = (8, 0, -1, 0, 0) = \chi_2 + \chi_1 + \chi_1 \chi_{\text{alt}}.

So the matrix for Ind is:

\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 1\\ 0& 1 & 1\end{pmatrix}

which is transpose to the matrix for Res, exactly as we’d expect it to!

Exercise.

Repeat the above computations for S_4\subset S_5 and A_4\subset S_5. [ Hint: to compute the character table of A4, pick the normal subgroup N = \{e, (1,2)(3,4), (1,3)(2,4), (1,4)(2,3)\} and consider the character table of A4/N. ]

blue-lin

G12. Omake

We have one final result to present:

Proposition. The degree of an irreducible representation of G divides |G|.

Proof.

The proof requires some elementary theory of algebraic integers. First, suppose we have a class function fG → C where every f(g) is an algebraic integer. Let \alpha = \sum_{g\in G} f(g)g\in \mathbf{C}[G].

  • For each conjugancy class C\subset G, take the element e_C := \sum_{g\in C} g\in \mathbf{C}[G].
  • Check that e_C commutes with anything in C[G].
  • Since e_C\in \mathbf{Z}[G] and Z[G] is a ring which is a finite Z-module, e_C is integral over Z.
  • Now, \alpha = \sum_g f(g)g is a linear combination of \{e_C\} with coefficients which are algebraic integers. Thus, α is a sum of commuting elements, each of which is integral over Z. Conclusion: α satisfies a monic polynomial with integer coefficients.

Since f is a class function, \alpha g = g\alpha\in \mathbf{C}[G] for any g\in G. Hence, if V is a simple C[G]-module, then multiplication-by-α on V is a C[G]-linear map. Schur’s lemma says this map is a scalar multiple λ of the identity, which must be an algebraic integer. To compute λ, we take the trace of α on V:

\begin{aligned}\lambda\cdot\dim V = \sum_{g\in G} f(g)\text{tr}(g|_V) =\sum_{g\in G} f(g)\chi_V(g).\end{aligned}

In particular, let f(g) = \overline{\chi_V(g)}; each f(g) is an algebraic integer since the eigenvalues of g are roots of unity and thus algebraic integers. Now \lambda\cdot\dim V = \sum_{g\in G} |f(g)|^2 = |G| by orthonormality of irreducible characters. Thus \lambda = |G|/\dim V is an algebraic integer which lies in Q, so it’s an integer. ♦

In fact, one can even prove that the degree divides [GZ(G)] where Z(G) is the centre of G. Interested readers may refer to Serre’s book “Linear Representations on Finite Groups” (Graduate Texts in Mathematics vol. 45).

Posted in Notes | Tagged , , , , , , | Leave a comment

Quick Guide to Character Theory (II): Main Theory

Reminder: throughout this series, G is a finite group and K is a field. All K-vector spaces are assumed to be finite-dimensional over K.

G4. Maschke’s Theorem

If W\subseteq V is a K[G]-submodule, it turns out V is isomorphic to the direct sum of W and some other submodule W’.

Maschke’s Theorem. Suppose char(K) does not divide #G (e.g. char(K) = 0). If W\subseteq V is a K[G]-submodule, then there’s a submodule W'\subseteq V such that:

W + W' = V, \quad W\cap W' = 0.

This gives V\cong W\oplus W'.

As a consequence, if W\subseteq V is a nontrivial submodule, then one can always decompose V = W\oplus W'; since V is finite-dimensional, this process must eventually terminate.

Definition. A K[G]-module V≠0 is said to be simple if it has no submodule other than 0 and itself. The corresponding representation is then said to be irreducible.

Thanks to Maschke’s theorem, every K[G]-module is a direct sum of simple modules.

Proof of Maschke’s Theorem.

Let X be a K-subspace of V such that W+X = V and W\cap X = 0, which is always possible by linear algebra. Take the projection map p:V = W\oplus X \to W. Now define the linear map:

\begin{aligned}q: V\to V, \quad q(v) = \frac 1 {|G|} \sum_{g\in G} g(p(g^{-1}v)).\end{aligned}

We claim: q is K[G]-linear; it suffices to prove that q(hv) = h·q(v) for all h\in G, v\in V. But

\begin{aligned} h\cdot q(v) = \frac 1 {|G|} \sum_{g\in G} hg(p(g^{-1}v)) \stackrel{y=hg}{=} \sum_{y\in G} y(p(y^{-1}hv)) = q(hv).\end{aligned}

Next claim: if w\in W then q(w)=w. Indeed since W is a K[G]-submodule of V, g^{-1}w\in W so p(g^{-1}w) = g^{-1}w for every g\in G. Hence q(w) is 1/|G| times a sum of |G| copies of w.

To complete the proof, we’ll show that W’ = ker(q) satisfies the conditions of the theorem:

  • Since q is K[G]-linear, its kernel W’ is a K[G]-submodule of V.
  • If v\in W'\cap W, then q(v) = 0 since v\in\text{ker}(q); on the other hand, since v\in W we have q(v)=v so v=0. Thus, W'\cap W = 0.
  • For any v\in V, write v = (v-q(v)) + q(v). Since p(V)\subseteq W, we have q(V)\subseteq W also. So q(v)\in W and we get q(q(v)) = q(v) \implies v-q(v)\in \text{ker}(q). Thus v\in W + W'. ♦

Example

Consider G=S_3 acting on V = K^3 by permuting the coordinates (see example 3 in G1). The 1-dimensional subspace W spanned by (1, 1, 1) is clearly a G-invariant subspace. Applying the construction in the above proof gives us the complementary submodule W' = \{(x,y,z)\in K^3: x+y+z=0\}.

blue-lin

G5. Schur’s Lemma

Assume char(K) = 0. By Maschke’s theorem, every K[G]-module is a direct sum of simple modules. The following result is handy:

Schur’s Lemma. Let V and W be simple K[G]-modules.

  • If f : V → W is a K[G]-linear map, then f=0 or f is an isomorphism.
  • Suppose K = C. If f : V → V is a K[G]-linear map, then it’s a scalar multiple of the identity.

Proof.

Suppose fV → W is non-zero. Its kernel is a submodule of V so it’s either 0 or V; since f≠0, we have ker(f) = 0. Likewise, its image is a submodule of W so it’s either 0 or W; since f≠0, we have im(f) = W. Thus f is bijective.

Let λ be an eigenvalue of f with eigenvector v, i.e. f(v) = λv. Then ker(f – λ) is non-zero so it must be whole of V, i.e. f = λ·1V. ♦

Another way of looking at Schur’s lemma: for C and simple C[G]-modules VW,

\dim_{\mathbf{C}} \text{Hom}_{\mathbf{C}[G]}(V, W) = \begin{cases} 0, \quad &\text{if } V\not\cong W,\\ 1,\quad &\text{if } V\cong W.\end{cases}

From now onwards, we let KC.

blue-lin

G6. The Space of Fixed Vectors.

For any C[G]-module V, consider the space:

V^G := \{v\in V: gv = v \text{ for every } g\in G\},

the space of all vectors fixed by G. The following result is critical.

Lemma. We have:

\begin{aligned}\dim(V^G) = \frac 1 {|G|}\sum_{g\in G} \text{tr}(g|_V),\end{aligned}

the average of the traces of g\in G acting on V.

Proof.

Take the averaging map (which is C-linear):

\begin{aligned}p:V\to V, \quad p(v) = \frac 1{|G|} \sum_{g\in G} gv.\end{aligned}

We claim: for any v\in V, we have p(v)\in V^G. Indeed, for any h\in G, we have

\begin{aligned}h\cdot p(v) = \frac 1 {|G|}\sum_{g\in G} hg(v)\stackrel{y=hg}{=} \frac 1 {|G|}\sum_{y\in G} y(v) = p(v).\end{aligned}

On the other hand, if w\in V^G then gww, which gives us p(w) = w. Just as in the proof of Maschke’s theorem, this shows \text{ker}(p)\cap V^G = 0 and \text{ker}(p) + V^G = V. If we pick a basis of ker(p) and of VG, then the matrix for p is:

invariant_matrixSo \dim(V^G) = \text{tr}(p) = \frac 1 {|G|}\sum_{g\in G}\text{tr}(g|_V). ♦

blue-lin

G7. Introducing Characters

Since the trace of g‘s action is important, let’s define:

Definition. Let \rho :G\to GL(V) be a representation of G. Its character is the function

\chi:G\to\mathbf{C}, \quad \chi(g) = \text{tr}(\rho(g)).

Looking at V as a C[G]-module, we have \chi_V(g) = \text{tr}(g|_V).

From G6, we see that \dim(V^G) = \frac 1 {|G|}\sum_{g\in G} \chi_V(g). Suppose V and W are C[G]-modules. Let’s consider the various constructions in G3.

  • Direct sum : \chi_{V\oplus W}(g) = \chi_V(g) + \chi_W(g).
  • Tensor product : \chi_{V\otimes W}(g) =\chi_V(g) \chi_W(g).
  • Dual : \chi_{V^*}(g) = \overline{\chi_V(g)}.

The first two equalities are easy. For the third, let’s pick a basis B=\{e_1, \ldots, e_n\} of V and consider the dual basis B^*=\{e_1^*, \ldots, e_n^*\}. If M is the matrix for gV → V with respect to B, the corresponding matrix for gV* → V* with respect to B* is given by (M^{-1})^t. Hence:

\chi_{V^*}(g) = \chi_V(g^{-1}).

Since each gV → V has finite order, it is diagonalisable (via expressing g in its Jordan canonical form) and every eigenvalue is a root of unity. With respect to a basis of eigenvectors, g is diagonal with entries (\lambda_i : i=1,\ldots, n). Thus g-1 is diagonal with entries (\lambda_i^{-1} = \overline\lambda_i: i=1,\ldots, n) and \text{tr}(g^{-1}) = \overline{\text{tr}(g)} as desired.

This gives:

  • Hom : \chi_{\text{Hom}_\mathbf{C}(V,W)}(g) = \chi_{V^*\otimes W}(g) = \chi_{V^*}(g)\chi_W(g) = \overline{\chi_V(g)}\chi_W(g).

Now consider HomC(V, W)G. A C-linear map fV → W is fixed by G if and only if:

g\circ f \circ g^{-1} = f \iff \forall v\in V, g(f(g^{-1}v)) = f(v) \iff \forall v\in V, g(f(v)) = f(g(v))

if and only if f is C[G]-linear. Thus \text{Hom}_{\mathbf{C}}(V, W)^G = \text{Hom}_{\mathbf{C}[G]}(V, W) and:

\begin{aligned}\dim_\mathbf{C} \text{Hom}_{\mathbf{C}[G]}(V, W) = \frac 1 {|G|} \sum_{g\in G} \overline{\chi_V(g)}\chi_W(g).\end{aligned}

Together with Schur’s lemma, we have:

Corollary (Orthogonality of Characters).

If V and W are simple C[G]-modules, then:

\begin{aligned}\frac 1{|G|}\sum_{g\in G} \overline{\chi_V(g)}\chi_W(g) = \dim_{\mathbf{C}} \text{Hom}_{\mathbf{C}[G]}(V,W) =\begin{cases} 1, \ \text{if } V\cong W,\\ 0,\ \text{if } V\not\cong W.\end{cases}.\end{aligned}

blue-lin

G8. Class Functions

Since tr(AB) = tr(BA) for square matrices of the same size (in fact, it’s even true for rectangular matrices, as long as AB and BA are both square),  we have

\chi_V(h^{-1}gh) = \text{tr}(h^{-1}\cdot gh|_V) = \text{tr}(gh\cdot h^{-1}|V) = \text{tr}(g|_V) = \chi_V(g).

Definition. A function \psi : G\to \mathbf{C} is called a class function if \psi(h^{-1}gh) = \psi(g) for any g,h\in G.

Thus, the character of a representation is a class function. Define an inner product between class functions \psi, \phi:G\to \mathbf{C} as:

\begin{aligned}\left<\psi, \phi\right> := \frac 1 {|G|} \sum_{g\in G} \psi(g)\overline{\phi(g)}.\end{aligned}

From G7, the set of characters of simple C[G]-modules (or irreducible representations) forms an orthonormal set. In fact, we can say more:

Theorem. The set of irreducible characters forms an orthonormal basis for the space of all class functions.

Proof.

It suffices to show that the irreducible characters span the space of class functions. Otherwise, there’s a class function fG → C which is orthogonal to all irreducible characters. Take the element \sigma := \frac 1 {|G|}\sum_{g\in G} f(g)g \in \mathbf{C}[G]. For any simple C[G]-module V, σ acts on V as a C-linear map. Now,

\begin{aligned}h\cdot\sigma(v) = \frac 1 {|G|}\sum_{g\in G} f(g)hg(v) = \frac 1 {|G|} \sum_{g\in G} f(g) hgh^{-1}(hv) \stackrel{y=hgh^{-1}}{=}\frac 1 {|G|}\sum_{y\in G} f(y)yhv = \sigma(hv)\end{aligned}

so it is in fact C[G]-linear and must be a scalar multiple of the identity. To compute this identity, we take the trace:

\begin{aligned} c = \frac 1 {\dim V}\text{tr}\left(\frac 1 {|G|}\sum_{g\in G} f(g) g\right) = \frac 1 {|G|\dim V} \sum_{g\in G} f(g)\chi_V(g) = \frac 1 {\dim V}\left<f, \overline\chi_V\right>.\end{aligned}

But f is orthogonal to each \overline\chi_V = \chi_{V^*} so c=0. Thus σ=0 on an irreducible representation V and hence on any representation as well. In particular, σ=0 on C[G] itself so 0 = σ·1 = σ. ♦

Corollary. The number of irreducible representations of G is the number of its conjugancy classes m.

The m × m table comprising of values \chi_V(g) is called a character table.

Note.

On an intuitive level, characters thus provide an excellent way to analyse non-abelian groups. Indeed, if G is “highly non-abelian” (e.g. simple), it has fewer conjugancy classes and thus a smaller character table. On the other hand, abelian groups have the largest possible character table since every conjugancy class has size 1.

blue-lin

G9. Degrees of Irreducible Characters

If V is a C[G]-module, its dimension as a C-space is called its degree.

Let’s take the obvious C[G]-module, i.e. C[G] itself (on the representation side, this gives rise to the regular representation we saw in example 3, G1). Writing C[G] as a direct sum of irreducible representations V_1^{d_1}\oplus V_2^{d_2} \oplus \ldots \oplus V_m^{d_m}, the corresponding character gives:

\chi_{reg} = d_1\chi_1 + d_2\chi_2 + \ldots + d_m \chi_m, where each d_i \ge 0.

To compute each di, we take the inner product d_i = \left<\chi_{reg}, \chi_i\right>. But \chi_{reg} is easy to compute: it takes e to |G| and all other g\in G to zero (since if g≠e, the action of g on G via left-multiplication has no fixed point). Thus, d_i = \frac 1 {|G|} |G|\cdot \chi_i(1)=\chi_i(1) =\dim_{\mathbf{C}} V_i.

We have thus proven:

Theorem. If V is an irreducible representation of G, the number of times it occurs in the regular representation is dim(V).

Corollary. |G| = \sum_{i=1}^m d_i^2. [ Follows from \chi_{reg}(1) = \sum_{i=1}^m d_i \chi_i(1). ]

Corollary. If G is abelian, then each d_i = 1. [ This follows from: size of character table is |G| × |G|. ]

Example of a Character Table

Here’s the character table of the symmetric group S4. It’s 5 × 5 since the number of conjugancy classes of S4 is p(4) = 5. The size of each conjugancy class is written in square brackets [s].

character_table_s4

In the next installation, we’ll see how this table is obtained.

Posted in Notes | Tagged , , , , , , | Leave a comment

Quick Guide to Character Theory (I): Foundation

Character theory is one of the most beautiful topics in undergraduate mathematics; the objective is to study the structure of a finite group G by letting it act on vector spaces. Earlier, we had already seen some interesting results (e.g. proof of the Sylow theorems) by letting G act on finite sets. Since linear algebra has much more structure, one might expect an even deeper theory.

Some prerequisites for understanding this set of notes:

  • basic group theory, up to group quotients and homomorphisms;
  • linear algebra, including tensor product of vector spaces;
  • elementary theory of (left) modules over non-commutative rings, including up to module quotients and homomorphisms.

[ We’ve yet to cover module theory and linear algebra; hopefully this will be rectified in the future. ]

At one point, one also needs to take the tensor product S\otimes_R M, where M is an R-module and R\subseteq S are (non-commutative) rings. But this is a rather minor aspect, and we’ll also describe the explicit construction so the reader can just accept some of the results at face value for now.

Throughout this document, G denotes a finite group and all linear algebra is performed over a field K. As time passes by, we’ll restrict ourselves to fields of characteristic 0, and then finally to the complex field (or any of your favourite algebraically closed fields of characteristic 0). Also, all vector spaces over K are assumed to be of finite dimension.

Let’s begin.

blue-lin

G1. Group Representations and Examples

We define:

Definition. A representation of a group G is a group homomorphism \rho : G \to GL(V), where V is a finite-dimensional vector space over field K.

If we fix a basis for V, then this is tantamount to giving a group homomorphism G \to GL_n(K), where n = dim(V). Thus, each element of G now corresponds to an n × n matrix with entries in K such that product in G corresponds to product of matrices.

[ Note: throughout all notes on this site, matrix representation for a linear map is obtained via M\cdot v, where M is a matrix and v is a column vector. Thus, if dim(V)=m, dim(W)=n and TV → W, then the underlying matrix has m columns and n rows, i.e. n × m. ]

Just like the case of group actions, we can think of a group representation as providing a map:

G\times V\to V, \quad (g, v) \mapsto (\rho(g))(v)

which is conveniently denoted g·v instead. This satisfies e\cdot v = v and (g_1 g_2)\cdot v = g_1\cdot (g_2\cdot v) for all group elements g_1, g_2\in G and v\in V. Under this notation, one also says G acts on V.

Examples

  1. Let dim(V)=1 and G act trivially on it. Thus \rho:G \to K^*=GL_1(V) takes every g to 1. We call this the trivial representation.
  2. Suppose G = S_n is the full symmetric group. Let dim(V)=1 and let G act on it via \rho(g) = \text{sgn}(g), where sgn(g) = +1 if g is an even permutation and -1 if it’s odd. This is called the alternating representation. Note that it’s only available for S_n and not for any old group.
  3. Let G = S_n again, and dim(V)=n be spanned by the basis e_1, \ldots, e_n. Now g\in S_n acts on V by taking e_i \mapsto e_{g(i)}. E.g. if n = 3, the representation is:

e \mapsto \begin{pmatrix} 1&0&0\\ 0&1&0 \\ 0&0&1\end{pmatrix},\ (1, 2) \mapsto \begin{pmatrix} 0&1&0\\ 1&0&0\\ 0&0&1\end{pmatrix},\ (1,3) \mapsto \begin{pmatrix} 0&0&1 \\ 0&1&0\\ 1&0&0\end{pmatrix},

(2, 3)\mapsto \begin{pmatrix} 1&0&0\\ 0&0&1\\ 0&1&0\end{pmatrix}, \ (1,2,3) \mapsto \begin{pmatrix} 0&0&1 \\ 1&0&0 \\ 0&1&0\end{pmatrix},\ (1,3,2)\mapsto \begin{pmatrix} 0&1&0 \\ 0&0&1 \\ 1&0&0\end{pmatrix}.

  1. Let G = \{e, a, a^2\} be a cyclic group of order 3 and dim(V)=2. A representation of G is given by: \rho(a) = \begin{pmatrix} -1 & 1\\ -1 & 0\end{pmatrix}. Since this matrix is of order 3, the map is well-defined.

Regular Representation

Example 3 above is clearly generalisable: if G acts on finite set X, then let V be a vector space with abstract basis \{e_x : x\in X\}. Thus, dim(V) = #X. Now g\in G acts on V by taking e_x \to e_{g\cdot x}.

In particular, any group G acts on itself by left multiplication, so this gives a representation of dimension #G. Explicitly, V is given an abstract basis \{e_g : g\in G\} and the action of h\in G is given by:

\rho_{reg}(h) : V\to V, \quad e_g \mapsto e_{hg}.

This is called the regular representation of group G. Note that example 3 is not the regular representation since in the regular representation of S3, dim(V) = 3! = 6.

blue-lin

G2. The Group Algebra

We define:

Definition. Given field K and finite group G, the group algebra K[G] is a K-vector space with an abstract basis given by:

\{e_g : g\in G\}

and multiplication K[G]\times K[G]\to K[G] given by e_g \cdot e_h = e_{gh} and extended linearly.

Some concrete computations will make it much clearer. Suppose G=S_3 and K=C. Then a typical product of elements of C[G] looks like:

\begin{aligned}&(\frac 1 2 e_{(1,2)} + \sqrt 2 e_{(1, 3,2)}) (-3 e_{(1,2)} + \sqrt 3 e_{(1,2,3)})\\ = &(-\frac 3 2 e_1 + \frac {\sqrt 3}2 e_{(2, 3)}) + (-3\sqrt 2 e_{(2,3)} +\sqrt 6 e_1)\\ = &(\frac 3 2 +\sqrt 6)e_1 + (\frac{\sqrt 3} 2 - 3\sqrt 2)e_{(2,3)}.\end{aligned}

The following should now be clear.

Theorem. The group algebra K[G] is a ring which contains K as a subring. It is commutative if and only if G is abelian.

As a ring, we can talk about left modules over K[G]. These turn out to correspond precisely to representations of G.

Let’s do the easy direction first: suppose we’re given a left K[G]-module V. Then V is naturally a K-vector space and we obtain an action of G on V by restricting the left-module action K[G] \times V \to V to the basis \{e_g : g\in G\} \subset K[G]. Since e_g \cdot e_h = e_{gh} for any g, h\in G, we get a representation of G on V.

Conversely, suppose G acts on V via K-linear maps, i.e. every g\in G gives rise to a linear map \rho(g) : V\to V. We’ll define a K[G]-module structure on V, by first decreeing that e_g\in K[G] act on V via ρ(g), then extending linearly to the whole K[G]. Explicitly:

\begin{aligned}K[G] \times V\to V\end{aligned} takes \begin{aligned}\left(\sum_{g\in G} c_g e_g, v\right) \mapsto \sum_{g\in G} c_g \rho(g)(v) \in V.\end{aligned}

Concrete Example

Consider example 4 from section G1, where G = \{e, a, a^2\} is cyclic of order 3 and the representation takes a to \begin{pmatrix} -1 & 1 \\ -1 & 0\end{pmatrix}. Now a typical element of K[G] is of the form:

c_0 e + c_1 a + c_2 a^2 where c_0, c_1, c_2 \in K.

The corresponding matrix is then:

c_0 \rho(e) + c_1 \rho(a) + c_2 \rho(a)^2 = c_0\begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix} + c_1 \begin{pmatrix} -1 & 1 \\ -1 & 0\end{pmatrix} + c_2 \begin{pmatrix} 0 & -1 \\ 1 & -1\end{pmatrix},

or \begin{pmatrix} c_0 - c_1 & c_1-c_2 \\ -c_1+c_2 & c_0-c_2 \end{pmatrix}. This represents the action of c_0 + c_1 a + c_2 a^2 \in K[G] on V as a K[G]-module.

blue-lin

G3. Creating New Representations

We’ll look at ways to create new representations of G from existing ones.

A. Direct Sum

If R is a ring, then the direct sum of two R-modules is another one. In particular, this holds for RK[G] as well. Specifically, if \rho_1 : G \to GL(V_1) and \rho_2 : G\to GL(V_2) are both representations, then the direct sum V := V_1\oplus V_2 gives:

\rho : G \to GL(V_1 \oplus V_2), \quad g\cdot (x, y) := (g\cdot x, g\cdot y).

If we pick bases of V1 and V2, then the resulting basis of V=V_1 \oplus V_2 gives the matrix of g : V → V as

matrix_direct_sum

B. Submodules and Quotients

Generally, if M is a left R-module and N\subseteq M a submodule, we get a quotient module M/N. When V is a left K[G]-module, a submodule W\subseteq V is said to be a G-invariant subspace. Clearly, this is a vector subspace; also, for each g\in G, the action of g on V results in g(W) \subseteq W. Conversely, if W is a vector subspace of V which is invariant under all g\in G, then it is a K[G]-submodule.

If we pick a basis of W and extend it to V, then the matrix representation of g\in G is:

matrix_module_quotientC. Tensor Product

If V and W are K-vector spaces, we can take their tensor product over K: X = V\otimes W. Explicitly, if \{e_i\}_{i\in I} is a basis of V and \{f_j\}_{j \in J} a basis of W, then \{e_i\otimes f_j\}_{(i,j)\in I\times J} gives a basis of the tensor product X.

Given g\in G, since the action is linear on both V and W, this induces a linear map

\phi_g : V\otimes W\to V\otimes W, \quad v\otimes w \mapsto (gv)\otimes (gw).

Note that \phi_g \circ \phi_{g'} = \phi_{gg'}; indeed, on elements v\otimes w this is easily seen to be true:

\phi_g(\phi_{g'}(v\otimes w)) = \phi_g((g'v)\otimes (g'w)) = g(g'v)\otimes g(g'w) = (gg')v\otimes (gg')w = \phi_{gg'}(v\otimes w).

Since the set of all such elements spans V\otimes W, the result follows. In terms of matrix representation, we get:

matrix_tensor_product

D. Space of Linear Functions

Suppose V and W are K[G]-modules. The space of all K-linear maps X := \text{Hom}_K(V, W) is also a K[G]-module. To define the action of G on X, let’s imagine a  K-linear map fV → W written in the form of a huge lookup table (vf(v)) such that each v occurs exactly once on the left. Now let G act on the entire table by replacing (vf(v)) with the pair (g·vg·f(v)). Unwinding the definition, we see that G acts on X via:

g : \text{Hom}_K(V,W) \to \text{Hom}_K(V,W), \quad f \mapsto (g\circ f \circ g^{-1} : V\to W).

Note that in the composition gfg-1, the left g acts on W while the right g-1 acts on V.

E. Dual Space.

A special case of the above is when WK with the trivial representation. The resulting HomK(VK) is known in linear algebra as the dual space V*. The above definition then gives us an action of G on V* via: f \mapsto f\circ g^{-1}\in V^*.

Let’s do some sanity check here. From linear algebra, there’s a canonical isomorphism:

V^*\otimes W \cong \text{Hom}_K(V, W), \quad (f\otimes w) \mapsto (v \mapsto f(v)w).

If both V and W are K[G]-modules, then there appears to be two different ways to define a G-action on HomK(VK). Fortunately, both ways are identical; this can be checked by letting g\in G act on the element f\otimes w on the left and the map v\mapsto f(v)w on the right.

  • On the left, we get (f\circ g^{-1})\otimes (gw).
  • On the right, we get the composition v\overset{g^{-1}}\longrightarrow g^{-1}v \rightarrow f(g^{-1}v)w \overset{g}\longrightarrow f(g^{-1}v)\cdot gw, which is the image of (f\circ g^{-1})\otimes (gw).

blue-lin

In a Nutshell

Given a finite group G and field K, we’ve defined the group algebra K[G] which is a ring containing K. This is done by using an abstract basis \{e_g : g\in G\} so that the dimension of K[G] is precisely the order of G. Product is defined via e_g \cdot e_{g'} = e_{gg'} and extended linearly.

There’s a one-to-one correspondence between (1) K[G]-modules, and (2) linear representations of G on K-vector spaces.

The usual operations to construct new K[G]-modules are (A) direct sums, (B) submodules and quotients, (C) tensor products, (D) HomK(VW) and (E) duals.

Everything presented so far is rather generic; in fact, one could even take K as any commutative ring and there’d be no effect on the theory thus far. In the next installation, we’ll explore the structure of K[G]-modules in greater detail.

Posted in Notes | Tagged , , , , , , | Leave a comment

Topology: More on Algebra and Topology

We’ve arrived at the domain where topology meets algebra. Thus we have to proceed carefully to ensure that the topology of our algebraic constructions are well-behaved.

Let’s look at topological groups again. Our first task is to show that the topologies of subgroups and quotient groups commute.

Proposition 1. Suppose N is a normal subgroup of G. If H is a subgroup of G containing N then there’re two ways to obtain the topology on H/N:

  • apply quotient topology to G/N, then subspace topology to H/N;
  • apply subspace topology to H, then quotient to H/N.

The two topologies are identical.

This follows from the more general fact that if pX → Y is an open quotient map, then for any subspace Z\subseteq Y, the restriction to q = p|_{p^{-1}(Z)} : p^{-1}(Z) \to Z is also a quotient map.

Once again, the fact that p is open is critical.

[ To see why, suppose V is a subset of Z such that q^{-1}(V) is open. Hence q^{-1}(V) = p^{-1}(Z) \cap U for some open subset U of X. Then p(U) is open. We claim p(U) ∩ ZV. Indeed, if x\in U and p(x)\in Z, then x \in U\cap p^{-1}(Z) = q^{-1}(V) so p(x) is in V. Conversely, if y\in V \subseteq Z, then since q is surjective, pick x\in q^{-1}(V) \subseteq U such that q(x)=y. Hence y lies in Z as well as p(U). ]

Isomorphism Theorems

For a group homomorphism fG → H, we saw that this induces a continuous bijective group homomorphism G/ker(f) → im(f). What about the remaining two isomorphism theorems of group theory?

Proposition 2. Suppose H is a subgroup and N a normal subgroup of the topological group G. Then HN = \{xy : x\in H, y\in N\} is a subgroup of G and H/(H ∩ N) → (HN)/N is a bijective continuous homomorphism of topological groups.

Proof.

The composition H → HN → HN/N is surjective and has kernel H ∩ N. Thus the resulting map H/(H ∩ N) → HN/N is a continuous bijective homomorphism. ♦

warningAs the reader may suspect, the resulting map is not a homeomorphism in general. For example, consider G = R × R with subgroups N = Z × Z and H = set of all real multiples of (1, √2). Then H/(H ∩ N) = H is isomorphic to the real line. On the other hand, (H+N)/N is not homeomorphic to H. Indeed, by proposition 1, (H+N)/N inherits the subspace topology from G/N \cong S^1 \times S^1 and is a dense subset:

dense_line_in_torus

Next, we have:

Proposition 3. Suppose N\subseteq H are both normal subgroups of G. Then we have an isomorphism of topological groups (G/N)/(H/N) → G/H.

Proof.

The canonical map G/N → G/H is continuous and surjective, with kernel = H/N, so we do get a bijective continuous homomorphism of groups (G/N)/(H/N) → G/H.

To prove that the reverse G/H → (G/N)/(H/N) is continuous, it suffices by universal property to show: composing with the quotient map G → G/H gives a continuous map. But this is obvious since it’s the result of composing G → G/N → (G/N)/(H/N). ♦

Summary.

When topology is taken into account, the first and second isomorphism theorems gives continuous bijective homomorphisms of the underlying objects, but the third isomorphism theorem gives an actual isomorphism.

blue-lin

Other Algebraic Objects

Let’s look at some other algebraic objects with topology added.

Example 1: Topological Ring.

Definition. A topological ring is a ring R equipped with a topology such that the addition and product maps are continuous.

What about subtraction? Fortunately, a ring has -1 so subtraction ab = a + (-1)×b is continuous. Examples of topological rings include Z (discrete topology), Z[1/2] = {a/2n : a integer, n positive integer}, and the p-adic integers Zp, which will be covered later.

Since we said nothing about the inverse map (x → 1/x) being continuous, the group of units R* may not be a topological group if we let it take the subspace topology from R. The usual trick is to embed R* → R × R via x → (x, 1/x) instead and give R* the subspace topology from R × R. Now inverse is merely swapping of two coordinates so it’s continuous.

Consistency of Inverse

If inverse on R* were already continuous as a subsapce of R, then we get precisely the same topology. Specifically, let T (resp. T’) be the subspace topology from R (resp. R × R) and let f : (R*, T) → (R*, T’) be the identity map.

  • From the universal property of subspaces (see exercise after prop. 5 here), f is continuous iff i'\circ f: (R^*, T) \to R\times R is continuous. But this map takes x to (x, 1/x) which is continuous since inverse is continuous on (R*, T). Hence, f is continuous.
  • Conversely, the inverse of f is continuous iff i\circ f^{-1}:(R^*, T')\to R is continuous; the latter map takes (x, 1/x) to x, which is clearly continuous.

Example 2: Topological Field.

A topological division ring / field, is one equipped with a topology such that the addition, product and reciprocal (x → 1/x) maps are continuous. [ The final map has to be restricted to the subspace of non-zero elements. ]

The division map (xy) → x/y is also continuous since it’s a composition of continuous maps. The most common topological fields are RC and the extensions of p-adic fields.

Example 3: Topological Vector Space.

A topological vector space over a topological field K is a vector space V equipped with a topology such that vector addition (V × V → V) and scalar multiplication (K × V → V) are continuous. Topological vector spaces are a huge topic in functional analysis, and they’re a generalisation of normed vector spaces.

One can show that if V and W are n-dimensional topological vector spaces over R (n finite), then V and W are isomorphic. However, things are far more complicated for infinite-dimensional vector spaces. In particular, two vector spaces with the same dimension can have different topological properties.

Example 4: Continuous Action of a Group

A topological group G is said to act continuously on a topological space X if the underlying group action G × X → X is continuous. Thus for each g\in G, the group action l_g:X\to X, x\mapsto gx is a homeomorphism. Indeed, the inverse is given by l_{g^{-1}} which is continuous.

Thus, a group action gives rise to a group homomorphism G → Homeo(X), although the converse isn’t true.

blue-linThe typical constructions on algebraic objects can be extended to that of algebraic objects with topology.

Products

For instance, if V and W are topological vector spaces over K, then so is V × W. Let’s show, as an example, that scalar multiplication mK × (V × W) → V × W is continuous. The proof merely relies on the universal property of products, i.e. it suffices to show that composing with projections

\pi_V\circ m : K\times V\times W \to V and \pi_W\circ m:K\times V\times W \to W

give continuous maps. But these maps take (cvw) to cv and cw respectively, and the result follows from continuity of scalar multiplication on V and W.

Likewise, one can show that if R and S are topological rings, then so is R × S.

Quotients

Quotients are fine thanks to the fact that for algebraic objects, the quotient map pA → A/B is usually open, which allows us to use the algebraic quotient lemma. E.g. since the canonical map pG → G/H is open for any subgroup H of G, so is R → R/I for any ideal I of R, as is V → V/W for any vector subspace W of V. We’ll look at two examples to show that the quotients induce continuous maps as well.

Example 1.

Let I be an ideal of R. We claim that the induced multiplication map:

m' : (R/I)\times (R/I) \to R/I

is continuous. To that end, we use the algebraic quotient lemma to show that the canonical map qR × R → (R/I) × (R/I) is a quotient map. Hence m’ is continuous if and only if m’qR × R → R/I is continuous. But m’q is identical to composing ring product R × R → R with canonical R → R/I, which is clearly continuous.

Example 2.

Consider a group action G × X → X such that the action of the normal subgroup N\triangleleft G is trivial. This induces an action m’ : (G/N) × X → X which we claim is continuous. Since qG × X → G/N × X is a quotient map, it suffices to show m’q is continuous. But m’q is precisely the original group action G × X → X, so we’re done. 

Isomorphism Theorems

The three isomorphism theorems generalise in a similar manner. E.g. for a ring R, if I\subseteq J are ideals of R, then (R/I)/(J/I) is isomorphic to R/I as topological rings. Indeed, from classical ring theory, we have an isomorphism for the underlying ring structure. For the topology, we look at the underlying additive groups; by proposition 3, (R/I)/(J/I) and R/I are isomorphic topological groups. Hence, the two structures are isomorphic topological rings.

Likewise, for any subspaces W\subseteq W' of topological vector space V, (V/W)/(W’/W) and V/W’ are isomorphic vector spaces.

blue-lin

Optional Case Study: Connectedness of SO(n).

Here’s a well-known example from manifold theory: the proof that SO(n) is connected.

First we define O(n) to be the set of all n × n matrices M with real entries such that M^t M = I. This is a group under matrix multiplication since:

  • M, N \in SO(n)\implies (MN)^t (MN) = N^t M^t M N = N^t N = I and
  • if M\in SO(n) then Mt is its inverse and M^t \in SO(n).

O(n) becomes a topological group if we provide it the subspace topology from \mathbf{R}^{n^2} since the product and inverse maps are all given by polynomials in the matrix entries (inverse is particularly easy: it’s just the transpose).

Summary. O(n) is a topological group, called the orthogonal group.

Next, we look at the geometry behind O(n). Denote the column vectors of M\in O(n) by:

M = (\mathbf{v}_1 | \mathbf{v}_2|\ldots |\mathbf{v}_n);

then M^t M = I just means

\mathbf{v}_i \cdot \mathbf{v}_j = \mathbf{v}_i^t \mathbf{v}_j =\begin{cases} 1, &\quad \mbox{if } i =j,\\ 0, &\quad\mbox{if }i\ne j.\end{cases}

We also write this as \mathbf{v}_i\cdot \mathbf{v}_j = \delta_{ij}, where δij is the Kronecker delta function which returns 1 if i=j and 0 otherwise. In other words, M maps the standard basis to the orthonormal basis \{\mathbf{v}_1, \ldots, \mathbf{v}_n\}. Hence a matrix in O(n) preserves the geometry of the Euclidean space by leaving distances and angles invariant.

Property 1. O(n) is compact.

Proof.

Indeed, O(n)\subset \mathbf{R}^{n^2} is closed since it’s defined by explicit polynomial equations in the matrix entries p_{11} = 1, p_{12} = 0, \ldots. On the other hand, O(n) is bounded since each column vector of M\in O(n) has unit length and thus each entry of M is in [-1, +1]. ♦

Next, consider the determinant map det: O(n) → R* which is a continuous group homomorphism. Since:

1 = \det(I) = \det(M^t M) = \det(M^t) \det(M) =\det(M)^2

the image of det is {+1, -1}. The kernel is denoted by SO(n), called the special orthogonal group.

Note. On an intuitive level, the special orthogonal group SO(n) comprises of rotations in n-space, while the general orthogonal group O(n) includes the reflections, which have det -1.

Note that since det(O(n)) is not connected, neither is O(n). However, we wish to prove that SO(n) is connected, via the following steps.

Step 1: SO(1) is connected

Obvious since SO(1) = {1}.

Step 2: Let SO(n) act on the Euclidean space.

Thus, SO(n)\times \mathbf{R}^n \to \mathbf{R}^n takes (M, \mathbf{x})\mapsto M\mathbf{x} which is a continuous action. Let’s compute the isotropy group for xen. This is the set of all M\in SO(n) whose last column is en. From M^t M=I we see that M must be of the form:

M = \begin{pmatrix} N & \mathbf{0}\\ \mathbf{0} & 1\end{pmatrix}, where N\in SO(n-1).

Hence, this gives a bijective continuous map from SO(n)/SO(n-1) to the orbit of en. What might this orbit be? First, Men is the last column of M so it’s a unit vector. Conversely, any unit vector x can be extended to an orthonormal basis via the Gram-Schmidt process. Thus, the orbit is precisely:

S^{n-1} = \{\mathbf{x} \in \mathbf{R}^n : ||x|| = 1\}.

[ The superscript n-1 comes from the fact that the space is (n-1)-dimensional, even though the ambient space is Rn. ]

In short, we have a continuous bijective map:

\phi:SO(n) / SO(n-1) \to S^n.

Step 3: The map is a homeomorphism.

Indeed, since O(n) is compact, so is SO(n) – being a closed subset. Hence, SO(n)/SO(n-1) is also a compact space. Now a continuous map from a compact space to a Hausdorff space must be a homeomorphism.

Step 4: The (n-1)-sphere Sn-1 is connected.

Consider the two hemispheres:

S^+ = \{(x_1, \ldots, x_n)\in S^{n-1} : x_1 \ge 0\} and S^- = \{(x_1, \ldots, x_n)\in S^{n-1} : x_1 \le 0\}.

The projection map S^+ \to \mathbf{R}^{n-1}, (x_1, \ldots, x_n) \mapsto (x_2, \ldots, x_n) which drops the first coordinate is injective and continuous. The image is precisely the unit disc D:=\{(x_2, \ldots, x_n) : x_2^2 + x_3^2 + \ldots + x_n^2 \le 1\}. Since S+ is compact, the projection map is thus a homeomorphism onto D and so S+ is connected.

By the same token, S is connected too. Since S^+ \cap S^- \ne \emptyset, their union Sn-1 is also connected.

Step 5: Assuming SO(n-1) and SO(n)/SO(n-1) are connected, so is SO(n).

More generally, if H is a subgroup of topological group G such that H and G/H are connected, so is G.

Indeed, suppose G = U\cup V for disjoint open subsets. If U contains a point x, it must contain the entire coset xH; otherwise, xH = (xH \cap U)\cup (xH \cap V) would be a partition of xH into disjoint open subsets, contradicting connectedness of xH.

Hence, U and V comprise of disjoint unions of cosets. So if pG → G/H denotes the canonical map, we have p(U)\cap p(V) = \emptyset. Since p is open, G/H = p(U)\cup p(V) for disjoint open subsets p(U), p(V) of G/H. This must mean p(U)=\emptyset or p(V)=\emptyset, so U or V is empty.

This completes the induction, and we’ve shown that SO(n) is connected for each n.

Posted in Notes | Tagged , , , , , , , , | Leave a comment

Topology: Quotients of Topological Groups

Topology for Coset Space

This is really a continuation from the previous article. Let G be a topological group and H a subgroup of G. The collection of left cosets G/H is then given the quotient topology. This quotient space, however, satisfies an additional property.

Definition. A map f : X → Y between two topological spaces is said to be open if, for any open subset U of X, f(U) is open in Y.

Proposition 1. The map p : G → G/H is open.

Proof. Let U be an open subset of G. From proposition 2 hereUH is open in G. Since p^{-1}(p(U)) = UH is open, so is p(U) by definition of quotient topology. ♦

warning In general, not every quotient map pX → X/~ is open. For example, glue the endpoints of I = [0, 1] together and form the quotient map p:I\to S^1. Then U = (1/2, 1] is open in I but p(U) is not open in S1.  

We should say something about open maps since this is our first encounter with them. Don’t worry too much for now – they don’t appear all that often.

Properties of Open Maps

  • The identity map on any X is open.
  • If fX → Y and gY → Z are open, then so is gfX → Z.
  • If f : X → Y is open and f(X)\subseteq Z\subseteq Y, then restricting to f : X → Z still gives an open map since if f(U) is open in Y, then f(U) ∩ Zf(U) is open in Z.
  • Restricting the domain doesn’t give an open map in general: e.g. if we restrict id: R → R to the closed interval [0, 1], the image of [0, 1] is not open. However, restricting f : X → Y to an open subset U of X gives an open map U → Y.
  • The projection map \prod_i X_i \to X_i is open whether we use the product or box topology. To prove that, use the fact that for any map f:A\to B, we have f(\cup_i A_i) = \cup_i f(A_i).
  • If each f_i : X_i \to Y_i is open, then the resulting f:\prod_i X_i \to \prod_i Y_i which takes (x_i) \mapsto (f_i(x_i)) is also open. Again, its proof requires f(\cup_i A_i) = \cup_i f(A_i).

Exercises

Generally, open continuous maps preserve local properties. For example, prove the following for an open continuous fX → Y.

  • If X is locally connected, then so is f(X).
  • If X is locally path-connected, then so is f(X).
  • Let X and Y be Hausdorff. If X is locally compact, then so is f(X).

Answers (Highlight to Read)

  • If y is in f(X), pick x in X such that f(x)=y. Then x is contained in some connected open subset U of X. So f(x) lies in f(U) which is an open and connected subset of Y and hence f(X).
  • Same as connected.
  • If y is in f(X), pick x in X such that f(x)=y. Then x is contained in some open subset U of X whose closure cl(U) is compact. So f(x) is contained in f(U), which is open in Y and hence f(X). Also f(cl(U)) is compact and hence closed in f(X), so the closure of f(U) in f(X) is contained in f(cl(U)) and thus compact also. 

blue-lin

Properties of Coset Space

Let H be a subgroup of the topological group G. The first result is straightforward.

Theorem 2. G/H is discrete if and only if H is open in G.

Proof.

G/H is discrete iff every singleton subset is open; by the definition of quotient topology, this holds iff every left coset gH is open in G, which holds iff H is open in G. ♦

The next result is trickier.

Theorem 3. The following are equivalent.

  1. G/H is T2.
  2. G/H is T1.
  3. H is closed in G.

Proof.

Let pG → G/H be the projection map, which is open by proposition 1. Now (1→2) is obvious. (2→3) follows from H = p^{-1}(\{e\}); if G/H is T1, then {e} is closed in G/H, hence H is closed in G.

Finally, suppose H is closed in G; to show that G/H is Hausdorff, recall that it suffices to show that the image of the diagonal map \Delta :G/H \to G/H \times G/H is closed in G/H × G/H. Now its complement is:

U := (G/H \times G/H) - \text{im}(\Delta) = \{(p(g), p(g')) : g^{-1}g' \not\in H\}.

Let q = (p, p) : G × G → G/H × G/H, which is open because p is. Then q^{-1}(U) = \{(g, g')\in G\times G : g^{-1}g'\not \in H\} is open since the continuous map f: G\times G\to G, (x, y)\mapsto x^{-1}y gives us q^{-1}(U) = f^{-1}(G-H).

Since q is surjective, we have U = q(q^{-1}(U)) which is open since q is open. ♦

blue-linGroup Quotient

Now suppose N is a normal subgroup of the topological group G. It turns out the group operations for G/N are continuous with respect to the quotient topology, i.e. G/N is also a topological group.

Theorem 4. The product map m':G/N \times G/N \to G/N and inverse map i':G/N \to G/N are continuous.

Proof.

First, we prove a general result:

Algebraic Quotient Lemma. Suppose p_i : X_i\to Y_i is a collection of quotient maps which are open. Let q = (p_i) : \prod_i X_i \to \prod_i Y_i be the projection map which takes (x_i) \mapsto (p_i(x_i)).

Then q is a quotient map. [ Compare this result with the warning after example 2 here. ]

Proof of AQL.

We need to show that V\subseteq \prod_i Y_i is open if and only if q^{-1}(V) is open. (→) is obvious since q is continuous. For (←), suppose q^{-1}(V) is open; we have q(q^{-1}(V)) = V since q is surjective. Since each pi is open, so is q. And since q is surjective q(q^{-1}(V))=V, which must be open. ♦

We resume our proof of theorem 4 to show m’ is continuous (i’ is easy). Applying it to G → G/N, we see that the projection qG × G → G/N × G/N is a quotient map too. Thus, showing m’ is continuous is equivalent to showing m’qG × G → G/N is continuous. But m’q is obtained via composing the product map G × G → G with projection G → G/N. Thus, we’re done. ♦

Theorem 5. The quotient group G/N is discrete if and only if N is open in G. Also, G/N is Hausdorff if and only if N is closed in G.

Proof.

Follows immediately from theorem 3 above. ♦

Corresponding to the first isomorphism theorem for groups, we have:

Theorem 6. If f:G\to H is a continuous homomorphism of topological groups, this induces a continuous injective map g:G/\text{ker}(f)\to H.

Proof. The only non-trivial part is continuity. If pG → G/ker(f) is the projection map, then gpf is continuous, so by the universal property of quotient topologyg is continuous. ♦

warningIn general, G/ker(f) doesn’t have the subspace topology from H. For example, f can be an injective continuous map which does not provide G with the subspace topology from H.

E.g. take \mathbf{Z}\to S^1, where m \mapsto (\cos(m\sqrt 2), \sin(m\sqrt 2)).

Next, recall (proposition 7 here) that the connected component of e\in G is a closed normal subgroup N of G.

Theorem 7. If N is the connected component of e, then G/N is totally disconnected.

Proof.

For each g in G, left-multiplication l_g:G\to G, x \mapsto gx is a homeomorphism. Thus, the connected components of G are the cosets gN, for various g. More generally, we’ll prove:

Lemma. In a topological space X, denote x ~ y if they belong to the same connected component. If the projection map p : X → X/~ is open, then X/~ is totally disconnected.

Proof of Lemma.

Suppose there’s a connected subset Y of X/~ comprising of more than a point. Let Z := p^{-1}(Y) which contains points from more than one connected component, so it is disconnected, i.e. Z=U\cup U' for some non-empty disjoint open subsets UU’ of Z. totally_disconnected_proof Now, if Z contains x, then it contains the entire connected component <x> of x. We claim this holds for U as well. Indeed, we have \left<x\right> = (U\cap \left<x\right>) \cup (U'\cap \left<x\right>) as a disjoint union; since <x> is connected and U\cap \left<x\right>\ne\emptyset we must have U\cap \left<x\right> = \left<x\right> \implies U\supseteq \left<x\right>. Thus U and U’ are both unions of connected components of X.

Write UZ ∩ V and U’Z ∩ V’ for open subsets VV’ of X. Since p is open, p(V) and p(V’) are open subsets of X/~ satisfying:

  • p(V) \cup p(V') = p(V\cup V') \supseteq p(U\cup U') = p(p^{-1}(Y)) = Y;
  • p(V) \cap p(V')\cap Y=\emptyset: indeed, if it contains y, then pick x\in V, x'\in V' such that p(x) = p(x’) = y. Then x, x'\in p^{-1}(Y) = Z so they’re in UU’ respectively. Since p(x) = p(x’), x and x’ must belong to the same connected component, which contradicts x\in V, x'\in V'.

Hence, p(V) ∩ Y and p(V’) ∩ Y are disjoint non-empty open subsets of Y with union Y, which contradicts our assumption that Y is connected. ♦

Examples of Group Quotients

  1. Consider Q as a topological subgroup of R. The quotient R/Q has the coarsest topology.
  2. Take \mathbf{Z}\subset \mathbf{R}. The map \exp: \mathbf{R}\to S^1 which takes t\mapsto (\cos(2\pi t), \sin(2\pi t)) has kernel equal to Z. By theorem 5 above, this gives a bijective continuous homomorphism R/Z → S1. Since the image is Hausdorff, if we could prove R/Z is compact, then we’d have shown that R/Z and S1 are isomorphic topological groups. But then R/Z is the continuous image of the composition [0, 1] → R → R/Z. Case closed.
  3. If G and H are topological groups, then so is G × H (with the product topology). The continuous bijective group homomorphism (G × H)/H → G is a homeomorphism since it’s open. Thus (G × H)/H and G are isomorphic topological groups.
  4. Take \mathbf{Z}^2\subset \mathbf{R}^2. The same reasoning holds as before and one can show that R2/Zis isomorphic to S1 × S1.
  5. Finally, take \text{SL}_n(\mathbf{R}) \subset \text{GL}_n(\mathbf{R}). The determinant map GLn(R) → R* induces a bijective continuous homomorphism GLn(R)/SLn(R) → R*. Now we can’t use the same trick since the groups aren’t compact. Instead, we compose: R* → GLn(R) → GLn(R)/SLn(R) → R*, where the first map takes c to the diagonal matrix with entries (c, 1, …, 1). This gives the identity map on R*; since the first two maps are continuous, the last map is open.
Posted in Notes | Tagged , , , , , , | Leave a comment

Topology: Quotient Topology and Gluing

In topology, there’s the concept of gluing points or subspaces together. For example, take the closed interval X = [0, 1] and glue the endpoints 0 and 1 together. Pictorially, we get:

glue_interval

That looks like a circle, but to prove it’s homeomorphic to one, we need to define what’s meant by “gluing”. Let X be a topological space and ~ be an equivalence relation on X, which partitions X as a disjoint union of equivalence classes. Let Y := X/~ be the set of equivalence classes and we get a surjective map pX → Y.

[ Conversely, given a surjective map pX → Y of any two sets, we get a corresponding equivalence relation via x_1 \sim x_2 \iff p(x_1) = p(x_2). ]

Defining the set Y is easy; the subtle issue is its underlying topology. For that, we need to explore the kind of properties we want for Y. Intuitively if f : [0, 1] → R is a continuous map such that f(0) = f(1), then gluing together 0 and 1 should induce a map [0, 1]/~ → R, which we expect to be continuous.

glue_map

Universal Property of Quotient. For a surjective map p:X\to Y, we would like the topology on Y to satisfy:

  • for any topological space Z and map f:Y\to Z, f is continuous if and only if f\circ p:X \to Z is continuous.

This inspires the following definition.

Definition. Let p:X\to Y be a surjective map from a topological space X to set Y. The quotient topology on Y is defined as follows.

  • A subset V\subseteq Y is open if and only if p^{-1}(V) is open in X.

It’s easy to show that this definition gives a topology on Y; the resulting topological space Y is called the quotient space.

Theorem 1. The quotient topology satisfies the above universal property.

Proof.

(→) Under the quotient topology, p is continuous so if f is continuous, then so is f\circ p:X \to Z. (←) Conversely, suppose f\circ p is continuous. For each open subset W of Z, the set p^{-1}f^{-1}(W) = (f\circ p)^{-1}(W) is open in X. By definition of the quotient topology, f^{-1}(W) is open in Y. ♦

Exercise.

Prove that if pX → Y is a surjective map from a topological space X to set Y, then any topology on Y satisfying the universal property must be defined as above. Thus the universal property uniquely characterises the quotient topology.

Proof (Highlight to Read).

Suppose Y has two topologies T and T’ satisfying the universal property with projection maps X → (YT) and p’X → (YT’). Consider the identity map f : (YT) → (YT’) which satisfies fpp’. Since fp is continuous, the universal property on p tells us f is continuous. Likewise, the inverse of f is continuous, so it is a homeomorphism. ♦

Exercises.

  1. Prove that if pX → Y and qY → Z are surjective maps such that Y has the quotient topology from X and Z has the quotient topology from Y, then Z has the quotient topology from X via qpX → Z.
  2. Prove that if pX → Y is bijective and Y has the quotient topology from X, then p is a homeomorphism.
  3. Prove that if p : X → Y is surjective, then the quotient topology can also be defined as follows: a subset C of Y is closed if and only if p^{-1}(C) is closed in X.
  4. Suppose iX → Y is injective and gives X the subspace topology from Y and pY → Z is surjective and gives Z the quotient topology from Y. If p\circ i:X\to Z is surjective, does this give Z the quotient topology from X? [ Answer: no, take the subspace [0, 1) of [0, 1], and the quotient topology on [0, 1] by gluing 0 and 1. Then the quotient topology is compact (see next problem), so [0, 1) → [0, 1]/~ is bijective and non-homeomorphic. By problem 2, the RHS is not a quotient topology from the LHS. ♦ ]
  5. Prove that if pX → Y gives Y the quotient topology and X is compact (resp. connected), then so is Y. Again we see that compact and connected often go hand-in-hand in their properties.

blue-lin

Proving Homeomorphisms

We saw that gluing 0 and 1 in [0, 1] gives a space that looks like a circle. Now that we’ve defined the quotient topology, let’s prove that in fact it is homeomorphic to a circle. Recall the useful result (see lemma here) that if X is compact and Y is Hausdorff, then a bijective continuous map X→Y is a homeomorphism.

[ Question to ponder: homeomorphism is intuitively a “local” property. So can we replace “compact” with “locally compact” above? I.e. is it true that if X is locally compact and Y is Hausdorff, then a bijective continuous X→Y is a homeomorphism? To get a hint of the answer, see the answer to exercise 4 above. ]

Let Y = [0, 1]/~, where ~ identifies 0 and 1, and p : [0, 1] → Y be the projection map. We wish to show Y\cong S^1, where S^1=\{(x,y)\in\mathbf{R}^2 : x^2 + y^2 = 1\}. Now:

f:[0, 1] \to S^1, \quad t \mapsto (\cos(2\pi t), \sin(2\pi t))

is a continuous map satisfying f(0) = f(1) so it induces a map g:Y\to S^1. By the universal property above, g is continuous since g\circ p = f :[0, 1]\to S^1 is continuous. Now, we invoke the lemma and conclude that g is a homeomorphism (together with the above exercise 5: since [0, 1] is compact, so is the quotient Y).

Now the acute reader may prove the following using the same technique.

Example 1: Double Circle

Let Y be the quotient topology from X=[0, 1] by identifying 0, 1/2 and 1. Prove that Y is homeomorphic to the union of two circles:

\{(x, y) : (x+1)^2 + y^2 = 1\} \cup \{(x,y) : (x-1)^2 + y^2 = 1\}

as a subspace of R2.

quotient_double_circ

[ Note: topologically, one calls this a wedge sum of two circles. Intuitively a wedge sum of a collection of topological spaces is obtained by fixing a point in each space and gluing all points together. This is a rather useful concept in algebraic topology, specifically homotopy, but that’s another story for another day. ]

Example 2: Cylinder

Let X = [0, 1] × [0, 1] and glue points (0, y) and (1, y) together, for all 0 ≤ y ≤ 1. Prove that we get the cylinder S1 × [0, 1].

quotient_cylinder

warning

Now one’s very tempted to believe: if p:X\to Y gives Y the quotient topology from X, and q:X'\to Y' gives Y’ the quotient topology from X’, then r:=(p,q) : X\times X' \to Y\times Y' gives Y × Y’ the quotient topology from X × X’.  But this is wrong, but a counterexample is fiendishly hard to find. For one, refer to Munkres, “Topology” (2nd ed.), section 22, page 145, exercise 6.

Example 3: Torus

Let X = [0, 1] × [0, 1] and glue points (0, y) and (1, y) together, as well as points (x, 0) and (x, 1) together. In particular, all the four corner points collapse to one. The result is a torus S^1 \times S^1.

quotient_torus

More Examples

Let X = [0, 1] × [0, 1] and glue points (0, y) and (1, 1-y) together. We get a Mobius strip, obtained by taking a rectangular strip of paper, and gluing two opposite edges together after twisting the strip 180 degrees.

Let X = [0, 1] × [0, 1] and glue points (0, y) and (1, y) together, as well as (x, 0) and (1-x, 1) together. This gives a Klein bottle, which is an interesting example of a two-dimensional object which cannot be embedded in 3-space as a subspace. The proof for this is hard though.

Let X = D^2 := \{(x, y)\in \mathbf{R}^2 : x^2 + y^2 \le 1\} be the unit disc on a plane. Collapse the outer circle comprising of points satisfying x^2 + y^2 = 1 into a single point. The resulting topological space is homeomorphic to a 2-sphere: S^2 := \{(x, y, z) \in \mathbf{R}^3 : x^2 + y^2 + z^2 = 1\}.

Let X be the space R × {0, 1}, where {0, 1} is given the discrete topology. Identify the points (x, 0) with (x, 1) for all x ≠ 0. We get the “real line with double origin”:

quotient_double_0Denoting the two origins by p and p’, we see that any open subsets UV which contain pp’ respectively, must intersect since they must contain a positive real number. Thus, the space X is Hausdorff but its quotient isn’t.

In the next article, we’ll be looking at topology of group quotients. The situation is not as straightforward as one might imagine.

Posted in Notes | Tagged , , , , , , , | 2 Comments

Topology: Topological Groups

This article assumes you know some basic group theory. The motivation here is to consider groups whose underlying operations are continuous with respect to its topology.

Definition. A topological group G is a group with an underlying topology such that:

  • the product map m: G × G → G is continuous;
  • the inverse map i: G → G is continuous.

Examples

  1. Any group can be a topological group by endowing it with the discrete topology.
  2. The complex plane C, real line R, and their subspaces ZQ are all topological groups under addition.
  3. C* = C – {0} and R* = R – {0} are also topological groups under multiplication.
  4. More interesting topological groups include GLn(R) and SLn(R), the group of non-singular n × n real matrices, and the group of n × n real matrices with determinant 1 (respectively). The topology of these groups are given by the subspace topology upon embedding them in \mathbf{R}^{n^2}.

The first property we want to prove is:

Proposition. On a topological group G, the following maps are homeomorphisms.

  • Left-multiplication: for g in G, l_g:G \to G, x\mapsto gx.
  • Right-multiplication: for g in G, r_g:G \to G, x\mapsto xg.
  • Inverse: i:G\to G above.
  • Conjugation: for g in G, c_g(x) = gxg^{-1}.

Proof.

Since multiplication and inverse maps are continuous, the above maps l_g, r_g, i, c_g are all continuous too. Plus, they’re bijective and their inverses are given by l_{g^{-1}}, r_{g^{-1}}, i, c_{g^{-1}} respectively, which are all continuous. So the maps are homeomorphisms. ♦

In particular, for any x, y\in G there’s a homeomorphism of G which maps x to y; indeed, l_{yx^{-1}}(x) = y.

On an intuitive level, this means the topology of G is completely uniform: e.g. to check what happens near any point g, it suffices to check near the identity e. [ This should not be confused with uniform topological spaces, which is an entirely different thing. ]

Next, the following result is useful.

Proposition 2. Let A, B be subsets of a topological group G. If A is open in G, then so are AB and BA, where AB=\{xy : x\in A, y\in B\}.

Proof.

This follows from AB = \cup_{b\in B} Ab which is a union of open subsets of G since right-multiplication is a homeomorphism. Same goes for BA. ♦

The case for closed subsets is not so nice, but at least we have:

Proposition 3. Let C be a closed subset of G and K be a compact subspace of G. Then CK and KC are closed in G.

Proof.

We’ll show that GCK is open in G; the case for KC is similar.

Fix x in G-CK and take the map f:G\times G\to G which takes (a,b)\mapsto ab^{-1}. Clearly f is continuous, so f^{-1}(G-C)\subseteq G\times G is open. Now for any y\in K, the ordered pair (x,y)\in f^{-1}(G-C) because if f(x,y) = xy^{-1} \in C, we would have x = (xy^{-1})y \in CK.

So (x,y)\in U_y\times V_y for some open subsets x\in U_y\subseteq G and y\in V_y\subseteq G such that f(U_y, V_y)\subseteq G-C. By compactness of K, it can be covered by finitely many V_{y_1}, \ldots, V_{y_n}:

let V := V_{y_1} \cup \ldots \cup V_{y_n} \supseteq K\ and \ U:= U_{y_1} \cap \ldots \cap U_{y_n}\ni x.

Then f(U, K) \subseteq f(U,V) \subseteq \cup_k f(U_{y_k}, V_{y_k}) \subseteq G-C. Thus x\in U\subseteq G-CK so indeed GCK is open. ♦

blue-lin

Subgroups

First, we look at the separation axioms on a topological group.

Proposition 4. The following are equivalent for a topological group G.

  1. G satisfies T3.
  2. G satisfies T2.
  3. G satisfies T1.
  4. The trivial subgroup {e} is closed.

Proof

The only non-trivial direction is (4→1). In fact, we’ll prove any topological group G is regular, from which (3→1) follows. The final case of (4→3) follows by left-multiplying G with g, thus every {g} is closed.

Consider the continuous map f:G\times G\to G which takes (a,b)\mapsto ab^{-1}. If x is contained in an open subset U of G, then f^{-1}(U) contains (x, e) so we have (x,e)\in V\times W\subseteq f^{-1}(U) for some open subsets VW of G containing xe respectively.

Now x\in V. Furthermore, V\cap (X-U)W =\emptyset since any element in the intersection corresponds to a\in V, b\in W such that ab^{-1}\not\in U which is a contradiction. By proposition 2, (XU)W is an open subset containing XU, so any topological group is regular. ♦

It’s ok if {e} is not closed, for we can take the closure which will turn out to be a subgroup, in fact, a normal subgroup of G. More generally:

Proposition 5. If H is a subgroup of G, then so is cl(H). If N is a normal subgroup of G, then so is cl(N).

Note.

The term “normal” unfortunately has a slight ambiguity here. It can refer to normal subgroups, or normal topological spaces (as in, any two disjoint closed subsets can be separated by open subsets). We’ll only refer to the former in this article.

Proof.

Take xy in cl(H). Now the continuous map f:G\times G\to G which takes (a,b)\mapsto ab^{-1} gives us f(H, H)\subseteq H. Since cl(H) is a closed subset of G containing H, f^{-1}(\text{cl}(H)) is a closed subset of G × G containing H × H. Thus f^{-1}(\text{cl}(H)) contains cl(H × H) = cl(H) × cl(H) and we have f(\text{cl}(H), \text{cl}(H))\subseteq \text{cl}(H) which, together with e\in \text{cl}(H), proves that cl(H) is a subgroup of G.

For the second statement, each conjugancy map c_g:G\to G, x\mapsto gxg^{-1} is a homeomorphism and maps N to N. Thus it must map cl(N) to cl(N) also. ♦

The closure can be nicely classified as follows:

Proposition 6. If H is a subgroup of G, then cl(H) is the intersection of all HU, where U is an open subset containing e.

Proof.

Now x\in \text{cl}(H) iff every open subset V containing x intersects H, which holds iff every open subset U containing e intersects x^{-1}H. And this holds iff x\in HU^{-1}. Now recall that U is open iff U^{-1} is open. ♦

Proposition 7. The connected component Y containing e is a closed normal subgroup of G.

Proof.

Y is closed in G since every connected component is closed.

Now suppose x,y\in Y. Since left-multiplication by x is a homeomorphism of G, it must map the connected component of e to that of x. But x and e both have Y as their connected component, so xYY and thus xy\in Y.

Next, inverse is a homeomorphism so it must take the connected component of e to the connected component of i(e)=e, i.e. i(Y) = Y. This proves Y is a subgroup. Normality follows from the fact that the conjugation map is a homeomorphism taking e to itself. ♦

Note.

Proposition 7 still holds if we replace “connected components” with “path-connected components“.

So far we’ve been looking at closed subgroups. What about open ones?

Proposition 8. An open subgroup H of G is also closed. Furthermore, a closed subgroup H of finite index in G is open.

Proof.

The complement GH is a union of (left) cosets of H. For the first statement, each coset is open since left-multiplication is a homeomorphism. Thus GH is open and H must be closed. For the second statement, each coset is closed and GH is a union of finitely many closed subsets, so it’s closed too, and H is open. ♦

Corollary. If G is a connected group, then any neighbourhood U containing e generates the whole group (in the algebraic sense).

Proof.

Replace U by U ∩ i(U) and we may assume i(U) = U. Now consider V:=\cup_{n=1}^\infty U^n where Un is the collection of products of n elements from U. It’s easy to see that V is an open subgroup of G. By the above proposition V is a clopen subset so V=G. It’s clear that the group <U> generated by U must contain V; thus <U>=G. ♦

blue-lin

Continuous Homomorphisms

Between topological groups, naturally we’ll be looking at homomorphisms which preserve the topology, i.e. continuous homomorphisms. For example: the following is obvious.

Property. If f:G\to H is a continuous map of topological groups, then for any closed (resp. open) subgroup K of H, f^{-1}(K) is a closed (resp. open) subgroup of G.

If K is normal in H, then f^{-1}(K) is normal in G.

In particular, if H is Hausdorff, then the kernel of f is a closed normal subgroup of G.

warningThe corresponding result doesn’t hold for the im(f), though we do know it’s a subgroup of H. E.g. consider f:\mathbf{Z}^2 \to \mathbf{R} which takes (ab) to a+b√2. The homomorphism is clearly continuous if Z2 is given the discrete topology. However, the image of f is not closed.

But since the continuous image of a connected (resp. compact) set is also connected (resp. compact), at least we have:

Property. If f:G\to H is a continuous homomorphism of topological groups and G is connected (resp. compact), then im(f) is a connected (resp. compact) group.

Next, we’ll talk about isomorphisms.

Definition. An isomorphism of topological groups is a group isomorphism f:G→H which is also a homeomorphism.

warning

Note that it’s not enough to say f is a bijective and continuous homomorphism of groups, since its inverse may not be continuous (even though it’s guaranteed to be a group homomorphism).

For example, take the above map f:\mathbf{Z}^2\to \mathbf{R} which takes (ab) to a+b√2. This induces a group isomorphism from Z2 to im(f) which is continuous but not a homeomorphism.

Posted in Notes | Tagged , , , , , , , | Leave a comment