Polynomial Multiplication, Karatsuba and Fast Fourier Transform

Let’s say you want to write a short program to multiply two linear functions f(x) = ax+b and g(x) = cx+d and compute the coefficients of the resulting product:

(fg)(x) = (ac)x^2 + (ad+bc)x + bd.

You might think it’ll take 4 multiplications (for acadbc and bd) and 1 addition (for ad+bc), but there’s a better way: calculate ac, bd, (a+b)(c+d)-acbd which results in 3 multiplications and 4 additions/subtractions. It’s better since additions and subtractions are generally much faster than multiplications, so the tradeoff is usually worth it.

The above method is known as Karatsuba algorithm, named after Anatolii Alexeevitch Karatsuba, a Russian mathematician who found it when he was only an undergraduate. These days, one is likely to be underwhelmed by Karatsuba’s method, but it caused quite a stir in the early 1960s when it disproved a conjecture by Andrey Kolmogorov.

The power of Karatsuba’s method is that the coefficients can belong to any algebraic structure as long as multiplication is distributive over addition, i.e. we don’t even need multiplication to be associative. In particular, a, b, c, d can be polynomials themselves. E.g. suppose we wish to multiply two cubic polynomials:

p(x) = p_0 + p_1x + p_2x^2 + p_3x^3, \quad q(x) = q_0 + q_1x + q_2x^2 + q_3x^3.

Instead of the naive method, which costs 16 multiplications, recursive applications of Karatsuba’s method gives us:

followed by three applications of Karatsuba method to the three multiplications. This gives 9 multiplications instead of 16. Let’s do a more detailed count of the number of operations. Suppose f(n) (resp. g(n)) is the number of multiplications (resp. additions or subtractions) for using Karatsuba’s method to multiply two polynomials of degree n-1. Then:

  • f(2n) = 3f(n), which gives us f(2m) = 3m in general.
  • g(2n) ≤ n + 3g(n) + 2n, since n additions are required to compute a+b and c+d, and a further 2n subtractions are used for (a+b)(c+d) – ac – bd. This gives:

g(2^m) \le 3\cdot 2^{m-1} + 3g(2^{m-1}) \le 3\cdot 2^{m-1} + 9\cdot 2^{m-2}+ 9g(2^{m-2}) \le \ldots

\implies g(2^m) \le \sum_{j=1}^m 3^j 2^{m-j} \le m \cdot 3^m.

Hence, even for the number of additions, Karatsuba’s method will eventually overtake the naive method.

Fourier Transform

There’s a nice philosophical ideal behind Karatsuba’s method. Back to the linear polynomials f(x) = ax+b and g(x) = cx+d. Essentially Karatsuba’s method evaluates the two functions at various points, then multiply the results:

  • f(0) = bg(0) = d which gives (fg)(0) = bd;
  • f(1) = a+bg(1) = c+d, which gives (fg)(1) = (a+b)(c+d);
  • f(∞) = a∞, g(∞) = b∞, which gives (fg)(∞) = (ab)∞2.

If the reader should balk at our wanton use of infinity, feel free to replace x with 1/x and f(x) with xf(1/x) and evaluate at x=0. The big idea is as follows:

Big Picture. To compute the polynomial product fg, evaluate f(x) and g(x) at various points xm, multiply f(xm)g(xm), then find the polynomial which passes through these points.

This is much harder than it sounds. For suppose deg(fg) ≤ d. If we naively evaluate f(xm) independently for each m, then we need d evaluations, each of which requires about d/2 multiplications since the deg(f), deg(g) ≈ d/2. That’s bad news since it gives rise to d2/2 multiplications, and g(xm) will give another d2/2, and the total number of multiplications stands at d2 which is not any better than simple expansion.

And we haven’t even started finding the polynomial through a set of points – the standard method, Lagrange interpolation, is way too slow here. ]

Thus, we need a better plan.

That’s where discrete Fourier transform comes in. Suppose we wish to multiply two polynomials whose product has degree at most 7. Thus, only 8 values of xi are needed. The values we’ll pick are the 8-th roots of unity:

\omega = \exp(2\pi i/8), \qquad x_m = \omega^m, \ 0\le m\le 7.

Thus x_4 = \omega^4 = \exp(\pi i) = -1 and \omega^8 = 1, which allows us to evaluate all powers of ω using only the first 8 powers via \omega^{8k+m} = \omega^{m}. Now break up the polynomial evaluation as follows: for f(x) = a_0 + a_1 x + \ldots + a_7 x^7, evaluate f(\omega^m), for m = 0, …, 7 as follows:

Collect all the even columns (red arrows) together and the odd columns (blue columns) together, we divide the matrix into four blocks:

The blocks labeled ‘A’ are simply the discrete Fourier transforms of f(x) = a_0 + a_2 x + a_4 x^2 + a_6 x^3 since \omega^2 = \exp(2\pi i/4). The block labeled ‘B’ is the discrete Fourier transform of f(x) = a_1 + a_3 x + a_5 x^2 + a_7 x^3 with the j-th term multiplied by ωj, for j = 0, 1, 2, 3. Finally, the ‘C’ block is simply the negative of the ‘B’ block since ω4 = -1.

Summary. To perform FFT for the case of 2n parameters, it suffices to perform FFT for two cases of n, with an additional 2n complex additions and n complex multiplications. This gives an overall complexity of O(n log n).

Inverse Fast Fourier Transform

Having found a method to perform FFT in time complexity O(n log(n)) it remains to find a method to invert FFT in comparable time complexity. But this turns out to be rather nice, for the inverse of the transformation from (aj) to fj) is similar to the forward transformation:

The reader can easily verify for himself that the above relations do indeed recover the aj‘s. Just use the relations.

  • If j is a multiple of 8, then \omega^j = 1 so 1 + \omega^j + \omega^{2j} + \ldots + \omega^{7j} = 8.
  • If j is not a multiple of 8, 1 + \omega^j + \ldots + \omega^{7j} = (\omega^{8j} - 1)/(\omega^j - 1) = 0 since the numerator is 0 but the denominator isn’t.

Notice that this is almost identical to the forward FFT process, except that we have to reverse the order of the resulting aj‘s, from j = 1, …, 7. Thus, the inverse FFT process also has complexity O(n log n).

In short, to multiply two polynomials whose product has degree at most n-1, we do:

  • perform FFT on the first polynomial to obtain (a_0, a_1, \ldots, a_{n-1});
  • perform FFT on the second polynomial to obtain (b_0, b_1, \ldots, b_{n-1});
  • multiply the two sequences termwise to give (c_j) = (a_j b_j);
  • perform inverse FFT on the resulting (c_0, c_1, \ldots c_{n-1}) to get the answer.

All steps other than the third require a time complexity of O(n log n), and the third step has a time complexity of O(n). Thus, the overall time complexity is O(n log n).

Applications of FFT

There are multiple uses for the fast Fourier transform algorithm.

  • Signal Processing : Fourier transform is the process of breaking a signal into a sum of various harmonics. Since digital data is collected in discrete packets, FFT is a natural way to do that, and it makes it tractable to perform real-time Fourier transform on millions of data points. [ The naive method, which has complexity O(n2), would make it impossible. ]
  • Image Compression : the bitmap standard JPEG is a lossy compression based on Discrete Cosine Transform (DCT), which one can think of as taking the real components of the Fourier transform. Roughly speaking, after FFT, the high-frequency components have to be down-sampled to attain a certain level of compression (this process is irreversible, hence the term lossy). Special allowances are often made for areas of the colour spectrum to which the human eye is particularly sensitive.
  • Multiprecision Computation : within a computer, a multi-precision integer (e.g. 1000-digit) is often stored in base 232 or 264 since computers typically have 32- or 64-bit buses and registers. FFT provides a way of multi-precision multiplication: to multiply ab, write a and b as polynomials with coefficients in [0, 232-1] (say). Then use FFT to multiply the two polynomials quickly and substitute x=232 to get the product.

The last case is of interest: although FFT is asymptotically faster than naive multiplication, it’s only used for extremely huge integers (e.g. millions of digits). The reason is that the calculations involved in FFT require computations with complex numbers, to a certain level of precision (certainly not to thousands, or even hundreds of significant figures, but still slow). Hence, the constant factor in the O(n log n) notation is rather large.

For example, the C library GMP uses a nice combination of Karatsuba and FFT in order to calculate multi-precision multiplication quickly.

Posted in Notes | Tagged , , , , , | Leave a comment

Intermediate Group Theory (5)

Free Groups

To motivate the concept of free groups, let’s consider some typical group G and elements ab of G. Recall that H=\left<a,b\right>, the subgroup generated by {ab}, is defined to be the intersection of all subgroups of G containing a and b. Immediately, we see that H satisfies:

  • it is a subgroup of G;
  • any subgroup of G containing both a and b must contain H also.

So in a very real sense, it’s the smallest subgroup of G containing {ab}. What does H look like? It certainly contains all expressions in ab and their inverses:

e, a^{-1}, b^2, ab^{-1}, a^3 b^2 a^{-1}, b^6 a b^{-2}a^2, \ldots

although not all these elements are distinct. On the other hand, the set of all these elements forms a group so H is precisely the set of all “words” in a and b. The idea of a free group is to consider these generic words and form them into a group, with product given by concatenation and simplification.

Definition. Let S be a set. The free group generated by S, denoted F(S), is the set of all words in the set of abstract symbols \{s, s^{-1} : s\in S\}, together with “e”. Group product is given by concatenation and simplification, via s * s^{-1} = e.

Hence, the free group generated by a singleton set is isomorphic to Z. The free group generated by {ab} is the list of all words described above. [ Conceptual quiz: what is the free group generated by the empty set? ]

The intuition behind this is that F(S) is the “largest” possible group generated by |S| elements. The most important feature of F(S) is its universal property.

Let S be a function and F(S) be the free group generated by S with injection i:S \to F(S) which sends s\in S to a single-character word “s” in F(S).

Universal Property. For any function f:S \to G to a group G, there is a unique group homomorphism g:F(S) \to G such that g\circ i=f.

To have an idea of what g looks like, let’s consider the concrete example of {ab} and take the sample words from above:

e, a^{-1}, b^2, ab^{-1}, a^3 b^2 a^{-1}, b^6 a b^{-2}a^2, \ldots

Then since g is a group homomorphism, it has no choice but to map these to:

e, f(a)^{-1}, f(b)^2, f(a)f(b)^{-1}, f(a)^3 f(b)^2 f(a)^{-1}, f(b)^6 f(a) f(b)^{-2} f(a)^2, \ldots

The correspondence between f and g can be written as follows:

\text{Fun}(S, G) \cong \text{Hom}(F(S), G),\quad g\circ i\leftrightarrow g,

where Fun(XY) is the set of all functions X → Y and Hom(GH) is the set of group homomorphisms G → H. From our description of g, it’s clear that g is surjective if and only if f(S) generates G.

Generators and Relations

Consider the group S3 which is generated by x = (1, 2, 3) and y = (1, 2). Thus, corresponding to the map {ab} → S3 which takes ab to (1, 2, 3), (1, 2) respectively, there is a surjective group homomorphism:

g : F(\{a,b\}) \to S_3, \quad a \mapsto x=(1, 2, 3), b \mapsto y=(1, 2).

The kernel N then gives an isomorphism F({ab})/N ≡ S3. What kind of elements lie in N? These comprise of all strings in ab, such that if we substitute them with xy respectively, the result is the identity element of S3. For example, since x^3 = y^2 = e, we must have a^3, b^2 \in N. Likewise, since yxy = (1, 3, 2) = x^2 we also have baba^{-2} \in N.

Thus, the normal subgroup H of F({ab}) generated by these three elements is contained in N. The question is whether H=N. The answer is yes: we won’t prove it in detail but merely sketch the outline:

  • show that every element of F({ab})/H can be represented by aibj for some integers i and j;
  • since a3 and b2 are in H, we can take i = 0, 1, 2 and j = 0, 1;
  • so F({ab})/H has at most 6 elements, and the surjective map F({ab})/H → F({ab})/N is bijective since its image has 6 elements exactly. ♦

Now we consider the general case.

Definition Let S be a set and F(S) be the free group on S. Consider a subset T \subseteq F(S). The group generated by S with relations in T is the quotient F(S)/N(T), where N(T) is the normal subgroup of F(S) generated by T.

For example, we just saw that the group generated by {ab} with relations a^3 = b^2 = baba^{-2} = e is isomorphic to S3. A common way to write this group is:

G = \left<a, b : a^3 = b^2 = baba^{-2} = e\right>  or  G = \left<a, b : a^3 = b^2 = e, ba = a^2 b\right>.

Example 1. Prove that the relations a^3 = b^2 = e are not sufficient to define S3.

Proof. In short, we need to show a3 and b2 do not generate the normal subgroup N in the epimorphism F({ab}) → S3 which maps ab to (1, 2, 3), (1, 2) respectively. We’ll do this in an indirect manner: suppose, on the contrary, N is the normal subgroup generated by a3 and b2.

Now, in the group Z/6, elements 2, 3 of also satisfy the relations (3·2 = 2·3 = 0 (mod 6)) and generate Z/6. So we get an epimorphism:

F({ab})/N → Z/6

which sends ab to 2, 3 respectively. But we already know F({ab})/N is isomorphic to S3. So this gives an epimorphism S3 → Z/6 which is clearly absurd. ♦

Example 2. Consider the group G of all bijections of Z of the form x → x+c and x → cx. Prove that this group is isomorphic to the group generated by ab with relations a2b2e.

Proof. The first order of the day is to figure out where to map a and b. Let’s take a, b to the maps f(x) = –x and g(x) = 1-x respectively. Clearly f(f(x)) = g(g(x)) = x so we do have a homomorphism

F(\{a,b\})/N(a^2, b^2) \to G, \quad a \mapsto f, b\mapsto g.

It suffices to show the map is bijective.

  • surjectivity: we need to show G is generated by f and g. Indeed, gf(x) = 1+x so we get all translations x+c. Compose with f to get the reflections cx.
  • injectivity: on the LHS, since a2 and b2 represent the identity, every element can be represented by an alternating sequence …abab…, where each side ends in either a or b. This maps to …fgfg… in RHS.
    • Suppose this result is the identity.
    • Since f and g have (coeff. of x) = -1, there must be an even number of terms, i.e. we have (fg)n or (gf)n.
    • But these two functions are respectively x → xn and x → x+n so they cannot be the identity except when n=0, which means …abab… is also the identity on the LHS. ♦

Example 3. Let K=Z act on H=Z, by mapping m in K to the map x → x+m. We’ll leave it to the reader to prove that the semidirect product G = H \rtimes K is isomorphic to the group generated by {ab} with relation baa-1b.

Word Problem

From the above examples, it’s natural to ask for a general algorithm to simplify an element of a group, with given generators and relations. Specifically, we have:

Word Problem. Let G be a group which is generated by finitely many generators, under finitely many relations. Given an arbitrary word (in the generators), determine whether it simplifies to the identity.

In particular, one hopes for a general (Turing) program which takes in a finite set of generators and relations, then determines whether a given word is the identity.

The question was answered negatively by Pyotr Novikov in 1955. In fact, he proved something much stronger: there is a particular finite set of generators and relations, such that no algorithm in the world can possibly take in a given word and determine within finite time whether the said word simplifies to the identity.

But that doesn’t mean the word problem can’t be solved for some particular special classes of groups. For example, we have:

Definition. Let m(i, j), 1 \le i, j\le n be a collection of positive integers such that m(i, i) = 1 and m(j, i) = m(i, j) > 1 for i ≠ j.

A reflection group is the group generated by elements s_1, s_2, \dots, s_n and relations (s_i s_j)^{m(i,j)} = e for any i, j.

In particular, we have s_i^2 = e for all i. For m(ij) = 2, this just means s_i s_j s_i s_j = e \implies s_i s_j = s_j s_i since s_i^2 = s_j^2 = e. It turns out the word problem for a reflection group is effectively solved. Furthermore, given any string of the si‘s, one can determine its shortest representation effectively.

Exercise : prove that the reflection group is not trivial. [ Hint (highlight to read): define a homomorphism which takes each generator s to -1; this defines a homomorphism G → {+1, -1} which takes each generator to -1. The map is surjective and hence G is non-trivial. ]

Posted in Notes | Tagged , , , , | Leave a comment

Intermediate Group Theory (4)

Applications

We’ll use the results that we obtained in the previous two posts to obtain some very nice results about finite groups.

Example 1. A finite group G of order p2 is isomorphic to either Z/p2 or (Z/p) × (Z/p). In particular, it is abelian.

Proof. Since G is a p-group, it has non-trivial centre, so Z(G) has order p or p2.

  • If it’s the 2nd case, G is abelian.
    • If there’s an element of G of order p2, then G is cyclic.
    • Else, pick x \in G-\{e\}. So x has order p. Pick y \in G-\left<x\right>, which again has order p. Since x and y commute, we can prove that x^i y^j, for 0 ≤ i, j ≤ p-1, are all distinct. This shows that G is isomorphic to (Z/p) × (Z/p).
  • Suppose it’s the first case. So Z(G) is cyclic of order p, generated by (say) x. Now we pick y\in G-Z(G). Again, we can prove that x^i y^j, for 0 ≤ i, j ≤ p-1, are all distinct. So every element of G can be written in this form, but this shows that G is abelian since x commutes with everything (including y):

(x^i y^j) (x^a y^b) = (x^i x^a)(y^j y^b) = x^{i+a}y^{j+b} = (x^a y^b)(x^i y^j)

which is a contradiction. ♦

Example 2. Suppose G has order pq, where pq are primes with p not dividing (q-1). Then G is cyclic.

Proof. By Cauchy’s theorem, G has a subgroup N of order q. This is a Sylow q-subgroup and all such subgroups are conjugate. The number of such subgroups divides pq and is 1 mod q – thus it divides p and can only be 1 or p. Since pq, it is 1. Thus, N is the unique Sylow q-subgroup of G, which means N is normal in G (since conjugation takes N to a Sylow q-subgroup, which has to be N).

By Cauchy’s theorem again, G has a subgroup H of order p. As before, the number of such subgroups divides q and is 1 mod p. Thus it equals 1 or q. Since q is not 1 mod p, it must be 1, so H is normal in G.

Thus, we have normal subgroups HN with orders qp respectively and trivial intersection. We must have HN ≡ H × N. Comparing order, HN must be the whole G. ♦

Example 3. If pq are primes and p divides q-1, then there is a non-abelian group G of order pq.

Proof. The construction is via semidirect product. Let HZ/q and KZ/p. Now the automorphism group Aut(H) has order q-1 which is divisible by p, so there is an element \phi\in \text{Aut}(H) of order p. This means we can let K act on H by mapping 1 mod p to \phi. Now construct the semidirect product G = H \rtimes K. It’s not abelian since:

(0,1)*(m,0) = (\phi(m), 1) and (m,0) * (0,1) = (m,1) are not equal. ♦

Note: when p=2, we obtain the dihedral group. It’s the group of symmetries (including reflections) of the regular q-gon, and can be generalised to the case when q is any integer > 1. ]

Example 4. If p is a prime, then there is a non-abelian group G of order p3.

Proof. Again use semidirect product. Let HZ/p2 and KZ/p. Now Aut(H) has order p(p-1) so it has an element \phi\in\text{Aut}(H) of order p. Let K act on H by mapping 1 mod p to \phi and construct the semidirect product G = H\rtimes K as before. ♦

[ From the above two examples, we see that the semidirect product is a useful method for constructing unusual groups. ]

Exercise: construct a non-abelian group G with normal subgroup N such that N and G/N are both isomorphic to Z.

Recall that A5 is a simple group. It turns out it’s the smallest non-abelian simple group since any group of order 1-59 is either abelian or has a non-trivial normal subgroup.

Let’s consider some examples. Subsequently, we’ll denote np for the number of Sylow p-subgroups.

Example 5. Prove that a group of order 12 has a non-trivial normal subgroup.

Proof. We have n2 | 3 and is 1 mod 2. If n2 = 1 then the Sylow 2-subgroup is normal, so we’ll assume n2 = 3. Similarly, n3 | 4 and is 1 mod 3, so let’s assume n3 = 4. But since a Sylow 2-subgroup and Sylow 3-subgroup intersect trivially, upon counting we get > 12 elements, which is impossible. ♦

Example 6. Prove that a group of order 24 has a non-trivial normal subgroup.

Proof. Again, n2 | 3 and is 1 mod 2, so let’s assume n2 = 3 (or the Sylow 2-subgroup would be normal). Consider the conjugation action of G on itself: this acts on the set of Sylow 2-subgroups, i.e. we get a homomorphism G → S3.

If we assume G is simple, the kernel (being a normal subgroup) is trivial or the entire group. It cannot be trivial since the map is not injective. So the homomorphism is trivial, i.e. conjugation leaves the three Sylow 2-subgroups invariant and all three are normal subgroups, which contradicts the fact that the three Sylow 2-subgroups are conjugate. ♦

Example 7 (Tricky!). Prove that a group of order 120 has a non-trivial normal subgroup.

Proof. Now n5 | 24 and is 1 mod 5. If we assume G is simple, then we must have n2 = 6. Now, the conjugation action of G on itself also acts on the set of the Sylow 5-subgroups transitively. Hence, we get a transitive action G → S6. Once again, since G is simple, we reason that the kernel is either trivial or the whole group. For transitive action, the only possibility is trival kernel, so we get an injective transitive action G → S6 so we might as well assume G is a subgroup of S6.

Now, A6 is a normal subgroup of S6 and thus G ∩ Ais a normal subgroup of G. This can only happen if G ∩ AG, i.e. G is in fact a subgroup of A6. This subgroup has index 360/120 = 3. Now G acts on the set of left-cosets A6/G by left multiplication and we get a homomorphism G → S3. Since G is simple, we’re forced to conclude the kernel is the entire group G, which is absurd since G acts transitively on the left-cosets. ♦

Exercise.

Prove that for groups of order up to 167, the only non-abelian simple group is A5 (of order 60). [ The hardest case of 120 has already been done. ]

Posted in Notes | Tagged , , , , , | Leave a comment

Intermediate Group Theory (3)

Automorphisms and Conjugations of G

We’ve seen how groups can act on sets via bijections. If the underlying set were endowed with a group structure, we can restrict our attention to bijections which preserve the group operation.

Definition. An automorphism of a group G is an isomorphism G → G. The set of all automorphisms of G forms a group Aut(G) under composition. An action of a group G on H is a group homomorphism G → Aut(H).

One useful automorphism of a group is the conjugation map we saw in the previous post: given g\in G, the conjugation map \phi_g : G\to G, x \mapsto gxg^{-1}. In the case of abelian groups, the conjugation map is trivial since gxg^{-1} = x.

Definition. The conjugation map \phi_g is called an inner automorphism. Let the group of \phi_g  be given by Inn(G).

Now let’s focus our attention on the conjugation action of G on itself.

Proposition. The homomorphism G \to Aut(G), g \mapsto \phi_g corresponding to the conjugation action has kernel:

Z(G) = \{g\in G: xg=gx\ \forall x\in G\}.

This is called the centre of G.

The proof is straightforward. It implies that the group of inner automorphisms is (canonically) isomorphic to G/Z(G).

Definition. Two elements x, y\in G of a group are said to be conjugates if they belong to the same orbit under the conjugancy map. Explicitly, this means y = gxg^{-1} for some element g of G.

In the case where G = S_n, we’ve already seen that two permutations are conjugate iff they have the same cycle structure. For example, S5 has 7 conjugancy classes, represented by e, (1, 2), (1, 2, 3), (1, 2, 3, 4), (1, 2, 3, 4, 5), (1, 2)(3, 4) and (1, 2)(3, 4, 5). Check that these are of sizes 1, 10, 20, 30, 24, 15 and 20 respectively.

If we consider the alternating group An, then this is no longer true. E.g. check that the group A5 has 5 conjugancy classes, represented by e, (1, 2, 3), (1, 2)(3, 4), (1, 2, 3, 4, 5) and (1, 2, 3, 5, 4). These have sizes 1, 20, 15, 12 and 12 respectively. Since a normal subgroup is closed under conjugation, it is a union of conjugancy classes. However, no non-trivial union of conjugancy classes of A5 is a factor of 60, so we’ve just proven the following.

Theorem. The only normal subgroups of A5 are {e} and itself, i.e. A5 is simple.

Now, we bring in results from the previous two posts regarding group actions on sets. First, we partition the set G into orbits under conjugancy to obtain:

|G| = |Z(G)| + \sum_x |G|/|G_x|,

since the centre Z(G) is precisely the set of fixed points under conjugation, and x runs over the conjugancy classes with >1 element. Now G_x =\{g\in G: gx = xg\} is the isotropy group of an element x.

If G is a p-group, then since |G|/|G_x| > 1, each summand must be divisible by p. In particular |Z(G)| is divisible by p, and so is non-trivial.

In summary: a p-group has non-trivial centre. 

Outer Automorphisms

First, we claim that the group of inner automorphisms (i.e. conjugate maps) forms a normal subgroup of Aut(G).

Proof. Take automorphisms \pi:G \to G and \phi_g : G\to G. Then:

(\pi \phi_g \pi^{-1})(x) = \pi(g\cdot \pi^{-1}(x)\cdot g^{-1}) = \pi(g)\cdot \pi(\pi^{-1}(x))\cdot \pi(g^{-1}) = \pi(g)x\pi(g)^{-1}.

Thus \pi\phi_g\pi^{-1} = \phi_{\pi(g)} is also an inner automorphism, so the first assertion is done. ♦

Definition. The quotient group Out(G) := Aut(G)/Inn(G) is called the group of outer automorphisms of G.

[ Beware that an element of Out(G) is not an automorphism, but an entire class of automorphisms. ]

The group Out(G) is critical for solving the extension problem for groups. Recall that this is the problem where you’re given groups N and H and must classify all G containing N for which G/N\cong H. We’ll illustrate one example by answering an earlier question regarding products of subgroups.

To summarise: let HK ≤ G be subgroups such that H is normal in G. Also, H ∩ K = {e}. Then HK is a subgroup and the product map H × K → HK which takes (hk) to hk is bijective but not a homomorphism as elements of H do not commute with those of K in general. However, we’ll use the following expression:

Note: upon swapping k and h’, we have replaced h’ by its conjugate with k, while all other terms remain unchanged.

The “conjugation-by-k” map thus gives rise to an automorphism \phi_k : H\to H. Note that since k doesn’t belong to H in general, the map \phi_k is not an inner automorphism. The mapping k\mapsto \phi_k thus gives an action of K on H.

Now we reverse this construction.

Definition. Suppose H and K are abstract groups, and K acts on H via k \mapsto (\phi_k:H \to H). The semidirect product H\rtimes_\phi K is the set H\times K, together with the product:

If the action is assumed to be known, we just write H\rtimes K.

Having motivated the definition of the semidirect product, it’s quite trivial to verify that the above construction indeed gives a group. Note that the identity of H\rtimes K is just (e, e) and the inverse of (hk) is given by (\phi_k^{-1}(h), k^{-1}). Also,

  • the subset of H\rtimes K comprising of elements (ek) is a subgroup;
  • the subset comprising of (he) is a normal subgroup since the second component of the resulting product of (h, k)^{-1} * (h', e) * (h,k) is the identity;
  • these two subgroups intersect trivially.

Example

The symmetric group Sn has normal subgroup An as well as subgroup H = {e, (1,2)}. Since H \cap A_n = \{e\}, we see that the symmetric group is the semidirect product of the groups An and H.

Exercise 1 : consider the cyclic group Z/pq for primes p and q. If pq, we already know that this group is isomorphic to Z/p × Z/q and is hence a non-trivial direct product. In particular it is a semidirect product (the direct product is a special case where the action is trivial). Now suppose p=q. Is Z/p2 a non-trivial semidirect product of two subgroups?

Exercise 2 : let K act on H in two ways via \phi, \psi : K\to \text{Aut}(H). This gives us semidirect products H\rtimes_\phi K and H\rtimes_\psi K. Prove that if projecting to Out(H) gives identical maps \overline\phi = \overline\psi : K\to \text{Out}(H), then the two semidirect products are isomorphic.

Answers (highlight to read).

  1. No because the group has a unique subgroup of order p.
  2. (Sketch) Suppose ψ(x) = y·φ(xy-1 for some y in H. The element (hk) in semidirect product for φ then corresponds to the element (y-1hyk) in semidirect product for ψ. This gives an isomorphism between the two semidirect products.
Posted in Notes | Tagged , , , , , | Leave a comment

Intermediate Group Theory (2)

This is a continuation from the previous post. Let G act on set X, but now we assume that both G and X are finite. Since X is a disjoint union of transitive G-sets, and each transitive G-set is isomorphic to G/H for some subgroup H ≤ G, it follows that we have:

|X| = \sum_i |G/H_i| = \sum_i \frac{|G|}{|H_i|}.

Let’s consider the case where HiG. The corresponding G-sets are merely singleton sets since |G/Hi| = 1. These are the elements x of X for which gxx for all g in G. We call such elements fixed points of X and denote them XG.

Now we can modify our equation to obtain:

|X| = |X_G| + \sum_i \frac{|G|}{|H_i|},

where each Hi is a proper subgroup of G. Let’s specialise even further.

Definition. Let p be prime. A finite group G≠{e} is called a p-group if its order is a power of p. A subgroup which is a p-group is also called a p-subgroup.

Plugging this into the above equation, if G is a p-group, then each |G|/|Hi| is a positive power of p and so is divisible by p. We thus obtain:

|X| \equiv |X_G| \pmod p.

This will be of great use to us.

Sylow’s First Theorem

The three Sylow theorems are extremely powerful results regarding the structure of finite groups. Let’s start with the preliminary result.

Cauchy’s Theorem.  If |G| is divisible by prime p, then G has a subgroup of order p.

Proof. Use the following steps.

  1. Let X = \{(x_0, \dots, x_{p-1}) : x_i \in G, x_0 x_1 \ldots x_{p-1} = e\}.
  2. Let Z/p act on it by left-rotation: m\cdot (x_0, x_1,\ldots, x_{p-1}) = (x_m, x_{m+1}, \ldots, x_{p-1}, x_0, \ldots, x_{m-1})where m = 0, 1, …, p-1. [ Check that x_m x_{m+1} \ldots x_{p-1} x_0 \ldots x_{m-1} = e even if G is not abelian. ]
  3. Easy to count X : the first p-1 elements determine the final one uniquely, so |X| = |G|^{p-1}.
  4. The set of fixed points XZ/p is precisely the set of (x, …, x) (x in G) such that xpe.
  5. XZ/p is not empty since it contains (e, …, e).
  6. Apply the above congruence: since Z/p is a p-group, |XZ/p| ≡ |X| ≡ 0 (mod p) so there’s a non-trivial element x of G such that xp = e. ♦

The first Sylow theorem is a more general version of Cauchy’s theorem, but its proof requires Cauchy’s theorem to kickstart it.

First Sylow Theorem. If |G| has order divisible by p^{m+1} and H<G is a subgroup of order p^m then H is contained in a subgroup K≤G of order p^{m+1}.

In short, by Cauchy’s theorem G has a subgroup H1 of order p. First Sylow theorem says this subgroup must be contained in another subgroup H1 < G of order p2. Applying this theorem repeatedly gives a sequence H_1 < H_2 < H_3 \ldots \le G where each |H_i| = p^i.

Proof.

Case 1. If H is a normal subgroup of G, then G/H is a group of order divisible by p. Apply Cauchy’s theorem to obtain a subgroup G/H of order p, which corresponds to a group HK ≤ G of order × |H|. Done.

Case 2. If not, let’s try to “make it normal”. Hence, let N(H) = \{g\in G: gHg^{-1}=H\}. This is called the normaliser of H since it is the largest subgroup of G for which H is normal in. Thus, H \triangleleft N(H) and it remains to show [N(H):H] is a multiple of p, after which we can replace G by N(H) and apply case 1. To show this,

  1. consider the action of H on the set of left-cosets XG/H by left multiplication (h, gH) \mapsto hgH;
  2. |G/H| is divisible by p by assumption, so |XH| is also divisible by p;
  3. but XH is the set of gH for which hgHgH for all h in H; i.e. the set of gH for which gHg-1H; in short, XN(H)/H. ♦

Second and Third Sylow Theorems

Let G be of order prm, where m is not divisible by prime p. Thanks to the first Sylow theorem, we know that there’s at least one subgroup of order pr. We shall call this a Sylow p-subgroup.

Second Sylow Theorem. Any two Sylow p-subgroups of G are conjugates, i.e. if |H|=|P|=p^r, then there is g\in G such that P = gHg^{-1}.

More generally, if P is a Sylow p-subgroup of G, and H is a p-subgroup of G, then we can find g\in G, gHg^{-1} \subseteq P.

Proof.

Let H act on the set XG/P of left-cosets via left multiplication. Since P is Sylow p-subgroup, |X| is not a multiple of p. Thus, neither is |XH|. In particular, XH is not empty so contains some gP. Since this is fixed under action by H, we have hgPgP for all h in H, and so g^{-1}Hg \subseteq P in particular. ♦

Third Sylow Theorem. Let n_p be the number of Sylow p-subgroups of G. Then:

  • n_p divides |G| and
  • n_p \equiv 1 \pmod p.

Proof.

First assertion: let P be any Sylow p-subgroup of G. Then all Sylow p-subgroups are of the form gPg^{-1}, g\in G. Repetition occurs iff

xPx^{-1} = yPy^{-1} \iff y^{-1}xP(y^{-1}x)^{-1} = P \iff y^{-1}x \in N(P) \iff y\cdot N(P) = x\cdot N(P).

So the distinct Sylow p-subgroups correspond to the left cosets G/N(P), i.e. n_p = |G|/|N(P)|, so it divides |G|.

Second assertion: let P act on the set XG/P of left cosets (by left multiplication again). By the above congruence, we have:

|X| \equiv |X_P| \pmod p.

  • Clearly LHS =  [GP].
  • On the other hand, the set XP comprises of all gP such that xgPgP for all x in P. This holds iff g^{-1}Pg = P. But this precisely means that g\in N(P). Hence X_P = N(P)/P so RHS = [N(P): P].
  • Since LHS is not divisible by p, dividing by RHS gives [G:N(P)] \equiv 1 \pmod p which completes the proof. ♦
Posted in Notes | Tagged , , , , , | Leave a comment

Intermediate Group Theory (1)

Given a group G, we wish to find out more about its properties. Questions include: what subgroups does it have? And normal subgroups? How many elements of order m does it have (where m must divide the order of G if the latter is finite)? It turns out the most fruitful way of looking at this problem is by concretely representing the group as the set of symmetries of some object, then looking at the structure of the object.

For finite groups, some possibilities include:

  • set of permutations on a finite set;
  • set of automorphisms of some group H (i.e. isomorphisms H → H);
  • set of invertible linear maps on a vector space.

We’ll be looking at the first two cases for now. Though the third case (representation theory) is actually much more interesting, it requires significantly more background to fully comprehend.

Group Actions

Let G be a group.

Definition. An action of G on a set X is a group homomorphism \phi : G \to S_X, where S_X is the group of all bijections X → X. The set X, together with the underlying action of G, is called a G-set.

Symbolically, the following notation is convenient. If g\in G and x\in X, then write “g·x” or “gx” for the element \phi(g)(x) \in X. Notice that the following conditions are satisfied.

  • ex = \phi(e)(x) = id_X (x) = x for any x in X.
  • g_1(g_2 x) = g_1(\phi(g_2)(x)) = \phi(g_1)(\phi(g_2)(x)) = \phi(g_1 g_2)(x) = (g_1 g_2)x for any g_1, g_2 \in G, x\in X.

[ The above notation does get a bit hairy (and will get worse), so if you’re new to this, do make sure everything works out. Or better yet, see if you can derive these results on your own. Diagrammatically, we have the following. ]

It turns out the above two conditions suffice to determine the original homomorphism. To put it explicitly, suppose we have a map G × X → X, denoted (gx) → gx, such that: (i) exx and (ii) g1(g2x) = (g1g2)x. Then there is a group homomorphism \phi : G \to S_X such that gx = \phi(g)x. The proof is as follows.

  • For each g \in G, the map X \to X, x \mapsto gx is bijective:
    • injectivity follows from gx = gy\implies g^{-1}(gx) = g^{-1}(gy) \implies (g^{-1}g)x = (g^{-1}g) y \implies ex = ey \implies x=y;
    • surjectivity follows from g(g^{-1}x) = (gg^{-1})x = x.
  • It follows that we have a map G \to S_X, where g\mapsto (x \mapsto gx). Condition (ii) then implies that this is a homomorphism. ♦

Symbolically, this means that associativity holds for a word like:

g_1 g_2 \dots g_n x, where g_1, g_2, \dots, g_n \in G and x\in X,

so the brackets do not matter as long as the last “character” in the word belongs to X.

Examples of Group Actions

  1. The symmetric group Sn acts on the set {1, 2, …, n} naturally.
  2. Any group G acts on the underlying set of G itself by left multiplication, i.e. underlying map × G → G is exactly group product, or g is taken to the permutation taking x\in G \mapsto gx\in G.
  3. Related  to (2), G also acts on itself by right multiplication of g-1, i.e. g is taken to the permutation x\in G\mapsto xg^{-1}\in G.
  4. One can compose (2) and (3) since they commute: this gives the conjugancy map \phi_g : G \to G, \phi_g(x) = gxg^{-1}. We’ll have ample chance to use this later since \phi_g is in fact an automorphism of G.
  5. Let H ≤ G be a subgroup and take the set G/H of left cosets gH. Then G acts on this set G/H via (g, g'H) \mapsto gg'H.
  6. Let X be the set of all subgroups of G. Each g\in G and subgroup H\le G give a conjugate gHg^{-1} \le G. This gives an action of G on X.

Here’s an example of an application: we shall prove that the symmetric group A5 of order 60 has no subgroup of order 20. To show this, use the classical result that A5 is a simple group (i.e. it has no normal subgroups except {e} and itself). Now if H\le A_5 is a subgroup of index 3, then A5 acts on the set of left cosets A5/H, which has 3 elements. This gives a homomorphism A_5 \to S_3. The kernel is a normal subgroup of A5 (which is simple) and so must be the whole group A5. So A5 acts trivially on the set of left cosets, which is impossible (why?).

Properties of Group Actions

Let’s look at a G-set X (i.e. a set equipped with an action of G). One can look at either the group or the set.

Definition. The stabiliser group (or isotropy group) is defined by G_x = \{g\in G : gx = x\}.

For x, y\in X, we write x ~ y if there is a g\in G such that gx = y. If this holds for all x, y, we say the group is transitive.

Some important properties need to be proven.

  1. The stabiliser group is a group (as implied by the name) :
    • by definition it contains the identity and is closed and multiplication;
    • for inverse, we have x = ex = g^{-1}(gx) = g^{-1}x.
  2. The relation x ~ y is an equivalence relation.
    • reflexive: xx since exx;
    • symmetric: if xy then gxy so g^{-1}y = g^{-1}gx = ex = x;
    • transitive: if xy and yz then gxy and g’yz, so z = (g’g)x.

By property 2, each G-set X can be partitioned into a disjoint union of equivalence classes (known as orbits), each of which is a transitive G-set. For example, in example 6 above, since the conjugation map G\to G, x \mapsto gxg^{-1} is bijective, the resulting subgroups H and gHg-1 have the same cardinality. So we could consider the action of G on the set Xm of all subgroups of order m (for various m), although this action is usually not transitive.

In short, it suffices to look at transitive G-sets and it turns out they can be nicely classified!

Theorem. Every transitive G-set is isomorphic to G/H for some subgroup H of G.

First, we explain what is meant by “isomorphism” of G-sets.

In set theory, we have sets and functions f : S → T from one set to another. In group theory, we’ve already seen that we should be looking at group homomorphisms fG → H, i.e. functions which preserve the underlying algebraic structure (symbolically, we write f(x)f(y) = f(xy)). What about G-sets?

Likewise, when we consider functions f : X → Y between G-sets, we should focus on those functions which respect the action of G. Thus, a homomorphism of G-sets is just a function f : X → Y such that

f(gx) = g·f(x) for any g in Gx in X.

An isomorphism is then a bijective homomorphism as before, and intuitively it just means the two G-sets have identical structures up to relabelling.

Proof of above theorem. Pick any element x in X and let H = G_x be its stabiliser group. We’ll map f : G/H \to X via gH \mapsto gx. Let’s prove it’s an isomorphism of G-sets:

  • well-defined and injective: gH = g'H \iff g^{-1}g' \in H=G_x \iff g^{-1}g'x = x \iff gx = g'x;
  • surjective: obvious since action of G on X is transitive;
  • homomorphism: f(g'(gH)) = f((g'g)H) = g'gx = g'\cdot f(gH). ♦

Thus, any G-set X can be pictorially represented as a union of orbits:

Posted in Notes | Tagged , , , , , , | Leave a comment

Intermediate Group Theory (0)

Let’s take stock of what we know about group theory so far in the first series.

  • We defined a group, which is a set endowed with a binary operation satisfying 3 properties.
  • For each group, we considered subsets which could inherit the group structure. These were called subgroups.
  • Each subgroup neatly partitioned the ambient group into (left/right) cosets, which gives rise to Lagrange’s theorem (among other things).
  • If the subgroup is normal, then the collection of cosets forms a group, called the quotient group.
  • Homomorphisms are functions between groups which respect the underlying operation. The kernel (set of elements mapping to the identity) is a normal subgroup, while the image is a subgroup. By the first isomorphism theorem, every homomorphism is effectively a composition of projection G → G/N, composed with injective G/N → H.

These are basic properties of a group which seem rather natural to consider. What other interesting questions of groups can we pose?

Question 1 : is the converse to Lagrange’s theorem true?

The theorem states that a subgroup H of G satisfies |H| | |G|. Conversely, if |G| = n has a factor m, one asks if G must have a subgroup of order m.

The answer is no but finding a counter-example is no easy affair: the smallest example I can think of is G = A_5 of order 60, and m=30. This follows from the two observations:

  • A_5 has no normal subgroup except {e} and itself;
  • any subgroup H ≤ G of index 2 is normal (because if g\in G - H, then G is the disjoint union of H and gH, and also the disjoint union of H and Hg; thus gHHg).

Thus, our naive conjecture turned out to be false. But instead of giving up, let’s consider circumstances where this is true. It turns out, if m is a prime-power which divides n = |G|, then G has a subgroup of order m. This is part of Sylow’s theorems, which offers great insight into the structure of finite groups and their subgroups.

Question 2 : how many groups, up to isomorphism, are there of order n for n = 1, 2, … ?

When n is prime, we already know the answer: the group is cyclic. With a bit of work, one can prove later that if np2 (p prime), then the only possibilities are Z/p2 and (Z/p) × (Z/p). Things quickly get worse as we consider higher powers of prime.

An interesting phenomenon is that we get the largest number of groups for 2k. To be specific, for n close to 2k, there are far more groups of order 2k than not. In fact, among all groups of order at most 2000more than 99% of them have order 1024. And if we exclude these, more than 96% of the remaining have order 768.

Question 3 : can we consider the structure of a group by “factoring” it into its constituents?

If N is a normal subgroup of G, then we obtain subgroup N and quotient group G/N. Conversely, one hopes to reconstruct G by looking at the smaller two constituents N and G/N. The question then becomes the following.

Question 4. Given N and H, how do we construct all groups G containing N such that G/N \cong H?

The extension problem is theoretically solved, but actually computing all the extensions can be a real hassle. We hope to go through some explicit examples by constructing extensions of groups of order 2n.

But even if the extension problem is solved, there remains the problem of classifying groups with no nontrivial subgroups.

Question 5. A group of order > 1 is said to be simple if its only normal subgroups are {e} and itself.

[ We exclude the trivial group since we need simple groups as building blocks for more complicated groups. In this regard, the trivial group is useless. Just like 1 is not a prime number. ]

If G is abelian, then G is simple if and only if it has prime order (Exercise: prove it, using the fact that all subgroups of abelian groups are automatically normal.)  For non-abelian groups, things get much stickier. The classification of finite simple groups was a massive endeavour undertaken by numerous mathematicians, which lasted until the mid 1980s. It was only twenty years later that a gap was found and had to be filled in. Here’s a more detailed overview of the history.

Speaking of breaking a group into constituents, one naturally asks if the set of constituents is unique. Explicitly:

Question 6. Let G be a finite group. Write:

\{e\} = H_0 \triangleleft H_1 \triangleleft H_2 \triangleleft \dots \triangleleft H_r = G,

where each H_i \triangleleft H_{i+1} and the quotient H_{i+1}/H_i is simple. Is the set of H_{i+1}/H_i unique up to isomorphism?

The answer is yes: this is the Jordan-Hölder theorem, which holds in far greater generality. Its proof is rather technical so we won’t be going through it.

Posted in Notes | Tagged , , , , , | 1 Comment

Casual Introduction to Group Theory (6)

Homomorphisms

[ This post roughly corresponds to Chapter VI of the old blog. ]

For sets, one considers functions fS → T between them. For groups, one would like to consider only actions which respect the group operation.

Definition.  Let G and H be groups. A function f : G → H is a homomorphism if f(xy) = f(x)f(y) for all x, y in G. Note that the product on LHS is in group G, while the product on RHS is in group H.

A homomorphism is called a monomorphism (resp. epimorphismisomorphism) if it is injective (resp. surjective, bijective).

Clearly, if fG → H and gH → K are both homomorphisms of groups, then so is gfG → K. Furthermore, f satisfies the following.

  • f(e_G) = e_H : we have f(e_G) = f(e_G * e_G) = f(e_G) * f(e_G), so cancel f(e_G) on both sides;
  • f(x)^{-1} = f(x^{-1}): we have f(x)f(x^{-1}) = f(x*x^{-1}) = f(e_G) = e_H, so multiply both sides by f(x)^{-1}.

These are the least we should expect out of homomorphisms. It also turns out a group homomorphism preserves subgroups, in the following sense:

Theorem.  Let f : G → H be a group homomorphism.

  • If K < G is a subgroup then f(K) < H is a subgroup.
  • If L < H is a subgroup, then f-1(L) < G is a subgroup.
  • If L is a normal subgroup of H, then f-1(L) is a normal subgroup of G.

Definition. In particular, the image im(f) := f(G) is a subgroup of H, and the kernel ker(f) := f-1({e}) is a normal subgroup of G.

The proof is easy so we’ll leave it to the reader. Notice one missing case: i.e. if K is a normal subgroup of G, is f(K) normal in H? A moment’s thought tells us this is obviously not true: pick a subgroup H of G which is not normal; then the inclusion H → G has image precisely H, which is not a normal subgroup of G by construction.

This will be a recurring theme throughout algebra: pullbacks f^{-1}(Y) are generally more well-behaved than images f(X).

Examples

  1. If N is a normal subgroup of G, then the projection map fG → G/N which takes g to gN is an epimorphism. The kernel is precisely N. In fact, this characterises all epimorphisms from G as we shall see from the isomorphism theorems.
  2. If GSn (n>1), the set of n-degree permutations, then the sign map f: G \to \{+1, -1\} which takes even (resp. odd) permutations to +1 (resp. -1) is an epimorphism. The kernel is An.
  3. If G = GLn(R) denotes the group of n × n invertible matrices with real entries, then the determinant map det : G → R* is a group homomorphism. The kernel is SLn(R), the set of matrices with determinant = 1.

First Isomorphism Theorem

Although there’re three, it’s really the first that’s most important. The remaining two can be derived from this first one.

First Theorem of Isomorphism. Let f : G → H be a group homomorphism with kernel N = ker(f). This gives an isomorphism φ : G/N → im(f), where φ(gN) = f(g).

Proof. There’re several things to check for the map φ.

  • Well-defined & injective: xN = yN \iff y^{-1}x \in N \iff f(y^{-1}x) = e \iff f(y)^{-1}f(x) = e \iff f(x) = f(y).
  • Surjective: follows from definition of im(f).
  • Homomorphism: \phi(xN * yN) = \phi(xyN) = f(xy) = f(x)f(y) = \phi(xN)\phi(yN).

Proof complete. ♦

Pictorially, we get:

Here’s one application of the theorem. Suppose HG is a subgroup and N is a normal subgroup of G. This gives homomorphisms H → G → G/N so the composed map is also a homomorphism. The kernel of the resulting map is \{x\in H: xN = N\} = H\cap N. So by the first isomorphism theorem, we get an injective map

H/(N\cap H) \hookrightarrow G/N.

Here’s another example: let’s prove that

  1. GL_n(\mathbf{R})/SL_n(\mathbf{R}) is isomorphic to the group R*;
  2. if G denotes the set of complex numbers z with |z|=1, prove that G is isomorphic to R/Z, where Z=group of integers, R=group of real numbers, both under addition.

The method in both cases is identical. Construct a map from one group to the other; find the kernel, then apply the above first isomorphism theorem.

  1. Take the determinant map GL_n(\mathbf{R}) \to \mathbf{R}^*. The map is surjective (e.g. the diagonal matrix with one entry x and remaining entries 1 has determinant x). The kernel is SL_n(\mathbf{R}). So the result follows from the first theorem of isomorphism.
  2. Take the exponential map R → C*, where x goes to exp(2πix). The kernel is exactly the set of integers while the image is the set of complex numbers z with |z|=1. The result follows from the theorem again.

“Factoring Through”

Let fG → H be a homomorphism. Suppose we know of elements xy for which f(x) = f(y) = e. Then since the kernel of f contains both x and y, it must contain N(xy), the normal subgroup generated by {xy}. So we get a series of group homomorphisms:

G\to G/N(x,y) \to G/\text{ker}(f) \to \text{im}(f) \hookrightarrow H,

where the first two maps are projections (since N(x, y) ≤ ker(f)), the third is due to the first isomorphism theorem and the fourth is just good old inclusion. We thus conclude that f gives rise to G/N(xy) → H.

This often happens when one doesn’t quite know the kernel,  but can obtain a few explicit elements in it. The question is whether these elements generate the full kernel. One can, at the very least, say that the map f factors through G/N(xy) → H.

Second and Third Theorems of Isomorphism

Let’s look at the remaining two theorems of isomorphisms.

Second Theorem of Isomorphism. Let H, K ≤ G be subgroups with H normal in G. Then we already know KH is a group. We get the isomorphism

K/(K\cap H) \cong KH/H.

Proof.

Map K → KH by inclusion and KH → KH/H by projection (this is well-defined since H is normal in KH). The composition is surjective because every element of KH/H is of the form (kh)HkH. The kernel is the set of k in K such that kHH, i.e. kernel = K ∩ H. Thus the result follows from the first theorem of isomorphism. ♦

Third Theorem of Isomorphism. Let N ≤ H ≤ G be subgroups such that N and H are normal in G (this also means N is normal in H). Then we have the isomorphism

(G/N)/(H/N) \cong G/H.

Proof.

Construct a map fG/N → G/H via f(gN) = gH. This is well-defined since

gN = g'N \iff g^{-1}g' \in N \implies g^{-1}g' \in H \iff gH = g'H.

It is clearly surjective. The kernel is the set of xN such that xHH, i.e. the kernel is exactly H/N. Now apply the first theorem of isomorphism. ♦

Correspondence between G and G/N

Let N \triangleleft G. The key result here is the interplay between subgroups of G and subgroups of G/N.

Theorem. There is a one-to-one correspondence :

\{H : N \le H \le G \text{ subgroup}\} \leftrightarrow \{K : K \le G/N \text{ subgroup}\}

which preserves:

  • inclusion : H_1 \subseteq H_2 \iff K_1 \subseteq K_2;
  • normality : H_1 \triangleleft H_2 \iff K_1 \triangleleft K_2;
  • indices : [H_2 : H_1] = [K_2 : K_1] whenever H_1\subseteq H_2.

Here, H_1 and H_2 are subgroups on the LHS corresponding to subgroups K_1 and K_2 on the RHS.

Sketch of Proof.

Let \pi : G \to G/N be the projection map. Let’s construct maps between the two sides.

  • RHS → LHS : for subgroup K ≤ G/N, take H = \pi^{-1}(K) which is a subgroup of G containing N = ker(π);
  • LHS → RHS : for N ≤ H ≤ G, take K = \pi(H) \le G/N.

We’ll leave it to the reader to show that the composition of the two maps – in either order – is the identity map (on either the LHS or RHS). This shows the bijection.

  • Inclusion follows immediately from the definition of the two maps.
  • For normality, prove that K is normal in G/N iff \pi^{-1}(K) is normal in G.
  • For indices, prove that a complete set of coset representatives of [\pi^{-1}(K_2) : \pi^{-1}(K_1)] is exactly a complete set of coset representatives of [K_2 : K_1]. ♦

Example of a “Universal” Proof

Of course there’re more properties we haven’t mentioned: e.g.

If H_i \leftrightarrow K_i is a collection of correspondences, then the intersection also corresponds: \cap_i H_i \leftrightarrow \cap_i K_i.

This can be proven in a straightforward way, but it’s “better” to use the existing properties. Now if H = \cap_i H_i, then H satisfies the property:

  • H \subseteq H_i for each i;
  • if there is an H’ such that H' \subseteq H_i for each i, then H' \subseteq H.

The corresponding K must then satisfy the same properties, albeit with Ki‘s:

  • K \subseteq K_i for each i;
  • if there is a K’ such that K' \subseteq K_i for each i, then K'\subseteq K.

Thus the only possibility for K must be the intersection of the Ki‘s.

[ Note: this proof is “better” in the sense that the approach is more high-level (technical term: it is more universal). While the universal method doesn’t seem any shorter now, it lends itself to replication for many different types of algebraic structures, some of whose definitions are extremely intricate and involved. ]

Posted in Notes | Tagged , , , , , , | Leave a comment

Casual Introduction to Group Theory (5)

Normal Subgroups and Group Quotients

[ This corresponds to approximately chapter V of the old blog. ]

We’ve already seen that if H ≤ G is a subgroup, then G is a disjoint union of (left) cosets of H in G. We’d like to use the set of these cosets to form a group. Naturally, we define:

xHyH := (xy)H.

The only problem which occurs is whether this is well-defined (note: this will be a recurring theme in algebra). Explicitly, if xHx’H and yHy’H, then is it true that (xy)H = (x’y’)H? We won’t repeat the argument here, but the end result is that the definition makes sense if and only if H is normal in G:

Definition. A subgroup H of G is said to be a normal subgroup  (written H \triangleleft G) if for any g\in G and h\in H, we have ghg^{-1} \in H.

If N \triangleleft G, then the set of cosets G/N can be made into a group via (gN)*(g’N) := (gg’)N.

Equivalently, we can say gHg^{-1} \subseteq H for any g in G. In fact, this implies gHg^{-1} = H for any g in G. [ To see the reverse inclusion, replace g by its inverse to obtain g^{-1}Hg \subseteq H \implies H \subseteq gHg^{-1}. ] So in fact, saying H is a normal subgroup is equivalent to saying the left cosets gH are identical to the right cosets Hg.

Be careful: for general subgroup H of G, it’s possible to find an element g such that gHg^{-1} \subset H strictly. E.g. consider G = GL_2(\mathbf{R}) and the subgroup H = \left\{\begin{pmatrix} 1 & n \\ 0 & 1\end{pmatrix} : n \in \mathbf{Z}\right\}. The element g = \begin{pmatrix} 1 & 0 \\ 0 & 1/2\end{pmatrix} then gives gHg^{-1} = \left\{\begin{pmatrix} 1 & 2n \\ 0 & 1\end{pmatrix} : n\in \mathbf{Z}\right\} which is properly contained in H. If you think this contradicts the previous paragraph, mull over it a bit more.

Examples

Let’s consider the examples in the previous posts. In the abelian case, all subgroups are normal since ghg^{-1} = gg^{-1}h = h. But let’s see what their quotient groups look like.

  1. For mZZ, where m>0, the quotient is precisely Z/m. In fact, this is precisely the definition of Z/m.
  2. For Q>0Q*, it’s quite easy to prove that \mathbf{Q}^* \cong \mathbf{Q}^{>0} \times {\pm 1}. So the quotient is (isomorphic to) Z/2. For Q*2 < Q>0, the resulting quotient is a product of infinitely many copies of Z/2: one for each prime p.
  3. The quotient of a cyclic group is still cyclic, so for k·(Z/m) < Z/m (where k|m) the resulting quotient is Z/k
  4. For (Z/21)*2 < (Z/21)*, the resulting group quotient has order 4. Since the square of every element lies in the subgroup, every element of the quotient has order 2. By the previous post, we see that the group is Z/2 × Z/2.

For the non-abelian cases:

  1. The image of Sm in Sn is not normal if 1 < mn. This is because although (1,2) lies in the subgroup, (2,n)(1,2)(2,n)^{-1} = (1,n) does not.
  2. The subgroup VS4, where V = {e, (1,2)(3,4), (1,3)(2,4), (1,4)(2,3)}, is a normal subgroup. To see why, our first post on permutations mentioned that if g and h are permutations, then ghg^{-1} has the same cycle structure as h. But V already contains all permutations with cycle structure 2+2. The quotient has order 6. Check that it’s isomorphic to S3. [ Worst come to worst, just draw the 6 × 6 multiplication table. ]
  3. For SLn(R) < GLn(R), the subgroup is normal since if det(h)=1, then \det(ghg^{-1}) = \det(g)\det(h)\det(g)^{-1} = \det(g)\det(g)^{-1} = 1.

Properties of Normal Subgroups

Some basic properties include:

  1. The intersection of normal subgroups of G is also normal.
  2. If N is a normal subgroup of G, and H is an arbitrary subgroup of G, then N ∩ H is normal in H.
  3. Normality is not transitive, i.e. we can find H normal in G, N normal in H, but N is not normal in G.

The proofs of (1) & (2) are easy and left as exercises. For (3), pick \{e, (1,2)\} \triangleleft V \triangleleft S_4. But clearly {e, (1,2)} is not normal in S4. Property (1) has an interesting implication: if S is some subset of G, we let N(S) be the intersection of all normal subgroups of G containing S. As beforeN(S) is called the normal subgroup of G generated by S and is the “smallest” normal subgroup of G containing S. This will come in handy later.

Symbolic Visualisation

I find the following visualisation helpful. For normal subgroup N \triangleleft G, since gNg-1N, we have gNNg and thus:

(gN)(g'N) = g(Ng')N = g(g'N)N = (gg')(NN) \subseteq (gg')N

where for any subsets ST of G, we define ST = \{xy : x\in S, y\in T\}. The inclusion NN \subseteq N is equivalent to saying N is closed under group product *. Actually we even have NNN, but we don’t need that to show that * is well-defined on the set of cosets.

The above symbolic visualisation will be helpful in the following.

Proposition. Let H, K ≤ G be subgroups. In general, HK is not a subgroup of G. However:

  1. if H (or K) is normal in G, then HK is a subgroup;
  2. if H and K are both normal in G, then HK is normal in G.

Sketch of Proof.

To show that HK is not a subgroup in general, let H = {e, (1,2)} and K = {e, (2,3)} in S3. Now, for (1), if H is normal in G, then for any elements kk’ of K, we have:

(Hk)(Hk') = H(kH)k' = H(Hk)k' \subseteq H(kk') \subseteq HK.

For inverse, (Hk)^{-1} = k^{-1}H = Hk^{-1} \subseteq HK.

To prove (2), if H and K are both normal, then for any element g of G, we have:

g(HK) = (gH)K = (Hg)K = H(gK) =H(Kg) = (HK)g

which completes the proof rather neatly and you can easily recall it any time. ♦

Group Products

Consider normal subgroups H, K\triangleleft G as above. If we assume they intersect trivially (H\cap K = \{e\}) then elements of H must commute with elements of K since:

(hk)(kh)^{-1} = hkh^{-1}k^{-1} = \overbrace{(hkh^{-1})}^{\in K}k^{-1} = h\overbrace{(kh^{-1}k^{-1})}^{\in H}

must lie in both H and K and so must be the identity. Furthermore, the multiplication map fH × K → HK is an isomorphism. Why? Because

  • f maps (hk) * (h’k’) = (hh’, kk’) to hh’kk’ = hkh’k’, which is the product of f(hk) and f(h’k’). [ A more proper notation would be f((hk)) but you know what we mean here. ]
  • The map is surjective by definition of HK.
  • The map is injective since hk = h'k' \implies h'^{-1}h = k'k^{-1} \in H\cap K. So it must be the identity and h=h’k=k’.

Conclusion. If H, K are normal subgroups of G with trivial intersection, then HK is isomorphic to H × K via the multiplication map (h, k) → hk.

Conversely, if you have a group product G = H × K, then clearly H × {e} and {e} × K are two normal subgroups of G which intersect trivially. Thus we have found a criterion for identifying group products: given G, can we express it as a product of two groups?

Now let’s loosen the condition a little and still say H \cap K = \{e\}, but now H is normal in G though K may not be. What happens to the multiplication map fH × K → HK? It’s still surjective by definition. Injectivity follows since H and K intersect trivially. The only problem is that f may not respect the group product on both sides since elements of H don’t commute with elements of K now (the earlier reasoning falls apart).

So we have a bijective map between two groups which doesn’t respect the algebra. But all is not lost, for it turns out we’ll get what’s called the semi-direct product. This will be covered in a later post.

Posted in Notes | Tagged , , , , , | Leave a comment

Casual Introduction to Group Theory (4)

Cosets and Lagrange’s Theorem

[ This post approximately corresponds to chapter IV from the old group theory blog. ]

The main theorem in this post is Lagrange’s theorem: if H ≤ G is a subgroup then |H| divides |G|.

But first, let’s consider cosets. Now, when H is a subgroup of G, it turns out G can be “sliced up” into subsets of equal sizes, one of which is H itself. To put it explicitly, let’s define a (leftcoset to be a set of the form:

gH = \{ gh : h\in H\}, where g\in G.

Then it turns out two cosets are either equal or disjoint:

Proposition. Let x, y \in G be any elements.

  • If x^{-1}y \in H, then xH = yH.
  • If x^{-1}y \not\in H, then xH \cap yH = \emptyset.

The proof is quite easy, but it’s here if you want it. Diagrammatically this gives:

Furthermore, we have |gH| = |H| for any g since the left-multiplication map which takes x to gx is a bijection between H and gH. Immediately, we obtain:

Lagrange’s Theorem. If H is a subgroup of finite group G, then |H| divides |G|. Also, |G|/|H| = [G:H], the number of (left) cosets of H in G. We call this the index of H in G.

In particular, if g is an element of G, then \left<g\right> is a subgroup of order o(g), so o(g) must divide |G|. Thus, g^{|G|} = e.

The theorem is rather surprising at first glance, since in the case of subset of a set, one can only say whether an element is in or not in the subset. In the case of subgroups, we get a nice partitioning.

Note: the following picture may help. If G=\mathbf{R}^3 and H is a plane in G passing through the origin, then each coset is just a translate of the plane; so \mathbf{R}^3 is the union of disjoint planes parallel to H.

Examples

Here’re the examples of subgroups from the last post. Let’s check that they satisfy Lagrange’s theorem.

  1. For the subgroup mZZ, the index [ZmZ] = m if m>0. In the case m=0, the index is infinite.
  2. For Q*2 < Q>0 < Q*, the index [Q* : Q>0] = 2 since we can just pick {+1, -1} as coset representatives, i.e. +1 and -1 belong to different cosets and the two form a disjoint union. On the other hand [Q>0 : Q*2] is infinite: can you find a good set of coset representatives? Answer (highlight to read): the set of primes.
  3. For the subgroup k·(Z/m) < Z/m, where k|m, the index equals k since the subgroup has order m/k.
  4. For (Z/21)*2 < (Z/21)*, the RHS group has elements {1, 2, 4, 5, 8, 10, 11, 13, 16, 17, 19, 20} while LHS group has elements {1, 4, 16}. Thus, the index = 12/3 = 4, with coset representatives {1, 2, 5, 10}.

Now for the non-abelian cases.

  1. Since |Sm| = m! and |Sn| = n!, the embedding S_m \to S_n for m ≤ n has index P^n_{n-m} = n!/m!.
  2. Since |V| = 4 and |A4| = 4!/2 = 12, the index = 3. For coset representatives, we can pick  e, (1,2,3) and (1,3,2).
  3. For A< Sn, the index = n!/(n!/2) = 2. Any odd permutation is a coset representative.

One particularly useful consequence is the following theorem in classical number theory.

Euler’s Theorem. Let a be coprime to m, where m>0. Then a^{\phi(m)} \equiv 1 \pmod m, where \phi(m) denotes the Euler-totient function.

In particular, if p is prime and a is not a multiple of p, then a^{p-1} \equiv 1 \pmod p.

Indeed, the group (Z/m)* has order φ(m) so every element a in the group must satisfy a^{\phi(m)}\equiv 1 \pmod m. Thus, going into an abstract theory can sometimes cast a new light on classical results.

Another immediate result is the following.

Any group of prime order must be cyclic.

Indeed, let G be of prime order p and g be an element which is not the identity. By Lagrange’s theorem, o(g) divides p so it equals 1 or p. Since g ≠ e, it can’t be 1. So the order of g is exactly p and we must have G = \left<g\right> since both have order p.

Thus, groups of orders 1, 2, 3, 5 are all cyclic.

Exercise : prove that if G has order 4, then G is isomorphic to either Z/4 or (Z/2) × (Z/2).

Answer (highlight to read): if x is an non-identity element, then its order divides 4 so must be 2 or 4. If it’s 4, G is cyclic. Otherwise, x*x = e. Pick another element y other than e and x. Again, y*y = e. Now xy is different from ex, y, so G = {exyxy}. Same goes for yx, so xyyx. QED.

More generally, any group of order p2 (p prime) is either Z/p2 or (Z/p) × (Z/p) but it’s hard to prove it with our current set of tools. Inspired by this, consider a rather ambitious goal: can we classify all finite groups of a certain order? Although this question is too general and immense, it’s nice to keep it in mind when we look at later concepts, in particular the Sylow theorems.

Next, it’s possible to define right cosets: these are sets of the form:

Hg = \{hg : h\in H\}

for a subgroup H ≤ G. Again, any two right cosets are either disjoint or identical so we could have proven Lagrange’s theorem with right cosets too. The left and right cosets don’t coincide in general. For example, consider H = {e, (1,2)} < S3. If we use the coset representative g = (1,3), then:

gH = {(1,3), (1,2,3)}   and   Hg = {(1,3), (1,3,2)}.

When left and right cosets do coincide, something interesting happens, but we’ll leave this to the next post.

Finally, we mention briefly about double cosets: if HK ≤ G are both subgroups, then we define HgK = \{hgk : h\in H, k\in K\}. Again, two cosets are either identical or disjoint, but now their sizes can vary. This is not really an important topic, but interested readers can read about it here.

Posted in Notes | Tagged , , , , , | Leave a comment