Consider the following simple problem.

Prove that the shape on the left cannot be completely tiled by 20 polygons of the types shown on the right.

The solution is rather simple: colour the shape in the following manner.

This gives 18 green hexagons, 21 orange hexagons and 21 blue hexagons. On the other hand, each triangular piece must cover three hexagons of distinct colours. Hence, if we could tile the figure completely with 20 pieces, we would have 20 hexagons of each colour, which is a contradiction. Thus such a tiling is impossible.

Now suppose we replace the above shape with the following.

It is still impossible to tile it with the two types of pieces as above. However, the colouring method fails to prove it since we would obtain an equal number of hexagons of each colour. Thus we need a better method.

The following technique can be generalized to prove that a similar shape of size *N* is tileable if and only if

Here, we will present the solution by John H. Conway and Jeffrey C. Lagarias in their paper “*Tiling with polyominoes and combinatorial group theory*” (1990). First we convert the problem to an equivalent one with polyominoes. We need to show that it is impossible to tile the left polyomino with the two triangular ones on the right (no rotation allowed).

Let *F* be the free group on the two-element set {*x*, *y*}. To describe it concretely, we consider the set of all words in **Reduction** in a word occurs when we remove two neighbouring occurrences of and E.g.

A word is said to be **reduced** if no further reduction is possible. Now we can write *F* as the set of all reduced words in

- Group composition in
*F*corresponds to concatenation of words with reduction. For instance,

- The group identity is the empty word.
- To obtain the inverse of a word, we reverse the order and replace each (resp. ) by (resp. ) and vice versa. E.g.

For convenience, we write for *m* neighbouring copies of *x* and for *m* neighbouring copies of and similarly for Thus is shorthand for

Now for each polyomino above, we obtain an element of *F* as follows. Start with a base point on the perimeter. We trace around the perimeter in a counter-clockwise direction, write for the directions of right, left, up, down respectively, and record the loop as a word upon returning to the base point.

For the large triangular figure, we pick the bottom-left corner as the base point and obtain the following word:

Now our definition of depends on our choice of base points. Let’s see what happens to when we switch to a different point on the perimeter. Let correspond to the path from the old base point to the new along the perimeter, and let be the new loop. Then from the diagram:

we have in the group *F*.

In other words, if we change the base point, the new loop is a conjugate of the old one.

The process of tiling can be converted to a statement about group product. Suppose we place the two figures together as follows:

The perimeter for *A* is represented by (or a conjugate) while that for *B* is represented by Hence the resulting diagram has perimeter which is the product of the words for *A* and *B* in the group *F*.

Summary.Suppose polygons are represented by words . Upon tiling , the resulting perimeter is represented by a word which can be obtained from via conjugation and composition.

Thus, if our large triangular polyomino can be tiled by the smaller pieces, then lies in the normal subgroup *N* generated by *a* and *b*. This will be exploited in our proof.

We will construct a normal subgroup *K* of *F* such that .

Suppose we have a robot which starts by facing a certain direction (e.g. north). For each element in *F*, we turn it into a computer program by reading the word from left to right:

- : turn right 60°, walk forward 1m, then turn right 60° again.
- : turn left 60°, walk backward 1m, then turn left 60° again.
- : turn left 60°, walk forward 1m, then turn left 60° again.
- : turn right 60°, walk backward 1m, then turn right 60° again.

Let *K* be the set of all words in *F* which bring the robot back to its original position and orientation. Note that *K* is closed under composition and inverse so it is a subgroup of *F*. Furthermore, it is a normal subgroup since if and , then is the program which does *g*, then *h *(which has no nett effect on the robot), then the reverse of *g*.

Indeed, for *a* and *b* we have:

Likewise it is easy to verify that

For *c*, we let be the canonical homomorphism. Since we have:

In particular, we have .

One easily sees that the robot must travel along the following lines:

The robot’s direction is indicated by the arrow in each disc; thus, it can only face in at most one direction at any spot. If the next instruction is *x*, the robot travels on the perimeter of a yellow triangle, along the given *x* orientation. If it is *x*^{-1}, the robot travels along the perimeter of a yellow triangle, *counter* to the given *x* orientation. The same holds for *y* and *y*^{-1}, but with the blue triangle.

Since each is a closed path, we may define to be the number of hexagons enclosed by the path. Here, the counting is oriented counter-clockwise – a loop which encloses a hexagon is counted +1 if the loop is counter-clockwise and -1 otherwise. This gives the desired homomorphism We have

and thus we have

Now if the original tiling were possible, we would require pieces. Hence *c* is obtained from the product of 12 conjugates of *a* and *b* (in *F*). Since *K* is normal in *F*, each conjugate *a’* of *a* lies in *K* and furthermore Likewise for any conjugate *b’* of *b* in *F*. From this, it follows that is even, which contradicts .

Thus our proof is complete.

]]>In the previous article, we described the Schreier-Sims algorithm. Given a small subset which generates the permutation group *G*, the algorithm constructs a sequence such that for:

we have a small generating set for each Specifically, via the Sims filter, we can restrict for each *i*.

In this article, we will answer the following question.

Factorization Problem.How do we represent an arbitrary representation as a product of elements of and their inverses?

If we can solve this problem in general, we would be able to solve a rather large class of puzzles based on permutations, e.g. Rubik’s cube and the Topspin puzzle. First, let’s look at the straightforward approach from the Schreier-Sims method.

Recall that the Schreier-Sims method starts with . At each step it picks a base element , computes a generating set for from then pares down this set with the Sims filter so that Thus one can, in theory, keep track of the elements of by expressing them as words in *A*.

The problem with this approach is that since is obtained from , the length of words for would be a constant factor of those for . Thus by the time we reach , their lengths would be exponential in *m*.

As far as we know, the first paper to solve the factorization problem is by T. Minkwitz in “*An Algorithm for Solving the Factorization Problem in Permutation Groups*” (1998). The idea is elegant. First, we replace the generating sets with the following tables.

Main Goal.For each , let be the orbit We wish to obtain a set , indexed by , such that

- takes the base element

In other words, the element is any element of *G* satisfying:

Minkwitz’s method replaces the sequence of sets with ; furthermore, the elements of the latter sets are stored as *words* in instead of mere permutations.

To begin, we initialize to be the empty word for *i* = 0, …, *m*-1.

Next we run through words in , starting from those of length 1, then those of length 2, etc. For each word *w*, compute the corresponding element and do the following:

- Start with
*i*= 0. - Let . Do we have an entry for ?
- If not, we let be the word
*w*and quit. - If yes, and
*w*is shorter than the current word in , we replace it with*w*and quit. - Otherwise, let Replace
*w*with and increment*i*then repeat step 2.

- If not, we let be the word

Let us take the example from the previous article, with generated by:

Applying the Schreier-Sims algorithm, we obtain the following base and orbits:

From this, we deduce that the order of *G* is 5 × 4 × 3 × 3 × 2 = 360. Applying Minkwitz’s algorithm, we first initialize the table as follows:

Now consider the word ‘a’, which corresponds to *g* = (1, 5, 7)(2, 6, 8). This has no effect on On the other hand, it takes Thus, we write the word ‘a’ into the last entry of the second row:

After running through all words of length 1, we arrive at the following table:

The first word of length 2 is ‘aa’, which corresponds to (1, 7, 5)(2, 8, 6), but this is just a^{-1}, and we have already processed it.

The next word of length 2 is ‘ab’, which gives . This takes However, the corresponding entry is already filled with ‘b’. Since this new word does not improve upon the old one, we replace the word with b^{-1}ab. Now we have

and proceed on to This element takes so we fill in the corresponding entry on the second row:

The above method is quite effective for most random groups. However, the last few entries of the table may take a really long time to fill up. E.g. on the set:

the shortest word which takes 1 to *n* is given by Intuitively, the table fills up slowly because the elements {1, …, *n*} diffuse slowly via the generators.

One possible way out of this rut is to fill in entries of the table by looking at the group elements below the current row. To be specific, suppose the row for has quite a few empty entries. We look at the filled entries on that row, say and consider all words *w* found below the row. Their group elements are guaranteed to lie in If the entry for is currently empty, we fill it in with the concatenation of *w* and the corresponding word for [Of course we need to perform reduction after concatenating the two words.]

To further optimize, note that sometimes a table entry gets replaced with a nice shorter word – it would be desirable to use this short word to update the other table entries.

Hence, we perform the following. For each row *i*, consider any two words *w’*, *w”* on that row. We take their concatenation *w* := *w’w”* and use it to update the entire table from row *i* onwards (by update, we mean: compute and check if *w* is shorter than the table entry at *x*. If it is, we update; otherwise we let *w*_{1} be this entry and replace *w* with and repeat the whole process).

As this process is rather slow, we only update the table if either *w’* or *w”* are of shorter length, compared to the last time we did this optimization.

Minkwitz’s paper suggests the following iterative procedure.

- Set a maximum word length
*l*. - For each word of short length, update the table as described earlier.
- If, at a certain row, the word length exceeds
*l*, we bail out.

- If, at a certain row, the word length exceeds
- For every
*s*steps, do the following.- Perform the above two enhancements: filling up and optimizing.
- Increase
*l*, say, to (5/4)*l*.

Setting the word length is optional, but can speed things up quite a bit in practice. It prevents the initial *s* steps from filling up the table with overly long words, otherwise optimizing them will be costly. For further details, the diligent reader may refer to Minkwitz’s original paper, available freely on the web.

Outcome.We applied this to the study of the Rubik’s cube group, and obtained a representation with maximum word length of 184 over all elements of the group. This is slightly worse than the result of 144-165 quoted in Minkwitz’s paper.

The Topspin puzzle is a toy which consists of a loop with 20 labeled counters in it. In the middle of the loop lies a turntable which reverts the order of any four consecutive counters. This is what the puzzle looks like.

[ Photo from product page on Amazon. ]

Thus the group of permutations of the counters is generated by

*a* = (1, 2, 3, …, 20), *b* = (1, 4)(2, 3)

From the Schreier-Sims algorithm, we see that these two elements generate the full symmetric group so Topspin achieves all possible permutations of the 20 counters. Minkwitz’s algorithm gives us a practical way to solve the puzzle given any initial configuration. Our program gave us a full table with a maximum word length of 824.

To search for short words representing a given ,

- let
*w*be a short word in ; - let be the permutation for
*w*; - find the word
*w’*for ; - thus
*ww’*is a word representing*g*.

We iterate over about 1000 instances of *w* and pick the shortest representation among all choices.

**Example**

Consider the transposition (1, 2). We obtain the following word:

a^{-1}a^{-1}b^{-1}a^{-1}b^{-1}ab^{-1}a^{-1}b^{-1}abab^{-1}a^{-1}a^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b^{-1}a^{-1}b

of length 43 (to be applied *from right to left*). This is a remarkably short word – for most random configurations we tried, the length of the word is >200.

Throughout this article, we let *G* be a subgroup of generated by a subset We wish to consider the following questions.

- Given
*A*, how do we compute the order of*G*? - How do we determine if an element lies in
*G*? - Assuming , how do we represent
*g*as a product of elements of*A*and their inverses?

In general, the order of *G* is comparable to even for moderately-sized *A* so brute force is not a good solution.

We will answer the first two questions in this article. The third is trickier, but there is a nice algorithm by Minkwitz which works for most practical instances.

We can represent the group of transformations of the Rubik’s cube as a subgroup of generated by a set *S* of 6 permutations. The idea is to label each *non-central* unit square of each face by a number; a transformation of the Rubik’s cube then corresponds to a permutation of these 6 × 8 = 48 unit squares.

[Image from official Rubik’s cube website.]

Our link above gives the exact order of the Rubik’s cube group (43252003274489856000), as computed by the open source algebra package GAP 4. How does it do that, without enumerating all the elements of the group? The answer will be given below.

To describe the Schreier-Sims algorithm, we use the following notations:

- is some subset represented in the computer’s memory;
- is a subgroup of ;
*G*acts on the set

Let us pick some random element and consider its orbit under *G*. From the theory of group actions, we have

where is the isotropy group of *k*. Now it is easy to compute the orbit of *k*: we start by setting the orbit to be the singleton {*k*}, then expand it by letting elements of *A* act on elements of this set. The process stops if we can’t add any more elements via this iteration. A more detailed algorithm will be provided later.

Thus, if we could effectively obtain a set of generators for , our task would be complete since we could recursively apply the process to . [Or so it seems: there’ll be a slight complication.]

For that, we pick a set *U* of representatives for the left cosets as follows. For each , we pick some element which maps , and for *j* = *k* we pick the identity. To facilitate this process, we use a data structure called a **Schreier vector**.

**Warning**: our description of the Schreier vector differs slightly from the usual implementation, since we admit the inverse of a generator.

Let us label the elements of the generating set

- Initialize an array (
*v*[1],*v*[2], …,*v*[*n*]) of integers to -1. - Set
*v*[*k*] := 0. - For each
*i*= 1, 2, …,*n*:- If
*v*[*i*] = -1 we ignore the next step. - For each
*r*= 1, 2, …,*m*, we set*g*=*g*:_{r}- set
*j*:=*g*(*i*); if*v*[*j*] = -1, then set*v*[*j*] = 2*r*-1; - set
*j*:=*g*^{-1}(*i*): if*v*[*j*] = -1, then set*v*[*j*] = 2*r*.

- set

- If
- Repeat the previous step until no more changes were made to
*v*.

The idea is that *v* contains a “pointer” to an element of *A* (or its inverse) which brings it one step closer to *k*.

Suppose we have elements

Labelling and , we obtain the following Schreier vector for *k*=1:

Thus the orbit of 1 is {1, 2, 3, 4, 5, 6, 7, 9}. To compute an element which maps 1 to, say, 9, the vector gives us Hence we can pick the following for *U*

Next, we define the map which takes to the unique such that (i.e. *u* is the unique element of *U* satisfying *u*(*k*) = *g*(*k*)). Our main result is:

Schreier’s Lemma.The subgroup is generated by the following set

**Proof**

First note that takes *k* to itself: indeed by definition takes *k* to Thus we see that each as desired so .

Next, observe that *B* is precisely the set of all for and which lies in Indeed for such an element, *u’* must be the unique element of *U* which maps *k* to and so

Now suppose ; we write it as a product of elements of *A* and their inverses:

We will write

where are elements to be recursively chosen. Specifically we start with , and for each we set . Note that each term in parentheses is an element of Thus, the expression lies in

So we have . Since , this gives as well so we have obtained *h* as a product of elements of *B* and their inverses.

Consider the subgroup generated by

If we pick *k* = 1, its orbit is For the coset representatives *U*, we take:

Now the subgroup is generated by the following 6 elements:

Let be the subgroup generated by these 6 elements; after removing the identity elements we are left with 4. Now if we pick *k* = 2 next, we obtain 5 representatives for the next *U* and thus, we obtain up to 20 generators for the stabilizer of {1, 2} in *G*.

The number of generators for the stabilizer groups seems to be ballooning: we started off with 2 examples, then expanded to 6 (but trimmed down to 4), then blew up to 20 after the second iteration.

Indeed, if we naively pick over all , then the number of generator increases times, while the order of the group decreases times as well. Thus, at worst, the number of generators is comparable to the order of the group, which is unmanageable.

Thankfully, we have a way to pare down the number of generators.

**Sims Filter **achieves the following.

Task.Given a set , there is an effective algorithm to replace by some satisfying and

Let us explain the filter now. For any non-identity permutation , let be the pair (*i*, *j*) with such that for all and

Now we will construct a set *B* such that and the elements all have distinct It thus follows that

- Label
- Prepare a table indexed by (
*i*,*j*) for all . Initially this table is empty. - For each , if we drop it.
- Otherwise, consider . If the table entry at (
*i*,*j*) is empty, we fill in. Otherwise, if the entry is , we replace with and repeat step 3.- Note that this new group element takes
*i*to*i*so if it is non-identity, we have with

- Note that this new group element takes

After the whole process, the entries in the table give us the new set *B*. Clearly, we have

As above, let us take *A* = {*a*, *b*} with

First step: since *J*(*a*) = (1, 5) we fill the element *a* in the table at (1, 5).

Second step: we also have *J*(*b*) = (1, 5), so now we have to replace *b* with

Now we have *J*(*b’*) = (2, 6) so this is allowed.

In summary, we denote and

- For
*i*= 1, 2, …- Pick a point not picked earlier.
- Let be the stabilizer group for under the action of From , we use Schreier’s lemma to obtain a generating set for
- Use Sims filter to reduce this set to obtain of at most elements.
- If is empty, quit.

Thus are distinct such that each of the groups

has an explicit set of generators of at most elements. Here are some applications of having this data.

It suffices to compute for each *i*. Since is the stabilizer group for the action of on we need to compute the size of the orbit for each *i*. Since we have a generating set for each , this is easy.

To check if we first check whether lies in the orbit If it weren’t, Otherwise we pick some such that Replacing *g* with , we may thus assume that , and it follows that if and only if Thus we can solve the problem inductively.

Generalizing, we can determine if is a subgroup of , when *H* and *G* are given by explicit sets of generators.

Writing and , we claim that *H* is normal in *G* if and only if:

First we fix Since and , we see that But both groups are of the same *finite* cardinality so equality holds. Thus as well. It follows that for all

This is an enhancement of the Euler test. Be forewarned that it is in fact weaker than the Rabin-Miller test so it may not be of much practical interest. Nevertheless, it’s included here for completeness.

Recall that to perform the Euler test on an odd *n*, we pick a base *a* and check if

To enhance this, we note that if *p* > 2 is prime, the value modulo *p* tells us whether *a* is a square modulo *p*. We had written extensively on this before – in short, the Legendre symbol satisfies:

Observe that the second, fourth and fifth properties follow easily from the definition of the Legendre symbol. The last property, known as the *quadratic reciprocity law*, is highly non-trivial and is the seed of an extremely deep area of algebraic number theory, called class field theory. But that’s a story for another day.

Now, if we had another way to compute the Legendre symbol and extend its definition to , we could compute and compare the two.

Let us extend the Legendre symbol to the Jacobi symbol.

Definition.Let a be any integer and n be an odd positive integer. Write

be its prime factorization. The

Jacobi symbolis defined via:where each term is the Legendre symbol.

The Jacobi symbol inherits most properties of the Legendre symbol.

**Exercise**

Prove the following properties of the Jacobi symbol.

**Proof**

We will only prove the last property as an example. First note that when *m* and *n* are odd primes, the result is just the quadratic reciprocity theorem. Since we have and it suffices to define, for positive odd *m* and *n*,

and prove and The two claims are identical, so let’s just show the first, which is equivalent to:

But this is obvious: the difference between the two sides is which is clearly even.

We thus have a computationally feasible way to compute the Jacobi symbol recursively for extremely large integers. E.g. to compute we repeatedly apply the above properties to obtain:

In Python code, we have:

def jacobi(m, n): if m == 0: return 0 if m == 1: return 1 if m < 0: if (n%4 == 3): return -jacobi(-m, n) else: return +jacobi(-m, n) if (m >= n): return jacobi(m%n, n) if m % 2 == 0: if (n%8 == 3 or n%8 == 5): return -jacobi(m/2, n) else: return +jacobi(m/2, n) if m % 4 == 3 and n % 4 == 3: return -jacobi(n, m) else: return jacobi(n, m)

Now we can describe our main objective.

Solovay-Strassen Test.Given an odd and base , compute the Jacobi symbol . Now check that

and that both sides are non-zero.

Let *n* = 10261 and *a* = 2. Immediately the Jacobi symbol gives . On the other hand, Thus even though *n* passes the Euler test for base 2, it fails the Solovay-Strassen test for the same base.

Notice that this also fails the Rabin-Miller test for the same base since

but

In fact, the following is true in general.

Theorem.If odd passes the Rabin-Miller test to base , then it also passes the Solovay-Strassen test to the same base.

For a proof of the result, see theorem 3 of this paper. This is why we said there’s not much practical application for this test.

]]>

The basic enhancement is as follows.

Lemma.If is prime and , then

The lemma is quite easy to proof. By the given condition is a multiple of *p* and since *p* is prime either (*m* – 1) or (*m* + 1) is a multiple of *p* and the result follows.

In particular, if *p* is an odd prime and *a* is not divisible by *p*, then by Fermat’s theorem we have:

In other words, to test if an odd *n* is prime, we pick a base *a* < *n* and check if If it is not, then *n* is not prime. This is called the **Euler test** with **base** *a*.

Note that the Euler test is necessarily stronger than the Fermat test since if , then we must have

Consider *n* = 11305. We pick the smallest base *a* = 2. This gives us:

Notice that the Fermat with the same base would give Hence a pseudoprime can be picked up by the Euler test.

Suppose we pick *n* = 10585 as before. We obtain:

so Euler test is not more effective than Fermat test in this case.

Let’s pick the smallest Carmichael number *n* = 561. We have:

so there are Carmichael numbers which succumb to the Euler test.

In conclusion, the Euler test is strictly stronger than the Fermat test. Furthermore, it’s a little faster since we only need to compute instead of .

As we saw above, even composite *n* can pass the Euler test for some bases.

Definition.If is odd composite and is such that , then we say is an

Euler pseudoprimetobase

Thus from our above example, 10585 is an Euler pseudoprime to bases 2 and 3.

Clearly, if *n* is an Euler pseudoprime to base *a*, then it is also a pseudoprime to base *a*. However, as example 1 above shows, there are pseudoprimes which are not Euler pseudoprimes (for a fixed given base). As example 3 shows, some Carmichael numbers can fail the Euler test, which spells good news for us.

Unfortunately, not all Carmichael numbers can be identified composite by the Euler test. Specifically, there are odd composite *n* which is an Euler pseudoprime for all *a* which are coprime to *n*. Such numbers are known as **absolute Euler pseudoprimes**. It turns out Chernick’s construction for Carmichael numbers also gives us absolute Euler pseudoprimes.

**Exercise**

Suppose *k* is a positive integer such that 6*k* + 1, 12*k* + 1 and 18*k* + 1 are prime. Prove that *n* := (6*k* + 1)(12*k* + 1)(18*k* + 1) is an absolute Euler pseudoprime.

Thus, we are not quite out of the woods yet.

As we noted above, if *p* is prime then whenever we have This observation helps us to identify the pseudoprime *n* = 10585 in example 2.

Indeed, we have However, since is even, we can go one step further and check if In our case, we obtain and thus we see that *n* is composite.

This is the gist of the Rabin-Miller test. The idea is that if and the exponent is even, then we check whether

If it is not, *n* must be composite.

Rabin-Miller Test.Suppose is a given base and is odd. Let be the highest power of 2 dividing and compute

Perform the following for iterations.

- If , output PASS.
- Let .
- If , output FAIL.
- Go back to step 1.

**Example**

If we set *n* = 10585 and *a* = 2 as above, then *k* = 3 and Thus we have

During the first iteration, we skip through step 1, set Thus we skip step 3 as well.

During the second iteration, we skip through step 1, set Thus step 3 says *n* fails the test.

Clearly the Rabin-Miller test is stronger than the Euler test, but just how much stronger is it? There are indeed composite numbers which pass the Rabin-Miller test for specific bases. These are called **strong pseudoprimes**.

Suppose *n* = 8321 and *a* = 2. The highest power of 2 dividing *n*-1 is 2^{7}. Hence we have The next iteration then gives so *n* passes the Rabin-Miller test to base 2.

On the other hand, one easily checks that *n* fails the Fermat test for base 3 so it is composite.

Thankfully, we do not have the case of Carmichael numbers here.

Theorem.If is odd composite, it will fail the Rabin-Miller test for 75% of within the range

Thus, for a given odd *n* we randomly pick for about 20 trials. If *n* passes all trials, we report that it is most likely prime. Otherwise, if it fails even a single trial, then it is composite. Heuristically, one imagines that the probability of *n* being composite and passing all trials to be about . In practice 75% is a gross underestimate since for most randomly chosen large *n*, the Rabin-Miller test will fail for >99% of the bases.

We have here a **probabilistic primality testing** method, in the sense that if *n* passes the test, it is overwhelmingly likely to be prime; on the other hand, any failure immediately implies *n* is composite.

Our confidence in the Rabin-Miller test is further enhanced by the following result.

Theorem (Miller).Assuming the Generalized Riemann Hypothesis (GRH), if passes the Rabin-Miller test for all , then it is prime.

Miller noted that the multiplicative group is generated by elements bounded by and thus it suffices to check for all bases in that bound. The conjecture remains open to this day – if you can prove it, you stand to win a million dollars. We will not delve into this topic since the GRH is an extremely deep result. However, we do note that there is overwhelming computational evidence in support of the conjecture and in practice it is reasonable to run the Rabin-Miller test within the above specified bound.

The Rabin-Miller test is a highly efficient and reliable, albeit probabilistic, primality test. In general, we would like our primality tests to satisfy the following conditions.

**Deterministic**: it can tell with absolute certainty if*n*is prime.**Efficient**: its runtime is polynomial in the length of*n*(i.e.*O*((log*n*)^{d}) for some*d*.**Unconditional**: it does not assume any open conjecture.

If we run, say, 20 iterations of Rabin-Miller we obtain a probabilistic efficient test, i.e. the second and third conditions are satisfied but not the first. Assuming the GRH, we could test for all bases less than 2(log *n*)^{2} and obtain a primality test satisfying the first and second conditions but not the third.

In addition there’s a primality test by Adleman, Pomerance and Rumely which is based on the elliptic curve group. This test is deterministic and unconditional, but its runtime complexity is slightly worse than polynomial time (in log *n*). Thus, it satisfies the first and third properties but not the second. We may describe this test in a later article.

For years, computer scientists wondered if there is a deterministic, polynomial-time and unconditional primality test. This was finally answered in the affirmative in 2002 by Agrawal, Kayal and Saxena who found a test which satisfied all three conditions. Unfortunately, despite being polynomial in the size of *n*, the algorithm can only be used on very modestly sized numbers.

The main problem we wish to discuss is as follows.

Question. Given n, how do we determine if it is prime?

Prime numbers have opened up huge avenues in theoretical research – the renowned Riemann Hypothesis, for example, is really a statement about the distribution of primes. On the other hand, it was only in the late 70s that people have found applications for them in computer science. Specifically, the RSA asymmetric encryption system requires the software to quickly generate huge prime numbers of several hundred digits. Hence, primality testing is of huge practical importance as well – in fact, without fast primality testing algorithms, asymmetric encryption systems would be totally absent.

[ Current state-of-the-art asymmetric encryption systems include RSA, Diffie-Hellman and elliptic curve cryptography, all of which rely heavily on the ability to generate huge primes quickly, among other things. ]

Curiously, primality testing turns out to be a much simpler problem than the task of factoring, where one needs to factor a given large integer. The current best algorithm for the latter task, general number field sieve, is extremely involved and requires an understanding of algebraic number theory. In contrast, most primality testing algorithms require only elementary number theory (except for the elliptic curve primality proving method, which I’ve yet to decide if I will cover).

One obvious way to test if a number is prime is **trial division**. Thus, given *n*, we iteratively test whether *n* is divisible by *k*, for all .

Why ? Because if *n* is divisible by some *k*, it is also divisible by *n*/*k*. Thus we may assume one of the factors to be at most . For example, given 103, we quickly find that it is prime since it is not divisible by 2, 3, …, 10.

In fact, if *n* is composite, then it is divisible by some *prime* number less than . However, iterating through prime numbers is computationally difficult so one usually doesn’t bother with that. On the other hand, this gives us a quick way to mentally test if *n* is prime for *n* < 400, since we only need to test for possible factors 2, 3, 5, 7, 11, 13, 17, 19.

Trial division is impractical for large numbers, but don’t knock it! When you need to code a quick function to test if a 32-bit number is prime, trial division is the way to go.

The primary observation comes from the following number-theoretic result:

Fermat’s (Little) Theorem.If is prime and is an integer not divisible by , then

Thus the contrapositive statement says: if *a* is an integer not divisible by *p* and , then *p* is not prime. This is called the **Fermat test** with **base** *a*. As example 2 below shows, it has its flaws.

Consider the integer *n* = 15943. Picking *a* = 2, we have:

Hence *n* is composite.

Consider the integer *n* = 10585. Fermat test then gives us

Hence we see that *n* is composite. More importantly, this shows that Fermat test can fail for some bases.

Let *n* = 10891. We see that:

Hence *n* *appears to be prime*. Indeed it is, as you can easily verify.

In summary, we have the following:

Fermat Test for n.For various positive bases , we check if If this fails for any a, we know that n is composite. Otherwise, n is most probably prime.

**Exercise**

Explain why, in the Fermat test, if we wish to test for all bases < *A*, it suffices to test for all *prime* bases < *A*.

Example 2 above teaches us an important lesson: there are composite *n* and numbers *a*≠1 for which Hence we have:

Definition.If is composite and is such that , then we say is a

pseudoprimewith base .

For a fixed base, pseudoprimes are rather rare. For example, the list of pseudoprimes with base 2 can be obtained on https://oeis.org/A001567 and there are only 22 of them below 10000 and 5597 of them below (see http://oeis.org/A055550). Hence given a random huge composite number, chances are high that the first Fermat test will expose it. Put in another way, given a huge random number that passes the first Fermat test, chances are high it’s prime.

This seems to imply that if a large integer *n* passes (say) 30 Fermat tests, then it’s guaranteed to be prime.

Not really, for we have the following.

Definition.If is composite such that for all coprime to we have then we say is a

Carmichael number.

The smallest Carmichael number is 561 = 3 × 11 × 17.

While such numbers are rare, there are infinitely many of them. [ *Fun fact: this conjecture remained open for a long time and was finally proven in 1994, by Alford, Granville and Pomerance.* ] Note, however, that among the list of Carmichael numbers, each term seems to have a small prime factor; for instance, 561 is divisible by 3. So maybe we can combine Fermat test with trial division (by small primes) in order to weed out such cases?

No such luck, however. Chernick discovered in 1931 that if *k* is a positive integer such that 6*k*+1, 12*k*+1 and 18*k*+1 are all prime, then

*n* = (6*k *+ 1)(12*k *+ 1)(18*k *+ 1)

is a Carmichael number. Now, it remains an open question whether there are infinitely many Carmichael numbers of this form, but in practice (from what we know of density of primes), it is quite easy to construct extremely large Carmichael numbers of this form. For instance, it takes only a few seconds for a Python script to give us the following 301-digit Carmichael number with three 100-digit prime factors:

3678977776333017616618572124346263612335718264220592681275 4323375966931049703503059541656515082721932203861403966236 1707445079748715571482964640305403211055421233151364221297 3982555078692571752850813633662021955795733617868645037712 1505333087445948159695018394505386803232557678106811464742 96899444481

Such a case would have no hope of getting picked up by a combination of Fermat test and trial division. But don’t give up yet, for the story is not over.

**Exercises**

Prove Chernick’s result. [This is one of those results which are easy to prove, but hard to obtain.]

Extend Chernick’s result to obtain a Carmichael number of >400 digits which has four prime divisors. What about five?

]]>We continue the previous discussion. Recall that for we have a -equivariant map

which induces an isomorphism between the unique copies of in both spaces. The kernel *Q* of this map is spanned by for various fillings *T* with shape and entries in [*n*].

E.g. suppose and ; then and the map induces:

with kernel *Q*. For the following filling *T*, we have the correspondence:

Lemma. If , then the above map factors through

E.g. in our example above, the map factors through .

**Proof**

Indeed if , then swapping columns and of gives us the same .

Now suppose and ; pick with length *a*. Now comprises of *m* copies of *a* so by the above lemma, we have a map:

where and . Taking the direct sum over all *m* we have:

which is a homomorphism of -representations. Furthermore, for any vector space *V* has an algebra structure via The above map clearly preserves multiplication since multiplying and both correspond to concatenation of *T* and *T’*. So it is also a homomorphism of -algebras.

Question. A basis of is given byfor Hence , the ring of polynomials in variables. What is the kernel of the induced map:

**Answer**

We have seen that for any and we have:

where we swap with various sets of *k* indices in while preserving the order to give and . Hence, *P* contains the ideal generated by all such quadratic relations.

On the other hand, any relation is a multiple of such a quadratic equation with a polynomial. This is clear by taking the two columns used in swapping; the remaining columns simply multiply the quadratic relation with a polynomial. Hence *P* is the ideal generated by these quadratic equations.

Since the quotient of by is a subring of , we have:

Corollary. The ideal generated by the above quadratic equations is prime.

Recall that takes , which is a left action; here , . We also let act on the right via:

so that becomes a -bimodule. A basic problem in *invariant theory* is to describe the ring comprising of all *f* such that for all .

Theorem. The ring is the image of:where runs over all .

In other words, we have:

**First Fundamental Theorem**: the ring of -invariants in is generated by

**Second Fundamental Theorem**: the relations satisfied by these polynomials are generated by the above quadratic relations.

Note that *g* takes to:

which is if . Hence we have . To prove equality, we show that their dimensions in degree *d* agree. By the previous article, the degree-*d* component of has a basis indexed by SSYT of type and entries in [*n*]; if *d* is not a multiple of *a*, the component is 0.

Next we check the degree-*d* component of . As -representations, we have

where acts on canonically. Taking the degree-*d* component, once again this component is 0 if *d* is not a multiple of *a*. If , it is the direct sum of over all . The of this submodule is where is the sequence Hence

Fix and sum over all ; we see that the number of copies of is the number of SSYT with shape and entries in [*n*]. The key observation is that each is an -irrep.

- Indeed, acts as a constant scalar on the whole of since is homogeneous. Hence any -invariant subspace of is also -invariant.

Hence is either the whole space or 0. From the proposition here, it is the whole space if and only if with *a* terms (which corresponds to ). Hence, the required dimension is the number of SSYT with shape and entries in [*n*].

We will describe another construction for the Schur module.

Introduce variables for . For each sequence we define the following polynomials in :

Now given a filling *T* of shape λ, we define:

where is the sequence of entries from the *i*-th column of *T*. E.g.

Let be the ring of polynomials in with complex coefficients. Since we usually take entries of *T* from [*n*], we only need to consider the subring .

Let Recall from earlier that any non-zero -equivariant map

must induce an isomorphism between the unique copies of in the source and target spaces. Given any filling *T* of shape , we let be the element of obtained by replacing each entry *k* in *T* by , then taking the wedge of elements in each column, followed by the tensor product across columns:

Note that the image of in is precisely as defined in the last article.

Definition. We take the mapwhere belongs to component

E.g. in our example above, is homogeneous in of degree 5, of degree 4 and of degree 3. We let act on via:

Thus if we fix *i* and consider the variables as a row vector, then . From another point of view, if we take as a basis, then the action is represented by matrix *g* since it takes the standard basis to the column vectors of *g*.

Proposition. The map is -equivariant.

**Proof**

The element takes by taking the column vectors of *g*; so

where *T’* is the filling obtained from *T* by replacing its entries with correspondingly.

On the other hand, the determinant gets mapped to:

which is .

Since contains exactly one copy of , it has a unique -submodule *Q* such that the quotient is isomorphic to The resulting quotient is thus identical to the Schur module *F*(*V*), and the above map factors through

Now we can apply results from the last article:

Corollary 1. The polynomials satisfy the following:

- if T has two identical entries in the same column.
- if T’ is obtained from T by swapping two entries in the same column.
- , where S takes the set of all fillings obtained from T by swapping a fixed set of k entries in column j’ with arbitrary sets of k entries in column j (for fixed j < j’) while preserving the order.

**Proof**

Indeed, the above hold when we replace by Now apply the above linear map.

Corollary 2. The set of , for all SSYT with shape λ and entries in [n], is linearly independent over

**Proof**

Indeed, the set of these is linearly independent over and the above map is injective.

Consider any bijective filling *T* for . Writing out the third relation in corollary 1 gives:

More generally, if satisfies and , the corresponding third relation is obtained by multiplying the above by a polynomial on both sides.

Take the SYT by writing in the left column and in the right. Now is the product:

In the sum , each summand is of the form , where matrices *M’*, *N’* are obtained from *M*, *N* respectively by swapping a fixed set of *k* columns in *N* with arbitrary sets of *k* columns in *M* while preserving the column order. E.g. for *n*=3 and *k*=2, picking the first two columns of *N* gives:

For a partition , one takes its Young diagram comprising of boxes. A *filling* is given by a function for some positive integer *m*. When *m*=*d*, we will require the filling to be bijective, i.e. *T* contains {1,…,*d*} and each element occurs exactly once.

If and is a filling, then is obtained by replacing each *i* in the filling with *w*(*i*). For a filling *T*, the corresponding row (*resp*. column) tabloid is denoted by {*T*} (*resp*. [*T*]).

Recall from an earlier discussion that we can express the -irrep as a quotient of from the surjection:

Here is any fixed bijective filling .

Concretely, a **C**-basis for is given by column tabloids [T] and the quotient is given by relations: where *T’* runs through all column tabloids obtained from *T* as follows:

- fix columns
*j*<*j’*and a set*B*of*k*boxes in column*j’*of*T*; then*T’*is obtained by switching*B*with a set of*k*boxes in column*j*of*T*, while preserving the order. E.g.

From the previous article we have , where is the quotient of the space of column tabloids described above. We let be the set of all functions , i.e. the set of all fillings of λ with elements of *V*. We define the map:

for any bijective filling This is independent of the *T* we pick; indeed if we replace *T* by for , the resulting RHS would be:

where the first equality holds since the outer tensor product is over and the second equality follows from our definition . Hence is well-defined. It satisfies the following three properties.

Property C1. is multilinear in each component V.

In other words, if we fix and consider as a function on *V* in component *s* of , then the resulting map is **C**-linear. E.g. if , then:

This is clear.

Property C2. Suppose are identical except and , where are in the same column. Then

**Proof**

Let be the transposition swapping *s* and *t*. Then by alternating property of the column tabloid and . Thus:

Finally, we have:

Property C3. Let Fix two columns in the Young diagram for λ, and a set B of k boxes in column j’. As A runs through all sets of k boxes in column j, let be obtained by swapping entries in A with entries in B while preserving the order. Then:

E.g. for any we have:

**Proof**

Fix a bijective filling Then:

where swaps the entries in *A* with those in *B* while preserving the order (note that ). But the sum of all such vanishes in Hence

Definition. Let V, W be complex vector spaces. A map is said to beλ-alternatingif properties C1, C2 and C3 hold.The

universal λ-alternatingspace (or theSchur module) for V is a pair where

- is a complex vector space;
- is a λ-alternating map,
satisfying the following universal property: for any λ-alternating map to a complex vector space W, there is a unique linear map such that

*F*(*V*) is not hard to construct: the universal space which satisfies C1 and C2 is the alternating space:

So the desired *F*(*V*) is obtained by taking the quotient of this space with all relations obtained by swapping a fixed set *B* of coordinates in with a set *A* of coordinates in , and letting *A* vary over all |*A*| = |*B*|. E.g. the relation corresponding to our above example for C3 is:

over all

By universality, the λ-alternating map thus induces a linear:

You can probably guess what’s coming next.

Main Theorem. The above is an isomorphism.

First observe that is surjective by the explicit construction of *F*(*V*) so it remains to show injectivity via dim(LHS) ≤ dim(RHS).

Now , and we saw earlier that its dimension is the number of SSYT with shape λ and entries in [*n*].

On the other hand, let be the standard basis of If *T* is any *filling* with shape λ and entries in [*n*], we let be the element of *F*(*V*) obtained by replacing each *i* in *T* by ; then running through the map

Claim. The set of generates , where T runs through all SSYT with shape λ and entries in [n].

**Proof**

Note that the set of , as *T* runs through all *fillings* with shape λ and entries in [*n*], generates *F*(*V*).

Let us order the set of all fillings of *T* as follows: *T’* > *T* if, in the rightmost column *j* where *T’* and *T* differ, at the lowest in which , we have .

This gives a total ordering on the set of fillings. We claim that if *T* is a filling which is not an SSYT, then is a linear combination of for *S* > *T*.

- If two entries in a column of
*T*are equal, then by definition.

- If a column
*j*and row*i*of*T*satisfy , assume*j*is the rightmost column for which this happens, and in that column,*i*is as large as possible. Swapping entries and of*T*gives us*T’*>*T*and

- Now suppose all the columns are strictly ascending. Assume we have , where
*j*is the largest for which this happens, and , for . Swapping the topmost*i*entries of column*j*+1, with various*i*entries of column*j*, all the resulting fillings are strictly greater than*T*. Hence , where each*S*>*T*.

Thus, if *T* is not an SSYT we can replace with a linear combination of where *S* > *T*. Since there are finitely many fillings *T* (with entries in [*n*]), this process must eventually terminate so each can be written as a linear sum of for SSYT *S*.

Thus ≤ number of SSYT with shape λ and entries in [*n*], and the proof for the main theorem is complete. From our proof, we have also obtained:

]]>

Lemma. The set of forms a basis for F(V), where T runs through the set of all SSYT with shape λ and entries in [n].

Again, we will denote throughout this article. In the previous article, we saw that the Schur-Weyl duality can be described as a functor:

- given a -module
*M*, the corresponding -module is set as

Definition. The constructionis functorial in and is called the

Schur functorwhen M is fixed.

Here, functoriality means that any linear map induces a linear .

For example, when , the functor is the identity functor. By Schur-Weyl duality, when *M* is irreducible as an -module, the resulting is either 0 or irreducible. We will see the Schur functor cropping up in two other instances.

Following the reasoning as in -modules, we have for partitions and ,

Since the only common irrep between the two representations is , any non-zero *G*-equivariant must induce an isomorphism between those two components. We proceed to construct such a map.

For illustration, take and pick the following filling:

To construct the map, we will take and as subspaces of Thus:

Let us map according to the above filling, i.e. goes into components 1, 4, 2 of while goes into component 3. Similarly, we map by mapping components 1, 3 to , components 4 and 2 to the other two copies of *V*. In diagram, we have:

This construction is clearly functorial in *V*. Hence, if is a linear map of vector spaces, then this induces a linear map

Another means of defining the Schur functor is by the Young symmetrizer. Here we shall let act on on the left and act on it on the right via:

Now given any (left) -module *M*, consider:

a left -module. We shall prove that corresponds to the Schur-Weyl duality, i.e. Once again, by additivity, we only need to consider the case . This gives where *T* is any filling of shape λ and thus:

From here, it is clear that and so is yet another expression of the Schur functor.

Recall that the irreducible -module can be written as where is the Young symmetrizer for a fixed filling of shape λ. Hence, the irrep can be written as:

For *d*=3, and , let us take the Young symmetrizer:

If is the standard basis for , then is spanned by elements of the form:

These satisfy the following:

By the first relation, we only include those with . By the second relation, we may further restrict to the case since if we have and if we replace We claim that the resulting spanning set forms a basis. Indeed the number of such triplets (*i*, *j*, *k*) is:

On the other hand, we know that has one copy of , one copy of and two copies of so

Thus is the cardinality of the set and we are done.

**Note**

Observe that the set corresponds to the set of all SSYT with shape (2, 1) and entries in [n] (by writing *i*, *j* in the first row and *k* below *i*). This is an example of our earlier claim that a basis of can be indexed by SSYT’s with shape and entries in [*n*]. For that, we will explore as a quotient module of in the next article. This corresponds to an earlier article, which expressed -irrep as a quotient of .