Polynomials and Representations XIII

Skew Diagrams

If we multiply two elementary symmetric polynomials e_\lambda and e_\mu, the result is just e_\nu, where \nu is the concatenation of \lambda and \mu sorted. Same holds for h_\lambda h_\mu. However, we cannot express s_\lambda s_\mu in terms of s_\nu easily, which is unfortunate since the Schur functions are the “preferred” basis, being orthonormal. Hence, we define the following.

Definition. A skew Young diagram is a diagram of the form \lambda/ \mu, where \lambda and \mu are partitions and \lambda_i \ge \mu_i for each i.

E.g. if \lambda = (7, 5, 3) and \mu = (5, 2, 1), then

skew_young_diagram_example

Note that the same skew Young diagram can also be represented by \lambda'/\mu' where \lambda' = (8, 6, 4, 1) and \mu' = (6, 3, 2, 1). These two Young diagrams are considered identical.

Definition. A skew semistandard Young tableau (skew SSYT) is a labelling of the skew Young diagram with positive integers such that each row is weakly increasing and each column in strictly increasing. Now \lambda/\mu is called the shape of the tableau and its type is given by \alpha where \alpha_i is the number of times i appears.

E.g. the following is a skew SSYT of the above shape. Its type is (4, 1, 1, 1).

skew_ssyt_example

blue-lin

Skew Schur Polynomials

Definition. The skew Schur polynomial corresponding to \lambda/\mu is given by:

\displaystyle s_{\lambda/\mu} := \sum_{\text{shape}(T) = \lambda/\mu} x^T

where x^T is \prod_i x_i^{\text{\# of } i \text{ in } T}. E.g. the above diagram gives x^T = x_1^4 x_2 x_3 x_4.

The proof for the following is identical to the case of Schur polynomials.

Lemma. The skew Schur polynomial s_{\lambda/\mu} is symmetric.

Indeed, one checks easily that the Knuth-Bendix involution works just as well for skew Young tableaux.

ssyt_bender-knuth_v2

So the number of skew SSYT of shape \lambda/\mu and type \nu is unchanged when we swap \nu_i and \nu_{i+1}.

Example

For \lambda = (3, 2) and \mu = (1), we have:

skew_schur_polynomial_example

The following result explains our interest in studying skew Schur polynomials.

Lemma. The product of two skew Schur polynomials is a skew Schur polynoial.

For example, we have:

skew_schur_polynomial_product

It remains to express s_{\lambda/\mu} as a linear combination of s_\nu, where |\nu| = |\lambda| - |\mu|.

blue-lin

Littlewood-Richardson Coefficients

Recall that we have Pieri’s formula h_r s_\lambda = \sum_\nu s_\nu, where \nu is summed across all diagrams obtained by adding r boxes to \lambda, such that no two are on the same column. Repeatedly applying this gives us:

\displaystyle h_\mu s_\lambda = \sum_{\nu_k} \sum_{\nu_{k-1}} \ldots \sum_{\nu_1} s_{\nu_k}

where \nu_0 := \lambda and \nu_{i+1} is obtained from \nu_i by adding \mu_i boxes such that no two lie in the same column. Hence, the number of occurrences for a given skew Young diagram \nu' is the number of skew SSYT’s with shape \nu' and type \mu.

Example

If \mu = (4, 2, 1) and \lambda = (5, 4, 3), here is one way of appending 4, 2, 1 boxes in succession:

skew_diagram_forming

which corresponds to the following skew SSYT:

skew_diagram_forming_2

This gives us the tool to prove the following.

Theorem. For any f\in \Lambda^{(d)} with d = |\lambda| - |\mu|, we have:

\displaystyle \left< s_{\lambda/\mu}, f\right> = \left< s_\lambda, f s_\mu\right>

so the linear map s_\lambda \mapsto s_{\lambda/\mu} is left adjoint to multiplication by s_\mu.

Proof

It suffices to prove this for all f = h_\nu where \nu\vdash d. Since \{h_\lambda\} is the basis dual to \{m_\lambda\}, the LHS is the coefficient of m_\nu in expressing s_{\lambda/\mu} in terms of monomial symmetric polynomials. By definition of s_{\lambda/\mu}, this is equal to the number of skew SSYTs with shape \lambda/\mu and type \nu; we will denote this by K_{\lambda/\mu, \nu}, the skew Kostka coefficient.

By the reasoning above, when h_\nu s_\mu is expressed as a linear combination of Schur functions, the coefficient of s_\lambda is also K_{\lambda/\mu, \nu}. Since the Schur functions are orthonormal, we are done. ♦

Note

The theorem is still true even when \lambda_i \ge \mu_i does not all hold, if we take s_{\lambda/\mu} = 0. Indeed, by reasoning with Pieri’s rule again, every Schur polynomial s_\lambda occuring in h_\nu s_\mu must have \lambda_i\ge \mu_i for all i.

Definition. When we express:

\displaystyle s_\mu s_\nu = \sum_{\lambda} c_{\mu\nu}^\lambda s_\lambda,

the values c_{\mu\nu}^\lambda = \left< s_\mu s_\nu, s_\lambda\right> are called the Littlewood-Richardson coefficients.

By the above theorem, this equals \left< s_{\lambda/\mu}, s_\nu\right>. One can calculate this using the Littlewood-Richardson rule, which we will cover later.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Polynomials and Representations XII

Lindström–Gessel–Viennot Lemma

Let us switch gears and describe a beautiful combinatorial result. Suppose G is a graph which is directed, has no cycles, and there are only finitely many paths from a vertex to another. Given sets of n vertices:

A =\{a_1, \ldots, a_n\}, \qquad B = \{b_1, \ldots, b_n\}

the lemma computes the “weighted sum” of non-intersecting paths from A to B.

To be specific, suppose each edge e of the graph has a weight w_e which takes values in a fixed ring. Given a directed path P : a\to b from a to b, let w(P) be the product of the edges along the path P. Now write e(a, b) for the sum of w(P) over all P:a\to b.

lgv_lemma_path_sum

Next given a_1, \ldots, a_n, b_1, \ldots, b_n above, we compute the determinant of:

M = \begin{pmatrix} e(a_1, b_1) & e(a_1, b_2) & \cdots & e(a_1, b_n) \\e(a_2, b_1) & e(a_2, b_2) & \cdots & e(a_2, b_n) \\\vdots & \vdots & \ddots & \vdots \\e(a_n, b_1) & e(a_n, b_2) & \cdots & e(a_n, b_n)\end{pmatrix}

On the other hand, consider a collection of paths P = (P_1, \ldots, P_n) where each P_i : a_i \to b_{\sigma(i)} for some permutation \sigma = \sigma(P) \in S_n. Furthermore we want the paths to be pairwise non-intersecting, i.e. no two paths contain a common vertex (not even at the endpoints). Now we can state the desired result.

Lindström–Gessel–Viennot (LGV) Lemma. We have:

\displaystyle\det(M) = \sum_{P = (P_1, \ldots, P_n)} \text{sgn}(\sigma(P)) \prod_{i=1}^n w(P_i)

over all tuples of paths (P_1, \ldots, P_n) described above.

lgv_lemma_path_sum_2

blue-lin

Proof of the LGV Lemma

First expand the determinant of M; each term is of the form:

\text{sgn}(\sigma) \times e(a_1, b_{\sigma(1)}) \times e(a_2, b_{\sigma(2)}) \times \ldots \times e(a_n, b_{\sigma(n)}), \qquad \sigma \in S_n.

Expanding each e(a_i, b_{\sigma(i)}) = \sum_{P_i : a_i \to b_{\sigma(i)}} w(P_i), the above term is a sum of terms of the form

\displaystyle\sum_{P_1, \ldots, P_n}\text{sgn}(\sigma) w(P_1) w(P_2) \ldots w(P_n),

summed over all n-tuples (P_i : a_i \to b_{\sigma(i)})_{i=1}^n. This is close to the desired result but here, the P_i may intersect. Hence, we wish to show that the sum over all (P_1, \ldots, P_n) for which some P_i and P_j intersect, vanishes.

Let \Sigma = \{P = (P_1, \ldots, P_n)\} be the collection of all such paths; for each P\in \Sigma, we:

  • let v be the first vertex occurring as an intersection between P_i and P_j (i\ne j), after taking a total ordering for the set of vertices;
  • let i be the smallest index such that P_i contains v;
  • let j be the largest index such that P_j contains v (note that j\ne i);

and get a triplet (i, v, j). Now let P' be the same tuple of paths but with the tails of P_i and P_j switched after v.

lgv_proof_involution

This gives a map f : \Sigma \to \Sigma, which is an involution since the corresponding triplet for f(P) is still (i, v, j). However \sigma(f(P)) and \sigma(P) have opposite signs. Denoting w(P_1) \ldots w(P_n) by T(P) we have T(f(P)) = T(P) and so:

\displaystyle\sum_{P\in \Sigma} \text{sgn}(\sigma(P)) T(P) = -\sum_{P\in \Sigma} \text{sgn}(\sigma(f(P))) T(f(P)) = -\sum_{P'\in \Sigma} \text{sgn}(\sigma(P')) T(P').

So the sum is zero. ♦

blue-lin

Jacobi-Trudi Identities

Here is our first application of the LGV lemma; all computations are in the formal ring \Lambda.

Theorem. Let \mu = \overline \lambda be the transpose of \lambda. Then s_\lambda is the determinant of either of these matrices.

\left( h_{\lambda_i + j - i} \right)_{1 \le i, j\le d} =\begin{pmatrix} h_{\lambda_1} & h_{\lambda_1 + 1} & \ldots & h_{\lambda_1 + d - 1} \\ h_{\lambda_2 - 1} & h_{\lambda_2} & \ldots & h_{\lambda_2 + d - 2} \\ \vdots & \vdots & \ddots & \vdots \\ h_{\lambda_d - d + 1} & h_{\lambda_d - d + 2} & \ldots & h_{\lambda_d}\end{pmatrix}

\left( e_{\mu_i + j - i} \right)_{1 \le i, j\le d} =\begin{pmatrix} e_{\mu_1} & e_{\mu_1 + 1} & \ldots & e_{\mu_1 + d - 1} \\ e_{\mu_2 - 1} & e_{\mu_2} & \ldots & e_{\mu_2 + d - 2} \\ \vdots & \vdots & \ddots & \vdots \\ e_{\mu_d - d + 1} & e_{\mu_d - d + 2} & \ldots & e_{\mu_d}\end{pmatrix}

where d is chosen to be \ge l(\lambda) and \ge l(\mu) in the respective cases.

Note

One remembers the structure of the matrices by writing \lambda as the subscripts for the diagonal entries, then incrementing to the right. We take h_i = e_i = 0 for all i <0. Observe that since h_0 = e_0 = 1, there is no harm increasing d by appending zeros to the partition.

Proof

We can obtain the second identity from the first by applying \omega, so let us show the first. Before we begin, observe that the Schur polynomial can be written as:

\displaystyle s_\lambda = \sum_{\text{SSYT } T} x^T,

where x^T = \prod_{i\ge 1} x_i^{\text{\# of } i \text{ in } T}. Indeed, this follows from the proposition here.

We will use the LGV lemma: let N be large; we take the following graph.

  • The vertices are lattice points in \mathbb{Z}^2.
  • The starting and ending vertices are given by:

\begin{aligned}a_d &= (0, N), a_{d-1} = (1, N), \ldots, a_1 = (d-1, N), \\ b_d &= (\lambda_d, 1), b_{d-1} = (\lambda_{d-1} + 1, 1), \ldots, b_1 = (\lambda_1 + d-1, 1).\end{aligned}

  • The edges are of one unit, directed either eastwards or southwards. The southward edges are weighted 1 while the eastward edges on y=i are labelled x_i for i\ge 1.

For example if \lambda = (4, 1, 1), here is an example of paths connecting a\to b.

jacobi-trudi_path

Let us compute the LHS of the LGV lemma. The set of paths from a_i = (d-i, N) to b_j = (\lambda_j+ d-j, 1) sums up to the complete symmetric polynomial h_{\lambda_j} + i-j. Thus the LHS is the desired determinant expression.

On the other hand, let us consider the collection of non-intersecting paths from a_1, \ldots, a_d to b_{\sigma(1)}, \ldots, b_{\sigma(d)}. For a path to exist, \sigma must be the identity. Each collection of non-intersecting paths corresponds to a reverse SSYT of shape \lambda (i.e. the rows are weakly decreasing and columns are strictly decreasing). E.g. the above diagram gives:

jacobi-trudi_path_corr

Hence, the RHS in the LGV lemma equals the sum of x^T over all reverse SSYT T of shape \lambda. This is equal to the sum over all SSYT T of shape \lambda (we take m, m' to be the minimum and maximum elements of the table respectively and replace each u with m+m'-u, thus obtaining an SSYT). Thus the sum is s_\lambda. ♦

blue-lin

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Polynomials and Representations XI

Here, we will give a different interpretation of the Schur polynomial, however this definition only makes sense in the ring \Lambda_n.

For a given vector \alpha = (\alpha_1, \ldots, \alpha_n) of non-negative integers, define the following determinant, a polynomial in x_1, \ldots, x_n:

\Delta_\alpha := \det\left( x_j^{\alpha_i} \right)_{1\le i, j \le n} =\det \begin{pmatrix}x_1^{\alpha_1 } & x_2^{\alpha_1 } & \ldots & x_n^{\alpha_1}\\x_1^{\alpha_2 } & x_2^{\alpha_2 } & \ldots & x_n^{\alpha_2}\\\vdots & \vdots & \ddots & \vdots \\x_1^{\alpha_n} & x_2^{\alpha_n} & \ldots & x_n^{\alpha_n}\end{pmatrix}.

For the case where \delta = (n-1, n-2, \ldots, 0), we also denote \Delta := \Delta_\delta = \Delta_{n-1, n-2, \ldots, 0}. The following result is classical.

Lemma. We have  \Delta = \prod_{1\le i < j \le n} (x_i - x_j). Also, for any \alpha, the polynomial \Delta_\alpha is divisible by \Delta.

Proof

In the above matrix, swapping any distinct x_i, x_j results in an exchange of two columns, which flips the sign of the determinant. Thus when x_i = x_j, \Delta_\alpha vanishes. Hence \Delta_\alpha is divisible by x_i - x_j for all i<j. It remains to prove the first statement.

For that, note that \Delta and \prod_{i<j} (x_i - x_j) are both homogeneous of degree \frac {n(n-1)} 2 so they are constant multiples of each other. Since the largest term (in lexicographical order) is x_1^{n-1} x_2^{n-2} \ldots x_{n-1} on both sides, equality holds. ♦

Definition. Suppose \lambda is a partition with l(\lambda) \le n; append zeros to it so that \lambda has n elements. Now define:

\displaystyle c_\lambda := \frac{\Delta_{\lambda + \delta}}{\Delta_\delta}.

Being a quotient of two alternating polynomials, c_\lambda(x_1, \ldots, x_n) is symmetric; note that it is homogeneous of degree |\lambda|.

Example

Suppose \lambda = (2, 1). We have:

\begin{aligned} n=2 &\implies c_\lambda(x_1, x_2) = x_1^2 x_2 + x_1 x_2^2,\\ n=3 &\implies c_\lambda(x_1, x_2, x_3) = (x_1^2 x_2 + x_1 x_2^2 + x_2^2 x_3 + x_2 x_3^2 + x_3^2 x_1 + x_3 x_1^2) + 2x_1 x_2 x_3.\end{aligned}

It turns out c_\lambda is precisely the Schur polynomial s_\lambda \in \Lambda_n; this will be proven through the course of the article.

blue-lin

Pieri’s Formula

Theorem. Take a partition \mu with l(\mu) \le n and let r \ge 0; we have:

\displaystyle c_\mu e_r = \sum_\lambda c_\lambda,

where \lambda is taken over all partitions with l(\lambda) \le n obtained by adding r squares to \mu such that no two of them lie in the same row.

Proof

We need to show

\displaystyle\Delta_{\mu + \delta} e_r = \sum_\lambda \Delta_{\lambda + \delta}, \qquad \delta := (n-1, n-2, \ldots, 0).

Note that both sides are homogeneous of degree |\mu| + \frac {n(n-1)} 2 + r; hence let us compare their coefficients of x^\alpha where x^\alpha is short for \prod_i x_i^{\alpha_i}. Since both sides are alternating polynomials, we may assume \alpha_1 > \alpha_2 > \dots. Now expand the LHS:

\displaystyle \Delta_{\mu+\delta} e_r = \sum_{i_1 < \ldots < i_r} \Delta_{\mu + \delta} x_{i_1} \ldots x_{i_r}.

In \Delta_{\mu + \delta}, each term is of the form \pm x^{ \sigma (\mu+\delta)} so the exponents of x_1, \ldots, x_n are all distinct. Now, multiplying by x_{i_1} \ldots x_{i_r} increases each exponent by at most 1. Thus to obtain x^\alpha with strictly decreasing exponents, we must begin with strictly decreasing exponents in the first place (i.e. x^{\mu + \delta}), then pick x_{i_1} \ldots x_{i_r} such that the exponents remain strictly decreasing.

Thus each (i_1, \ldots, i_r) corresponds to a binary n-vector \gamma of weight r, such that \mu + \gamma + \delta is strictly decreasing. And the latter holds if and only if \mu + \gamma is a partition.

E.g. suppose \mu = (5, 3, 3, 1, 0), n=5, r=3; we get:

\mu + \delta = (9, 6, 5, 2, 1) \implies \gamma =\begin{cases}(1, 1, 1, 0, 0),\ (1, 1, 0, 1, 0), \ (1, 1, 0, 0, 1), \\(1, 0, 0, 1, 1),\ (0, 1, 1, 1, 0),\ (0, 1, 0, 1, 1).\end{cases}

In diagrams, this corresponds to:

pieri_add_squares

i.e. partitions obtained by adding r boxes to \mu such that no two lie in a row. ♦

blue-lin

Main Result

Theoremc_\lambda is the Schur polynomial s_\lambda in \Lambda_n.

Proof

We wish to show \mathbf c = \mathbf K\mathbf m. Successively applying Pieri’s formula gives:

e_\lambda = 1\cdot e_{\lambda_1} \ldots e_{\lambda_l} = \sum_{\mu_l} \ldots \sum_{\mu_1} c_{\mu_l}

where \mu_0 = \emptyset and the Young diagram for \mu_{i+1} is obtained from \mu_i by attaching \lambda_i boxes such that no two lie in the same row; the sum is over the set of all such (\mu_0, \ldots, \mu_l). Label the additional boxes in \mu_i \to \mu_{i+1} by i+1 and take the transpose; we obtain an SSYT of shape \overline{\mu_l} and type \lambda.

For example, suppose \lambda = (5, 4, 3, 1); here is one way we can successively add squares:

pieri_successive

which gives us:

ssyt_obtained

Hence for each \mu, the number of occurrences of c_\mu in the above nested sum is K_{\overline \mu\lambda}. This gives:

\displaystyle e_\lambda = \sum_\mu K_{\overline \mu \lambda} c_\mu \implies \mathbf e = \mathbf K^t\mathbf J \cdot \mathbf c.

From \mathbf e = \mathbf K^t \mathbf J \mathbf K \mathbf m, we get \mathbf c = \mathbf K\mathbf m= \mathbf s as desired. ♦

Corollary

Pieri’s Formulae. Take a partition \mu and let r \ge 0; we have, in the formal ring \Lambda:

\displaystyle s_\mu e_r = \sum_\lambda s_\lambda,\quad s_\mu h_r = \sum_{\lambda'} s_{\lambda'},

where \lambda (resp. \lambda') is taken over all partitions obtained by adding r squares to \mu, such that no two of them lie in the same row (resp. column).

Proof

By the earlier Pieri’s formula, the first formula holds in all \Lambda_n. Hence it holds in \Lambda too. Applying \omega to it and using \omega(s_\lambda) = s_{\overline\lambda}, we get the second formula. ♦

Posted in Uncategorized | Tagged , , , , | Leave a comment

Polynomials and Representations X

Cauchy’s Identity

In this article, our primary focus is the ring \Lambda_n of symmetric polynomials in x_1, \ldots, x_n.

Theorem (Cauchy’s Identity). Consider polynomials h_\lambda, m_\lambda \in \Lambda_n over all partitions \lambda. [Recall that m_\lambda = 0 if l(\lambda) > n.] We have an equality of formal power series:

\displaystyle \sum_{\lambda} h_\lambda(x_1, \ldots, x_n) m_\lambda(y_1, \ldots, y_n) = \prod_{1\le i,j\le n} \frac 1 {1 - x_i y_j}.

Note.

For convenience, we will use x and y to denote x_1, \ldots, x_n and y_1, \ldots, y_n respectively. Also x^\lambda means x_1^{\lambda_1} x_2^{\lambda_2} \ldots.

Note that the LHS sum is well-defined since for each d\ge 0, there are only finitely many partitions of d, so we get a finite sum in \Lambda_n^{(d)}. The eventual sum lies in \prod_{d\ge 0} \Lambda_n^{(d)}, a formal power series in x_1, \ldots, x_n, y_1, \ldots, y_n.

Proof

If d = |\lambda|, we have h_\lambda = \sum_{\mu\vdash d} M_{\lambda\mu} m_\mu. Thus we need to show:

\displaystyle \sum_{d\ge 0}\sum_{\lambda,\mu \vdash d} M_{\lambda\mu} m_\mu(x) m_\lambda(y) = \prod_{1\le i,j\le n} \frac 1 {1 - x_i y_j}.

The proof is combinatorial: each term on the RHS is

(1 - x_i y_j)^{-1} = 1 + (x_i y_j) + (x_i y_j)^2 + \ldots.

Now we compute the coefficient of x^\mu y^\lambda in the expanded product. E.g. for \lambda = (3,2,2) and \mu = (4,3), to obtain the term x_1^3 x_2^2 x_3^2 \cdot y_1^4 y_2^3 we can multiply:

\displaystyle (x_1 y_1)^2 \cdot (x_3 y_1)^2 \cdot (x_1 y_2)^1 \cdot (x_2 y_2)^2 \implies \begin{array}{|c|ccc|}\hline  & 3 & 2 & 2 \\ \hline 4 & 2 & 0 & 2 \\ 3 & 1 & 2 & 0 \\ \hline\end{array}.

Hence each occurrence of x^\mu y^\lambda in the expansion corresponds to a matrix of non-negative integers with row sum \mu_i and column sum \lambda_j, and the coefficient of x^\lambda y^\mu on the RHS is exactly M_{\lambda\mu}. This means RHS = LHS. ♦

Exercise

Simplify the following sum:

\displaystyle \sum_\lambda e_\lambda(x_1, \ldots, x_n) m_\lambda(y_1, \ldots, y_n).

blue-lin

Hall Inner Product for \Lambda^{(d)}_n

Now \{h_\lambda\} and \{m_\lambda\} are dual bases under the Hall inner product and they satisfy Cauchy’s identity. We would like to prove the converse that Cauchy’s identity imply dual bases. However, the problem is that our results for the Hall inner product were proven for \Lambda and not \Lambda_n. To obtain similar results for \Lambda_n, we need to be a bit more careful.

We already know that

\{h_\lambda : \lambda \vdash d, \lambda_1 \le n\}

form a basis of \Lambda_n^{(d)}. It turns out we also have:

Proposition. Let Z_d := \{\lambda : \lambda \vdash d, l(\lambda) \le n\}. Then the sets

\{ h_\lambda : \lambda \in Z_d\}, \quad \{ s_\lambda : \lambda \in Z_d\}

are \mathbb{Z}-bases of \Lambda_n^{(d)}.

Proof

Consider the surjective map \pi:\Lambda^{(d)} \to \Lambda_n^{(d)} which takes x_i \mapsto 0 for all i>n. Its kernel has a basis given by the monomial symmetric polynomials \{ m_\lambda : \lambda \vdash d, l(\lambda) > n\}. Now if l(\lambda) > n and \mu \trianglelefteq \lambda, we have l(\mu) \ge l(\lambda) > n as well. Thus:

l(\lambda) > n \implies s_\lambda = \sum_{\mu\trianglelefteq \lambda} K_{\lambda\mu} m_\mu \in \text{ker}(\pi).

Since \{s_\lambda : \lambda \vdash d\} is a basis of \Lambda^{(d)}, the set \{s_\lambda : \lambda\vdash d, l(\lambda)\le n\} thus forms a basis of \Lambda_n^{(d)} as desired.

Finally since h_\lambda = \sum_{\mu\trianglerighteq\lambda} K_{\mu\lambda} s_\mu and the matrix (K_{\mu\lambda})_{\mu, \lambda \in Z_d} is invertible, we see that \{h_\lambda : \lambda\in Z_d\} also forms a basis of \Lambda_n^{(d)}. ♦

Summary. The following are \mathbb{Z}-bases of \Lambda_n^{(d)}.

\begin{aligned} \{ m_\lambda : \lambda \vdash d, l(\lambda) \le n\}, &\qquad \{ e_\lambda : \lambda \vdash d, \lambda_1 \le n\}, \\ \{h_\lambda : \lambda \vdash d, l(\lambda) \le n\}, &\qquad \{h_\lambda : \lambda\vdash d, \lambda_1 \le n\}, \\ \{s_\lambda : \lambda \vdash d, l(\lambda) \le n\}.&\end{aligned}

For example, when d=4 and n=3, we obtain the following, where each row gives a \mathbb{Z}-basis of the abelian group.

basis_of_formal_and_polynomial

Note: from this, it is clear that \omega(s_{\lambda})\ne s_{\overline\lambda} when we are confined to \Lambda_n^{(d)}. E.g. in the above diagram \omega(s_4) \ne s_{1111} = 0.

Now we define the Hall inner product for \Lambda_n^{(d)} by letting \{h_\lambda\}_{\lambda \in Z_d} and \{m_\lambda\}_{\lambda \in Z_d} to be dual bases. As before, we have the following.

Proposition.

  • \{s_\lambda\}_{\lambda \in Z_d} is an orthonormal basis.
  • The Hall inner product on \Lambda_n^{(d)} is symmetric.

Proof. Same as before (check carefully).

warningHowever, the involution \omega : \Lambda_n^{(d)} \to \Lambda_n^{(d)} is no longer unitary. This can be checked for n=2, d=3. Here, we get s_3 = m_3 + m_{21} and s_{21} = m_{21}= e_{21}. A bit of computation then shows that \omega(s_{21}) = h_{21} = s_3 + s_{21} which is not a unit vector. The reader should examine the proof to see why it does not carry over.

blue-lin

Criterion for Orthonormal Basis

Now let us prove the desired result.

Theorem. Suppose we have a ollection f_\lambda, g_\lambda \in \Lambda^{(d)}_n indexed by \lambda for all \lambda \in Z_d and d\ge 0. If Cauchy’s identity holds for them:

\displaystyle \sum_{d\ge 0,\lambda \in Z_d} f_\lambda(x) g_\lambda(y) = \prod_{i,j\ge 1} (1 - x_i y_j)^{-1},

then f_\lambda and g_\lambda are dual bases with respect to the Hall inner product.

Proof

We know that \{h_\lambda\}_{\lambda\in Z_d} and \{m_\lambda\}_{\lambda \in Z_d} both satisfy Cauchy’s identity and are dual. Taking the degree-d component in x or y:

\displaystyle \sum_{\lambda \in Z_d} f_\lambda(x) g_\lambda(y) = \sum_{\lambda\in Z_d} m_\lambda(x) h_\lambda(y).

Now, let us first show that \{f_\lambda\}_{\lambda\in Z_d} is a basis of \Lambda^{(d)}_n \otimes_\mathbb{Z} \mathbb{Q}; comparing dimensions, it suffices to show that the set spans this space.

  • Suppose not; then there exists h \in \Lambda_n^{(d)} - \{0\} such that \left< f_\lambda, h\right> =0 for all \lambda \in Z_d.
  • Write h(x) = \sum_{\lambda\in Z_d} c_\lambda h_\lambda(x), where the c_\lambda are not all zero.
  • The two Cauchy identities give us (note: \left< a(x,y), b(x)\right> is computed term-wise by taking a(x,y) as a power series in x with coefficients in \mathbb{Z}[\![ y ]\!]):

\displaystyle 0 = \left< \sum_{\lambda\in Z_d} f_\lambda(x) g_\lambda(y), h(x)\right> =\left< \sum_{\lambda\in Z_d} m_\lambda(x) h_\lambda(y), \sum_{\mu\in Z_d} c_\mu h_\mu(x)\right> = \sum_{\lambda\in Z_d} c_\lambda h_\lambda(y).

  • Hence h=0 as desired.

Now it remains to show that f_\lambda^\vee = g_\lambda, where the dual is taken over \mathbb{Q}. We have:

\displaystyle \left< m_\lambda(x), \sum_{\mu\in Z_d} h_\mu(x) m_\mu(y) \right> = m_\lambda(y)

for any \lambda \in Z_d; by linearity we can replace m_\lambda with any f\in \Lambda_n^{(d)}. In particular

\displaystyle f_\lambda^\vee(y) = \left< f_\lambda^\vee(x), \sum_{\mu\in Z_d} h_\mu(x) m_\mu(y) \right> =\left< f_\lambda^\vee(x), \sum_{\mu\in Z_d} f_\mu(x) g_\mu(y)\right> = g_\lambda(y)

which is what we wanted. ♦

Posted in Uncategorized | Tagged , , , , | Leave a comment

Polynomials and Representations IX

Hall Inner Product

Let us resume our discussion of symmetric polynomials. First we define an inner product on d-th component \Lambda^{(d)} of the formal ring. Recall that the sets

\displaystyle \{h_\lambda : \lambda \vdash d\},\quad \{m_\lambda: \lambda \vdash d\}

are both \mathbb{Z}-bases of \Lambda{(d)}.

Definition. The Hall inner product

\left<-, -\right> : \Lambda^{(d)} \times \Lambda^{(d)} \to \mathbb{Z},

is defined by setting \{h_\lambda\}_{\lambda \vdash d} and \{m_\lambda\}_{\lambda\vdash d} to be dual, i.e. \left< h_\lambda, m_\mu\right> := \delta_{\lambda\mu} where \delta_{\lambda\mu} is 1 if \lambda=\mu and 0 otherwise.

The introduction of the Hall inner product may seem random and uninspired, but it has implications in representation theory, which we will (hopefully) see much later. The following properties of the inner product are easy to prove.

Proposition.

  • The inner product is symmetric, i.e. \left< a,b\right> = \left< b,a\right> for any a,b\in \Lambda^{(d)}.
  • The involution \omega is unitary with respect to the inner product, i.e. \left<\omega(a), \omega(b)\right> = \left<a, b\right>.

Proof

For \lambda\vdash d, we have h_\lambda = \sum_{\mu\vdash d} M_{\lambda\mu} m_\mu, so \displaystyle\left< h_\lambda, h_\mu\right> = M_{\mu\lambda}. By definition M_{\mu\lambda} = M_{\lambda\mu} for any partitions \lambda and \mu. Thus the Hall inner product is symmetric.

Next, \Lambda^{(d)} has bases given by \{h_\lambda\} and \{e_\lambda\} over all \lambda \vdash d. We get:

\displaystyle \left< \omega(h_\lambda), \omega(e_\mu)\right> = \left< e_\lambda, h_\mu\right> = N_{\lambda\mu}.

which is equal to \left< h_\lambda, e_\mu\right> since the inner product is symmetric and N_{\lambda\mu} = N_{\mu\lambda}. By linearity, \left<\omega(a), \omega(b)\right> = \left<a, b\right> for any a,b \in \Lambda^{(d)}.  ♦

In the next section, we will see that the inner product is positive-definite. Even better, we will explicitly describe an orthonormal basis which lies in \Lambda.

blue-lin

Schur Polynomials – An Orthonormal Basis

Recall that we have, in \Lambda,

\mathbf h = \mathbf M \mathbf m, \qquad \mathbf M = \mathbf K^t \mathbf K,

where \mathbf K = (K_{\lambda\mu}) is the Kostka number, i.e. number of SSYT with shape \lambda and type \mu.

Definition. For each partition \lambda, the Schur polynomial is defined as:

s_\lambda := \sum_{\mu} K_{\lambda\mu} m_\mu.

Written vectorially, this gives \mathbf s = \mathbf K\mathbf m and thus \mathbf h = \mathbf K^t \mathbf s. Note that s_\lambda \in \Lambda^{(d)} where d:=|\lambda|.

For each n>0, the image of s_\lambda in \Lambda_n^{(d)} is also called the Schur polynomial; we will take care to avoid any confusion.

Example.

Consider the case of d=3. We have:

\mathbf K = \begin{pmatrix} 1 & 1 & 1 \\ 0 & 1 & 2 \\ 0 & 0 & 1\end{pmatrix}\implies \begin{aligned} s_3 &= m_3 + m_{21} + m_{111},\\ s_{21} &= m_{21} + 2m_{111},\\ s_{111} &= m_{111}.\end{aligned}

We then have:

Proposition. The polynomials \{s_\lambda : \lambda \vdash d\} form an orthonormal basis of \Lambda^{(d)}, i.e. \left<s_\lambda, s_\mu\right> = \delta_{\lambda\mu}, the Kronecker delta function (which takes 1 when \lambda = \mu and 0 otherwise).

Proof

From \mathbf s =\mathbf K\mathbf m and \mathbf h = \mathbf K^t \mathbf s we have:

\left< s_\lambda, h_\mu\right> = \left< \sum_\nu K_{\lambda\nu} m_\nu, h_\mu\right> = \sum_\nu K_{\lambda\nu}\delta_{\nu\mu} = K_{\lambda\mu},

\left< s_\lambda, h_\mu\right> = \left< s_\lambda, \sum_\nu K_{\nu\mu} s_\nu \right> = \sum_\nu K_{\nu\mu}\left< s_\lambda, s_\nu\right>.

Treating \left< s_\lambda, s_\nu\right> as a matrix \mathbf A, we get \mathbf A \mathbf K= \mathbf K; since \mathbf K is invertible, \mathbf A = \mathbf I so the s_\lambda are orthonormal. ♦

Corollary. The Hall inner product on \Lambda^{(d)} is positive-definite.

blue-lin

Further Results on Schur Polynomials

Since \{s_\lambda\}_{\lambda \vdash d} form an orthonormal basis of \Lambda^{(d)}, so do \{\omega(s_\lambda)\}_{\lambda \vdash d}. Now apply the following.

Lemma. Suppose A is a finite free abelian group with orthonormal basis (v_i)_{i=1}^n, i.e. \left< v_i, v_j\right> = \delta_{ij}. If (w_i)_{i=1}^n is another orthonormal basis of A, then there is a permutation \sigma of \{1,\ldots, n\} such that:

w_i = \pm v_{\sigma(i)}, \quad \text{ for } i=1,\ldots, n.

Thus, unlike vector spaces over a field, an orthonormal basis of a finite free abelian group is uniquely defined up to permutation and sign.

Proof

Fix i and we have w_i = \sum_{j=1}^n c_{ij} v_j for some c_{ij} \in \mathbb{Z}. We get:

1 = \left< w_i, w_i\right> = \sum_{j=1}^n c_{ij}^2.

Thus c_{ij_0} = \pm 1 for some j_0 depending on i, and c_{ij} = 0 for all j \ne j_0. The map i\mapsto j_0 gives a function \sigma : \{1,\ldots,n\} \to \{1, \ldots, n\}. Since the \{w_i\} form a basis, \sigma is a bijection. ♦

Thus \omega(s_\lambda) = \pm s_{\sigma(\lambda)} for some permutation of \{\lambda : \lambda \vdash d\}. We write this as \omega(\mathbf s) = \mathbf W \cdot \mathbf s where \mathbf W has exactly one non-zero entry in each row and each column, and such an entry is \pm 1. Thus

\begin{aligned}\mathbf h = \mathbf K^t \mathbf s &\overset {\omega} \implies \mathbf e = \mathbf K^t \omega(\mathbf s)= \mathbf K^t \mathbf W\mathbf s = \mathbf K^t \mathbf W \mathbf K \mathbf m\\ &\implies \mathbf N = \mathbf K^t \mathbf W\mathbf K.\end{aligned}

Now we saw earlier that \mathbf J\mathbf N is upper-triangular with all 1’s on the main diagonal where \mathbf J is a permutation matrix swapping \lambda with \overline\lambda (so \mathbf J^2 = \mathbf I). Hence:

\mathbf J \mathbf K^t \mathbf W \mathbf K = \begin{pmatrix}1 & \ldots & {\small \ge 0} \\0 & \ddots & \vdots \\0 & 0 & 1 \end{pmatrix} \implies \mathbf J \mathbf K^t =\begin{pmatrix}1 & \ldots & {\small \in \mathbb{Z}} \\0 & \ddots & \vdots \\0 & 0 & 1 \end{pmatrix} \mathbf W^{-1}.

Since all entries of \mathbf J \mathbf K^t are non-negative, it follows that \mathbf W is a permutation matrix. And since \mathbf K is also upper-triangular with 1’s on the main diagonal, we have \mathbf W = \mathbf J.

blue-lin

Summary

Thus we have shown:

\omega(s_\lambda) = s_{\overline\lambda} for all \lambda \vdash d.

\mathbf M = \mathbf K^t \mathbf K \implies \mathbf h = \mathbf K^t \mathbf K \mathbf m.

\mathbf N = \mathbf K^t \mathbf J \mathbf K \implies \mathbf e = \mathbf K^t \mathbf J \mathbf K \mathbf m.

The second and third relations can also be written as:

\displaystyle M_{\lambda\mu} = \sum_{\nu \trianglerighteq \lambda,\, \nu\trianglerighteq \mu } K_{\nu\lambda} K_{\nu\mu},\qquad N_{\lambda\mu} = \sum_{\lambda\trianglelefteq \nu\trianglelefteq \overline\mu} K_{\nu\lambda} K_{\overline\nu \mu}.

Also since \det\mathbf K = 1 and J is a permutation matrix, we have \det \mathbf M = 1 and \det \mathbf N = \pm 1.

Example

Consider the case d=3. Check that \mathbf N = \mathbf K^t \mathbf J \mathbf K:

\mathbf J = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 1 & 0\\ 1 & 0 & 0\end{pmatrix}, \quad \mathbf K = \begin{pmatrix} 1 & 1 & 1 \\ 0 & 1 & 2 \\ 0 & 0 & 1\end{pmatrix}, \quad \mathbf N = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 1 & 3 \\ 1 & 3 & 6\end{pmatrix}.

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Polynomials and Representations VIII

Matrix Balls

Given a matrix A of non-negative integers, the standard RSK construction masks the symmetry between P and Q, but in fact we have:

Symmetry Theorem. If A corresponds to (PQ), then the transpose of A corresponds to (QP). In particular, if A is a permutation matrix, then swapping P and Q corresponds to taking the inverse of A.

Here, we will give an alternative treatment for the RSK correspondence, by means of a construction known as matrix balls. We will demonstrate it with the following matrix for \lambda = (5, 3, 3) and \mu = (5, 3, 3):

A = \begin{pmatrix} 2 & 1 & 2 \\ 2 & 1 & 0 \\ 1 & 1 & 1\end{pmatrix}

with the RSK correspondence producing:

P = \boxed{\begin{matrix}1 & 1 & 1 & 1 & 1 & 2 & 3 \\2 & 2 & 3 \\3 \end{matrix}} \qquad Q = \boxed{\begin{matrix}1 & 1 & 1 & 1 & 1 & 3 & 3 \\2 & 2 & 2 \\3\end{matrix}}

blue-lin

Description of Algorithm

As the name implies, we create a matrix and in each entry (i, j), we place A_{ij} balls in it. Next, we number each ball as follows:

  • start from the top left entry (i,j)=(1,1); label the balls 1, 2, … in order;
  • we proceed left to right, top to bottom; at each entry (i,j), we look at all balls occuring to its top and left (i.e. (i',j') for all i\le i', j\le j' and (i',j') \ne (i,j)); let k be the smallest positive integer which has not been used among these entries; now label the balls in (i,j) from k onwards.

Our matrix above thus gives:

matrix_ball_example

From this, we obtain the first rows of P and Q: for each k, the k-th entry of the first row of P is given by the first column in which k appears among the balls; likewise, the k-th entry of the first row of Q is the first row in which k appears (mnemonic: P stores the j‘s which labels the columns while Q stores the i‘s which labels the rows). This gives us

P_1 = (1, 1, 1, 1, 1, 2, 3), \qquad Q_1 = (1, 1, 1, 1, 1, 3, 3).

which is consistent with our RSK computation.

Next Step

Next, construct a new matrix of balls. For each number k=1, 2, \ldots, look at all the balls in the current matrix containing k; this gives a diagonal zig-zag line, running from top-right to bottom-left. Between consecutive balls containing k, say at (u,v) and (u',v') with u > u', v < v', we place a ball at (u, v') in the new matrix without any label. Working with our matrix of balls above, we obtain:

matrix_ball_iteration

Let A' = (A'_{ij}) be the matrix of non-negative integer entries such that A'_{ij} is the number of balls in this new matrix. Now we label the balls in this new matrix as before, by starting from the top-left corner and proceeding left to right, top to bottom. In our working example, in the new matrix, we place 1 ball for k=3, 1 ball for k=4 and 2 balls for k=5. This gives the new matrix of balls:

matrix_ball_example_iter2

Now for the second row of P (resp. Q), we again let the k-th entry denote the first column (resp. row) in which k appears among this new set of balls. In our example, this gives P_2 = (2, 3, 3) and Q_2 = (2, 2, 2) which is consistent with our RSK computation.

Finally, note that there are two balls labelled 2 in this matrix; this gives rise to a single ball in location (3, 3) of the third matrix of balls, and we get P_3 = (3) = Q_3.

blue-lin

Proof of Equivalence with RSK

We now show that this construction recovers the desired P, Q. A priori, it is not even clear that P and Q are tableaux but this will be proven along the way. We use induction on |\lambda| = |\mu|, the number of balls in the first matrix. Clearly there is nothing to show when |\lambda| \le 1.

Suppose d := |\l| > 1; let (u, v) be the last element occurring in the two-row array (i_r, j_r) corresponding to the matrix A_{ij}. Thus when we traverse the matrix A left-to-right, top-to-bottom, the last non-zero entry is at (u, v). Consider the following.

  • Let B := (B_{ij}) be the matrix equal to A_{ij} except that B_{uv} = A_{uv} - 1.
  • The matrix of balls for (B_{ij}) is the same as that of (A_{ij}), but with one ball removed from (u, v), i.e. the one with the largest label k at (u,v).
  • By induction hypothesis on (B_{ij}), the (P', Q') obtained from its matrix balls is the same as that obtained by RSK.
  • Thus we need to show, if (P, Q) is obtained from matrix balls of (A_{ij}), then P = P' \leftarrow v and Q is obtained from Q' by attaching u to the corresponding extra square.

matrix_ball_last_entry

First Row of (PQ)

First suppose k only occurs once among all ball labels; it must then be maximum among them (since any ball with a larger label must lie to its right and/or below, and no such ball exists). By matrix ball construction, P (resp. Q) is obtained from P' (resp. Q') by attaching v (resp. u) to the end of first row as the k-th entry. Since k is the unique maximal value, we have P = P' \leftarrow v indeed. This settles the case.

Suppose k more than once, i.e. it occurs next at (u',v'), where u' <u and v'> v and no other k occurs between rows u and u' or columns v and v':

matrix_ball_last_entry_multiple

By matrix ball construction, in the first row of P', the k-th entry equals v' and the 1st to (k-1)-th entries are all \le v. On the other hand, the k-th entry of P equals v. Hence the first row of P is obtained by letting v bump v' off the k-th entry of first row of P'. Also, note that the first row of Q is identical to that of Q'. Thus the first rows of P and Q are consistent with the RSK construction.

Subsequent Rows of (PQ)

The next matrix iteration A'_{ij} from matrix balls thus has one more ball than B'_{ij}, i.e. one at (u, v'). Note that (u, v') is the last entry in A'_{ij} which is positive, hence we may repeat our reasoning from earlier.

Once again if its label k’ is unique, then the second row of P is obtained from P' by appending v' to the end, and the second row of Q from Q' by appending u to the end. In this case, we still get P = P'\leftarrow v and Q is obtained from Q' by appending to the 2nd row (where the extra square is). Otherwise, we iterate the process for the remaining rows. This completes the proof. ♦

From the matrix ball interpretation of RSK, the symmetry theorem is obvious.

  • Indeed if B_{ij} is the transpose of A_{ij}, the matrix ball obtained from (B_{ij}) is also the transpose of that obtained from (A_{ij}). All the remaining constructions are similarly transpose, with the rows and columns swapped.

blue-lin

Posted in Uncategorized | Tagged , , , , | Leave a comment

Polynomials and Representations VII

Our next task is as follows:

  • Given partition \lambda and vector \mu = (\mu_1, \mu_2, \ldots), count the number of semistandard Young tableaux with shape \lambda and type \mu (i.e. k occurs \mu_k times).

Proposition. The number of SSYT with shape \lambda and type \mu remains invariant when we permute the \mu_i‘s around.

Proof

Since each permutation is a product of transpositions of the form (i, i+1), it suffices to prove invariance under (i, i+1). This is via the Bender-Knuth involution.

Given an SSYT with \mu_i entries labelled i and \mu_{i+1} entries labelled i+1, we first pair up vertical elements labelled i and i+1 and leave them untouched. For the other elements, we consider each row which has t\ge 0 copies of i followed by u \ge 0 copies of i+1. Now replace them by u copies of i followed by t copies of i+1.

ssyt_bender-knuth_v2

The rows remain weakly increasing, but are the columns still strictly increasing? The only problem which may occur is when an i is replaced by i+1 (or vice versa) and this is equal to the entry below (or above). But we have already excluded all such cases by pairing i and i+1 earlier. ♦

Hence we might as well sort the entries of \mu and assume it is a partition.

Definition. Given partitions \lambda, \mu, the Kostka number K_{\lambda\mu} is the number of SSYT of the shape \lambda and type \mu.

Clearly K_{\lambda\mu} = 0 if |\lambda| \ne |\mu|.

blue-lin

More on Kostka Numbers

As an example, take \lambda = (2, 1) and \mu = (1, 1, 1). Since \mu_1 = \mu_2 = \mu_3 = 1, this is in fact an SYT and there are two possibilities.

young_tableau_2_examples

On the other hand, if \l = (1, 1, 1) and \mu = (2, 1), no such tableau exists. Hence

K_{21, 111} = 2, \qquad K_{111, 21} = 0.

Lemma 1. If K_{\lambda\mu} > 0, then \lambda \trianglerighteq \mu, i.e. for each r we have:

\lambda_1 + \lambda_2 + \ldots + \lambda_r \ge \mu_1 + \mu_2 + \ldots + \mu_r.

Also K_{\lambda\lambda} = 1.

Proof

Consider a positive integer r; since each column is strictly increasing, each of 1, 2, \ldots, r must occur in the first r rows, so the number of boxes \lambda_1 + \ldots + \lambda_r is at least the number of these integers, i.e. \mu_1 + \ldots + \mu_r. This proves the inequality. If equality holds for all r, this constraint forces us to have all i‘s on the i-th row. ♦

The rest of the article is dedicated to proving:

Main Theorem. For any partitions \lambda, \mu, we have:

\displaystyle M_{\lambda\mu}= \sum_\nu K_{\nu\lambda} K_{\nu\mu} = \sum_{\nu\trianglerighteq \lambda,\ \nu\trianglerighteq\mu} K_{\nu\lambda} K_{\nu\mu}.

Written in matrix form, this gives \mathbf M = \mathbf K^t \mathbf K.

blue-lin

RSK Correspondence

The Robinson–Schensted–Knuth (RSK) correspondence works as follows; suppose we have a matrix A=(A_{ij}) of non-negative integers, with column sum \sum_i A_{ij} = \lambda_j and row sum \sum_j A_{ij} = \mu_i. We do not assume  that \lambda and \mu are partitions. Now take

w_A := \begin{pmatrix} i_1 & i_2 & i_3 & \ldots \\ j_1 & j_2 & j_3 & \ldots\end{pmatrix}

satisfying the following:

  • i_1 \le i_2\le \ldots;
  • if i_r = i_{r+1}, then j_r \le j_{r+1} (together with the first condition, this means the sequence of (i_r, j_r) is in lexicographical order);
  • A_{ij} is exactly the number of r for which (i_r, j_r) = (i, j).

For example,

A=\begin{pmatrix} 1 & 0 & 2 \\ 0 & 2 & 0 \\ 1 & 1 & 0\end{pmatrix}\ \implies\ w_A = \begin{pmatrix} 1 & 1 & 1 & 2 & 2 & 3 & 3\\ 1 & 3 & 3 & 2 & 2 & 1 & 2\end{pmatrix}.

Now we construct a pair of SSYT (P, Q) as follows:

  • Start with empty SSYT P(0), Q(0).
  • For each r=0, 1, \ldots, let P(r+1) := P_r \leftarrow j_{r+1} and Q(r+1) be the SSYT with the same shape as P(r+1) but with i_{r+1} added to the extra box.

The above example for A gives us the following P(r), Q(r); the coloured cells on the left refer to the bumped cells, while those on the right are the additional cells attached to maintain the same shape.

rsk_two_tableaux

Lemma 2. The resulting Q(r) is an SSYT for each r.

Proof

Indeed if i_{r+1} > i_r, then i_{r+1} is larger than any other number in Q(r), so we still get an SSYT after placing it in the extra box. On the other hand if i_{r+1} = i_r = \ldots = i_{r-j} > i_{r-j-1}, then j_{r+1} \ge j_r \ge\ldots\ge j_{r-j} so by lemma 3 here, the additional cell for i_{r+1} lies strictly to the right of the bumped cells of I(P(r-1), j_r), \ldots, I(P(r-j-1), j_{r-j}). Since these include all occurrences of i_{r} in Q(r), it is safe to write i_r in the new cell for Q(r+1). ♦

Taking the final pair (P(d), Q(d)), we have obtained (PQ) of the same shape such that the type of P is \lambda and that of Q is \mu.

Theorem. The correspondence A\mapsto (P, Q) gives a bijection:

\displaystyle\left\{\begin{array}{c}\text{non-negative integer matrix } A_{ij}\\ \sum_i A_{ij} = \lambda_j,\ \sum_j A_{ij} = \mu_i\end{array}\right\} \leftrightarrow \left\{ (P, Q): \begin{array}{c}\text{type}(P) = \lambda,\ \text{type}(Q) = \mu \\ \text{shape}(P) = \text{shape}(Q)\end{array}\right\}

Proof

It remains to show how we can construct an A given (PQ). Find the largest entry k of Q; if there are several of them, pick the rightmost one. Note that such a square must be a corner square (i.e. no square to its right or below). Now remove it from Q to obtain i_d and perform reverse row-insertion from the corresponding square of P to obtain j_d. This gives us (P(d-1), Q(d-1)) and a pair (i_d, j_d).

By construction we have i_d \ge i_{d-1} \ge \ldots. Now suppose i_r = i_{r-1} along the way. The box removed for i_r in Q is strictly to the right of i_{r-1}. Hence in P, the bumped entries for j_r are strictly to the right of j_{r-1} (by lemma 3 here and the exercise after it). In particular j_r \ge j_{r-1}. ♦

Taking the cardinality of both sets, we obtain the desired equality:

\displaystyle M_{\lambda\mu}= \sum_\nu K_{\nu\lambda} K_{\nu\mu}.

Example

Suppose d=3. Labeling the columns and rows by partitions 3, 2+1 and 1+1+1, we get the following matrices. One easily checks that \mathbf M = \mathbf K^t \mathbf K.

\mathbf M = \begin{pmatrix} 1&1&1 \\ 1&2&3 \\ 1&3&6 \end{pmatrix}, \qquad \mathbf K = \begin{pmatrix} 1 & 1 & 1 \\ 0 & 1 & 2 \\ 0 & 0 & 1\end{pmatrix}.

Note

A is a permutation matrix if and only if P and Q are both standard Young tableaux.

Exercise

Find the matrix corresponding to the following tableaux:

rsk_exercise

Exercise

Let f^\lambda be the number of SYT of shape \lambda. Prove that we have d! = \sum_{\lambda \vdash d} (f^\lambda)^2.

Posted in Uncategorized | Tagged , , , , | Leave a comment

Polynomials and Representations VI

For now, we will switch gears and study the combinatorics of the matrices \mathbf M = (M_{\lambda\mu}) and \mathbf N = (N_{\lambda\mu}), where \lambda, \mu run over all partitions of d>0. Eventually, we will show that there is a matrix K such that:

\mathbf M = \mathbf K^t \mathbf K, \qquad \mathbf N = \mathbf K^t \mathbf J \mathbf K

where J is the permutation matrix swapping \lambda and its transpose. To achieve that, we will need to digress a little.

blue-lin

Young Tableaux

Let \lambda be any partition; we will use a Young diagram for \lambda comprising of boxes. A filling of \lambda is obtained by writing a positive integer in each box. The filling is called a semistandard Young tableau (SSYT) if the rows are weakly increasing and columns are strictly increasing. Finally, it is called a standard Young tableau (SYT) if the entries are a permutation of 1 to d = |\lambda|.

If \lambda = (4,3,1) the following are examples:

filling_ssyt_syt

\lambda is called the shape of a filling, while the tuple \mu = (\mu_1, \mu_2, \ldots) where \mu_k is the number of occurrences of k, is called the type of the filling. Thus, the above 3 fillings all have shape (4, 3, 1); their types are (1, 3, 2, 1, 1), (2,2,2,0,1,1) and (1,1,1,1,1,1,1,1) respectively.

We will now describe an important operation to create new SSYTs.

Row Insertions

Given an SSYT P and positive integer k, we write P\leftarrow k for the filling which has k added to it in the following manner.

  • First we add k to the first row. If k ≥ the rightmost entry, we add another square to the tableau and write k in it; this ends the process.
  • Otherwise, starting from the rightmost entry and proceeding leftwards, we find the first square whose contents we can replace with k, while ensuring that the row is weakly increasing. In other words, we find the leftmost element which is greater than k, then swaps with k. If that square had content j, we say that k bumps out j.
  • Now we proceed to add j to the next row, via the above process.
  • If we finish all rows with square i left over, we add a row with a single square and write i in it.

As the reader may suspect, P\leftarrow k is always an SSYT (to be proved later).

Example

Suppose we have an SSYT with three rows: (1,1,2,4,5,6,6), (2,3,5,5,7) and (3,4,6); to add an element 3 to it, we proceed as follows, ending with a tableau with four rows. Thus, 3 bumps out 4 in the first row, which bumps out 5 in the second row, etc.

young_tableau_updating

For convenience, we let I(P, k) denote the set of squares whose contents were replaced (or added). Each square is denoted by (i, j), where i is the row number and j is the column number. In the above example, we obtain: I(P, k) = \{(1, 4), (2, 3), (3, 3), (4, 1)\}.

blue-lin

Basic Properties

Lemma 1. The replaced squares I(P, k) move to the left (weakly) as we proceed down the rows, i.e. if (i, j), (i+1, j') \in I(P, k), then j \ge j'.

Proof

Suppose (i, j) \in I(P, k); consider the square directly below it. Either it is > P_{ij} or it does not exist. In the latter case, we do not worry about (i+1, j') \in I(P, k). In the former case, since P_{i+1, j} is larger than the bumping number P_{ij}, no square after (i+1, j) can be bumped. ♦

Lemma 2. The resulting P \leftarrow k is a semistandard Young tableau.

Proof

By construction, the rows of P\leftarrow k are weakly increasing so we need to show that its columns are strictly increasing. Suppose k' bumps out P_{ij} = k''; since a number can only bump out something bigger, the number directly beneath (i,j) (if it exists) is >k'' > k'.

What about the number above (i, j)? Suppose the bumped number in row i-1 was P_{i-1, j'} = k'. By lemma 1, we have j' \ge j. As k' was bumped out, the new entry at (i-1, j') is < k'. Since j'\ge j, the new number at (i-1, j) is also < k'. ♦

Lemma 3. If j\le k, then the replaced squares I(P,j) lie strictly to the left of I(P\leftarrow j, k).

Proof

The proof is by induction on row. Say, on the first row, j bumps out j' in P, and k bumps out k' in P\leftarrow j. Since j \le k and k can only bump out something bigger, k bumps out something strictly to the right of j. Since j' \le k', the induction may proceed onto the next row. ♦

Exercise

Prove that if j>k, then I(P, j) lie weakly to the right of I(P\leftarrow j, k).

Lemma 4. Given an SSYT P and a corner box b (i.e. a box with no boxes to its right or below), there is a unique SSYT Q and positive integer k such that Q\leftarrow k = P and the additional box is on b.

Proof

Suppose the rightmost square of row r in P is k'. Remove it; since there are no boxes to its right or below, the subfilling from row r onwards is an SSYT. If we are on the top row, stop.

Otherwise, on the row above, find the rightmost square which is < k' and replace it with k'. Note that such a square must exist since the square directly above is already <k'. Repeat until we reach the top row. ♦

Exercise

For each row in SSYT P below, remove the rightmost square and find the corresponding (P', k) such that P' \leftarrow k = P.

ssyt_exercise_sample

Exercise

Write a program in Python to perform row insertion and removal.

blue-lin

Posted in Uncategorized | Tagged , , , | Leave a comment

Polynomials and Representations V

It was clear from the earlier articles that n (number of variables x_1, \ldots, x_n) plays a minimal role in the combinatorics of the symmetric polynomials. Hence, removing the parameter n turns out to be quite convenient; the process gives us the formal ring of symmetric functions.

Concrete Definition of Ring Λ

Consider the set of all monomials x^a := x_1^{a_1} \ldots x_m^{a_m}, where a = (a_1, a_2, \ldots) is a vector of non-negative integers such that only finitely many terms are non-zero. Thus x_1 x_2 and x_1 x_2^2 x_3^3 are monomials but x_1 x_2 x_3 \ldots is not. Now for d = 0,1,2,\ldots, consider the additive group C_d of all sums

\displaystyle\sum_{\sum a_i = d} c_a x^a, \qquad c_a \in \mathbb{Z} \text{ for each } a.

Though each a has only finitely many non-zero terms, the collection of all c_a‘s may have infinitely many non-zero terms. E.g. x_1 + 2x_3 + 3x_5 + 4x_7 + \ldots is a perfectly valid element of C_1.

Now let \Lambda^{(d)} be the subgroup of all \sum_a c_a x^a \in C_d such that c_a = c_{\sigma(a)} for any permutation \sigma of \{1,2,\ldots\}. Note that each \Lambda^{(d)} is a finite free abelian group with basis elements given by:

\begin{aligned} \Lambda^{(0)} &\longrightarrow \left\{1\right\},\\ \Lambda^{(1)} &\longrightarrow \left\{\sum_{i\ge 1} x_i\right\}, \\ \Lambda^{(2)} &\longrightarrow \left\{\sum_{i\ge 1} x_i^2, \sum_{1\le i<j} x_i x_j\right\},\\ \Lambda^{(3)} &\longrightarrow \left\{\sum_{i\ge 1} x_i^3, \sum_{\substack{i,j\ge 1 \\i\ne j}} x_i^2 x_j, \sum_{1\le i< j< k} x_i x_j x_k\right\}. \end{aligned}

Let: \Lambda := \oplus_{d=0}^\infty \Lambda^{(d)}; this gives us a homogeneous graded ring, called the formal ring of symmetric functions. E.g. in this ring, we have:

\left(\sum_{i=1}^\infty x_i\right)^2 = \left(\sum_{i=1}^\infty x_i^2\right) + 2\left( \sum_{1\le i < j} x_i x_j\right).

For each n, the map \Lambda^{(d)} \to \Lambda_n^{(d)} taking x_{n+1}, x_{n+2}, \ldots \mapsto 0 gives a graded ring homomorphism \Lambda \to \Lambda_n.

Abstract Definition of Λ

Here is an alternate definition: let \Lambda_n \to \Lambda_{n-1} be the map (graded ring homomorphism) taking x_n \mapsto 0. Thus we get a sequence:

\ldots \longrightarrow \Lambda_{n+1} \longrightarrow \Lambda_n \longrightarrow \ldots \longrightarrow \Lambda_1 \longrightarrow 0.

Now \Lambda is defined to be the inverse limit of this sequence, which gives us a graded ring and a set of graded maps \Lambda \to \Lambda_n, one for each n.

Note that if n\ge d, the maps \ldots \to \Lambda_{n+1}^{(d)} \to \Lambda_{n}^{(d)} are all isomorphisms; hence \Lambda^{(d)} \to \Lambda_n^{(d)} is also an isomorphism.

blue-lin

Symmetric Polynomials in Λ

We define the following in \Lambda, also called the elementary symmetric polynomialcomplete symmetric polynomial and the power sum symmetric polynomial.

\displaystyle\begin{aligned} e_0 = 1,\qquad e_k &= \sum_{1 \le i_1 < \ldots < i_k} x_{i_1} x_{i_2} \ldots x_{i_k},\\ h_0 = 1,\qquad h_k &= \sum_{1 \le i_1 \le \ldots \le i_k} x_{i_1} x_{i_2} \ldots x_{i_k},\\ \qquad p_k &= \sum_{i\ge 1} x_i^k.\end{aligned}

For a partition \lambda, we also define e_\lambda := e_{\lambda_1} \ldots e_{\lambda_l} etc, as before. Finally, the monomial symmetric polynomial m_\lambda is the sum of all x_{i_1}^{a_1} x_{i_2}^{a_2} \ldots x_{i_l}^{a_l}, over all 1 \le i_1 < \ldots < i_l and (a_1, \ldots a_l) such that when sorted in decreasing order, (a_i) becomes (\lambda_i). Note that \deg(m_\lambda) = |\lambda|.

When projected via \Lambda \to \Lambda_n, the above polynomials becomes their counterpart in \Lambda_n. For example

\displaystyle \begin{aligned} m_{21} = \sum_{i\ne j} x_i^2 x_j \in \Lambda \quad &\mapsto \quad x_1^2(x_2 + x_3) + x_2^2 ( x_1 + x_3)+ x_3(x_1 + x_2) \in \Lambda_3,\\ e_3 = \sum_{1\le i < j< k} x_i x_j x_k \in \Lambda \quad&\mapsto\quad x_1 x_2 x_3 \in \Lambda_3.\end{aligned}

Rewriting Earlier Results

We thus have the following:

\displaystyle \begin{aligned} e_\lambda &= \sum_{\mu \vdash d} N_{\lambda\mu} m_\mu,\\ h_\lambda &= \sum_{\mu\vdash d} M_{\lambda\mu} m_\mu, \end{aligned}

where d = |\lambda|. Indeed, the above relations hold in \Lambda_n^{(d)} for any n and d; when n\ge d, the isomorphism \Lambda^{(d)} \cong \Lambda_n^{(d)} shows that the same relations hold in \Lambda. Similarly, the identities from earlier can be copied over verbatim by projecting to \Lambda_n^{(k)} for n\ge k.

\displaystyle\begin{aligned} e_0 h_k - e_1 h_{k-1} + \ldots + (-1)^k e_k h_0 &=0,\\ e_{k-1}p_1 - e_{k-2}p_2 + \ldots + (-1)^{k-1}e_0 p_k &= ke_k, \\ h_{k-1}p_1 + h_{k-2}p_2 + \ldots + h_0 p_k &= kh_k.\end{aligned}.

blue-lin

Duality in Λ

Recall that we have an involution \omega:\Lambda_n \to \Lambda_n for each n; this map does not commute with the projection \Lambda_{n+1} \to \Lambda_{n+1}. For an easy example, say n=2, the map \omega : \Lambda_3 \to \Lambda_3 takes e_3 \mapsto h_3. However, upon projecting to \Lambda_2, the element e_3 \mapsto 0 while h_3 \mapsto \ne 0.

This can be overcome by taking n\ge d. Since \omega is graded, we thus have the following:

commutative_diagram_formal_ring_d

The vertical lines are isomorphisms and the diagram commutes.

Definition. For each d\ge 0, we define the involution \omega : \Lambda^{(d)} \to \Lambda^{(d)} to be that induced from \omega: \Lambda_n^{(d)}\to \Lambda_n^{(d)} for any n\ge d.

Note that in Λ, we have:

\omega(e_k) = h_k, \quad\omega(h_k) = e_k, \quad\omega(p_k) = (-1)^{k-1}p_k

for all k\ge 1 since these hold in \Lambda_n^{(k)} for any n\ge k.

blue-lin

Posted in Uncategorized | Tagged , , , | Leave a comment

Polynomials and Representations IV

Power Sum Polynomials

The power sum polynomial is defined as follows:

p_k := x_1^k + x_2^k + \ldots + x_n^k \in \Lambda_n^{(k)}.

In this case, we do not define p_0, although it seems natural to set p_0 = n. As before, for a partition \lambda, define:

\displaystyle p_\lambda:= p_{\lambda_1} p_{\lambda_2} \ldots p_{\lambda_l}, \qquad l = l(\lambda).

Note that we must have l = l(\lambda) above since we have not defined p_0. For example if \lambda = (2, 1), then p_\lambda = \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n x_i\right).

Their generating function is given by:

\displaystyle P(t) := p_1 + p_2 t + p_3 t^2 + \ldots= \sum_{j=1}^n \frac {x_j}{1 - x_j t}.

blue-lin

Newton’s Identities (I)

Recall that the generating function for e_i is E(t) = \prod_{i=1}^n (1 + x_i t). Thus we have the following relation:

\displaystyle \log E(t) = \sum_{j=1}^n \log (1 + x_j t)\stackrel{\frac d {dt}}{\implies} \frac {E'(t)}{E(t)} = \sum_{j=1}^n \frac {x_j}{1 + x_j t} = P(-t).

From E'(t) = E(t) P(-t) we take the coefficient of t^{k-1} on both sides and obtain Newton’s identities:

k e_k = e_{k-1} p_1 - e_{k-2} p_2 + \ldots + (-1)^{k-1} e_0 p_k.

for all k\ge 1, where e_k = 0 for all k>n. With that, each p_k can be written as a polynomial in e_1, e_2, \ldots, e_n with integer coefficients. E.g.

p_1 = e_1,\quad p_2 = e_1^2 - 2e_2,\quad p_3 = e_1^3 - 3e_1 e_2 + 3e_3, \ldots

Conversely, we can also express each e_k as a polynomial in p_1, p_2, \ldots, p_n with rational coefficients. Thus we have:

\displaystyle \Lambda_n \otimes_{\mathbb{Z}} \mathbb{Q} =\mathbb{Q}[p_1, p_2, \ldots, p_n],

where the RHS is a free \mathbb{Q}-algebra in p_1, \ldots, p_n. In particular, \Lambda_n^{(d)}\otimes_{\mathbb{Z}} \mathbb{Q} has basis given by p_\lambda for all \lambda \vdash d, \lambda_1 \le n.

Newton’s Identities (II)

There is another form of Newton’s identities which relate p_k with the complete symmetric polynomials h_i. Recall that the generating function for h_i is H(t) = \prod_{i=1}^n \frac 1 {(1 - x_i t)}. Hence:

\displaystyle\log H(t) = -\sum_{j=1}^n \log(1 - x_j t) \implies \frac{H'(t)}{H(t)}= \sum_{j=1}^n \frac {x_j}{1 - x_j t} = P(t).

From H'(t) = P(t)H(t), taking the coefficient of t^{k-1} gives us another form of Newton’s identities:

k h_k = h_{k-1} p_1 + h_{k-2} p_2 + \ldots + h_0 p_k.

for all k\ge 1. From this, we can express p_k as a polynomial in h_1, \ldots, h_n with integer coefficients. E.g.

p_1 = h_1, \quad p_2 = 2h_2 - h_1^2,\quad p_3 = 3h_3 - 3h_2 h_1 + h_1^3.

As above, we can also express each h_k as a polynomial in p_1, \ldots, p _n with rational coefficients.

Exercise

Using one of Newton’s identities, express p_4 = x^4 + y^4 + z^4 as a polynomial in:

\begin{aligned} p_1 &= x+y+z,\\ p_2 &= x^2 + y^2 + z^2, \\ p_3 &= x^3 + y^3 + z^3.\end{aligned}

blue-lin

Self-Duality

Recall that the involution \omega : \Lambda_n \to\Lambda_n swaps e_i \leftrightarrow h_i for 1\le i \le n. Looking at the expressions of p_n in terms of e_i and h_i, the following is clear.

Proposition. For each 1\le k \le n, we have \omega(p_k) = (-1)^{k-1}p_k.

Proof

Apply induction on k; when k=1 we have p_1 = e_1 = h_1 so this is clear. Suppose 1<k\le n. Apply \omega to the first Newton’s identity; for each i\le n we have \omega(e_i) = h_i and so:

k h_k = h_{k-1} \omega(p_1) - h_{k-2} \omega(p_2) + \ldots + (-1)^{k-1} h_0 \omega(p_k).

By induction hypothesis \omega(p_i) = (-1)^{i-1}p_i for 1 \le i \le k-1, so we get:

k h_k = h_{k-1} p_1 + h_{k-2} p_2 + \ldots + h_1 p_{k-1} + (-1)^{k-1} h_0 \omega(p_k).

Now the second Newton’s identity gives \omega(p_k) = (-1)^{k-1} p_k as desired. ♦

warningThe above is no longer true for k>n. For example, let n=2. We get:

\begin{aligned} p_1 = e_1\ &\implies\ \omega(p_1) = h_1 = p_1, \\ p_2 = e_1^2 - 2e_2\ &\implies\ \omega(p_2) = h_1^2 - 2h_2 = -p_2,\\ p_3 = e_1^3 - 3e_1 e_2 \ &\implies\ \omega(p_3) = h_1^3 - 3h_1 h_2 \ne \pm p_3.\end{aligned}

Summary

We have seen the following symmetric polynomials:

polynomials_relationsIn the above diagram a → b means that b can be expressed as a polynomial in a with integer coefficients; the same holds for dotted arrows but now the polynomial may have rational coefficients.

One wonders if there is a combinatorial interpretation of the coefficients of p_\lambda when expressed as a linear sum of m_\mu. It turns out this has implications in the representation theory of the symmetric group S_n. We will (hopefully) get to talk about this beautiful result in a later article.

Posted in Uncategorized | Tagged , , , | Leave a comment