Thinking Infinitesimally – Multivariate Calculus (II)

Chain Rule for Multivariate Calculus

We continue our discussion of multivariate calculus. The first item here is the analogue of Chain Rule for the multivariate case. Suppose we have parameters fu, v, x, y, z. Suppose {uv} are independent parameters (in particular, the system is at least 2-dimensional), and assume that (i) we can write x = x(u,v), y = y(u,v) and z = z(u,v) as functions of {u, v}, (ii) we can write f = f(x, y, z) as a function of x, y and z. This also means we can write f as a function of u and v. Upon perturbing the system, we get:

\delta f \approx \delta x\left.\frac{\partial f}{\partial x}\right|_{y,z} + \delta y\left.\frac{\partial f}{\partial y}\right|_{x,z} + \delta z\left.\frac{\partial f}{\partial z}\right|_{x,y}, and

\delta f \approx \delta u\left.\frac{\partial f}{\partial u}\right|_v + \delta v\left.\frac{\partial f}{\partial v}\right|_u.

We wish to find a formula which expresses the second set of partial derivatives in terms of the first. To do that, we divide the first equation by \delta u and obtain:

\frac{\delta f}{\delta u} \approx \frac{\delta x}{\delta u}\left(\left.\frac{\partial f}{\partial x}\right|_{y,z}\right) + \frac{\delta y}{\delta u}\left(\left.\frac{\partial f}{\partial y}\right|_{x,z}\right) + \frac{\delta z}{\delta u}\left(\left.\frac{\partial f}{\partial z}\right|_{x,y}\right).

If we maintain \delta v = 0 and let \delta u \to 0, then the LHS converges to \left.\frac{\partial f}{\partial u}\right|_v by definition. Taking the limit on the RHS also, we obtain:

\left.\frac{\partial f}{\partial u}\right|_v = \left.\frac{\partial x}{\partial u}\right|_v\left.\frac{\partial f}{\partial x}\right|_{y,z} + \left.\frac{\partial y}{\partial u}\right|_v\left.\frac{\partial f}{\partial y}\right|_{x,z} + \left.\frac{\partial z}{\partial u}\right|_v\left.\frac{\partial f}{\partial z}\right|_{x,y}.

This is usually written in books as the simplified form: \frac{\partial f}{\partial u} = \frac{\partial x}{\partial u}\frac{\partial f}{\partial x}+\frac{\partial y}{\partial u}\frac{\partial f}{\partial y} + \frac{\partial z}{\partial u}\frac{\partial f}{\partial z}, which is acceptable since the context is clear: the coordinate x is assumed to occur together with y and z, while u and v are always assumed to occur together. We left all the subscripts in our initial equation because we’re really trying to be careful here.

To remember the above formula, use the diagram:

Thus to compute \frac{\partial f}{\partial u} we find all possible paths from f to u through the intermediate parameters {xyz} and take the sum of all terms, where each term is the product of the corresponding partial derivatives along the way.

Example 1. Suppose f(x,y,z) = x^2 y^2 + yz^3 - xz and x(u,v) = u^2 + v^2, y(u,v) = 2uv, z(u,v) = uv^3. Then

\frac{\partial f}{\partial x} = 2xy^2 - z,\ \frac{\partial f}{\partial y}=2x^2 y+z^3, \ \frac{\partial f}{\partial z} = 3yz^2 - x.

Together with \frac{\partial x}{\partial u} = 2u, \frac{\partial y}{\partial u} = 2v and \frac{\partial z}{\partial u} = v^3, we get the desired relation for \frac{\partial f}{\partial u} which is convenient if we wish to calculate explicit values.

Example 2. Suppose we have polar coordinates (x,y) = (r\cos \theta, r\sin \theta). Then for any f = f(x, y),

\frac{\partial f}{\partial r} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial r} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial r} = \cos\theta \frac{\partial f}{\partial x} + \sin\theta \frac{\partial f}{\partial y}, – (1)

\frac{\partial f}{\partial\theta} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial\theta} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial\theta} = -r\sin\theta \frac{\partial f}{\partial x} + r\cos\theta \frac{\partial f}{\partial y} – (2)

But if we wish to express \frac{\partial f}{\partial x} and \frac{\partial f}{\partial y} in terms of \frac{\partial f}{\partial r}, \frac{\partial f}{\partial\theta}, then the equation r\sin\theta \times (1) + \cos\theta\times (2) simplifies to: r\sin\theta \frac{\partial f}{\partial r} + \cos\theta\frac{\partial f}{\partial\theta} = r\frac{\partial f}{\partial y}. A similar computation gives us an expression for \frac{\partial f}{\partial x}. In short:

\frac{\partial f}{\partial x} = \cos\theta \frac{\partial f}{\partial r}-\frac 1 r\sin\theta \frac{\partial f}{\partial\theta},\ \ \ \frac{\partial f}{\partial y} = \sin\theta \frac{\partial f}{\partial r} + \frac 1 r \cos\theta\frac{\partial f}{\partial\theta}.

Higher Order Multivariate Derivatives

Recall that in the single-variable case, we can take successive derivatives of the function f(x) to obtain \frac{df}{dx}, \frac{d^2 f}{dx^2}, \frac{d^3 f}{dx^3} etc. Let’s consider the multivariate case here.

Suppose {x, y, z, w} forms a set of coordinates. If we fix the values of yz, and w then we can differentiate a function f(x,y,z,w) with respect to x as many times as we please. Thus we write this as:

\left.\frac{\partial^n f}{\partial x^n}\right|_{y,z,w} := \frac{d^n f}{dx^n}, keeping yzw constant.

For example, if f(x,y,z,w) = x^3 y + xz^2 - w^4, then \left.\frac{\partial^2 f}{\partial x^2}\right|_{y,z,w} = 6xy.

On the other hand, if we fix the values of z and w, then we can differentiate with respect to x first while keeping y constant, then with respect to y while keeping x constant. This is denoted by:

\left.\frac{\partial^2 f}{\partial y \partial x}\right|_{z,w} := \left.\frac{\partial}{\partial y}\left(\left.\frac{\partial f}{\partial x} \right|_{y,z,w}\right)\right|_{x,z,w}.

But one can also switch the order around: differentiate with respect to y first, then with respect to x. It turns out for the order doesn’t matter if the function is nice enough, i.e. we get:

\left.\frac{\partial}{\partial y}\left(\left.\frac{\partial f}{\partial x}\right|_{y,z,w}\right)\right|_{x,z,w} = \left.\frac{\partial}{\partial x}\left(\left.\frac{\partial f}{\partial y}\right|_{x,z,w}\right)\right|_{y,z,w}.

Here’s an intuitive (but non-rigourous) explanation of the reason. Since zw are fixed throughout, let’s simplify our notation by denoting g(x,y) = f(x,y,z,w). Now consider a small perturbation (x,y)\mapsto (x+\delta x, y+\delta y) and consider the following:

\frac{g(x+\delta x, y+\delta y) - g(x, y+\delta y) - g(x+\delta x, y) + g(x,y)}{\delta x\cdot\delta y} = \frac{1}{\delta y}\left(\frac{g(x+\delta x, y+\delta y) - g(x, y+\delta y)}{\delta x} - \frac{g(x+\delta x, y) - g(x,y)}{\delta x}\right).

If we let \delta x\to 0 with \delta y constant, the two terms on the RHS converge to \left.\frac{\partial g}{\partial x}\right|_y (x, y+\delta y) and \left.\frac{\partial g}{\partial x}\right|_y (x, y) respectively. If we now let \delta y \to 0, the expression converges to \left.\frac{\partial}{\partial y}\left(\left.\frac{\partial g}{\partial x}\right|_y\right)\right|_x. By symmetry, the equation also converges to  \left.\frac{\partial}{\partial x}\left(\left.\frac{\partial g}{\partial y}\right|_x\right)\right|_y if we switch the order of convergence. Since it shouldn’t matter whether we let \delta x\to 0 first then \delta y\to 0 or vice versa, the two derivatives are equal.

[ Warning: pathological examples where the two derivatives differ do exist! Such functions are explicitly forbidden in our consideration. ]

Example 3. Consider f(x,y,z) = x^3 y + 3xz^2 - xy^4 z. Then the two derivatives are:

\frac{\partial}{\partial x}\frac{\partial f}{\partial y} = \frac{\partial}{\partial x}(x^3 - 4xy^3 z) = 3x^2 - 4y^3 z,

\frac{\partial}{\partial y}\frac{\partial f}{\partial x} = \frac{\partial}{\partial y}(3x^2 y + 3z^2 - y^4 z) = 3x^2 - 4y^3 z.

Example 4. Consider rectilinear coordinates (xy) and polar coordinates (rθ), where the two are related via (x, y) = (r\cos\theta, r\sin\theta). We already know from example 2 that:

\frac{\partial f}{\partial x} = \cos\theta \frac{\partial f}{\partial r}-\frac 1 r\sin\theta \frac{\partial f}{\partial\theta},\ \ \ \frac{\partial f}{\partial y} = \sin\theta \frac{\partial f}{\partial r} + \frac 1 r \cos\theta\frac{\partial f}{\partial\theta}.

Let’s see if we can express the second derivatives with respect to {xy} in terms of those with respect to {rθ}. It may look horrid, but the calculations can be simplified by thinking of $\frac \partial{\partial x}$ as an operator, i.e. a function which takes functions to other functions! Thus we shall write:

\frac{\partial}{\partial x} = \cos\theta \frac{\partial}{\partial r} - \frac {\sin\theta}r \frac{\partial}{\partial\theta}.

So to get the second derivative in terms of x, we just apply the operator to itself:

\frac{\partial^2}{\partial x^2} = \left(\cos\theta \frac{\partial}{\partial r} - \frac {\sin\theta}r \frac{\partial}{\partial\theta}\right)\left( \cos\theta \frac{\partial}{\partial r} - \frac {\sin\theta}r \frac{\partial}{\partial\theta}\right).

Since the operators are all additive (an operator D is said to be additive if D(fg) = DfDg for all functions f and g), we can use the distributive property to expand the RHS. Beware, though, that operators are in general not commutative; for example, by the product law we get:

\frac{\partial}{\partial r}\left(\frac{\sin\theta}r \frac{\partial}{\partial\theta}\right) = \frac{\partial}{\partial r}\left(\frac{\sin\theta}r\right)\frac{\partial}{\partial\theta} + \frac{\sin\theta}{r} \frac{\partial^2}{\partial r\partial\theta} = -\frac{\sin\theta}{r^2}\frac{\partial}{\partial\theta} + \frac{\sin\theta}{r} \frac{\partial^2}{\partial r\partial \theta}.

Now the reader has enough tools to verify the following:

\frac{\partial^2}{\partial x^2} = \cos^2\theta \frac{\partial^2}{\partial r^2} + \frac{\sin^2\theta}{r^2} \frac{\partial^2}{\partial\theta^2} - \frac{\sin 2\theta}r \frac{\partial^2}{\partial r \partial\theta} + \frac {\sin 2\theta}{2r^2} \frac{\partial}{\partial \theta} + \frac{\sin^2\theta}{r} \frac{\partial}{\partial r},

\frac{\partial^2}{\partial y^2} = \sin^2\theta \frac{\partial^2}{\partial r^2} + \frac{\cos^2\theta}{r^2} \frac{\partial^2}{\partial \theta^2} + \frac{\sin 2\theta}{r} \frac{\partial^2}{\partial r\partial\theta} - \frac{\sin 2\theta}{2r^2} \frac{\partial}{\partial \theta} + \frac{\cos^2\theta}{r} \frac{\partial}{\partial r}.

The case of \frac{\partial^2}{\partial x\partial y} is left as an exercise for the reader.


All hints are ROT-13 encoded to avoid spoilers.

  1. Obligatory mechanical exercises: in each of the following examples, verify that \frac{\partial}{\partial y}\frac{\partial f}{\partial x} = \frac{\partial}{\partial x}\frac{\partial f}{\partial y}, \frac{\partial}{\partial z}\frac{\partial f}{\partial x} = \frac{\partial}{\partial x}\frac{\partial f}{\partial z} etc, via explicit computations.
    1. f(x, y) = \sin(x^2 + y^3)\exp(xy).
    2. f(x,y,z) = \sin(xy) (x^3 z + y^2) \exp(z^3).
  2. If ff(xy), and zxy, is there any relationship between \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} and \frac{\partial f}{\partial z}? [ Hint: Ner nyy guerr cnegvny qrevingvirf jryy-qrsvarq? ]
  3. In 3-D space, we can define spherical coordinates (r, \theta, \phi) which satisfies x = r\sin\theta\cos\phi, y = r\sin\theta\sin\phi, z = r\cos\theta. For a function ff(x, yz), express the partial derivatives \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} in terms of spherical coordinates.
  4. Prove that there does not exist f(xy) such that \frac{\partial f}{\partial x} = x^2y + xy^2 and \frac{\partial f}{\partial y} = x^3 + x^2 y.
  5. Given (x, y) = (u^2 - v^2, 2uv), for a function f(xy), express \frac{\partial f}{\partial x} and \frac{\partial f}{\partial y} in terms of u and v, and the partial derivatives of f with respect to u and v.
  6. Find all points on the curve x^3 + 2y^3 - 3xy = 1 where the curve is tangent to a circle centred at the origin (see diagram below for a sample circle). You may use wolframalpha to numerically obtain the values.
  7. (*) (Legendre transform) Suppose we have a 2-dimensional system with (non-independent) parameters uvw. Define the parameter f = \left.\frac{\partial u}{\partial v}\right|_w. Explicitly write down a new parameter g in terms of u, v, w, f such that \left.\frac{\partial g}{\partial f}\right|_w = -v. [ Hint: lbh pna pbzcyrgryl vtaber bar bs gur cnenzrgref. ]
  8. (*) For a set of coordinates {xt} in the plane, the differential equation \frac{\partial^2 f}{\partial x^2} = \frac{\partial^2 f}{\partial t^2} is called the 1-dimensional wave equation. Find all general solutions of this equation. [ Hint: fhofgvghgr gur gjb inevnoyrf ol gur fhz naq gur qvssrerapr. ]

(Sample answer for question 6)

This entry was posted in Notes and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s