Basic Analysis: Differentiation (1)

In this article, we’ll look at differentiation more rigourously and carefully. Throughout this article, we suppose f is a real-valued function defined on an open interval (b, c) containing a, i.e. f : (b, c) → R with b < a < c.

Theorem. The derivative of f(x) at a is defined by:

$f'(a) := \lim_{x\to a} \frac{f(x) - f(a)}{x-a}.$

If the limit exists, we also say f(x) is differentiable at x=a. If f(x) is differentiable at every point in (b, c), then we get a function $(b,c) \to \mathbf{R}$ , $x\mapsto f'(x)$ . The resulting function is then written as $\frac{df}{dx}$ .

Pictorially, the derivative measures the gradient of the tangent at x=a:

Example 1

Consider f : R → R, given by f(x) = x³. At any point x=a, we have:

$f'(a) = \lim_{x\to a}\frac{x^3 - a^3}{x-a} = \lim_{x\to a}\frac{(x-a)(x^2+xa+a^2)}{x-a}$

which is equal to $x^2 + xa + a^2$ when x≠a. Hence, as x→a, the expression tends to $a^2 + aa + a^2 = 3a^2$ . Conclusion: the derivative exists at every point a on the real line and $\frac{df}{dx} = 3x^2$ .

Example 2

Consider f : R-{0} → R, given by f(x) = 1/x. At any point x=a≠0, note that f(x) is defined on an open interval containing a. Then:

$f'(a) = \lim_{x\to a}\frac{x^{-1} - a^{-1}}{x-a} = \lim_{x\to a} -\frac 1{ax} = -\frac 1 {a^2}$ .

Hence, $\frac{df}{dx} = x^{-2}$ . Conclusion: the derivative exists for every non-zero a.

Example 3

Take the function f : R → R, given by f(x) = |x|. If x=a>0, then f(x)=x on an open interval containing a so it’s easy to check that f’(a) = 1. Likewise, if x=a<0, then f’(a) = -1. Finally, we shall show that f’(0) does not exist. But f’(0) is the limit of g(x) := (|x|-|0|)/(x-0) = |x|/x as x tends to 0. Note that g(x)=1 for x>0, and g(x)=-1 for x<0. Hence, the left limit is -1 while the right limit is +1. Since the two limits don’t match, the limit doesn’t exist. Conclusion: the derivative exists for every non-zero a.

Properties of the Derivative

We have the following basic properties.

Theorem. Let f(x) and g(x) be defined on an open interval about a. If they are both differentiable at x=a, then:

f(x) is continuous at x=a;

(addition rule) the derivative of the sum (f+g)(x) at x=a is (f+g)'(a) = f'(a) + g'(a);

(product rule) the derivative of the product (fg)(x) at x=a is (fg)'(a) = f'(a)g(a) + f(a)g'(a).

Proof.

1. For the first property, we have $f'(a) = \lim_{x\to a}\frac{f(x)-f(a)}{x-a}$ , so:

$\lim_{x\to a} (f(x) - f(a)) = \lim_{x\to a}\frac{f(x)-f(a)}{x-a} \cdot \lim_{x\to a}(x-a)$

if both limits on the RHS exist. But they do: the first is f’(a) while the second is 0. So the LHS tends to 0 and f(x)→f(a) as x→a as desired.

2. For the second property, write:

$(f+g)'(a) = \lim_{x\to a}\frac{(f+g)(x)-(f+g)(a)}{x-a} = \lim_{x\to a}\frac{f(x)-f(a)}{x-a} + \lim_{x\to a}\frac{g(x)-g(a)}{x-a}.$

The two limits on the right are f’(a) and g’(a) respectively. Hence, the middle limit equals f’(a)+g’(a) and we’re done.

3. For the final property, write (fg)'(a) as:

$\lim_{x\to a}\frac{f(x)g(x) - f(a)g(a)}{x-a} = \lim_{x\to a} f(x)\frac{g(x)-g(a)}{x-a} + \lim_{x\to a}\frac{f(x)-f(a)}{x-a}\cdot g(a).$

Since f is differentiable at x=a, it’s continuous there also. Thus, the limits on the RHS all exist the expression approaches f(a)g’(a) + f’(a)g(a), as desired. ♦

The next property is extremely useful computationally.

Theorem (Chain Rule). Suppose f(x) (resp. g(x)) is defined on an open interval containing a (resp. f(a)). If f'(a) and g'(f(a)) both exist, then the composed function h = g°f gives h'(a) = g'(f(a))·f'(a).

Proof.

One’s inclined to write:

$\frac{h(x)-h(a)}{x-a} = \frac{g(f(x))-g(f(a))}{x-a} = \frac{(g(f(x))-g(f(a))}{f(x)-f(a)} \cdot \frac{f(x)-f(a)}{x-a}$ ,

if f(x) ≠ f(a), and let x→a. But the problem is that f(x)=f(a) may occur even when x is arbitrarily close to a. This is a slight technical issue, which we’ll counter by defining:

$G(y) = \begin{cases} \frac{g(y) - g(f(a))}{y-a}, &\text{ if } y\ne f(a), \\ g'(f(a)), &\text{ if } y=a.\end{cases}$

Then G(y) is defined on an open interval about f(a) and is continuous there. Now we can write the above equality as:

$\frac{g(f(x))-g(f(a))}{x-a} = G(f(x))\cdot\frac{f(x)-f(a)}{x-a}$ ,

if x ≠ a. Now since G is continuous at f(a) and f is continuous at a, taking the limit x→a gives (g°f)'(a) = g’(f(a)) f’(a) on both sides. ♦

Now we can describe the derivative of f(x)/g(x).

Corollary. If f(x) and g(x) are differentiable at x=a, and g'(a)≠0, then

1/g is differentiable at x=a and (1/g)'(a) = -g'(a)/g(a)².

f/g is differentiable at x=a and $(f/g)'(a) = \frac{g(a)f'(a) - f(a)g'(a)}{g(a)^2}$ .

Proof. The first property follows from the Chain Rule and Example 2. The second property follows from the first property and the product rule.

Summary. We have derived the standard laws of differentiation in secondary school calculus.

(addition rule) $\frac{d}{dx}(f+g) = \frac{df}{dx}+\frac{dg}{dx}$ ;

(product rule) $\frac{d}{dx}(fg) = \frac{df}{dx}g + f\frac{dg}{dx}$ ;

(quotient rule) $\frac{d}{dx}(f/g) = \frac{g\frac{df}{dx} - f\frac{dg}{dx}}{g^2}$ ;

(chain rule) $\frac{dg}{dx} = \frac{dg}{dy}\cdot\frac{dy}{dx}$ .

More Examples

The above rules for manipulating derivatives have multiple applications.

Example 4

Let’s differentiate $f_n(x) = x^n$ for an integer n. Clearly, for n=0, the derivative is 0 and for n=1, it’s 1. For a positive integer n>1, the derivative of f_n can be obtained by recursively applying the product rule:

$f_n(x) = x\cdot f_{n-1}(x) \implies f_n'(x) = f_{n-1}(x) +x\cdot f_{n-1}'(x),$

which gives $f_n'(x) = n\cdot x^{n-1}$ for n≥0. By applying the division rule, we see that in fact this equation holds for all integers n.

Example 5

Let $f(x) = (x^2 + 1)^{100}$ . We wish to compute $\frac{df}{dx}$ in closed form. Let $u(x) = x^2+1$ and $v(y) = y^{100}$ . Since f(x) = v(u(x)), the chain rule gives us

$\frac{df}{dx} = 100(x^2+1)^{99}\cdot 2x = 200x(x^2+1)^{99}.$

Example 6

For a fixed positive integer n, write the binomial expansion:

$(x+1)^n = C^n_n x^n + C^n_{n-1} x^{n-1} + \ldots + C^n_1 x + C^n_0.$

Differentiating both sides with respect to x, the chain rule applied to the LHS gives n(1+x)^n-1 while the RHS gives us $\sum_{k=1}^n k\cdot C^n_{k} x^{k-1}$ . Substituting x=1 then gives us the combinatorial equality:

$1\cdot C^n_1 + 2\cdot C^n_2 + \ldots + (n-1) C^n_{n-1} + n\cdot C^n_n = n\cdot 2^{n-1}.$

Many more examples can be found in our articles on generating functions.

Example 7 (with Warning)

We know that for |x|<1, the geometric series converges to:

$1 + x+x^2 +x^3 + \ldots = \frac 1 {1-x}.$

One’s tempted to differentiate both sides with respect to x, which gives $1+2x+3x^2+4x^3+\ldots$ on the LHS and $\frac 1{(1-x)^2}$ on the RHS by using the chain rule with u(x) = 1-x. But there’s no guarantee that the addition rule holds for infinitely many terms, although in this case it does work. The underlying theory will be covered in greater detail in a later article.

Higher Derivatives

From the above results, we see that we can differentiate a function f(x) = p(x)/q(x), where p(x) and q(x) are both polynomials, at any point x=a which is not a root of q(x). A function of this form (p(x)/q(x) for polynomials p(x) and q(x)) is called a rational function. Furthermore, from the division rule the derivative $\frac{df}{dx}$ is also a rational function, which means we can differentiate it as many times as we wish:

$\frac{d^2 f}{dx^2} := \frac{d}{dx}\left(\frac{df}{dx}\right),\ \frac{d^3 f}{dx^3} := \frac{d}{dx}\left(\frac{d^2f}{dx^2}\right),\ \ldots.$

These can also be denoted by f”, f”’, etc, or f⁽²⁾, f⁽³⁾, … .

Now, if f(x) is second-differentiable (i.e. f” exists) at x=a, then by definition f’ = df/dx must exist on an open interval (b, c) about a, and also be differentiable at x=a. In particular, since differentiability implies continuity, df/dx is continuous at x=a:

Definition. A function f(x) is said to be continuously differentiable at x=a if it’s differentiable on an open interval about a, and the derivative is continuous at x=a.

Clearly, we have the following implications:

$\begin{matrix}\text{second diff.}\\ \text{at }x=a\end{matrix} \implies\begin{matrix}\text{continuously}\\ \text{diff. at }x=a\end{matrix}\implies \begin{matrix}\text{differentiable}\\ \text{at }x=a\end{matrix}\implies \begin{matrix}\text{continuous}\\ \text{at }x=a\end{matrix}$

And the implications are strict:

Continuous but not differentiable function : easy, take f(x) = |x|; it’s continuous and example 3 tells us it’s not differentiable at x=0.
Differentiable but not continuously differentiable function : take $f(x)=\begin{cases}x^2 \sin(1/x), &\text{if } x\ne 0,\\ 0, &\text{if } x=0.\end{cases}$ When a≠0, the derivative $f'(a) = 2a \sin(1/a) + a^2\cos(1/a) (-1/a^2) = 2a\sin(1/a) -\cos(1/a)$ , which does not converge as a→0. When a=0, the derivative is $\lim_{x\to 0} \frac{f(x)-f(0)}{x-0} = \lim_{x\to 0} x\sin(1/x) = 0$ by squeezing the function between |x| and -|x|.
Continuously differentiable but not second differentiable function : take f(x) = x|x|. It’s easy to see that the derivative is f’(a) = 2a if a≥0, and f’(a) = -2a if a<0. Thus f’(x) = 2|x| which is continuous but not differentiable.

More generally, it’s easy to see that the function f(x) = x^n-1|x| is (n-1)th order continuously differentiable (i.e. the (n-1)th derivative exists and is continuous) but not n-th order differentiable.

Now, we say f(x) is infinitely differentiable if the n-th order derivative exists for all n. For such a function, one’s inclined to write down its Taylor series about x=a:

$f(x) = f(a) + f'(a)(x-a)+ \frac{f''(a)}{2!}(x-a)^2 + \frac{f'''(a)}{3!}(x-a)^3 + \ldots$

and hope that the Taylor series converges to f(x) in some open interval about a. If so, we say f(x) is analytic. Alas.

Infinitely differentiable but not analytic function : let $f(x) = \begin{cases}\exp(-x^{-2}), &\text{if } x\ne 0,\\ 0, &\text{if }x=0.\end{cases}$ At the points x=a≠0, it’s easy to see the n-th derivative is of the form $p(x)x^{-m} \exp(-x^{-2})$ for some polynomial p(x) and positive integer m. To check its behaviour as x→0, let y=1/x; then $x^{-m} \exp(-x^{-2}) = y^m/ \exp(y^2)$ which approaches zero since exponential increases much faster than any polynomial function. With this, one proves via induction that the n-th derivative at 0 is always 0, so the Taylor series is 0.

It should be noted that this anomaly doesn’t occur for complex analysis. Specifically, a differentiable function C → C on an open set is automatically analytic there and one immediately has all the wonderful results about Taylor series convergence etc. We hope to come around to that at a later date.