📖 Univariate differentiation

📖 Univariate differentiation#

⏱ | words

Definition#

Consider a function \(f : X \rightarrow \mathbb{R}\), where \(X \subseteq \mathbb{R}\). Let \((x_0, f (x_0))\) and \((x_0 + h, f (x_0 + h))\) be two points that lie on the graph of this function.

Draw a straight line between these two points. The slope of this line is

\[\begin{split} \begin{align*} \frac{\text{rise}}{\text{run}} &= \frac{y_2 − y_1}{x_2 − x_1} \\ &= \frac{f (x_0 + h) − f (x_0)}{x_0 + h − x_0} \\ &= \frac{f (x_0 + h) − f (x_0)}{h} \end{align*} \end{split}\]

What happens to this slope as \(h \rightarrow 0\) (that is, as the second point gets very close to the first point?

Let \(h = 1/n\) and look at the limit as \(n \to \infty\)

_images/016920f4f0dbe391b90b9abea7240a3d9a062831faf768e293126b54194ec2a2.png

_images/661ccc2e895421b4b8349ac188b54688b04ae3fce3a1914ce399df8c8a7a8d9b.png

Definition

The (first) derivative of the function \(f(x)\) at the point \(x=x_{0}\), if it exists, is defined to be

\[ f^{\prime}\left(x_{0}\right) =\frac{d f}{d x}(x_{0}) =\lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h} . \]

This is simply the slope of the straight line that is tangent to the function \(f(x)\) at the point \(x=x_{0}\).

Differentiation from first principles#

We will now proceed to use this definition to find the derivative of some simple functions. This is sometimes called “finding the derivative of a function from first principles”.

Example: the derivative of a constant function

Consider the function \(f: X \longrightarrow \mathbb{R}\) defined by \(f(x)=b\), where \(b \in \mathbb{R}\) is a constant. Clearly we have \(f\left(x_{0}\right)=f\left(x_{0}+h\right)=b\) for all choices of \(x_{0}\) and \(h\).

Thus we have

\[ \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\frac{b-b}{h}=\frac{0}{h}=0 \]

for all choices of \(x_{0}\) and \(h\).

This means that

\[ \lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\lim _{h \rightarrow 0} 0=0 \]

As such, we can conclude that

\[ f^{\prime}(x)=\frac{d f}{d x}=0 \]

Example: the derivative of a linear function

Consider the function \(f: X \longrightarrow \mathbb{R}\) defined by \(f(x)=a x+b\). Note that

\[\begin{split} \begin{aligned} f(x+h) & =a(x+h)+b \\ & =a x+a h+b \\ & =a x+b+a h \\ & =f(x)+a h . \end{aligned} \end{split}\]

Thus we have

\[ \frac{f(x+h)-f(x)}{h}=\frac{f(x)+a h-f(x)}{h}=\frac{a h}{h}=a . \]

This means that \(\lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\lim _{h \rightarrow 0} a=a\).

As such, we can conclude that \(f^{\prime}(x)=\frac{d f}{d x}=a\).

Example: the derivative of a quadratic power function

Consider the function \(f: X \longrightarrow \mathbb{R}\) defined by \(f(x)=x^{2}\). Note that

\[\begin{split} \begin{aligned} f(x+h) & =(x+h)^{2} \\ & =x^{2}+2 x h+h^{2} \\ & =f(x)+2 x h+h^{2} \end{aligned} \end{split}\]

Thus we have

\[\begin{split} \begin{aligned} \frac{f(x+h)-f(x)}{h} & =\frac{f(x)+2 x h+h^{2}-f(x)}{h} \\ & =\frac{2 x h+h^{2}}{h} \\ & =2 x+h . \end{aligned} \end{split}\]

This means that \(\lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\lim _{h \rightarrow 0}(2 x+h)=2 x\).

As such, we can conclude that \(f^{\prime}(x)=\frac{d f}{d x}=2 x\)

Example: the derivative of a quadratic polynomial function

Consider the function \(f: X \longrightarrow \mathbb{R}\) defined by \(f(x)=a x^{2}+b x+c\). Note that

\[\begin{split} \begin{aligned} f(x+h) & =a(x+h)^{2}+b(x+h)+c \\ & =a\left(x^{2}+2 x h+h^{2}\right)+b x+b h+c \\ & =a x^{2}+2 a x h+a h^{2}+b x+b h+c \\ & =\left(a x^{2}+b x+c\right)+(2 a x+b) h+a h^{2} \\ & =f(x)+(2 a x+b) h+a h^{2} . \end{aligned} \end{split}\]

Thus we have

\[\begin{split} \begin{aligned} \frac{f(x+h)-f(x)}{h} & =\frac{f(x)+(2 a x+b) h+a h^{2}-f(x)}{h} \\ & =\frac{(2 a x+b) h+a h^{2}}{h} \\ & =(2 a x+b)+a h \\ & =2 a x+b+a h . \end{aligned} \end{split}\]

This means that

\[ \lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\lim _{h \rightarrow 0}(2 a x+b+a h)=2 a x+b \]

As such, we can conclude that \(f^{\prime}(x)=\frac{d f}{d x}=2 a x+b\).

Fact (Binomial theorem)

Suppose that \(x, y \in \mathbb{R}\) and \(n \in \mathbb{Z}_+ = \mathbb{N} \cup \{0\}\).

We have

\[ (x + y)^n = \sum_{k = 0}^{n} \binom{n}{k} x^{(n−k)} y^k \]

where \(\binom{n}{k} = C_{n,k} = \frac{n!}{k! (n − k)!}\) is known as a binomial coefficient (read “\(n\) choose \(k\)” because this is the number of ways to choose a subset of \(k\) elements from a set of \(n\) elements). \(n! = \prod_{k=1}^n k\) denotes a factorial of \(n\), and \(0! = 1\) by definition.

Useful property is \(\binom{n}{k} = C_{n,k} = C_{n-1,k-1}+C_{n-1,k}\) is illustrated by Pascal’s triangle.

Example: the derivative of a positive integer power function

Consider the function \(f: \mathbb{R} \longrightarrow \mathbb{R}\) defined by \(f(x)=x^{n}\), where \(n \in \mathbb{N}=\mathbb{Z}_{++}\). We know from the binomial theorem that

\[\begin{split} \begin{aligned} f(x+h) & =(x+h)^{n} \\ & =\sum_{r=0}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r} \\ & =\left(\begin{array}{l} n \\ 0 \end{array}\right) x^{n} h^{0}+\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r} \\ & =1 x^{n}(1)+\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r} \\ & =x^{n}+\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r} . \end{aligned} \end{split}\]

Thus we have

\[\begin{split} \begin{aligned} \frac{f(x+h)-f(x)}{h} & =\frac{f(x)+\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r}-f(x)}{h} \\ & =\frac{\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r}}{h} \\ & =\sum_{r=1}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r-1} \\ & =\left(\begin{array}{l} n \\ 1 \end{array}\right) x^{n-1} h^{0}+\sum_{r=2}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r-1} \\ & =n x^{n-1}(1)+\sum_{r=2}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r-1} \end{aligned} \end{split}\]

\[\begin{split} \frac{f(x+h)-f(x)}{h} & =n x^{n-1}+\sum_{r=2}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r-1} \end{split}\]

This means that

\[\begin{split} \lim _{h \rightarrow 0} \frac{f\left(x_{0}+h\right)-f\left(x_{0}\right)}{h}=\lim _{h \rightarrow 0}\left(n x^{n-1}+\sum_{r=2}^{n}\left(\begin{array}{l} n \\ r \end{array}\right) x^{n-r} h^{r-1}\right)=n x^{n-1} \end{split}\]

As such, we can conclude that

\[ f^{\prime}(x)=\frac{d f}{d x}=n x^{n-1} \]

Example: derivative of an \(e^x\)

Consider the function \(f: \mathbb{R} \longrightarrow \mathbb{R}_{++}\) defined by \(f(x)=e^x\), where \(e\) in Euler’s constant. Recall that by definition \(e = \lim_{n \to \infty} (1+\frac{1}{n})^n\)

\[ \lim_{h \to 0} \frac{f(x+h)-f(x)}{h} =\lim_{h \to 0} \frac{e^{x+h}-e^x}{h} =\lim_{h \to 0} \frac{e^x(e^h-1)}{h} \]

Consider \(\lim_{h \to 0}\frac{e^h-1}{h}\) after substitution \(t=(e^h-1)^{-1}\) \(\Leftrightarrow\) \(h = \ln(t^{-1}+1)\)

\[\begin{split} \begin{array}{l} \lim_{h \to 0}\frac{e^h-1}{h} = \lim_{t \to \infty}\frac{t^{-1}}{\ln(t^{-1}+1)} =\\= \lim_{t \to \infty}\left(t \ln(t^{-1}+1) \right)^{-1} =\\= \Big( \ln \Big[ \lim_{t \to \infty} (1+\frac{1}{t})^t \Big] \Big)^{-1} =\\= ( \ln e )^{-1}= (1)^{-1} = 1 \end{array} \end{split}\]

We used the fact that of \(e = \lim_{n \to \infty} (1+1/n)^n\).

Therefore

\[ \lim_{h \to 0} \frac{f(x+h)-f(x)}{h} =\lim_{h \to 0} \frac{e^x(e^h-1)}{h} = e^x \lim_{h \to 0} \frac{e^h-1}{h} = e^x \]

Differentiation rules#

Fact: The derivatives of some commonly encountered functions

If \(f(x)=a\), where \(a \in \mathbb{R}\) is a constant, then \(f^{\prime}(x)=0\).
If \(f(x)=a x+b\), then \(f^{\prime}(x)=a\).
If \(f(x)=a x^{2}+b x+c\), then \(f^{\prime}(x)=2 a x+b\).
If \(f(x)=x^{n}\), where \(n \in \mathbb{N}\), then \(f^{\prime}(x)=n x^{n-1}\).
If \(f(x)=\frac{1}{x^{n}}=x^{-n}\), where \(n \in \mathbb{N}\), then \(f^{\prime}(x)=-n x^{-n-1}=-n x^{-(n+1)}=\frac{-n}{x^{n+1}}\). (Note that we need to assume that \(x \neq 0\).)
If \(f(x)=e^{x}=\exp (x)\), then \(f^{\prime}(x)=e^{x}=\exp (x)\).
If \(f(x)=\ln (x)\), then \(f^{\prime}(x)=\frac{1}{x}\). (Recall that \(\ln (x)\) is only defined for \(x>0\), so we need to assume that \(x>0\) here.)

Fact: Scalar Multiplication Rule

If \(f(x)=c g(x)\) where \(c \in \mathbb{R}\) is a constant, then

\[ f^{\prime}(x)=c g^{\prime}(x) \]

Example

Let \(f(x) = a g(x)\) where \(g(x)=x\). From the derivation of the derivative of the linear functions we know that \(g'(x) = 1\) and \(f'(x) = a\). We can verify that \(f'(x) = a g'(x)\).

Fact: Summation Rule

If \(f(x)=g(x)+h(x)\), then

\[ f^{\prime}(x)=g^{\prime}(x)+h^{\prime}(x) \]

Example

Let \(f(x) = a x + b\) and \(g(x)=cx+d\). From the derivation of the derivative of the linear functions we know that \(f'(x) = a\) and \(g'(x) = c\). The sum \(f(x)+g(x) = (a+c)x+b+d\) is also a linear function, therefore \(\frac{d}{dx}\big(f(x)+g(x)\big) = a+c\).

We can thus verify that \(\frac{d}{dx}\big(f(x)+g(x)\big) = f'(x)+ g'(x)\).

Fact: Product Rule

If \(f(x)=g(x) h(x)\), then

\[ f^{\prime}(x)=g^{\prime}(x) h(x)+h^{\prime}(x) g(x) \]

Example

Let \(f(x) = x\) and \(g(x)=x\). From the derivation of the derivative of the linear functions we know that \(f'(x) = g'(x) = 1\). The product \(f(x)g(x) = x^2\) and from the derivation of above we know that \(\frac{d}{dx}\big(f(x)g(x)\big) = \frac{d}{dx}\big(x^2\big) = 2x\).

Using the product formula we can verify \(\frac{d}{dx}\big(f(x)g(x)\big) = 1\cdot x + x \cdot 1 = 2x\).

Fact: Quotient Rule

If \(f(x)=\frac{g(x)}{h(x)}\), then

\[ f^{\prime}(x)=\frac{g^{\prime}(x) h(x)-h^{\prime}(x) g(x)}{[h(x)]^{2}} \]

The quotient rule is redundant#

In a sense, the quotient rule is redundant. The reason for this is that it can be obtained from a combination of the product rule and the chain rule.

Suppose that \(f(x)=\frac{g(x)}{h(x)}\). Note that

\[ f(x)=\frac{g(x)}{h(x)}=g(x)[h(x)]^{-1} \]

Let \([h(x)]^{-1}=k(x)\). We know from the chain rule that

\[ k^{\prime}(x)=(-1)[h(x)]^{-2} h^{\prime}(x)=\frac{-h^{\prime}(x)}{[h(x)]^{2}} \]

Note that

\[ f(x)=\frac{g(x)}{h(x)}=g(x)[h(x)]^{-1}=g(x) k(x) \]

We know from the product rule that

\[\begin{split} \begin{aligned} f^{\prime}(x) & =g^{\prime}(x) k(x)+k^{\prime}(x) g(x) \\ & =g^{\prime}(x)[h(x)]^{-1}+\left(\frac{-h^{\prime}(x)}{[h(x)]^{2}}\right) g(x) \\ & =\frac{g^{\prime}(x)}{h(x)}-\frac{h^{\prime}(x) g(x)}{[h(x)]^{2}} \\ & =\frac{g^{\prime}(x) h(x)}{[h(x)]^{2}}-\frac{h^{\prime}(x) g(x)}{[h(x)]^{2}} \\ & =\frac{g^{\prime}(x) h(x)-h^{\prime}(x) g(x)}{[h(x)]^{2}} . \end{aligned} \end{split}\]

This is simply the quotient rule!

Fact: Chain Rule

If \(f(x)=g(h(x))\), then

\[ f^{\prime}(x)=g^{\prime}(h(x)) h^{\prime}(x) \]

Example: the derivative of an exponential function

Suppose that \(f(x)=a^{x}\), where \(a \in \mathbb{R}_{++}=(0, \infty)\) and \(x \neq 0\).

We can write \(f(x) = e^{\ln a^{x}} = e^ {x \ln(a)}\), and using the chain rule

\[ f'(x) = \frac{d}{dx} e^ {x \ln(a)} = e^ {x \ln(a)} \frac{d}{dx} (x \ln(a)) = a^x \ln(a) \]

Fact: The Inverse Function Rule

Suppose that the function \(y=f(x)\) has a well defined inverse function \(x=f^{-1}(y)\). If appropriate regularity conditions hold, then

\[ \frac{dx}{d y} = \frac{d}{d y} f^{-1}(y) =\frac{1}{f'\big( f^{-1}(y)\big)} \]

Example: the derivative of a logarithmic function

Suppose that \(f(x)=\log _{a}(x)\), where \(a \in \mathbb{R}_{++}=(0, \infty)\) and \(x>0\).

Recall that \(y = \log _{a}(x) \iff a^y = x\). Then evaluating \(\frac{dx}{dy}\) using the derivative of the exponential function we have

\[ \frac{dx}{dy} = a^y \ln(a) \]

On the other hand, using the inverse function rule we have

\[ \frac{dx}{dy} = \frac{1}{f'(a^y)} \]

Combining the two expressions and reinserting \(a^y = x\), we have

\[ f'(x) = \frac{d \log_a(x)}{d x} = \frac{1}{a^y \ln(a)} = \frac{1}{x \ln(a)} \]

Example: product rule

Consider the function \(f(x)=(a x+b)(c x+d)\).

Differentiation Approach One: Note that

\[\begin{split} \begin{aligned} f(x) & =(a x+b)(c x+d) \\ & =a c x^{2}+a d x+b c x+b d \\ & =a c x^{2}+(a d+b c) x+b d \end{aligned} \end{split}\]

Thus we have

\[\begin{split} \begin{aligned} f^{\prime}(x) & =2 a c x+(a d+b c) \\ & =2 a c x+a d+b c \end{aligned} \end{split}\]

Differentiation Approach Two: Note that \(f(x)=g(x) h(x)\) where \(g(x)=a x+b\) and \(h(x)=c x+d\). This means that \(g^{\prime}(x)=a\) and \(g^{\prime}(x)=c\). Thus we know, from the product rule, that

\[\begin{split} \begin{aligned} f^{\prime}(x) & =g^{\prime}(x) h(x)+h^{\prime}(x) g(x) \\ & =a(c x+d)+c(a x+b) \\ & =a c x+a d+a c x+b c \\ & =2 a c x+a d+b c . \end{aligned} \end{split}\]

Differentiability and continuity#

Continuity is a necessary, but not sufficient, condition for differentiability.
- Being a necessary condition means that “not continuous” implies “not differentiable”, which means that differentiable implies continuous.
- Not being a sufficient condition means that continuous does NOT imply differentiable.
Differentiability is a sufficient, but not necessary, condition for continuity.
- Being a sufficient condition means that differentiable implies continuous.
- Not being a necessary condition means that “not differentiable” does NOT imply “not continuous”, which means that continuous does NOT imply differentiable.

Continuity does NOT imply differentiability#

To support this statement all we need is to demonstrate a single example of a function that is continuous at a point but not differentiable at that point.

Proof

Consider the function

\[\begin{split} f(x)=\left\{\begin{array}{cc} 2 x & \text { if } x \leqslant 1 \\ \frac{1}{2} x+\frac{3}{2} & \text { if } x \geq 1 \end{array}\right. \end{split}\]

(There is no problem with this double definition at the point \(x=1\) because the two parts of the function are equal at that point.)

This function is continuous at \(x=1\) because

\[ \lim _{x \rightarrow 1} 2 x=2=\lim _{x \rightarrow 1}\left(\frac{1}{2} x+\frac{3}{2}\right) \]

and

\[ f(1)=2 \]

However, this function is not differentiable at \(x=1\). To show this it is convenient to use the Heine definition of the limit for a function in application to the derivative.

Consider two sequence converging to \(x=1\) from two different directions:

\[ \{p_n\}_{n \in \mathbb{N}}: p_n < 0, p_n \to 0, \quad \{q_n\}_{n \in \mathbb{N}}: q_n > 0, q_n \to 0 \]

Then at \(x=1\)

\[ \lim_{n \to \infty} \frac{f(1+p_n)-f(1)}{p_n} = \lim_{n \to \infty} \frac{2+2p_n-2}{p_n} = 2 \]

but

\[ \lim_{n \to \infty} \frac{f(1+q_n)-f(1)}{q_n} = \lim_{n \to \infty} \frac{\frac{1}{2} +\frac{1}{2}q_n+\frac{3}{2} - \frac{1}{2} - \frac{3}{2}}{q_n} = \frac{1}{2} \]

Thus, for two different convergent sequences \(p_n\) and \(q_n\) we have two different limits of the derivative at \(x=1\). We conclude that the limit \(\lim _{h \rightarrow 1} \frac{f(1+h)-f(1)}{h}\) is undefined.

\(\blacksquare\)

Example

A example of a function that is continuous at every point but not differentiable at any point is the Wiener process (Brownian motion).

Differentiability implies continuity#

Proof

Consider a function \(f: X \longrightarrow \mathbb{R}\) where \(X \subseteq \mathbb{R}\). Suppose that

\[ \lim _{h \rightarrow 0}\left(\frac{f(a+h)-f(a)}{h}\right) \]

exists.

We want to show that this implies that \(f(x)\) is continuous at the point \(a \in X\). The following proof of this proposition is drawn from [Ayres Jr and Mendelson, 2013] (Chapter 8, Solved Problem 2).

First, note that

\[\begin{split} \begin{gathered} \lim _{h \rightarrow 0}(f(a+h)-f(a))=\lim _{h \rightarrow 0}\left\{\left(\frac{h}{h}\right)(f(a+h)-f(a))\right\} \\ =\lim _{h \rightarrow 0}\left\{h\left(\frac{f(a+h)-f(a)}{h}\right)\right\} \\ =\lim _{h \rightarrow 0}(h) \lim _{h \rightarrow 0}\left(\frac{f(a+h)-f(a)}{h}\right) \\ =(0)\left(\lim _{h \rightarrow 0}\left(\frac{f(a+h)-f(a)}{h}\right)\right) \\ =0 . \end{gathered} \end{split}\]

Thus we have

\[ \lim _{h \rightarrow 0}(f(a+h)-f(a))=0 . \]

Now note that

\[\begin{split} \begin{gathered} \lim _{h \rightarrow 0}(f(a+h)-f(a))=\lim _{h \rightarrow 0} f(a+h)-\lim _{h \rightarrow 0} f(a) \\ =\left(\lim _{h \rightarrow 0} f(a+h)\right)-f(a) \end{gathered} \end{split}\]

Upon combining these two results, we obtain

\[ \left(\lim _{h \rightarrow 0} f(a+h)\right)-f(a)=0 \Longleftrightarrow \lim _{h \rightarrow 0} f(a+h)=f(a) . \]

Finally, note that

\[ \lim _{x \rightarrow a} f(x)=\lim _{h \rightarrow 0} f(a+h) . \]

Thus we have

\[ \lim _{x \rightarrow a} f(x)=f(a) \]

This means that \(f(x)\) is continuous at the point \(x=a\).

\(\blacksquare\)

Higher-order derivatives#

Suppose that \(f: X \longrightarrow \mathbb{R}\), where \(X \subseteq \mathbb{R}\), is an \(n\)-times continuously differentiable function for some \(n \geqslant 2\).

We can view the first derivative of this function as a function in its own right. This can be seen by letting \(g(x)=f^{\prime}(x)\).

The second derivative of \(f(x)\) with respect to \(x\) twice is simply the first derivative of \(g(x)\) with respect to \(x\).

In other words,

\[ f^{\prime \prime}(x)=g^{\prime}(x) \]

or, if you prefer,

\[ \frac{d^{2} f}{d x d x}=\frac{d g}{d x} \]

Thus we have

\[ \frac{d^{2} f}{d x d x}=\frac{d}{d x}\left(\frac{d f}{d x}\right) \]

The same approach can be used for defining third and higher order derivative.

Definition

The \(n\)-th order derivative of a function \(f \colon \mathbb{R} \to \mathbb{R}\), if it exists, is the derivative of it’s \((n-1)\)-th order derivative treated as an independent function.

\[ f^{(k)}(x)=\frac{d^{k} f}{\underbrace{dx \cdots dx}_{k}}= \frac{d}{d x}\left(\frac{d^{k-1} f}{\underbrace{dx \cdots dx}_{k-1}}\right) \]

for all \(k \in\{1,2, \cdots, n\}\), where we define

\[ f^{(0)}(x)=f(x) \]

Example

Let \(f(x)=x^{n}\)
Then we have:
\(f^{\prime}(x)=\frac{d f(x)}{d x}=n x^{n-1}\)
\(f^{\prime \prime}(x)=\frac{d f^{\prime}(x)}{d x}=n(n-1) x^{n-2}\),
\(f^{\prime \prime \prime}(x)=\frac{d f^{\prime \prime}(x)}{d x}=n(n-1)(n-2) x^{n-3}\),
and so on and so forth until
\(f^{(k)}(x)=\frac{d f^{(k-1)}(x)}{d x}=n(n-1)(n-2) \cdots(n-(k-1)) x^{n-k}\),
and so on and so forth until
\(f^{(n)}(x)=\frac{d f^{(n-1)}(x)}{d x}=n(n-1)(n-2) \cdots(1) x^{0}\).
Note that \(n(n-1)(n-2) \cdots(1)=n\) ! and \(x^{0}=1\) (asuming that \(x \neq 0\) ).
This means that \(f^{(n)}(x)=n\) !, which is a constant.
As such, we know that \(f^{(n+1)}(x)=\frac{d f^{(n)}(x)}{d x}=0\).
This means that \(f^{(n+j)}(x)=0\) for all \(j \in \mathbb{N}\).

Taylor series#

Definition

The function \(f: X \to \mathbb{R}\) is said to be of differentiability class \(C^m\) if derivatives \(f'\), \(f''\), \(\dots\), \(f^{(m)}\) exist and are continuous on \(X\)

Fact

Consider \(f: X \to \mathbb{R}\) and let \(f\) to be a \(C^m\) function. Assume also that \(f^{(m+1)}\) exists, although may not necessarily be continuous.

For any \(x,a \in X\) there is such \(z\) between \(a\) and \(x\) that

\[\begin{split}\begin{array}{rl} f(x) =& f(a) + f'(a)(x-a) + \dots + \frac{f^{(m)}(a)}{m!}(x-a)^m + R_m(x) =\\ =& f(a) + \sum_{k=1}^m \frac{f^{(k)}(a)}{k!}(x-a)^k + R_m(x), \end{array}\end{split}\]

where the remainder is given by

\[R_m(x) = \frac{f^{(m+1)}(z)(x-a)^{m+1}}{(m+1)!} = o\big((x-a)^m\big)\]

Definition

Little-o notation is used to describe functions that approach zero faster than a given function

\[f(x) = o\big(g(x)\big) \; \text{as} \; x \to a \quad \iff \quad \lim_{x \to a} \frac{f(x)}{g(x)} = 0\]

Loosely speaking, if \(f \colon \mathbb{R} \to \mathbb{R}\) is suitably differentiable at \(a\), then

\[ f(x) \approx f(a) + f'(a)(x-a) \]

for \(x\) very close to \(a\),

\[ f(x) \approx f(a) + f'(a)(x-a) + \frac{f''(a)}{2!}(x-a)^2 \]

on a slightly wider interval, etc.

These are the 1st and 2nd order Taylor series approximations to \(f\) at \(a\) respectively

As the order goes higher we get better approximation

_images/taylor_4.png — Fig. 27 4th order Taylor series for \(f(x) = \sin(x)/x\) at 0#

_images/taylor_6.png — Fig. 28 6th order Taylor series for \(f(x) = \sin(x)/x\) at 0#

_images/taylor_8.png — Fig. 29 8th order Taylor series for \(f(x) = \sin(x)/x\) at 0#

_images/taylor_10.png — Fig. 30 10th order Taylor series for \(f(x) = \sin(x)/x\) at 0#

Example

Consider function \(f(x) = \ln(x)\) and let \(a=1\). Let’s approximate \(f(x)\) with Taylor series at \(a=1\).

Not that \(f'(x) = 1/x\), \(f''(x) = -1/x^2\), \(f'''(x) = 2/x^3\)

Linear approximation:

\[ f(x) = f(a) + f'(a)(x-1) = f(1) + f'(1)(x-1) = 0 + 1(x-1) = x-1 \]

Quadratic approximation:

\[\begin{split} f(x) = f(a) + f'(a)(x-1) + \frac{f''(a)}{2}(x-1)^2 = \\ 0 + 1(x-1) - \frac{1}{2}(x-1)^2 = \\ x-1 - \frac{1}{2}(x^2 - 2x + 1) = \\ -\frac{1}{2}x^2 + 2x - \frac{3}{2} \end{split}\]

Quadratic approximation:

\[\begin{split} f(x) = f(a) + f'(a)(x-1) + \frac{f''(a)}{2}(x-1)^2 + \frac{f'''(a)}{2 \cdot 3}(x-1)^3 = \\ 0 + 1(x-1) - \frac{1}{2}(x-1)^2 + \frac{1}{3}(x-1)^3 = \\ x-1 - \frac{1}{2}(x^2 - 2x + 1) + \frac{1}{3}(x^3 - 3x^2 + 3x - 1) = \\ \frac{1}{3}x^3 -\frac{3}{2}x^2 + 3x - \frac{11}{6} \end{split}\]

_images/c78d148a672a46eace2ccd02d1f7d36fb2a54291c9d2d5c47324daefa74fb782.png

📖 Univariate differentiation

Contents

📖 Univariate differentiation#

Definition#

Differentiation from first principles#

Differentiation rules#

The quotient rule is redundant#

Differentiability and continuity#

Continuity does NOT imply differentiability#

Differentiability implies continuity#

Higher-order derivatives#

Taylor series#