📖 Sequences and convergence#

| words

References and additional materials
_images/shsc2016.png

[Sydsæter, Hammond, Strøm, and Carvajal, 2016]

Chapters 2.7, 5.5, 7.11

  • Numberphile video on the Euler’s constant \(e=2.718281828\) YouTube

  • 3Blue1Brown (Grant Sanderson) video on series and inventing new math YouTube

Norm and distance#

First, we have to understand how to measure distance between points in \(\mathbb{R}^n = \times_{i \in \{1,2,··· ,n\}} \mathbb{R} = \mathbb{R} \times \mathbb{R} \times \dots \times \mathbb{R}\)

Definition

The Euclidean norm of \(x \in \mathbb{R}^N\) is defined as

\[\| x \| = \left( \sum_{n=1}^N x_n^2 \right)^{1/2}\]

Interpretation:

  • \(\| x \|\) represents the length of \(x\)

_images/vec.png

Fig. 21 Length of red line \(= \sqrt{x_1^2 + x_2^2} =: \|x\|\)#

\(\| x - y \|\) represents distance between \(x\) and \(y\)

_images/vec_minus.png

Fig. 22 Length of red line \(= \|x - y\|\)#

Fact

For any \(\alpha \in \mathbb{R}\) and any \(x, y \in \mathbb{R}^N\), the following statements are true:

  • \(\| x \| \geq 0\) and \(\| x \| = 0\) if and only if \(x = 0\)

  • \(\| \alpha x \| = |\alpha| \| x \|\)

Triangle inequality

  • \(\| x + y \| \leq \| x \| + \| y \|\)

In fact, any function can be used as a norm, provided that the listed properties are satisfied

Example

More general distance function based on \(\|\cdot\|_\rho\) norm in \(\mathbb{R}^2\).

_images/metric.png

Fig. 23 Circle drawn with different norms#

Absolute value as Euclidean norm in \(\mathbb{R}\)#

Naturally, in \(\mathbb{R}\) Euclidean norm simplifies to

\[\| x \| = \sqrt{x^2} = |x|\]
Hide code cell source
import numpy as np
import matplotlib.pyplot as plt

def subplots():
    "Custom subplots with axes throught the origin"
    fig, ax = plt.subplots()

    # Set the axes through the origin
    for spine in ['left', 'bottom']:
        ax.spines[spine].set_position('zero')
    for spine in ['right', 'top']:
        ax.spines[spine].set_color('none')
    
    ax.grid()
    return fig, ax

fig, ax = subplots()

ax.set_ylim(-3, 3)
ax.set_yticks((-3, -2, -1, 1, 2, 3))
x = np.linspace(-3, 3, 100)
ax.plot(x, np.abs(x), 'g-', lw=2, alpha=0.7, label=r'$f(x) = |x|$')
ax.plot(x, x, 'k--', lw=2, alpha=0.7, label=r'$f(x) = x$')
ax.legend(loc='lower right')

plt.show()
_images/9804fa843f38bc049201a44aff430060de98efadea440963aef3f0107a157236.png

Therefore we can think of norm as a generalization of the absolute value to \(\mathbb{R}\)

Neighborhood of a point#

Fact

For any function \(h(x):\mathbb{R} \to \mathbb{R}\) an inequality of the form

\[|h(x)| \leqslant a\]

is equivalent to the double inequality

\[-a \leqslant h(x) \leqslant a\]

Example

\[|x - 3| < 2\]

This is equivalent to

\[\begin{split}\begin{cases} x-3 < 2 \\ x-3 > -2 \end{cases}\end{split}\]

Solving the first inequality we have \(x<5\) and solving the second we have \(x>1\). Therefore the solution is \((1,5)\).

Definition

The \(\epsilon\)-neighborhood of a point \(x \in \mathbb{R}\) is the set

\[ \{ y \in \mathbb{R} : |x-y| < \epsilon \} \]

\(\epsilon\)-Neighborhood of a point \(x \in \mathbb{R}\) are all the points that are closer to \(x\) than \(\epsilon\).

Sequences#

Definition

A sequence from a set \(X \subset \mathbb{R}\) is a function of the form \(f: \mathbb{N} \rightarrow X\).

It is often denoted by \(\{ x_n \}^\infty_{n=1}\) or \(\{ x_n \}_{n \in \mathbb{N}}\) or \(\{ x_1, x_2, \dots , x_n, \dots \}\), where \(x_n \in X\) for all \(n \in \mathbb{N}\).

In effect, it is indexing of the elements of \(X\) by the natural numbers.

Note that this definition of a sequence does not explicitly allow for finite sequences, but we can think about a finite sequence as being a truncated sequence, where we throw away all terms for which \(n > N\) for some \(N \in \mathbb{N}\). Such a sequence would be written as \(\{ x_n \} ^{N}_{n=1}\) or \(\{ x_1, x_2, \dots , x_{N} \}\).

Example

  • \(\{x_n\} = \{2, 4, 6, \ldots \}\)

  • \(\{x_n\} = \{1, 1/2, 1/4, \ldots \}\)

  • \(\{x_n\} = \{1, -1, 1, -1, \ldots \}\)

  • \(\{x_n\} = \{0, 0, 0, \ldots \}\)

Important notational comment

We have been looking at sequences drawn from some set \(X\). The \(n\)-th element of such a sequence was denoted by \(x_n\). This requires that \(x_n \in X\).

Do not confuse this \(x_n\) with the \(n\)-th coordinate of the point \(x = (x_1, x_2, \dots , x_{(n−1)}, x_n, x_{(n+1)}, \dots , x_L) \in \mathbb{R}^L\).

If we are considering sequences of points (or vectors) in \(\mathbb{R}^L\), then we might want to use superscripts to denote sequence position and subscripts to denote coordinates.

  • In this way, the \(k\)-th element of the sequence \(\{ x_n \}^\infty_{n=1}\) would be the point (or vector) \(x_k = (x^k_1, x^k_2, \dots , x^k_{(n−1)}, x^k_n, x^k_{(n+1)}, \dots , x^k_L)\). Note that the \(k\) in this case is an index, NOT a power.

Convergence and limit#

Let \(a \in \mathbb{R}\) and let \(\{x_n\}\) be a sequence

Suppose, for any \(\epsilon > 0\), we can find an \(N \in \mathbb{N}\) such that

\[| x_n -a | <\epsilon \text{ for all } n \geq N\]

Then \(\{x_n\}\) is said to converge to \(a\)

Convergence to \(a\) in symbols,

\[\forall \, \epsilon > 0, \; \exists \, N \in \mathbb{N} \; \text{ such that } n \geq N \implies | x_n -a | <\epsilon\]
  • expression \(| x_n -a | <\epsilon\) defines an \(\epsilon\)-neighborhood around \(a\)

The sequence \(\{x_n\}\) is eventually in this \(\epsilon\)-neighborhood around \(a\)

Hide code cell source
import matplotlib.pyplot as plt
import numpy as np

# from matplotlib import rc
# rc('font',**{'family':'serif','serif':['Palatino']})
# rc('text', usetex=True)

def fx(n):
    return 1 + 1/(n**(0.7))
def subplots(fs):
    "Custom subplots with axes throught the origin"
    fig, ax = plt.subplots(figsize=fs)
    # Set the axes through the origin
    for spine in ['left', 'bottom']:
        ax.spines[spine].set_position('zero')
    for spine in ['right', 'top']:
        ax.spines[spine].set_color('none')
    return fig, ax
def plot_seq(N,epsilon,a,fn):
    fig, ax = subplots((9, 5))  
    xmin, xmax = 0.5, N+1
    ax.set_xlim(xmin, xmax)
    ax.set_ylim(0, 2.1)
    n = np.arange(1, N+1)
    ax.set_xticks([])
    ax.plot(n, fn(n), 'ko', label=r'$x_n$', alpha=0.8)
    ax.hlines(a, xmin, xmax, color='k', lw=0.5, label='$a$')
    ax.hlines([a - epsilon, a + epsilon], xmin, xmax, color='k', lw=0.5, linestyles='dashed')
    ax.fill_between((xmin, xmax), a - epsilon, a + epsilon, facecolor='blue', alpha=0.1)
    ax.set_yticks((a - epsilon, a, a + epsilon))
    ax.set_yticklabels((r'$a - \epsilon$', r'$a$', r'$a + \epsilon$'))
    ax.legend(loc='upper right', frameon=False, fontsize=14)
    plt.show()

N = 50
a = 1
plot_seq(N,0.30,a,fx)
plot_seq(N,0.20,a,fx)
plot_seq(N,0.10,a,fx)
_images/182beae19cd8d71c4b57ff8e04ea4046ee7f2f4a0ac9608bca4199d1173067aa.png _images/4e03e7574787802ef46b89ce94228755a3371ec930cdee7a1094b9fb433fff48.png _images/0280dbf338a538bb413a939e41ca9b228ef99e8f0875cdcdd28bf87ec56a783d.png

Definition

The point \(a\) is called the limit of the sequence, denoted

\[x_n \to a \text{ as } n \to \infty \quad \text{ or } \quad \lim_{n \to \infty} x_n = a,\]

if

\[\forall \, \epsilon > 0, \; \exists \, N \in \mathbb{N} \; \text{ such that } n \geq N \implies |x_n - a|< \epsilon\]

Example

\(\{x_n\}\) defined by \(x_n = 1 + 1/n\) converges to \(1\):

\[x_n \to 1 \; \mathrm{as} \; n \to \infty \quad\text{or}\quad \lim_{n \to \infty} x_n = 1\]

To prove this we must show that \(\forall \, \epsilon > 0\), there is an \(N \in \mathbb{N}\) such that

\[n \geq N \implies |x_n - 1| < \epsilon\]

To show this formally we need to come up with an “algorithm”

  1. You give me any \(\epsilon > 0\)

  2. I respond with an \(N\) such that equation above holds

In general, as \(\epsilon\) shrinks, \(N\) will have to grow

Proof:

Here’s how to do this for the case \(1 + 1/n\) converges to \(1\)

First pick an arbitrary \(\epsilon > 0\)

Now we have to come up with an \(N\) such that

\[n \geq N \implies |1 + 1/n - 1| < \epsilon\]

Let \(N\) be the first integer greater than \( 1/\epsilon\)

Then

\[n \geq N \implies n > 1/\epsilon \implies 1/n < \epsilon \implies |1 + 1/n - 1| < \epsilon \]

Remark: Any \(N' > N\) would also work

Example

The sequence \(x_n = 2^{-n}\) converges to \(0\) as \(n \to \infty\)

Proof:

Must show that, \(\forall \, \epsilon > 0\), \(\exists \, N \in \mathbb{N}\) such that

\[n \geq N \implies |2^{-n} - 0| < \epsilon\]

So pick any \(\epsilon > 0\), and observe that

\[|2^{-n} - 0| < \epsilon \; \iff \; 2^{-n} < \epsilon \; \iff \; n > - \frac{\ln \epsilon}{\ln 2}\]

Hence we take \(N\) to be the first integer greater than \(- \ln \epsilon / \ln 2\)

Then

\[n \geq N \implies n > -\frac{\ln \epsilon}{\ln 2} \implies |2^{-n} - 0| < \epsilon\]

What if we want to show that \(x_n \to a\) fails?

To show convergence fails we need to show the negation of

\[\forall \,\; \epsilon > 0, \;\; \exists \,\; N \in \mathbb{N} \;\mathrm{such\;that}\; n \geq N \implies |x_n -a|<\epsilon\]

In words, there is an \(\epsilon > 0\) where we can’t find any such \(N\)

That is, for any choice of \(N\) there will be \(n>N\) such that \(x_n\) jumps to the outside \(\epsilon\)-neighborhood around \(a\)

In other words, there exists a \(\epsilon\) such that \(|x_n-a| \geqslant \epsilon\) again and again as \(n \to \infty\).

This is the kind of picture we’re thinking of

Hide code cell source
def fx2(n):
    return 1 + 1/(n**(0.7)) - 0.3 * (n % 8 == 0)

N = 80
a = 1
plot_seq(N,0.15,a,fx2)
_images/b4a39c3a65dfe77b9443f4f12ae1cb7900b98eba3aeb49c4d3f663426fd484a3.png

Example

The sequence \(x_n = (-1)^n\) does not converge to any \(a \in \mathbb{R}\)

Proof:

This is what we want to show

\[\exists \,\; \epsilon > 0 \;\text{ such that } |x_n-a| \geqslant \epsilon \text{ infinitely many times as } n \to \infty\]

Since it’s a “there exists”, we need to come up with such an \(\epsilon\)

Let’s try \(\epsilon = 0.5\), so that the \(\epsilon\)-neighborhood around \(a\) is

\[\{ x \in \mathbb{R} : |x - a| < 0.5 \} = (a-0.5, a+0.5 )\]

We have:

  • If \(n\) is odd then \(x_n = -1\) when \(n > N\) for any \(N\).

  • If \(n\) is even then \(x_n = 1\) when \(n > N\) for any \(N\).

Therefore even if \(a=1\) or \(a=-1\), \(\{x_n\}\) is not in \((a-0.5, a+0.5 )\) infinitely many times as \(n \to \infty\).

It holds for all other values of \(a \in \mathbb{R}\) as well

Cauchy sequences#

Informal definition: Cauchy sequences are those where \(|x_n - x_{n+1}|\) gets smaller and smaller

_images/cauchy.png

Example

Sequences generated by iterative methods for solving nonlinear equations often have this property

See code example

Definition

A sequence \(\{x_n\}\) is called Cauchy if

\[\forall \; \epsilon > 0, \;\; \exists \; N \in \mathbb{N} \; \mathrm{such\;that}\; \forall n, m \geqslant N \implies | x_n - x_m | < \epsilon\]

Alternatively

\[\forall \; \epsilon > 0, \;\; \exists \; N \in \mathbb{N} \; \mathrm{such\;that}\; \forall j \geqslant N \implies | x_n - x_{n+j} | < \epsilon\]

Cauchy sequences allow to establish convergence without finding the limit itself!

Fact

Every convergent sequence is Cauchy, and every Cauchy sequence is convergent.

Example

\(\{x_n\}\) defined by \(x_n = \alpha^n\) where \(\alpha \in (0, 1)\) is Cauchy

Proof:

For any \(n , j\) we have

\[|x_n - x_{n+j}| = |\alpha^n - \alpha^{n+j}| = \alpha^n |1 - \alpha^j| \leq \alpha^n\]

Fix \(\epsilon > 0\)

We can show that \(n > \log(\epsilon) / \log(\alpha) \implies \alpha^n < \epsilon\)

Hence any integer \(N > \log(\epsilon) / \log(\alpha)\) the sequence is Cauchy by definition.

Sub-sequences#

Let \(g: \mathbb{N} \rightarrow \mathbb{N}\) be an increasing mapping of the form \(g(k) = n_k\).

  • This means that \(g(k + 1) = n_{k+1} > n_k = g(k)\) for all \(k \in \mathbb{N}\).

The mapping \(f: \mathbb{N} \rightarrow X\) is a sequence from \(X\), and so is the mapping \(h: \mathbb{N} \rightarrow X\) given by \(h = f \circ g\)!

  • Note that \(h(k) = f(g(k)) = f(n_k)\).

  • \(f\) generates the sequence \(\{x_1, x_2, \dots , x_{n_1 − 1}, x_{n_1} , x_{n_1 + 1}, \dots , x_{n_2 − 1}, x_{n_2}, x_{n_2 + 1}, \dots \}\)

  • \(h\) generates the sequence \(\{x_{n_1} , x_{n_2} , \dots \}\)

Definition

The sequence \(\{x_{n_k} \}_{n_k \in \mathbb{N}}\) where \(k \in \mathbb{N}\) and \(n_{k+1} > n_k\) for all \(k\), is referred to as a sub-sequence of the sequence \(\{ x_n \}_{n \in \mathbb{N}}\)

Note that every term in the sequence \(\{ x_{n_k} \}_{n_k \in \mathbb{N}}\) is found somewhere in the sequence \(\{ x_n \}_{n \in \mathbb{N}}\)

Note also that the (relative) order in which the terms appear is the same for each of the sequences (due to the fact that \(n_{k+1} > n_k\) for all \(k\))

Example

Consider the sequence \(\{ x_n \}_{n \in \mathbb{N}}\) where \(x_n = \frac{1}{n}\).

  • The mapping for this sequence is \(f(n) = \frac{1}{n}\).

  • The sequence looks like \(\left\{1, \frac{1}{2}, \frac{1}{3} , \dots , \frac{1}{n}, \dots \right\}\).

Consider the increasing mapping \(g: \mathbb{N} \rightarrow \mathbb{N}\) defined by \(g(k) = k^2\).

  • This, along with \(f\), generates the sub-sequence mapping

\[ h(k) = f(g(k)) = f(k^2) = \frac{1}{k^2}\]
  • The associated sub-sequence is

\[ \left\{ \frac{1}{k^2} \right\}_{k \in \mathbb{N}} = \left\{ 1, \frac{1}{4}, \frac{1}{9}, \dots , \frac{1}{n^2} , \dots \right\}\]

Consider the increasing map \(g: \mathbb{N} \rightarrow \mathbb{N}\) defined by \(g(k) = 2^k\).

  • This, along with \(f\), generates the sub-sequence map

\[h(k) = f(g(k)) = f(2^k) = \frac{1}{2^k}\]
  • The associated sub-sequence is

\[\left\{ \frac{1}{2^k} \right\}_{k \in \mathbb{N}} = \left\{ \frac{1}{2}, \frac{1}{4}, \frac{1}{8} , \dots , \frac{1}{2^n}, \dots \right\} \]

Fact

If \(\{ x_n \}\) converges to \(x\) in \(\mathbb{R}\), then every subsequence of \(\{x_n\}\) also converges to \(x\)

Fact

If two different subsequences of a sequence converge to different limits, the original sequence is not convergent.

Example

  • The sequence \(\left\{ \frac{1}{n} \right\}_{n \in \mathbb{N}}\) converges to the point \(x = 0\).

  • The sequence \(\left\{ \frac{1}{n^2} \right\}_{n \in \mathbb{N}}\) converges to the point \(x = 0\).

    • Recall that \(\left\{ \frac{1}{n^2} \right\}_{n \in \mathbb{N}}\) is a sub-sequence of \(\left\{ \frac{1}{n} \right\}_{n \in \mathbb{N}}\)

  • The sequence \(\left\{ \frac{1}{2^n} \right\}_{n \in \mathbb{N}}\) converges to the point \(x = 0\).

    • Recall that \(\left\{ \frac{1}{2^n} \right\}_{n \in \mathbb{N}}\) is a sub-sequence of \(\left\{ \frac{1}{n} \right\}_{n \in \mathbb{N}}\)

  • The sequence \(\left\{ \frac{(−1)^n}{n} \right\}_{n \in \mathbb{N}} = \{ −1, \frac{1}{2}, \frac{−1}{3} , \dots \}\) converges to the point \(x = 0\).

  • The sequence \(\left\{ \frac{n}{n + 1} \right\}_{n \in \mathbb{N}} = \{ \frac{1}{2}, \frac{2}{3}, \frac{3}{4}, \dots \}\) converges to the point \(x = 1\).

  • The sequence \(\{ (−1)^n \}_{n \in \mathbb{N}} = \{ −1, 1, −1, 1, \dots \}\) does not converge.

  • The sequence \(\{ n \}_{n \in \mathbb{N}}\) does not converge.

  • The sequence \(\left\{ ( \frac{1}{n} , \frac{(n−1)}{n} ) \right\}_{n \in \mathbb{N}}\) converges to \((0, 1)\).

  • The sequence \(\left\{ ( \frac{(−1)^n}{n}, \frac{(−1)^n}{n} ) \right\}_{n \in \mathbb{N}}\) converges to \((0, 0)\).

  • The sequence \(\{ ((−1)^n, (−1)^n) \}_{n \in \mathbb{N}}\) does not converge.

  • The sequence \(\{ (n, n) \}_{n \in \mathbb{N}}\) does not converge.

Properties of limits#

Fact

  1. \(x_n \to a\) in \(\mathbb{R}\) if and only if \(|x_n - a| \to 0\) in \(\mathbb{R}\)

  2. If \(x_n \to x\) and \(y_n \to y\) then \(x_n + y_n \to x + y\)

  3. If \(x_n \to x\) and \(\alpha \in \mathbb{R}\) then \(\alpha x_n \to \alpha x\)

  4. If \(x_n \to x\) and \(y_n \to y\) then \(x_n y_n \to xy\)

  5. If \(x_n \to x\) and \(y_n \to y\) then \(x_n / y_n \to x/y\), provided \(y_n \ne 0\), \(y \ne 0\)

  6. If \(x_n \to x\) then \(x_n^p \to x^p\)

Fact

Each sequence in \(\mathbb{R}\) has at most one limit

Euler’s constant \(e\)#

Euler’s constant, which is usually denoted by \(e\), is an irrational number that is defined as the limit of a particular sequence of rational numbers.

Consider the sequence

\[ \{ x_n \}_{n \in \mathbb{N}} = \left\{ \left(1 + \frac{1}{n} \right)^n \right\}_{n \in \mathbb{N}} \]

This sequence converges to Euler’s constant. Euler’s constant is defined to be

\[ e = \lim_{n \rightarrow \infty} \left(1 + \frac{1}{n} \right)^n \]

Computing Euler’s constant as a limit of a sequence

e0 = 2.71828182845904523536028747135266
f = lambda n: (1+1/n)**n
print('Approximation of Euler constant')
for i in range(0,10100,500):
    e = f(i+1)
    print('iter = {:5d}: e = {:14.8f} err = {:14.10f}'.format(i,e,e0-e))
Approximation of Euler constant
iter =     0: e =     2.00000000 err =   0.7182818285
iter =   500: e =     2.71557393 err =   0.0027079019
iter =  1000: e =     2.71692529 err =   0.0013565409
iter =  1500: e =     2.71737689 err =   0.0009049376
iter =  2000: e =     2.71760291 err =   0.0006789198
iter =  2500: e =     2.71773859 err =   0.0005432399
iter =  3000: e =     2.71782907 err =   0.0004527577
iter =  3500: e =     2.71789372 err =   0.0003881134
iter =  4000: e =     2.71794221 err =   0.0003396225
iter =  4500: e =     2.71797993 err =   0.0003019027
iter =  5000: e =     2.71801010 err =   0.0002717240
iter =  5500: e =     2.71803480 err =   0.0002470304
iter =  6000: e =     2.71805538 err =   0.0002264511
iter =  6500: e =     2.71807279 err =   0.0002090370
iter =  7000: e =     2.71808772 err =   0.0001941098
iter =  7500: e =     2.71810066 err =   0.0001811725
iter =  8000: e =     2.71811198 err =   0.0001698519
iter =  8500: e =     2.71812197 err =   0.0001598629
iter =  9000: e =     2.71813084 err =   0.0001509835
iter =  9500: e =     2.71813879 err =   0.0001430386
iter = 10000: e =     2.71814594 err =   0.0001358880
  • See an excellent video about the Euler constant by Numberphile YouTube