πŸ“– Multivariate differential calculus

πŸ“– Multivariate differential calculus#

⏱ | words

References and additional materials
_images/comingsoon.png

WARNING

This section of the lecture notes is still under construction. It will be ready before the lecture.

  • directional derivatives

  • gradient of a function

  • total derivative

  • second and higher order derivatives

  • Young theorem

  • Hessian matrix

  • Taylor series for multivariate function

  • tangent plane and linear approximation

  • examples, examples, economic applications

Second-order partial derivatives#

Consider the function single-real-valued multivariate function \(f\left(x_1,x_2,\dots,x_n \right)\). Recall that if the first-order partial derivative exist it is also a single-real-valued multivariate function. The second-order partial derivative is given as the partial derivate of the first-order partial derivate

\[ \frac{\partial^{2} f\left(x_{1}, x_{2}, \dots, x_{n}\right)}{\partial x_{j} \partial x_{i}}=\frac{\partial}{\partial x_{j}}\left(\frac{\partial f\left(x_{1}, x_{2}, \dots, x_{n}\right)}{\partial x_{i}}\right), \]

where \(x_i\) is the input \(f\) is differentiated with respect to, and \(x_j\) is the input the first-order partial derivative is differentiated with respect to.

The second-order partial derivative is commonly denoted

\[ \frac{\partial^{2} f\left(x_{1}, x_{2}, \dots, x_{n}\right)}{\partial x_{j} \partial x_{i}}=f_{i j}\left(x_{1}, x_{2}, \dots, x_{n}\right) \]

Note again that if the second-order partial derivative exist it is also a single-real-valued multivariate function.

Definition

The Hessian is defined as the matrix of second-order partial derivatives of \(f\)

\[\begin{split} \nabla^2 f(\mathbf{x}) = \left(\begin{array}{cccc} \frac{\partial^{2} f(\mathbf{x})}{\partial x_{1}^{2}} & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{2} \partial x_{1}} & \cdots & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{n} \partial x_{1}} \\ \frac{\partial^{2} f(\mathbf{x})}{\partial x_{1} \partial x_{2}} & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{2}^{2}} & \cdots & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{n} \partial x_{2}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^{2} f(\mathbf{x})}{\partial x_{1} \partial x_{n}} & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{2} \partial x_{n}} & \cdots & \frac{\partial^{2} f(\mathbf{x})}{\partial x_{n}^{2}} \end{array}\right) \end{split}\]

The Hessian is commonly denoted as \(H(\mathbf{x})\), \(\nabla^2 f(\mathbf{x})\), \(D_{xx^T}f(\mathbf{x})\), or \(\operatorname{hess} f(\mathbf{x})\).

The elements on the diagonal of the Hessian are referred to as second-order own partial derivatives, and the elements off the diagonal are referred to as second-order cross partial derivatives.

Definition

Some terminologi:

  • If \(f\) is twice-differentiable at every point \(x \in X\), then \(f\) is said to be twice-differentiable on \(X\).

  • If \(f\) is twice-differentiable on \(X\) and \(\frac{\partial^{2} f}{\partial x_{k} \partial x_{i}}\) is a continuous function on \(X\) for all combinations of \(i\) and \(j\) then \(f\) is said to be twice continuously differentiable on \(X\). This is denoted by \(f \in C^{2}\) on \(X\).

It is possible to extend the process of differentiation for multivariate functions to even higher orders than second-order derivatives. Doing so for partial derivatives is relatively straight-forward. However, this will not be done in this course.

Hessian of the Cobb-Douglas production function

Suppose that we have a Cobb-Douglas production function:

\[ f(L, K)=A L^{\alpha} K^{\beta} . \]

From earlier we know the partial derivatives of the Cobb-Douglas production function that constitute the gradient

\[\begin{split} \nabla f(\mathbf{x}) = \left(\begin{array}{c} \tfrac{\partial f(L,K)}{\partial L} \\ \tfrac{\partial f(L,K)}{\partial K} \end{array}\right) = \left(\begin{array}{c} \alpha A L^{\alpha-1} K^{\beta} \\ \beta A L^{\alpha} K^{\beta-1} \end{array}\right). \end{split}\]

Take the partial derivatives of the marginal product of labor

\[\begin{split} \tfrac{\partial^2 f(L,K)}{\partial L \partial L} &= \tfrac{\partial}{\partial L} \left( \tfrac{\partial f(L,K)}{\partial L} \right) \\ &= \tfrac{\partial}{\partial L} \left( \alpha A L^{\alpha-1} K^{\beta}\right) \\ &= \alpha (\alpha-1) A L^{\alpha-2} K^{\beta} \\ \tfrac{\partial^2 f(L,K)}{\partial L \partial K} &= \tfrac{\partial}{\partial K} \left( \tfrac{\partial f(L,K)}{\partial L} \right) \\ &= \tfrac{\partial}{\partial K} \left( \alpha A L^{\alpha-1} K^{\beta}\right) \\ &= \alpha \beta A L^{\alpha-1} K^{\beta-1} \end{split}\]

Take the partial derivatives of the marginal product of capital

\[\begin{split} \tfrac{\partial^2 f(L,K)}{\partial K \partial L} &= \tfrac{\partial}{\partial L} \left( \tfrac{\partial f(L,K)}{\partial K} \right) \\ &= \tfrac{\partial}{\partial L} \left( \beta A L^{\alpha} K^{\beta-1}\right) \\ &= \alpha \beta A L^{\alpha-1} K^{\beta-1} \\ \tfrac{\partial^2 f(L,K)}{\partial K \partial K} &= \tfrac{\partial}{\partial K} \left( \tfrac{\partial f(L,K)}{\partial K} \right) \\ &= \tfrac{\partial}{\partial K} \left( \beta A L^{\alpha} K^{\beta-1}\right) \\ &= \beta (\beta - 1) A L^{\alpha} K^{\beta-2} \end{split}\]

We can now set up the Hessian of the Cobb-Douglas production function

\[\begin{split} \nabla^2 f(\mathbf{x}) = \left(\begin{array}{cc} \tfrac{\partial^2 f(L,K)}{\partial L \partial L} & \tfrac{\partial^2 f(L,K)}{\partial L \partial K} \\ \tfrac{\partial^2 f(L,K)}{\partial K \partial L} & \tfrac{\partial^2 f(L,K)}{\partial K \partial K} \end{array} \right) = \left(\begin{array}{cc} \alpha (\alpha-1) A L^{\alpha-2} K^{\beta} & \alpha \beta A L^{\alpha-1} K^{\beta-1} \\ \alpha \beta A L^{\alpha-1} K^{\beta-1} & \beta (\beta-1) A L^{\alpha} K^{\beta-2} \end{array} \right). \end{split}\]

Note that the Hessian is symmetric.

It turns out that the second-order cross partial derivatives are equal in general. This is known as Young’s theorem.

Fact (Young’s theorem)

Suppose that \(f(x_1,x_2,\dots,x_n)\) is a \(C^2\) on its domain, X. Then, for each pair of indices \(i\), \(j\),

\[ \frac{\partial^2 f (\mathbf{x})}{\partial x_i \partial x_j} = \frac{\partial^2 f (\mathbf{x})}{\partial x_j \partial x_i}. \]

Young’s theorem states that the order of differentiation does not matter for a \(C^2\) function. The result is trivially true when \(i=j\). This is the case of second-order own partial derivatives.

Total differentiation#

When calculating the partial derivative of a function \(f\) with respect to \(x_i\), we only allow \(x_i\) to vary and keep all other variables constant. In contrast, when calculating the total derivative we allow all independent variables to vary

\[ df = \sum_{i=1}^n \frac{\partial f(\mathbf{x})}{\partial x_i}dx_i. \]

\(df\) is the total derivative of \(f\) with respect to the change \(\mathbf{dx}=(dx_1,dx_2,\dots,dx_n)\). Unlike the partial derivate the total derivate does not restrict the analysis to be local, i.e. \(dx_i \rightarrow 0\).

Example I

Consider the bivariate function \(y=f(x_1,x_2)=a x_1 + b x_2\). The total derivative with respect to \(x_1\) is then

\[ df=a dx_1 + b dx_2. \]

Example II

Consider the bivariate function \(y=f(x_1,x_2)=x_1 x_2\). The partial derivative with respect to \(x_1\) is then

\[ df=x_2 dx_1 + x_1 dx_2. \]

The total derivative, \(df\), can be thought of as a linear approximation of the change in \(f\) due to the change \(\mathbf{dx}\).

\[ df \approx f(\mathbf{x} + \mathbf{dx}) - f(\mathbf{x}). \]

Hence, we can use total differentiation to approximate \(f\) around the point \(\mathbf{x}^0\)

\[ f(\mathbf{x}^0 + \mathbf{dx}) \approx f(\mathbf{x}^0) + \sum_{i=1}^n \frac{\partial f(\mathbf{x}^0)}{\partial x_i}dx_i. \]

This is an example of a first order Taylor approximation.

The chain rule#

Many economic models involve composition functions. These are functions of one or serveral variables in wich the variables are themselves functions of other basic variables.

E.g. many models of economic growth regard production as a function of capital and labor, both which are functions of time. For these models, we can apply the chain rule to analyze how production changes over time due to changes in labor and capital.

Fact (chain rule I)

When \(z=f(x_1,x_2,\dots,x_n)\) with \(x_i=g_i(t)\) for every \(i\), then

\[ \frac{dz}{dt} = \sum_{i=1}^n \frac{\partial z}{\partial x_i} \frac{dx_i}{dt} \]

As every variable, \(x_i\), depends on the basic variable, \(t\), a small change in \(t\) sets off a chain reaction. The sum of the individual contributions is called the total derivative and is denoted \(dz/dt\).

Calculating the growth rate of the production

The production of the economy is given by the Cobb-Douglas production function \(y=f(L(t),K(t))\) where the labor and capital inputs are both functions of time.

Labor and capital is accumulated at constant growth rates

\[\begin{split} \dot{L} \equiv \frac{dL}{dt} &= g_L L(t). \\ \dot{K} \equiv \frac{dK}{dt} &= g_K K(t). \\ \end{split}\]

Use chain rule to calculate how production changes over time

\[\begin{split} \frac{dy}{dt} &= \frac{\partial f(L,K)}{\partial L} \frac{dL}{dt} + \frac{\partial f(L,K)}{\partial K} \frac{dK}{dt} \\ &= \alpha A L^{\alpha-1} K^{\beta} g_L L + \beta A L^{\alpha} K^{\beta-1} g_K K \\ &= \alpha A L^{\alpha} K^{\beta} g_L + \beta A L^{\alpha} K^{\beta} g_K \\ &= (\alpha g_L + \beta g_K) y. \end{split}\]

Divide both sides with the production, \(y\)

\[ \frac{\dot{y}}{y} = \alpha g_L + \beta g_K. \]

The growth rate of the economy is constant, and is given by dot product of the elasticities and growth rates with respect to each of the production inputs.

The chain rule can be generalized by allowing the variables, \(x_i\), to be a function of more than one basic variable.

Fact (chain rule II)

When \(z=f(x_1,x_2,\dots,x_n)\) with \(x_i=g_i(t_1,t_2,...,t_m)\) for every \(i\), then

\[ \frac{\partial z}{\partial t_j} = \sum_{i=1}^n \frac{\partial z}{\partial x_i} \frac{\partial x_i}{\partial t_j} \]

for each \(j=1,2,\dots,m\)

Excercise#

Calculate the partial derivatives and elasticities of the CES production function

The CES production function is given by:

\[ Y = A \left( aL^r + bK^r \right)^{\frac{s}{r}}. \]

The first order partial derivatives with respect to labor is:

\[\begin{split} \frac{\partial Y}{\partial L} &= \tfrac{s}{r} A \left( aL^r + bK^r \right)^{\frac{s}{r} - 1} arL^{r-1} \\ &= \frac{saL^{r}}{aL^r + bK^r} \frac{Y}{L} \\ \end{split}\]

The first order partial derivatives with respect to capital is:

\[\begin{split} \frac{\partial Y}{\partial K} &= \tfrac{s}{r} A \left( aL^r + bK^r \right)^{\frac{s}{r} - 1} brK^{r-1} \\ &= \frac{sbK^{r}}{aL^r + bK^r} \frac{Y}{K} \\ \end{split}\]

The elasticities of output with respect to labor and capital are:

\[\begin{split} \varepsilon_L &= \frac{\partial Y}{\partial L} \frac{L}{Y} &= \frac{saL^{r}}{aL^r + bK^r} \\ \varepsilon_K &= \frac{\partial Y}{\partial K} \frac{K}{Y} &= \frac{sbK^{r}}{aL^r + bK^r} \end{split}\]