Jeśli widzisz tę wiadomość oznacza to, że mamy problemy z załadowaniem zewnętrznych materiałów na naszej stronie internetowej.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Główna zawartość

### Kurs: Analiza matematyczna funkcji wielu zmiennych>Rozdział 2

Lekcja 7: Pochodne cząstkowe funkcji wektorowych (artykuły)

# Multivariable chain rule, simple version

The chain rule for derivatives can be extended to higher dimensions.  Here we see what that looks like in the relatively simple case where the composition is a single-variable function.

## Do czego zmierzamy

• Given a multivariable function $f\left(x,y\right)$, and two single variable functions $x\left(t\right)$ and $y\left(t\right)$, here's what the multivariable chain rule says:
$\underset{\text{Derivative of composition function}}{\underset{⏟}{\frac{d}{dt}f\left(x\left(t\right),y\left(t\right)\right)}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}=\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}$
• Written with vector notation, where $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)=\left[\begin{array}{c}x\left(t\right)\\ y\left(t\right)\end{array}\right]$, this rule has a very elegant form in terms of the gradient of $f$ and the vector-derivative of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$.
$\underset{\text{Derivative of composition function}}{\underset{⏟}{\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}=\stackrel{\text{Dot product of vectors}}{\stackrel{⏞}{\mathrm{\nabla }f\cdot {\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)}}$

## A more general chain rule

Jak się prawdopodobnie domyślasz, wzór na pochodną złożenia funkcji wielu zmiennych jest uogólnieniem reguły łańcuchowej dla funkcji jednej zmienej:
$\frac{d}{dx}f\left(g\left(t\right)\right)=\frac{df}{dg}\frac{dg}{dt}={f}^{\prime }\left(g\left(t\right)\right){g}^{\prime }\left(t\right)$
What if instead of taking in a one-dimensional input, $t$, the function $f$ took in a two-dimensional input, $\left(x,y\right)$?
Well, in that case, it wouldn't make sense to compose it with a scalar-valued function $g\left(t\right)$. Instead, let's say there are two separate scalar-valued functions $x\left(t\right)$ and $y\left(t\right)$, and we plug these in as the coordinates of $f$. The overall composition will be a single variable function, with a single-number input $t$, and a single-number output $f\left(x\left(t\right),y\left(t\right)\right)$, as shown in this diagram:
There is still a chain rule that lets you compute the derivative of this new single-variable function $f\left(x\left(t\right),y\left(t\right)\right)$, and it involves the partial derivatives of $f$:
Keep in mind, an expression like $\frac{\partial f}{\partial x}\frac{dx}{dt}$ is shorthand for
$\frac{\partial f}{\partial x}\left(x\left(t\right),y\left(t\right)\right)\frac{dx}{dt}\left(t\right)$
That is, both are functions of $t$, but $\frac{\partial f}{\partial x}$ is evaluated via the intermediate functions $x\left(t\right)$ and $y\left(t\right)$.

## Written with vector notation

Rather than thinking of $x\left(t\right)$ and $y\left(t\right)$ as being separate functions, it's common to package them together into a single, vector-valued function:
$\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)=\left[\begin{array}{c}x\left(t\right)\\ y\left(t\right)\end{array}\right]$
Then instead of writing the composition as $f\left(x\left(t\right),y\left(t\right)\right)$, you can write it as $f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)$.
With this notation, the multivariable chain rule can be written more compactly as a dot product between the gradient of $f$ and the vector-derivative of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$:
$\begin{array}{rl}\phantom{\rule{1em}{0ex}}\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)& =\underset{\text{Rewrite this sum as a dot product}}{\underset{⏟}{\frac{\partial f}{\partial x}\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\frac{dx}{dt}+\frac{\partial f}{\partial y}\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\frac{dy}{dt}}}\\ \\ & =\underset{\mathrm{\nabla }f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)}{\underset{⏟}{\left[\begin{array}{c}\frac{\partial f}{\partial x}\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\\ \\ \frac{\partial f}{\partial y}\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\end{array}\right]}}\cdot \underset{{\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)}{\underset{⏟}{\left[\begin{array}{c}\frac{dx}{dt}\\ \\ \frac{dy}{dt}\end{array}\right]}}\\ \\ & =\mathrm{\nabla }f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\cdot {\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)\end{array}$
Written like this, the analogy with the single-variable derivative is clearer.
$\begin{array}{r}\frac{d}{dt}f\left(g\left(t\right)\right)={f}^{\prime }\left(g\left(t\right)\right){g}^{\prime }\left(t\right)=\frac{df}{dg}\cdot \frac{dg}{dt}\end{array}$
The gradient $\mathrm{\nabla }f$ plays the role of the derivative of $f$, and the vector derivative ${\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)$ plays the role as the ordinary derivative of $g$.

## Intuition for why the chain rule works

As a warm up, consider the single variable chain rule for a composition like $f\left(g\left(t\right)\right)$. Here's how I like to understand that composition:
• First, $g$ maps a point $t$ on the number line to another point $g\left(t\right)$ the number line.
• Then $f$ comes in and maps the point $g\left(t\right)$ to yet another point on the number line, $f\left(g\left(t\right)\right)$
Understanding the derivative of $f\left(g\left(t\right)\right)$ requires understanding how a tiny change in $t$ changes the final output.
So let's dive into what the chain rule is really saying.
$\frac{d}{dx}f\left(g\left(t\right)\right)=\frac{df}{dg}\cdot \frac{dg}{dt}$
• The term $\frac{dg}{dt}$ represents how a tiny change in $t$ influences the intermediate output, $g\left(t\right)$.
• The term $\frac{df}{dg}$ represents how a tiny change in $g$ influences the final output $f\left(g\left(t\right)\right)$.
• The total change in $f$ due to a small change in $t$ is then the product of both these influences.

## Extend this intuition to more dimensions

The intuition is similar for the multivariable chain rule. You can think of $\stackrel{\to }{\mathbf{\text{v}}}$ as mapping a point on the number line to a point on the $xy$-plane, and $f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)$ as mapping that point back down to some place on the number line. The question is, how does a small change in the initial input $t$ change the total output $f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)$?
Let's break down what the multivariable chain rule is saying, spelling it out in terms of the component functions $x\left(t\right)$ and $y\left(t\right)$:
$\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)=\frac{d}{dt}f\left(x\left(t\right),y\left(t\right)\right)=\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}$
• The term $\frac{dx}{dt}$ represents how a tiny change in $t$ influences the intermediate output $x\left(t\right)$.
• Likewise the term $\frac{dy}{dt}$ represents how a tiny change in $t$ influences the second intermediate output $y\left(t\right)$.
• The term $\frac{\partial f}{\partial x}$ represents how a tiny change to the $x$-component of an input to $f$ influences its output, and similarly the term $\frac{\partial f}{\partial y}$ accounts for how a small change to the $y$-component of the input changes $f$.
• One way a small change to $t$ influences $f\left(x\left(t\right),y\left(t\right)\right)$ is that it first changes $x\left(t\right)$, which in turn changes $f$. This effect is captured in the product $\frac{\partial f}{\partial x}\frac{dx}{dt}$.
• The other way a change to $t$ changes the output of $f\left(x\left(t\right),y\left(t\right)\right)$ is by first changing the second intermediate output $y\left(t\right)$, which in turn affects the output of $f$. This effect is captured in the product $\frac{\partial f}{\partial y}\frac{dy}{dt}$.
• Adding these two products gives the total change in $f$.

## Connection with directional derivative

You might notice that the dot product expression for the multivariable chain rule looks a lot like a directional derivative:
$\begin{array}{r}\phantom{\rule{1em}{0ex}}\mathrm{\nabla }f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)\cdot {\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)\end{array}$
In fact, that's exactly what it is! The derivative ${\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left({t}_{0}\right)$ at a particular value ${t}_{0}$ gives a vector in the input space of $f$:
$\begin{array}{r}\phantom{\rule{1em}{0ex}}{\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left({t}_{0}\right)=\left[\begin{array}{c}{x}^{\prime }\left({t}_{0}\right)\\ {y}^{\prime }\left({t}_{0}\right)\end{array}\right]\end{array}$
If $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$ is interpreted as a parametric path inside this space, perhaps thought of as the trajectory of a particle, the derivative at a particular point in time ${t}_{0}$ gives the velocity vector of this particle at that time.
With this interpretation, the chain rule tells us that the derivative of the composition $f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)$ is the directional derivative of $f$ along the derivative of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$.
This should make sense, because a tiny change by "$dt$" to $t$ should, by the meaning of the derivative, cause a tiny change $d\stackrel{\to }{\mathbf{\text{v}}}$ to the output of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$. And the point of the directional derivative is that a tiny change $d\stackrel{\to }{\mathbf{\text{v}}}$ to the input of $f$ should cause a change $df$ as determined by $\frac{\partial f}{\partial \stackrel{\to }{\mathbf{\text{v}}}}={\mathrm{\nabla }}_{\stackrel{\to }{\mathbf{\text{v}}}}f$.

## Example 1: With and without the new chain rule

Define $f\left(x,y\right)$ like this:
$\begin{array}{r}f\left(x,y\right)={x}^{2}y\end{array}$
And define $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$ like this:
$\begin{array}{r}\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)=\left[\begin{array}{c}\mathrm{cos}\left(t\right)\\ \mathrm{sin}\left(t\right)\end{array}\right]\end{array}$
Find the derivative $\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)$.
Solution without chain rule:
Before throwing our fancy new tool at the problem, it's worth pointing out that this is something we can solve by first writing out the composition as a single variable function of $t$:
$\begin{array}{rl}\phantom{\rule{1em}{0ex}}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)& =f\left(\mathrm{cos}\left(t\right),\mathrm{sin}\left(t\right)\right)\\ & =\mathrm{cos}\left(t{\right)}^{2}\mathrm{sin}\left(t\right)\end{array}$
Now you can take the ordinary derivative:
$\begin{array}{rl}\frac{d}{dt}\mathrm{cos}\left(t{\right)}^{2}\mathrm{sin}\left(t\right)& =\mathrm{cos}\left(t{\right)}^{2}\left(\mathrm{cos}\left(t\right)\right)+2\mathrm{cos}\left(t\right)\left(-\mathrm{sin}\left(t\right)\right)\mathrm{sin}\left(t\right)\\ & =\overline{){\mathrm{cos}}^{3}\left(t\right)-2\mathrm{cos}\left(t\right){\mathrm{sin}}^{2}\left(t\right)}\end{array}$
But of course, the purpose of this example is to get a feel for what the chain rule feels like.
Solution using chain rule:
First, let's explicitly state the component functions of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$:
$\begin{array}{rl}x\left(t\right)& =\mathrm{cos}\left(t\right)\\ y\left(t\right)& =\mathrm{sin}\left(t\right)\end{array}$
According to the chain rule,
$\begin{array}{rl}\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)& =\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}\end{array}$
Taking the partial derivatives of $f\left(x,y\right)={x}^{2}y$ and the ordinary derivatives of $x\left(t\right)=\mathrm{cos}\left(t\right)$, $y\left(t\right)=\mathrm{sin}\left(t\right)$, we get
$\begin{array}{rl}& \phantom{\rule{1em}{0ex}}\frac{\partial }{\partial x}\left({x}^{2}y\right)\frac{d}{dt}\left(\mathrm{cos}\left(t\right)\right)+\frac{\partial }{\partial y}\left({x}^{2}y\right)\frac{d}{dt}\left(\mathrm{sin}\left(t\right)\right)\\ \\ & =\left(2xy\right)\left(-\mathrm{sin}\left(t\right)\right)+\left({x}^{2}\right)\left(\mathrm{cos}\left(t\right)\right)\end{array}$
We want everything in terms of $t$, so we plug in $x=\mathrm{cos}\left(t\right)$ and $y=\mathrm{sin}\left(t\right)$.
$\begin{array}{rl}& \left(2xy\right)\left(-\mathrm{sin}\left(t\right)\right)+\left({x}^{2}\right)\left(\mathrm{cos}\left(t\right)\right)\\ \\ & \left(2\mathrm{cos}\left(t\right)\mathrm{sin}\left(t\right)\right)\left(-\mathrm{sin}\left(t\right)\right)+\left(\mathrm{cos}\left(t{\right)}^{2}\right)\mathrm{cos}\left(t\right)\\ \\ =& \overline{)-2\mathrm{cos}\left(t\right){\mathrm{sin}}^{2}\left(t\right)+{\mathrm{cos}}^{3}\left(t\right)}\end{array}$
Reassuringly, this is the same as the answer we got without using the chain rule. You might be thinking that this new chain rule makes things unnecessarily complicated, and the dirty little secret is that for concrete computations like this one, it is often not needed.
However, it is useful for writing equations in terms of an unknown function, as the next example shows.

## Example 2: Unknown function

Suppose the temperature across a two-dimensional region varies according to a function $T\left(x,y\right)$, which we do not know. You wander throughout this region, sampling temperatures as you go, and your $x$ and $y$ coordinates as functions of time are
$\begin{array}{rl}\phantom{\rule{1em}{0ex}}x\left(t\right)& =30\mathrm{cos}\left(2t\right)\\ y\left(t\right)& =40\mathrm{sin}\left(3t\right)\end{array}$
In taking your measurements, you notice that the temperature never changes along your path. What can you say about the partial derivatives of $T$?

## Podsumowanie

• Given a multivariable function $f\left(x,y\right)$, and two single variable functions $x\left(t\right)$ and $y\left(t\right)$, here's what the multivariable chain rule says:
$\underset{\text{Derivative of composition function}}{\underset{⏟}{\frac{d}{dt}f\left(x\left(t\right),y\left(t\right)\right)}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}=\frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}$
• Written with vector notation, where $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)=\left[\begin{array}{c}x\left(t\right)\\ y\left(t\right)\end{array}\right]$, this rule has a very elegant form in terms of the gradient of $f$ and the vector-derivative of $\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)$.
$\underset{\text{Derivative of composition function}}{\underset{⏟}{\frac{d}{dt}f\left(\stackrel{\to }{\mathbf{\text{v}}}\left(t\right)\right)}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}\phantom{\rule{-0.167em}{0ex}}=\stackrel{\text{Dot product of vectors}}{\stackrel{⏞}{\mathrm{\nabla }f\cdot {\stackrel{\to }{\mathbf{\text{v}}}}^{\prime }\left(t\right)}}$

## Chcesz dołączyć do dyskusji?

Na razie brak głosów w dyskusji
Rozumiesz angielski? Kliknij tutaj, aby zobaczyć więcej dyskusji na angielskiej wersji strony Khan Academy.