Analiza matematyczna funkcji wielu zmiennych
Kurs: Analiza matematyczna funkcji wielu zmiennych > Rozdział 2Lekcja 3: Pochodna cząstkowa i gradient (artykuły)
Dokładniejsze zrozumienie pochodnych kierunkowych
A more thorough look at the formula for directional derivatives, along with an explanation for why the gradient gives the slope of steepest ascent.
This article is targetted for those who want a deeper understanding of the directional derivative and its formula.
Formal definition of the directional derivative
There are a couple reasons you might care about a formal definition. For one thing, really understanding the formal definition of a new concept can make clear what it is really going on. But more importantly than that, I think the main benefit is that it gives you the confidence to recognize when such a concept can and cannot be applied.
As a warm up, let's review the formal definition of the partial derivative, say with respect to :
The connection between the informal way to read and the formal way to read the right-hand side is as follows:
|Symbol||Informal understanding||Formal understanding|
|A tiny nudge in the direction.||A limiting variable which goes to , and will be added to the first component of the function's input.|
|The resulting change in the output of after the nudge.||The difference between and , taken in the same limit as .|
We could instead write this in vector notation, viewing the input point as a two-dimensional vector
Here is written in bold to emphasize its vectoriness. It's a bit confusing to use a bold for the entire input rather than some other letter, since the letter is already used in an un-bolded form to denote the first component of the input. But hey, that's convention, so we go with it.
Instead of writing the "nudged" input as , we write it as , where is the unit vector in the -direction:
In this notation, it's much easier to see how to generalize the partial derivative with respect to to the directional derivative along any vector :
In this case, adding to the input for a limiting variable formalizes the idea of a tiny nudge in the direction of .
Showing directional derivative nudge
Seeking connection between the definition and computation
Computing the directional derivative involves a dot product between the gradient and the vector . For example, in two dimensions, here's what this would look like:
Here, and are the components of .
The central question is, what does this formula have to do with the definition given above?
Breaking down the nudge
The computation for can be seen as a way to break down a tiny step in the direction of into its and components.
Break apart a step along the vector
Specifically, you can imagine the following procedure:
- Start at some point .
- Choose a tiny value .
- Add to , which means stepping to the point . From what we know of partial derivatives, this will change the output of the function by about
- Now add to to bring us up/down to the point . The resulting change to is now about
Adding the results of steps and , the total change to the function upon moving from the input to the input has been about
This is very close to the expression for the directional derivative, which says the change in due to this step should be about
However, this differs slightly from the result of our step-by-step argument, in which the partial derivative with respect to is taken at the point , not at the point .
Luckily we are considering very, very small values of . In fact, more technically, we should be talking about the limit as . Therefore evaluating at will be almost the same as evaluating it at . Moreover, as approaches , so does the difference between these two, but we have to assume that is continuous.
Why does the gradient point in the direction of steepest ascent?
Having learned about the directional derivatives, we can now understand why the direction of the gradient is the direction of steepest ascent.
Pojęcie kierunku najszybszego wzrostu.
Specifically, here's the question at hand.
- Let be some scalar-valued multivariable function, such as .
- Let be a particular input point
- Consider all possible directions, i.e. all unit vectors in the input space of .
Question (informal): If we start at , which direction should we walk so that the output of increases most quickly?
Question (formal): Which unit vector maximizes the directional derivative along ?
The famous triangle inequality tells us that this will be maximized by the unit vector in the direction .
Maximize dot product
Notice, the fact that the gradient points in the direction of steepest ascent is a consequence of the more fundamental fact that all directional derivatives require taking the dot product with .
Chcesz dołączyć do dyskusji?
Rozumiesz angielski? Kliknij tutaj, aby zobaczyć więcej dyskusji na angielskiej wersji strony Khan Academy.
Na razie brak głosów w dyskusji