Change of variables and necessary conditions for optimality

In this post, we consider how removal and addition of degrees of freedom through change of variables can help in searching for a minimum of a function.

Removing degrees of freedom

Consider a function $f(x,y) = xy$. The necessary condition for optimality says that its differential should be equal to zero,

However, we can introduce a new variable $z = xy$ and consider a function $g(z) = z$, which obviously has no critical point. What happened here is that we lost one degree of freedom. Initially, we could change $x$ and $y$ independently, but after introducing $z$, we can only change their product $xy$.

Sometimes, though, we don’t need to use all degrees of freedom to find a critical point of a function. For example, if $f(x,y) = (xy)^2$ and $g(z) = z^2$, then

describes the same set of solutions as the system of equations $(xy^2, x^2y) = (0,0)$ obtained from the partial derivatives of $f$. So, we incur no loss of information by removing some degrees of freedom in this case.

One should be careful, nevertheless, to ensure that the solution $z^*$ of the equation $g_z = 0$ lies in the range of the function $z$. For example, if $f(x) = (\frac{1}{x})^2$, then we could write $g(z) = z^2$ with $z = \frac{1}{x}$. Provided $dg = 2z$, we could haste to declare $z^* = 0$ a critical point of $f$ despite its lying outside the image of $z$.

To summarize,

Adding degrees of freedom

Adding degrees of freedom can only hurt. Consider the function

that has a minimum at $z = 1$. Assuming we are studying its restriction on $z \geq 0.5$, we can introduce the function $g$ of two variables

with $x^2 = z^2$ and $y^2 = 2z - 1$. Equating its differential to zero,

leads to contradiction, since it implies $z = 0$ and $z = 0.5$ at the same time. We ran into such troubles because two variables give more flexibility than we can actually afford with one. Expanding $dx$ and $dy$ in the differential, we see that

and even though equation $g_y = 2y = 0$ suggests setting $y = 0$, multiplying $g_y = 2y$ by $y_z = \frac{1}{y}$ results in a finite quantity.

Introducing extra variables just obscures the problem. Most importantly, critical points of $g$ do not tell us anything about critical points of $f$. Therefore, adding degrees of freedom in its pure form is not helpful for optimization.