Constraint aggregation¶

In this section the use of constraint aggregation is demonstrated. We will focus on the following problem:

\[\min_{x \in [0,1]}{(\max_{j=1,...,N}{(f_j(x)})}\]

where:

\[f_j(x) = jxe^{1-jx} \quad \forall j=1,...,N\]

This can be reformulated into a non-linear optimization problem with non linear constraints:

\[\min{(y)}\]

\[s.t.\]

\[g_j(x,y) = f_j(x)-y \leq 0 \quad \forall j=1,...,N\]

\[0\leq x \leq 1\]

In this situation this problem can be solved using any non linear optimization solver supporting non linear inequality constraints. The solution of this problem is \((x^*,y^*) \equiv (0,0)\quad \forall N\in \mathbf{N}\) .

Some optimization solvers, such as MMA, can become expensive when the number of constraints \(N\) is large. Moreover, since constraint gradients are often needed, reducing the number of constraints to fewer constraints can be efficient to compute the gradient by adjoint approach. Constraint aggregation aims to reduce the number of constraints in an optimization problem. It does this by replacing the original constraints with operators that transform vectors into scalars.

\[\min{(y)}\]

\[s.t.\]

\[G_{agg}(g_j(x,y), p) \leq 0\]

\[0\leq x \leq 1\]

Where \(G_{agg}(g_j(x,y), p)\) indicates an aggregation operator and \(p\) its parameters (if applicable). In the next section some aggregation approaches are reviewed.

The maximum¶

The simplest approach to aggregate the constraint functions is to take their maximum.

\[G_{max}(g_j(x,y), p) = \max_{j=1,...,N}{(f_j(x)})\]

using this formulation keeps the same solution of the original problem but has the major inconvinient of not being differentiable. This is not always a problem in practice. However, it is possible that the performance of gradient-based solvers may be degraded.

The KS function¶

The Kreisselmeier–Steinhauser (KS) function [KS83] is a continuous and differentiable function that tends to the maximum operator when the aggregation parameter tends to infinity.

\[G_{KS}(g_j(x,y), p) = \frac{1}{p}\log({\sum_{j=1}^{N}{\exp(pg_j(x,y))}})\]

It can be shown that:

\[G_{KS}(g_j(x,y), p)-G_{max}(g_j(x,y), p)\leq \frac{\log(N)}{p}\]

This means that using KS function aggregation the solution of the optimization problem is an approximation of the original problem. Since the KS function is always greater than the maximum, this approach always leads to feasible solutions. By increasing the value of p, the KS function will be closer to the maximum and the optimization will increase the number of iterations to converge.

The Induced Exponential¶

The Induced Exponential function [KH15] is also a continuous and differentiable function that tends to the maximum operator when the aggregation parameter tends to infinity.

\[G_{IKS}(g_j(x,y), p) = \frac{\sum_{j=1}^{N}{g_j(x,y)e^{pg_j(x,y)}}}{\sum_{j=1}^{N}{e^{pg_j(x,y)}}}\]

It can be shown that:

\[G_{max}(g_j(x,y), p)-G_{IKS}(g_j(x,y), p)\leq \frac{\log(N)}{p}\]

This means that using IKS function aggregation, the solution to the optimization problem is an approximation of the original problem. Since the IKS function is always smaller than the maximum, this approach always leads to infeasible solutions. By increasing the value of p, the IKS function will be closer to the maximum and the optimization will increase the number of iterations to converge.

Examples for constraint aggregation

Gallery generated by Sphinx-Gallery