In this post, we will talk about dual norms: a handy tool in optimization.

Consider the following optimization problem:

\begin{equation} \min_{\mathbf{a}} ||\mathbf{a}||_p \;\;\text{s.t.}\; \mathbf{a}^T\mathbf{b}=c \end{equation}

where $\mathbf{a},\mathbf{b} \in \mathbb{R}^n$

How do we find $\mathbf{a}^*$? Well, we have two cases.

Case 1. $p=0$, this is straightforward and there exist $n$ optimal solutions, if $c\neq 0$. One optimal solution is of the form:

$\mathbf{a}^*=(c/b_1,0, \ldots, 0)$

Case 2. $p>0$, here we make use of the concept of dual norms. Formally, the dual norm $||\cdot||_p$ of the norm $||\cdot||_q$ is defined as

In other words, for any arbitray $\mathbf{b}$, we have

where $p$ and $q$ are such that $\frac{1}{p}+\frac{1}{q}=1$ or alternatively $p= \frac{q}{q-1}$. This follows from the Young’s inequality for products as shown below briefly:

From Young’s inequality, we have for all $i \in [n]$

where equality holds only if $|a_i|^p=|b_i|^q$. Let’s sum over all $i$, we get Now, let $||\mathbf{a}||_p$ $||\mathbf{b}||_q=1$ and $a_i = b_i$, we have

Having described dual norms, let’s go back to our case $p>0$ for Eq. (1). From the definintion of dual norm, it is straighforward to see that

We found the objective function value at the optimal solution, but how about $\mathbf{a}^*$? Well, we know that

So $\mathbf{a}^*$ must be such that

From the definition of $||\cdot||_q$ we have

In other words, for all $i \in [n]$

Let’s see this in code.

# problem setup
from cvxopt import normal
import numpy as np
from cvxopt.modeling import variable, op, max, sum, dot
import cvxpy as cp
import matplotlib.pyplot as plt

n = 10
b = np.random.randn(n)
c = np.random.randn()
p = 3
def solve_cvxopt(n, b, c, p):
    a = cp.Variable(shape=n)
    if p == 0: print("not supported! exit!"); return
    obj = cp.Minimize(cp.norm(a, p))
    constraint = [cp.sum(cp.multiply(a,b))== c]
    prob = cp.Problem(obj, constraint)
    # ECOS and SCS solvers fail to converge before
    # the iteration limit. Use CVXOPT instead.
    prob.solve(solver=cp.CVXOPT, verbose=False)
    return a.value

def solve_dual(n, b, c, p):
    if p == 0:
        a_opt = np.zeros(n)
        for i, _b in enumerate(b):
            if abs(_b) > 0:
                a_opt[i] = c / b[i]
                return a_opt
    elif p == 1:
        a_opt = np.zeros(n)
        idx = np.argmax(abs(b))
        a_opt[idx] = c / np.max(abs(b))
        return a_opt
    else:
        if p == np.inf : q = 1
        else: q = p / (p - 1)
        den = np.linalg.norm(b, ord=q) ** q
        a_opt = c / den * np.sign(b) * (abs(b) ** (q - 1))
        return a_opt
x_opt = solve_cvxopt(n, b, c, p)
x_dual = solve_dual(n, b, c, p)
np.linalg.norm(x_opt,p)
0.06388941220368444
np.linalg.norm(x_dual,p)
0.06388941212687518