On Maximally Aligned Subspaces

Given a vector $v\in {-1,+1}^n$, we are interested in the set of $k$ orthogonal vectors $r_1,\ldots, r_k \in {-1,+1}^n$, that solves $\max_{\{r_i\}_k}\min_{i\in [k]}v^Tr_i$

From Tramer et al., 2018, it was shown that if for all $i$, $v^Tr_i \geq \alpha n$ for $\alpha \in (0,1)$, then $\alpha \leq 1/\sqrt{k}$. In other words, $OPT=n/\sqrt{k}$. Let’s denote by $R$ the matrix formed by stacking $r_i$ vertically.

from scipy.linalg import hadamard
import numpy as np
import itertools

# dim
n = 108 # imagenet dim
# adv cone size < n
k = 36
# target vector
v = np.sign(np.random.randn(n))
OPT= n / np.sqrt(k)

Naive Construction

The following naive method (Tramer et al, 2018) can be used to achieve ~$n/k$, a factor of $\sqrt{k}$ worse than OPT, assuming $k$ divides $n$.

def chain_lol(lol):
    return list(itertools.chain(*lol))

def construct_idx(chunk_size, k, n):
    """a method to get 1d idxs for the R matrix used by the naive construction method
    """
    return list(filter(
        lambda x: x < n*k,
        chain_lol(
        [range(i*(chunk_size)+j,i*(chunk_size ) + chunk_size+j) for i,j in enumerate(range(0,n*k,n))])))

def naive_R(n, k):
    chunk_size = (n + k - 1) // k
    R = np.zeros((k, n))
    #print(construct_idx(chunk_size, k, n))
    R.ravel()[construct_idx(chunk_size, k, n)] = v
    return R

R = naive_R(n,k)

R.dot(v), OPT

(array([3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
        3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3., 3.,
        3., 3.]), 18.0)

A factor of 10 worse for Imagenet. Let’s consider another construction from Tramer et al., 2018.

Tight Randomized Construction with Regular Hadamard matrix

Hs = {
    '4': np.load('../reg_hadamard_mats/reg_hadamard_mat_order-4.npy'),
    '16':np.load('../reg_hadamard_mats/reg_hadamard_mat_order-16.npy'),
    '36':np.load('../reg_hadamard_mats/reg_hadamard_mat_order-36.npy'),
    '64':np.load('../reg_hadamard_mats/reg_hadamard_mat_order-64.npy'),
    '100':np.load('../reg_hadamard_mats/reg_hadamard_mat_order-100.npy'),
}

H = Hs['100']

k = H.shape[0]
# target vector
v = np.sign(np.random.randn(n))
OPT= n / np.sqrt(k)

R = np.zeros((k, n))
R[:, :n // k * k ] = np.repeat(H, n // k, axis=1)
R *= v[None, :]

R.dot(v), OPT

(array([10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10., 10.,
        10., 10., 10., 10., 10., 10., 10., 10., 10.]), 10.8)

sum(H)

array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10])

H.dot(H.T)

array([[100,   0,   0, ...,   0,   0,   0],
       [  0, 100,   0, ...,   0,   0,   0],
       [  0,   0, 100, ...,   0,   0,   0],
       ...,
       [  0,   0,   0, ..., 100,   0,   0],
       [  0,   0,   0, ...,   0, 100,   0],
       [  0,   0,   0, ...,   0,   0, 100]])

In the above discussion, we considered the case where $|| r_i ||_\infty \leq1$ and $v\in {-1,+1}^n$, how about the case of $||r_i||_2 \leq 1$ and $v\in \mathbb{R}^n$.

A similar result can be shown such that $r_i^Tv \geq k^{-1/2} ||v||$ with $k=\min(\lfloor 1/\alpha^2 \rfloor, d)$ (See Tramer et al., 2017). Let’s define our setup below, before we show one possible construction.

n = 1000

def ei(n, i):
    """return the ith basis vector"""
    ei = np.zeros((n, 1))
    ei[i] = 1
    return ei

v = np.random.randn(n,1)
k = 6
norm_v = np.linalg.norm(v)
OPT = norm_v / np.sqrt(k)

The following is one possible construction

z = np.sum(np.eye(n)[:, :k], axis=1, keepdims=True) / np.sqrt(k)

S = ei(n, 1).dot(v.T) / norm_v
T = ei(n, 1).dot(z.T) / np.linalg.norm(z)**2
U = S.T.dot(T)

R = U.dot(np.eye(n)[:, :k])

v.T.dot(R), OPT

(array([[13.24266379, 13.24266379, 13.24266379, 13.24266379, 13.24266379,
         13.24266379]]), 13.242663787049473)