# What is Linear Programming?

## Or what do they mean by âsolve for xâ?

We all remember our introduction to basic algebra in grade or middle school. Our teachers would refer to them as âword problemsâ so we wouldnât panic at the ambiguity of not simply applying a set of rehearsed instructions like we had done up to this point with long additions. The first questions only involved a single unknown (often labeled x) that we had to âsolve forâ. For example, if a teacher brings 33 apples to class and gives each student 3, how many students are in the class. Then we moved on to more advanced problems involving multiple unknowns. We quickly learned that 2 unknowns require as many âfactsâ to yield a unique solution. And soon enough, without even realizing it, we started studying linear algebra.

Did you ever wonder what happens when we break the rule that `N`

unknowns require `N`

*independent* equations for a unique solution? There are only two cases to consider:

- more equations than unknowns (overdetermined system)
- fewer equations than unknowns (underdetermined system)

An overdetermined system does not have a solution. Itâs common practive in that case
to pick the *least bad* approximation. This involves measuring how much our hypothetical solution
âmissesâ each equation then searching for the solution that âmissesâ the least. Specialists
call this approach *minimizing the Lâ norm* and the associated procedure
Ordinary Least Squares. This technique
is very important when trying to extract trends from data. However, OLS is not the focus of
this current blog. We might return to it in a later post on
statistical modeling.

This post will focus on *underdetermined* systems. Underdetermined systems have either no
solution or infinitely many solutions. These solutions form a convex set
in ââż. That geometric interpretation is helpful. Let us take a moment to understand how it comes about.
First, we remember that \(a x + b y + c z + d = 0\) (where \(a\), \(b\), \(c\), and \(d\) represent known coefficients
whereas \(x\), \(y\), and \(z\) represent unknowns) defines the two-dimensional plane in three dimensions
which is orthogonal to the vector \((a, b, c)\). As a consequence, we can interpret the solution set
of an underdetermined system of equations as the intersection of these n-dimensional planes. We struggle to
interpret geometric objects in more than 2 or 3 dimensions. Fortunately, we can *project* them to lower
dimensions. Those projections are equivalent to dropping one of the dimensions (say \(z\)) and turning
the equality into an equality. The same concept lets us go the other way (turning inequalities into
equalities) by introducing slack variables.

Recapping, having more unknowns than equations leads to infinitely many solutions. We often choose
one *preferred* solution out of that infinite *feasible* set. This can be achieved by defining an
objective function over the set of solutions. In other
words, *optimization* (which is what we call choosing the best objective value out of the *feasible* set)
is complementary to *regression* (where we find the *least bad* approximate solution). The next obstacle
is that linear optimization solvers expect the problem in
matrix form. Take for example the
wheat and barley example in the previous link in
algebraic form:

The corresponding matrix form reads

\[\begin{eqnarray} \mathrm{maximize}\ (3\ 4)^T x \\ \mathrm{subject\ to}\ \left( \begin{array}{cc} 1 & 1 \\ 2 & 6 \\ 4 & 2 \end{array} \right) x \le \left( \begin{array}{cc} 10 \\ 48 \\ 32 \end{array} \right) \end{eqnarray}\]Typography highlights the equivalence but how do we make computer âseeâ it? First, we need objects that let us represent symbolic expressions. Itâs not hard to roll those by hand if weâre only interested in this specific problem but thatâs hardly the python way. Much easier to install the standard python package for symbolic manipulations: sympy. Next, we are going to write a simple function that lets us write our sample problem as

```
from sympy.abc import x, y
maximum = maximize(
4 * x + 3 * y,
x + y <= 10,
3 * x + 6 * y <= 48,
4 * x + 2 * y <= 32,
)
```

Our function is very similar to
sympy.linear_eq_to_matrix
except it also accepts an objective function and doesnât ask you to provide `symbols`

but
infers them for you.

We start by collecting matrix coefficients:

```
from operator import attrgetter, methodcaller
lhs = map(attrgetter("lhs"), constraints) # maximize "*args"
coefficients = list(map(methodcaller("as_coefficients_dict"), lhs))
```

This representation as list of dictionaries remind us of a sparse matrix in
Compressed Sparse Row
format. CSR is flat thanks to an extra array (`indptr`

) which tells us where rows begin and end. Thatâs easy:

```
from itertools import accumulate
indptr = np.fromiter(
accumulate(map(len, coefficients), initial=0),
dtype=int,
count=len(constraints) + 1
)
```

The `data`

vector is equally trivial, we just need to flatten the `.values()`

of `coefficients`

:

```
from itertools import chain
data = np.fromiter(
chain.from_iterable(map(methodcaller("values"), coefficients)),
dtype=float,
count=len(indices),
)
```

(Please forgive me if youâre finding operator and itertools hard to grok. List comprehensions would have been equally suitable to express this logic)

`indices`

is a bit trickier. Itâs precisely what
`np.unique`

means by
`return_inverse`

. Unfortunately, `np.unique`

requires *comparable* types because it invokes
`np.argsort`

under the
hood. Fortunately, `pd.factorize`

does precisely what we need (and only requires *hashable* types):

```
from scipy import sparse
indices, variables = pd.factorize(list(chain.from_iterable(coefficients)))
A = sparse.csr_array(
(data, indices, indptr),
shape=(len(constraints), len(variables)),
)
b = np.fromiter(
map(methodcaller("evalf"), rhs), dtype=float, count=len(constraints)
)
c = np.vectorize(objective.coeff, otypes=[float])(variables)
```

`pd.factorize`

also returned the unique variables which we can pass to
`sympy.Expr.coeff`

to
define the `c`

vector.

And there you have it, just learning the absolute minimum about `sympy`

âs rich API lets us
convert a linear optimization problem to canonical form.