Combinatorics

Dr. Phillip M. Feldman

I've created a Python module called `combinatorics`

to supplement
Python's `itertools`

module. `combinatorics`

fills
in gaps in the following areas of basic combinatorics:

- ordered and unordered m-way combinations,
- generalizations of the four basic occupancy problems (aka
*balls in boxes*), and - constrained permutations, otherwise known as the
*off-by-m problem*.

Application areas include reliability engineering, quantum mechanics, and
frequency-based cryptanalysis. (Frequency-based cryptanalysis is useless
against modern encryption, but it is fun to create simple substitution cyphers
and experiment with cracking them).

`prod`

`prod`

function, calculates the product
of a sequence of integers (typically specified via a list or tuple). Because
the NumPy function uses 32-bit integer arithmetic with silent handling of
overflows, results are wrong if the correct answer would exceed the limits of a
signed 32-bit integer. When operating on a sequence of integers, the
`prod`

function defined in this module uses large integer arithmetic
and thus always gives correct results.
In [1]: prod(range(1,11)) Out[1]: 3628800 In [2]: prod(range(1,31)) Out[2]: 1409286144 In [3]: from combinatorics import prod In [4]: prod(range(1,11)) Out[4]: 3628800 In [5]: prod(range(1,31)) Out[5]: 265252859812191058636308480000000L

`n_choose_m`

The function `n_choose_m`

calculates $\left( \begin{array}{c}
n \\ m \end{array} \right)$, defined as the number of ways in which one can
select m of n distinct objects without regard for order. The function uses only
integer operations:

- If $m > n-m$, we replace $m$ by $n-m$. This depends on the fact that $\left( \begin{array}{c} n \\ m \end{array} \right) = \left( \begin{array}{c} n \\ n-m \end{array} \right)$.
We calculate the answer by evaluating

prod(range(n-m+1,n+1)) / prod(range(2,m+1)),

which is equivalent to $n! / m! / (n-m)!$.

`m_choose_n_ln`

instead of
`m_choose_n`

.
`n_choose_m_ln`

This function calculates the natural logarithm of $\left( \begin{array}{c}
n \\ m \end{array} \right)$. The calculation is done using SciPy's
`gammaln`

function. For large $n$, especially for $n > 10000$,
this function is much faster than `n_choose_m`

(computational and
memory requirements are both much lower).

Note: To obtain a value for $\left( \begin{array}{c} n \\ m \end{array}
\right)$, apply the `math.exp`

or `numpy.exp`

function
to the result returned by this function.

`m_way_ordered_combinations`

m_way_ordered_combinations(items, ks)

`ks[i]`

items in the ith group, i= 0, 1, 2, ..., m-1, where
`m= len(ks)`

is the number of groups. `items`

must be (A) a list, tuple, or other iterable, or (B) a
positive integer. If `items`

is an integer, it is replaced by
`range(items)`

.
`ks`

should be either a list or tuple containing non-negative
integers, where the sum of these integers does not exceed the length of
`items`

.
`items=[0,1,2,3,4,5]`

and `ks=[2,2,2]`

. The output
includes a total of 90 combinations. Two of these are the following:
((0, 1), (2, 3), (4, 5)) ((2, 3), (0, 1), (4, 5))These are distinct because the order of the groups, which differs, is significant.

\[ \frac{n!}{k_0! \cdot k_1! \cdot \cdots \cdot k_{m-1}!} \]

where n is the number of items, m is the number of groups, and k_i is the number of items in the ith group.

`m_way_unordered_combinations`

m_way_unordered_combinations(items, ks)

`ks[i]`

items in the ith
group, i= 0, 1, 2, ..., m-1, where `m= len(ks)`

is the number of
groups. `items`

must be (A) a list, tuple, or other iterable, or (B) a
positive integer. If `items`

is an integer, it is replaced by
`range(items)`

.
`ks`

should be either a list or tuple containing non-negative
integers, where the sum of these integers does not exceed the length of
`items`

.
from combinatorics import * list(m_way_unordered_combinations(6, [2,2,2]))The output consists of the 15 combinations listed below:

(0, 1), (2, 3), (4, 5) (0, 1), (2, 4), (3, 5) (0, 1), (2, 5), (3, 4) (0, 2), (1, 3), (4, 5) (0, 2), (1, 4), (3, 5) (0, 2), (1, 5), (3, 4) (0, 3), (1, 2), (4, 5) (0, 3), (1, 4), (2, 5) (0, 3), (1, 5), (2, 4) (0, 4), (1, 2), (3, 5) (0, 4), (1, 3), (2, 5) (0, 4), (1, 5), (2, 3) (0, 5), (1, 2), (3, 4) (0, 5), (1, 3), (2, 4) (0, 5), (1, 4), (2, 3)

`unlabeled_balls_in_labeled_boxes`

unlabeled_balls_in_labeled_boxes(n, box_sizes)

`n`

is the number of balls.

`box_sizes`

is a list of length 1 or greater. The length of the
list corresponds to the number of boxes. `box_sizes[i]`

is a positive
integer that specifies the maximum capacity of the ith box. If
`box_sizes[i]`

equals `n`

(or greater), the ith box can
accommodate all `n`

balls and thus effectively has unlimited
capacity.

`unlabeled_balls_in_unlabeled_boxes`

unlabeled_balls_in_unlabeled_boxes(n, box_sizes)

In [1]: from combinatorics import * In [2]: list(unlabeled_balls_in_unlabeled_boxes(10, [5,4,3,2,1])) Out[2]: [(5, 4, 1, 0, 0), (5, 3, 2, 0, 0), (5, 3, 1, 1, 0), (5, 2, 2, 1, 0), (5, 2, 1, 1, 1), (4, 4, 2, 0, 0), (4, 4, 1, 1, 0), (4, 3, 3, 0, 0), (4, 3, 2, 1, 0), (4, 3, 1, 1, 1), (4, 2, 2, 2, 0), (4, 2, 2, 1, 1), (3, 3, 3, 1, 0), (3, 3, 2, 2, 0), (3, 3, 2, 1, 1), (3, 2, 2, 2, 1)]

`labeled_balls_in_unlabeled_boxes`

labeled_balls_in_unlabeled_boxes(n, box_sizes)

`labeled_balls_in_labeled_boxes(balls, box_sizes)`

: This function
returns a generator that produces all distinct distributions of distinguishable
balls among distinguishable boxes, with specified box sizes (capacities). This
is a generalization of the most common formulation of the problem, where each
box is sufficiently large to accommodate all of the balls.

In [1]: from combinatorics import * In [2]: list(labeled_balls_in_labeled_boxes(3, [2,2])) Out[2]: [((0, 1), (2,)), ((0, 2), (1,)), ((1, 2), (0,)), ((0,), (1, 2)), ((1,), (0, 2)), ((2,), (0, 1))]

`partitions`

partitions(n)

`partitions`

generates partitions of the specified
integer using `unlabeled_balls_in_unlabeled_boxes`

.
Note that this is an inefficient method for
generating partitions. `partitions2`

(see the next section
of this document) is more efficient. The code for `partitions`

,
which is quite small, follows:
def partitions(n): _partitions= unlabeled_balls_in_unlabeled_boxes(n, n*[n]) for _partition in _partitions: yield tuple([p for p in _partition if p])

In [1]: for i in range(1, 16): ...: print('%2d: %d' % (i, len(list(partitions(i))))) ...: 1: 1 2: 2 3: 3 4: 5 5: 7 6: 11 7: 15 8: 22 9: 30 10: 42 11: 56 12: 77 13: 101 14: 135 15: 176Running the above example took more than a minute of CPU time on my computer.

`partitions2`

partitions(n)

`partitions2`

, which was written by David Eppstein,
efficiently generates partitions of the specified integer. Generating a table of
the number of partitions of the first 15 integers requires a small fraction of a
second.
In [1]: from combinatorics import * In [2]: p= partitions2(6) In [3]: for i, partition in enumerate(p): ...: print('%2d: %s' % (i, partition)) ...: 0: [1, 1, 1, 1, 1, 1] 1: [1, 1, 1, 1, 2] 2: [1, 1, 2, 2] 3: [2, 2, 2] 4: [1, 1, 1, 3] 5: [1, 2, 3] 6: [3, 3] 7: [1, 1, 4] 8: [2, 4] 9: [1, 5] 10: [6]

`off_by_one(n)`

: This function returns a generator that enumerates
all possible solutions of the *off-by-one* problem. This problem can be
stated as follows:

A list of n items is provided. Each item is in its correct position, or one before or one after its correct position. Enumerate all possibilities for the correct ordering.

(These are essentially constrained permutations). With two lines of code, one can generate a table showing the number of possible permutations versus n:

for i in range(2,10): print('%d, %d' % (i, len(list(off_by_one(i))))

The output is as follows:

2, 2 3, 3 4, 5 5, 8 6, 13 7, 21 8, 34 9, 55

Note that the number of possible orders follows the Fibonacci sequence.

`off_by_m(n, m)`

: This recursive generator function enumerates all
possible solutions of the 'off-by-m' problem. This problem, which is a
generalization of the off-by-one problem, can be stated as follows:

A list of n items is provided. Each item is in its correct position, or up to m positions before or after its correct position. Enumerate all possibilities for the correct ordering.