Combinatorics

Dr. Phillip M. Feldman

1. Introduction

I've created a Python module called combinatorics to supplement Python's itertools module.  combinatorics fills in gaps in the following areas of basic combinatorics:

  1. ordered and unordered m-way combinations,
  2. generalizations of the four basic occupancy problems (aka balls in boxes), and
  3. constrained permutations, otherwise known as the off-by-m problem.

Application areas include reliability engineering, quantum mechanics, and frequency-based cryptanalysis. (Frequency-based cryptanalysis is useless against modern encryption, but it is fun to create simple substitution cyphers and experiment with cracking them).

2. Elementary Combinatorics

2.1 prod

Overview

This function, like NumPy's prod function, calculates the product of a sequence of integers (typically specified via a list or tuple). Because the NumPy function uses 32-bit integer arithmetic with silent handling of overflows, results are wrong if the correct answer would exceed the limits of a signed 32-bit integer. When operating on a sequence of integers, the prod function defined in this module uses large integer arithmetic and thus always gives correct results.

Example (Interactive ipython I/O)

   In [1]: prod(range(1,11))
   Out[1]: 3628800

   In [2]: prod(range(1,31))
   Out[2]: 1409286144

   In [3]: from combinatorics import prod

   In [4]: prod(range(1,11))
   Out[4]: 3628800

   In [5]: prod(range(1,31))
   Out[5]: 265252859812191058636308480000000L

2.2 n_choose_m

The function n_choose_m calculates $\left( \begin{array}{c} n \\ m \end{array} \right)$, defined as the number of ways in which one can select m of n distinct objects without regard for order. The function uses only integer operations:

  1. If $m > n-m$, we replace $m$ by $n-m$. This depends on the fact that $\left( \begin{array}{c} n \\ m \end{array} \right) = \left( \begin{array}{c} n \\ n-m \end{array} \right)$.
  2. We calculate the answer by evaluating

          prod(range(n-m+1,n+1)) / prod(range(2,m+1)),
    
    which is equivalent to $n! / m! / (n-m)!$.
Note: Python can handle integers of arbitrary size, but the algorithm tends to bog down for very large values of n, partly because of the number of operations being performed and partly because of the memory requirements. For values of n above about 10000, use m_choose_n_ln instead of m_choose_n.

2.3 n_choose_m_ln

This function calculates the natural logarithm of $\left( \begin{array}{c} n \\ m \end{array} \right)$. The calculation is done using SciPy's gammaln function. For large $n$, especially for $n > 10000$, this function is much faster than n_choose_m (computational and memory requirements are both much lower).

Note: To obtain a value for $\left( \begin{array}{c} n \\ m \end{array} \right)$, apply the math.exp or numpy.exp function to the result returned by this function.

3. M-Way Combinations

3.1 m_way_ordered_combinations

Signature

   m_way_ordered_combinations(items, ks)

Overview

This function returns a generator that produces all m-way ordered combinations (multinomial combinations) from the specified collection of items, with ks[i]items in the ith group, i= 0, 1, 2, ..., m-1, where m= len(ks) is the number of groups. Ordered combinations means that the relative order of equal-size groups is important. The order of the items within any group is not important. The total number of combinations generated is given by the multinomial coefficient formula (see below).

Inputs

items must be (A) a list, tuple, or other iterable, or (B) a positive integer. If items is an integer, it is replaced by range(items). ks should be either a list or tuple containing non-negative integers, where the sum of these integers does not exceed the length of items.

Example

Let items=[0,1,2,3,4,5] and ks=[2,2,2]. The output includes a total of 90 combinations. Two of these are the following:
   ((0, 1), (2, 3), (4, 5))
   ((2, 3), (0, 1), (4, 5))
These are distinct because the order of the groups, which differs, is significant.

Notes

The total number of combinations generated is given by the following multinomial coefficient:

\[ \frac{n!}{k_0! \cdot k_1! \cdot \cdots \cdot k_{m-1}!} \]

where n is the number of items, m is the number of groups, and k_i is the number of items in the ith group.

3.2 m_way_unordered_combinations

Signature

   m_way_unordered_combinations(items, ks)

Overview

This function returns a generator that produces all m-way unordered combinations from the specified collection of items, with ks[i] items in the ith group, i= 0, 1, 2, ..., m-1, where m= len(ks) is the number of groups. Unordered combinations means that the relative order of equal-size groups is not important. The order of the items within any group is also unimportant.

Inputs

items must be (A) a list, tuple, or other iterable, or (B) a positive integer. If items is an integer, it is replaced by range(items). ks should be either a list or tuple containing non-negative integers, where the sum of these integers does not exceed the length of items.

Example

Consider the following reliability problem: Six similar but non-identical units must be grouped into three pairs—primary and redundant—except that we don't care which is which. Identify all the ways in which this can be done.
from combinatorics import *
list(m_way_unordered_combinations(6, [2,2,2]))
The output consists of the 15 combinations listed below:
(0, 1), (2, 3), (4, 5)
(0, 1), (2, 4), (3, 5)
(0, 1), (2, 5), (3, 4)
(0, 2), (1, 3), (4, 5)
(0, 2), (1, 4), (3, 5)
(0, 2), (1, 5), (3, 4)
(0, 3), (1, 2), (4, 5)
(0, 3), (1, 4), (2, 5)
(0, 3), (1, 5), (2, 4)
(0, 4), (1, 2), (3, 5)
(0, 4), (1, 3), (2, 5)
(0, 4), (1, 5), (2, 3)
(0, 5), (1, 2), (3, 4)
(0, 5), (1, 3), (2, 4)
(0, 5), (1, 4), (2, 3)

Notes

When all group sizes are unequal, the total number of combinations generated is given by the multinomial coefficient (see above). When two or more groups have equal sizes, the number of combinations is less than the multinomial coefficient because combinations that differ only in the relative order of equal-size groups are excluded.

4. Balls in Boxes

4.1 Introduction

Many combinatorial problems are equivalent to the problem of distributing balls into boxes. The balls may be either identical (unlabeled) or distinguishable (labeled), and the boxes may also be either identical or distinguishable; we thus have four classes of balls-in-boxes problems. Although typically the boxes do not have capacity limits, i.e., one may choose to put all of the balls in any single box, the most general formulation of the problem allows for capacity limits.

4.2 unlabeled_balls_in_labeled_boxes

Signature

   unlabeled_balls_in_labeled_boxes(n, box_sizes)

Overview

This function returns a generator that produces all distinct distributions of indistinguishable balls among labeled boxes with specified box sizes (capacities). This is a generalization of the most common formulation of the problem, where each box is sufficiently large to accommodate all of the balls, and is an important example of a class of combinatorics problems called weak composition problems.

Inputs

n is the number of balls.

box_sizes is a list of length 1 or greater. The length of the list corresponds to the number of boxes. box_sizes[i] is a positive integer that specifies the maximum capacity of the ith box. If box_sizes[i] equals n (or greater), the ith box can accommodate all n balls and thus effectively has unlimited capacity.

Acknowledgment

I'd like to thank Chris Rebert for helping me to convert my prototype class-based code into a generator function.

4.3 unlabeled_balls_in_unlabeled_boxes

Signature

   unlabeled_balls_in_unlabeled_boxes(n, box_sizes)

Overview

This function returns a generator that produces all distinct distributions of indistinguishable balls among indistinguishable boxes, with specified box sizes (capacities). This is a generalization of the most common formulation of the problem, where each box is sufficiently large to accommodate all of the balls. It might be asked, In what sense are the boxes indistinguishable if they have different capacities? The answer is that the box capacities must be considered when distributing the balls, but once the balls have been distributed, the identities of the boxes no longer matter.

Example (Interactive ipython I/O)

   In [1]: from combinatorics import *

   In [2]: list(unlabeled_balls_in_unlabeled_boxes(10, [5,4,3,2,1]))
   Out[2]:
   [(5, 4, 1, 0, 0),
    (5, 3, 2, 0, 0),
    (5, 3, 1, 1, 0),
    (5, 2, 2, 1, 0),
    (5, 2, 1, 1, 1),
    (4, 4, 2, 0, 0),
    (4, 4, 1, 1, 0),
    (4, 3, 3, 0, 0),
    (4, 3, 2, 1, 0),
    (4, 3, 1, 1, 1),
    (4, 2, 2, 2, 0),
    (4, 2, 2, 1, 1),
    (3, 3, 3, 1, 0),
    (3, 3, 2, 2, 0),
    (3, 3, 2, 1, 1),
    (3, 2, 2, 2, 1)]

4.4 labeled_balls_in_unlabeled_boxes

Signature

   labeled_balls_in_unlabeled_boxes(n, box_sizes)

Overview

This function returns a generator that produces all distinct distributions of distinguishable balls among indistinguishable boxes, with specified box sizes (capacities). This is a generalization of the most common formulation of the problem, where each box is sufficiently large to accommodate all of the balls.

labeled_balls_in_labeled_boxes(balls, box_sizes): This function returns a generator that produces all distinct distributions of distinguishable balls among distinguishable boxes, with specified box sizes (capacities). This is a generalization of the most common formulation of the problem, where each box is sufficiently large to accommodate all of the balls.

Example (Interactive ipython I/O)

In [1]: from combinatorics import *

In [2]: list(labeled_balls_in_labeled_boxes(3, [2,2]))
Out[2]:
[((0, 1), (2,)),
 ((0, 2), (1,)),
 ((1, 2), (0,)),
 ((0,), (1, 2)),
 ((1,), (0, 2)),
 ((2,), (0, 1))]

5. Partitions

5.1 Introduction

In number theory and combinatorics, a partition of a positive integer n, also called an integer partition, is a way of writing n as a sum of positive integers. Two sums that differ only in the order of their summands are considered to be the same partition. (The quote is from http://en.wikipedia.org/wiki/Partition_(number_theory).

5.2 partitions

Signature

   partitions(n)

Overview

The function partitions generates partitions of the specified integer using unlabeled_balls_in_unlabeled_boxes. Note that this is an inefficient method for generating partitions. partitions2 (see the next section of this document) is more efficient. The code for partitions, which is quite small, follows:
def partitions(n):
   _partitions= unlabeled_balls_in_unlabeled_boxes(n, n*[n])

   for _partition in _partitions:
      yield tuple([p for p in _partition if p])

Example (Interactive ipython I/O)

In this example, I generate a table listing the number of partitions versus $n$.
   In [1]: for i in range(1, 16):
      ...:     print('%2d: %d' % (i, len(list(partitions(i)))))
      ...:
    1: 1
    2: 2
    3: 3
    4: 5
    5: 7
    6: 11
    7: 15
    8: 22
    9: 30
   10: 42
   11: 56
   12: 77
   13: 101
   14: 135
   15: 176
Running the above example took more than a minute of CPU time on my computer.

5.3 partitions2

Signature

   partitions(n)

Overview

The function partitions2, which was written by David Eppstein, efficiently generates partitions of the specified integer. Generating a table of the number of partitions of the first 15 integers requires a small fraction of a second.

Example (Interactive ipython I/O)

In this example, I generate a table listing the number of partitions of the number 6.
   In [1]: from combinatorics import *

   In [2]: p= partitions2(6)

   In [3]: for i, partition in enumerate(p):
      ...:     print('%2d: %s' % (i, partition))
      ...:
    0: [1, 1, 1, 1, 1, 1]
    1: [1, 1, 1, 1, 2]
    2: [1, 1, 2, 2]
    3: [2, 2, 2]
    4: [1, 1, 1, 3]
    5: [1, 2, 3]
    6: [3, 3]
    7: [1, 1, 4]
    8: [2, 4]
    9: [1, 5]
   10: [6]

6. Constrained Permutations

off_by_one(n): This function returns a generator that enumerates all possible solutions of the off-by-one problem. This problem can be stated as follows:

A list of n items is provided. Each item is in its correct position, or one before or one after its correct position. Enumerate all possibilities for the correct ordering.

(These are essentially constrained permutations). With two lines of code, one can generate a table showing the number of possible permutations versus n:

for i in range(2,10):
   print('%d, %d' % (i, len(list(off_by_one(i))))

The output is as follows:

2, 2
3, 3
4, 5
5, 8
6, 13
7, 21
8, 34
9, 55

Note that the number of possible orders follows the Fibonacci sequence.

off_by_m(n, m): This recursive generator function enumerates all possible solutions of the 'off-by-m' problem. This problem, which is a generalization of the off-by-one problem, can be stated as follows:

A list of n items is provided. Each item is in its correct position, or up to m positions before or after its correct position. Enumerate all possibilities for the correct ordering.

Combinatorics-1.4.5.zip


Last update: 8 April, 2015