This is the mail archive of the gsl-discuss@sourceware.cygnus.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: combinations (summary)


I would like to express my appreciation to the people
who took time to help me with combinations problem:

Tim Peters   - for Python code which does EXACTLY what I wanted,
               for advise and for pointers to GNU's GMP package

Mirek Gorski - for pointer to GNU's GMP package
               http://www.math.iastate.edu/cbergman/crypto/gmp/gmp_toc.html

Pat Walters  - for C++ class for generating combinations

Emili Besalu - for pointer to his web page dealing with similar
               problems http://iqc.udg.es/~emili/emili_nc.htm

Thank you all very much! -- Ryszard

ORIGINAL QUESTION:
 
I am looking for a function which could compute
combinations given lexicographical index e.g.

C(1;3,2) -> 1,2
C(2;3,2) -> 1,3
C(3;3,2) -> 2,3

I have found function which does this
( http://math.nist.gov/GAMS.html ) but it is limited
to small numbers since it is using regular 4 bytes
representation for integers and therefore index
range is severly limited ( < 2^32 ).

Any pointers to the software which does this for
integers of arbitrary length would be very much
appreciated.


Ryszard Czerminski   phone: (781)395-1269 x 479
ArQule, Inc.         e-mail: ryszard@arqule.com
200 Boston Avenue    http://www.arqule.com
Medford, MA 02155

P.S.

from Tim Peters:

> I am looking for a function which could compute
> combinations given lexicographical index e.g.
>
> C(1;3,2) -> 1,2
> C(2;3,2) -> 1,3
> C(3;3,2) -> 2,3
>
> I have found function which does this
> ( http://math.nist.gov/GAMS.html ) but it is limited
> to small numbers since it is using regular 4 bytes
> representation for integers and therefore index
> range is severly limited ( < 2^32 ).

As is often the case, relaxing such limits creates other problems; in this
case, the Fortran TOMS 515 algorithm recomputes the number of combinations
from scratch each time thru the inner loop, which is very expensive if
choosing many elements from a very large set.

> Any pointers to the software which does this for
> integers of arbitrary length would be very much
> appreciated.

One is attached.  Note that since this is Python instead of Fortran,
everything is 0-based.  That is, when asking for the i'th combination of m
things taken n at a time, i ranges from 0 up to but not including C(m, n),
and the "canonical set" is taken to be range(m).  The inner loop avoids the
problem above, so this is reasonably fast even for multi-hundred digit
indices.

more-combinations-than-a-reasonable-person-could-want<wink>-ly y'rs  - tim

def _chop(n):
    """n -> int if it fits, else long."""

    try:
        return int(n)
    except OverflowError:
        return n

def comb(m, n):
    """m, n -> number of combinations of m items, n at a time.

    m >= n >= 0 required.
    """

    if not m >= n >= 0:
        raise ValueError("m >= n >= 0 required: " + `m, n`)
    if n > (m >> 1):
        n = m-n
    if n == 0:
        return 1
    result = long(m)
    i = 2
    m, n = m-1, n-1
    while n:
        # assert (result * m) % i == 0
        result = result * m / i
        i = i+1
        n = n-1
        m = m-1
    return _chop(result)

def combatindex(m, n, i):
    """m, n, i -> i'th combination of m items taken n at a time.

    m >= n >= 1 and 0 <= i < comb(m, n) required.

    Return the i'th combination in lexicographic order, as a list
    of n elements taken from range(m).
    The index (i) is 0-based.

    Example:
    >>> for i in range(6):
    ...    print combatindex(4, 2, i)
    [0, 1]
    [0, 2]
    [0, 3]
    [1, 2]
    [1, 3]
    [2, 3]
    """

    if not m >= n >= 1:
        raise ValueError("m >= n >= 1 required: " + `m, n`)
    c = long(comb(m, n))
    if not 0 <= i < c:
        raise ValueError("0 <= i < comb(m,n) required: " + `i, c`)
    result = []
    # have c == comb(m, n), want comb(m-1,n-1)
    c = c * n / m
    # invariant: c == comb(m-1, n-1)
    for element in xrange(m):
        if i < c:
            # take this element, and n-1 from the remaining
            result.append(element)
            n = n-1
            if n == 0:
                break
            # have c == comb(m-1,n), want comb(m-2,n-1)
            c = c * n / (m-1)
        else:
            # skip this element, and take all from the remaining
            i = i-c
            # have c == comb(m-1,n-1), want comb(m-2,n-1)
            c = c * (m-n) / (m-1)
        m = m-1
    assert i == 0
    return result

def _test():
    failures = 0
    m, n = 10, 6
    c = comb(m, n)
    last = [0] * n
    for i in xrange(c):
        this = combatindex(m, n, i)
        if len(this) != n:
            print "*** length error:", m, n, i, this
            failures = failures + 1
        if not last < this:
            print "*** lexicographic error:", m, n, i, last, this
            failures = failures + 1
        last = this
    if this != range(m-n, m):
        print "*** last value wrong:", m, n, c-1, this
        failures = failures + 1
    # c has more than 1400 digits in the next case -- may take a
    # few seconds
    m, n = 5000, 2000
    c = comb(m, n)
    this = combatindex(m, n, 0)
    if this != range(n):
        print "*** first value wrong:", m, n, 0, this
        failures = failures + 1
    this = combatindex(m, n, c-1)
    if this != range(m-n, m):
        print "*** last value wrong:", m, n, c-1, this
        failures = failures + 1
    if failures:
        print "*********** TEST FAILED *************"

if __name__ == "__main__":
    _test()

=========================================================

from Pat Walters:


combo.cpp

combo.h


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]