\begin{funcdesc}{sample}{population, k}
Return a \var{k} length list of unique elements chosen from the
population sequence. Used for random sampling without replacement.
+ \versionadded{2.3}
- Returns a new list containing elements from the population. The
- list itself is in random order so that all sub-slices are also
- random samples. The original sequence is left undisturbed.
-
- If the population has repeated elements, then each occurence is a
- possible selection in the sample.
+ Returns a new list containing elements from the population while
+ leaving the original population unchanged. The resulting list is
+ in selection order so that all sub-slices will also be valid random
+ samples. This allows raffle winners (the sample) to be partitioned
+ into grand prize and second place winners (the subslices).
- If indices are needed for a large population, use \function{xrange}
- as an argument: \code{sample(xrange(10000000), 60)}.
+ Members of the population need not be hashable or unique. If the
+ population contains repeats, then each occurrence is a possible
+ selection in the sample.
- Optional argument random is a 0-argument function returning a random
- float in [0.0, 1.0); by default, the standard random.random.
- \versionadded{2.3}
+ To choose a sample from a range of integers, use \function{xrange}
+ as an argument. This is especially fast and space efficient for
+ sampling from a large population: \code{sample(xrange(10000000), 60)}.
\end{funcdesc}
These must be integers in the range [0, 256).
"""
- if not type(x) == type(y) == type(z) == type(0):
+ if not type(x) == type(y) == type(z) == int:
raise TypeError('seeds must be integers')
if not (0 <= x < 256 and 0 <= y < 256 and 0 <= z < 256):
raise ValueError('seeds must be in range(0, 256)')
# Previous selections are stored in dictionaries which provide
# __contains__ for detecting repeat selections. Discarding repeats
# is efficient unless most of the population has already been chosen.
- # So, tracking selections is useful when sample sizes are much
- # smaller than the total population.
+ # So, tracking selections is fast only with small sample sizes.
n = len(population)
if not 0 <= k <= n:
random = self.random
result = [None] * k
if n < 6 * k: # if n len list takes less space than a k len dict
- pool = list(population) # track potential selections
- for i in xrange(k):
- j = int(random() * (n-i)) # non-selected at [0,n-i)
- result[i] = pool[j] # save selected element
- pool[j] = pool[n-i-1] # non-selected to head of list
+ pool = list(population)
+ for i in xrange(k): # invariant: non-selected at [0,n-i)
+ j = int(random() * (n-i))
+ result[i] = pool[j]
+ pool[j] = pool[n-i-1]
else:
- selected = {} # track previous selections
+ selected = {}
for i in xrange(k):
j = int(random() * n)
- while j in selected: # discard and replace repeats
+ while j in selected:
j = int(random() * n)
result[i] = selected[j] = population[j]
- return result # return selections in the order they were picked
+ return result
## -------------------- real-valued distributions -------------------
# Math Software, 3, (1977), pp257-260.
random = self.random
- while 1:
+ while True:
u1 = random()
u2 = random()
z = NV_MAGICCONST*(u1-0.5)/u2
b = (a - _sqrt(2.0 * a))/(2.0 * kappa)
r = (1.0 + b * b)/(2.0 * b)
- while 1:
+ while True:
u1 = random()
z = _cos(_pi * u1)
bbb = alpha - LOG4
ccc = alpha + ainv
- while 1:
+ while True:
u1 = random()
u2 = random()
v = _log(u1/(1.0-u1))/ainv
# Uses ALGORITHM GS of Statistical Computing - Kennedy & Gentle
- while 1:
+ while True:
u = random()
b = (_e + alpha)/_e
p = b*u