Tuesday, August 17, 2010

Sampling With Replacement Using Weights in Python

Here is the Python function corresponding to sample() call in R. We based it on the code here; only changed it so that the inputs use seperate weight and value vectors instead of one vector that has tuples of weight, value pairs.
import random

items = [(10, "low"),
(100, "mid"),
(890, "large")]

w = [10, 100, 890]
v = ["low", "mid", "large"]

def weighted_sample(ws, vs, n):
total = float(sum(w for w in ws))
i = 0
w = ws[0]
v = vs[0]
while n:
x = total * (1 - random.random() ** (1.0 / n))
total -= x
while x > w:
x -= w
i += 1
w = ws[i]
v = vs[i]
w -= x
yield v
n -= 1

for i in weighted_sample(w, v, 500):
print i

No comments:

Post a Comment