Friday, September 10, 2010

GLM and Python, scikits.statsmodels

Python version of the code for ARM 6.2 uses GLM() call under the scikits.statsmodels package. AFAIK there is no equivalent call for R factor() in this package. Also, in order to get the intercept as in R, we need to add an extra column of '1's. Other than that, modified R calls (withouts factors) and Python GLM calls match exactly.
import numpy
import scikits.statsmodels as sm
from scipy import stats

data = numpy.loadtxt("frisk_with_noise.dat", skiprows=7)

X = numpy.zeros((3,len(data[:,0])))
print X.shape

arrests = data[:,2]
arrests[arrests == 0] = 1
arrests = numpy.log(arrests)

stops = data[:,0]
stops[stops==0.0] = .0001

X[0,:] = arrests # arrests
X[1,:] = data[:,4] # eth
X[2,:] = numpy.ones(len(data[:,0])) # eth

glm = sm.GLM(stops, X.T, family=sm.family.Poisson())
res = glm.fit()
Google Groups

No comments:

Post a Comment