Applied psychometric data fit

I) Maximum likelihood: Matlab, SAS

Fitting the logistic Regression with Matlab on the mac

[b, dev, stat] = glmfit(x, [y Ny], 'binomial', 'logit')

where x is the variable manipulated, y is the number of outcome for a given x, Ny is the total number of case for a given x, binomial is the distribution and logit is the link function.

Ignore the warning messages. Always check with another program, but the answer should be correct.

glmfit works better with grouped data (as is the format of the data obtained with the method of constant stimulus). For adaptive data, I sort the data by increasing x, group the data, and fit the model for a group of N data point (N=10 for exemple). I then modify glmfit to use this solution as the initial guess for the parameters, and ran the maximum likelihood fit with the original data (ie N=1). With this two step method, glmfit works well with adaptive single x data.

Fitting the Logistic regression with SAS on a Unix machine

proc logistic data=one descending;

model success = experience;

proc genmod data=one descending;

model success = experience/dist=bin link=logit;

Fitting a "modified" logistic function with Matlab

A theoretical sigmoid function between .25 and 1 can be mapped onto -inf and inf by a simple transformation analogous to the logistic link.

Instead of using the logit function (p/(1-p)) we can use the modified logit function ((p-.25)/(1-p)). The difference is that while the observed probability has to be between 0 and 1 in the logistic case, it does not have between .25 and 1 in the forced choice case. If we map the range, [.25 1] to [-inf inf] what are we to do with observed probabilities in the range [0 .25]? My way around this problem has been to raise the observed probability to .251 or some such probability within the mapped range. This distortion of the data will be reflected in the model fit to some extent, but the effect is somewhat negligeable as long as most of the data is collected around 80% correct.

To modify the logistic link function to vary between .25 and 1 for example, instead of from 0 to 1 add the following lines to the matlab code

fl=inline('log((max(.25+1e-6, min(p, 1-1e-6))-.25) ./ (1-max(.25+1e-6, min(p, 1-1e-6))))')

fd=inline('0.75 ./ ((1-max(.25+1e-6, min(p, 1-1e-6))) .* (max(.25+1e-6, min(p, 1-1e-6))-0.25))')

fi=inline('(.25 + exp(x)) ./ (1+exp(x))')

mylink={fl fd fi};

fl is the link log((p-.25)/(1-p))

fd is the derivative u'/u when fl = log(u) : u'/u = (1-.25)/((1-p)(p-.25))

fi is the inverse function, if x=fl(p), p=fi(x).

then run the maximum likelihood analysis for quest type binomial data

[b, dev, stat] = glmfit(x, [y ones(size(y))],'binomial',mylink);

Fitting a "modified logistic" function with SAS on a Unix machine

proc genmod data=masha;

fwdlink link=log((_MEAN_ - 0.25)/(1-_MEAN_));

invlink ilink=(0.25 + exp(_XBETA_))/(1+exp(_XBETA_));

model correcthv =delay /dist=bin;

II) R and gnlr application

Compile and add Lindsey's gnlm library and rmutil library, load the package in R.

Here is an example using gnlr proposed by K K at

#Weibull parameters for simulated data
b <- log(3.5)
g <- 0.25
d <- 0.05
a <- 0.04
p <- c(a,b,g,d) <- 960
cnt <- 10^seq(-2,-1,length=6) # contrast levels

#simulated observer responses
wb <- function(p) { p[3]+(1-p[3]-p[4])*(1-exp(-((cnt/p[1])^exp(p[2])))) }
ny <- rbinom(length(cnt),,wb(p)) #number of yes responses
nn <- #number of no responses
phat <- ny/(ny+nn)
resp.mat <- matrix(c(ny,nn),ncol=2)

quartz() # put your graphic device driver here
xlab="Contrast",ylab="Probability Correct",las=1)
points(cnt,phat,cex=2,pch=16) <-

#ppred<-fitted( #predicts number of yes responses
cnt <- 10^seq(-2,-1,length=50)
ppred <- wb(coef(
names($coefficients) <-c("alpha","beta","gamma","delta")

Comparing gnlr and modified logistic

gnlr seems to be ideal since it fits the non linear shape and takes into account the binomial distribution of the data. However, in practice, gnlr is more sensitive to initial conditions, the modified logistic regression more robust. This may have to do with the nature of the data used to compare the two analysis: adaptive data around the threshold thought. There is relatively little data around the .25% correct point, and the data is mostly clustered around the threshold.

A paired analysis on 56 conditions showed no difference between the thresholds between the two methods, and a small difference between the likelihoods of negligeable effect size. Here is an example of the compared fits.