10 Analysis
Analysis of fMRI data can take many forms, but at the core of most analyses is a GLM — a general linear model that seeks to understand the data as a linear combination of several different explanatory variables. Typically, each voxel is analyzed separately, so the analysis produces maps of beta weights — the relative contribution of each stimulus to the signal in a voxel — and a corresponding statistical parameter to help the experimenter understand whether the result is significant.
The first section of the chapter lays out the basic approach for a GLM, with a brief digression into linear algebra for those who are curious about how it works or have a background in matrix algebra. However, this brief discussion of linear algebra is not enough to teach someone how to do it if they haven’t taken a linear algebra class yet — it’s just pointers toward how you can do the analysis yourself if you know (or want to learn!) a bit of linear algebra.
The analysis presented above assumed that we knew exactly how to convert a neural event into a timeseries that predicts the BOLD response. (By convolving the neural timeseries with a canonical HIRF — go back and look at the convolution section in the previous chapter if you haven’t yet!) However, different people actually have different HIRFs. Here’s a brief discussion of approaches to building a model that can flex to capture the variability in different individuals’ HIRF shapes.
And while we’re talking about how to deal with the real world … our design matrix also needs to include some nuisance regressors to include some known sources of noise in the data — baseline drift and motion.
Exercises
A Colab notebook to start exploring regression is here.
After looking through it and running the last cell a few times, answer these questions:
- Why do the “linear algebra” and “linear regression” estimates give different estimates for the beta weights associated with the two different stimuli?
- Are they, on average, the same?
- How does their agreement depend on the amount of noise in the simulated data?