BrainStat : A toolbox for statistical analysis of neuroimaging data

https://img.shields.io/badge/License-BSD%203--Clause-blue.svg https://github.com/MICA-MNI/BrainStat/actions/workflows/python_unittests.yml/badge.svg https://github.com/MICA-MNI/BrainStat/actions/workflows/MATLAB_unittests.yml/badge.svg https://readthedocs.org/projects/brainstat/badge/?version=master&style=plastic https://img.shields.io/badge/code%20style-black-000000.svg https://www.mathworks.com/matlabcentral/images/matlab-file-exchange.svg

Welcome to BrainStat’s documentation!

BrainStat is a toolbox for the statistical analysis and context decoding of neuroimaging data. It implements both univariate and multivariate linear models and interfaces with the BigBrain Atlas, Allen Human Brain Atlas and Nimare databases. BrainStat flexibly handles common surface, volume, and parcel level data formats, and provides a series of interactive visualization functions. The toolbox has been implemented in both Python and MATLAB, the two most widely adopted programming languages in the neuroimaging and neuroinformatics communities. It is openly available, and documented here.

https://github.com/MICA-LAB/BrainStat/blob/master/docs/figures/brainstat_logo_bw.png?raw=true

Developers

  • Reinder Vos de Wael - MICA Lab, Montreal Neurological Institute

  • Şeyma Bayrak - Max Planck Institute for Human Cognitive and Brain Sciences

  • Oualid Benkarim - MICA Lab, Montreal Neurological Institute

  • Sara Lariviere - MICA Lab, Montreal Neurological Institute

  • Raul Cruces - MICA Lab, Montreal Neurological Institute

  • Peer Herholz - Montreal Neurological Institute

  • Seok-Jun Hong - Sungkyunkwan University

  • Sofie Valk - Max Planck Institute for Human Cognitive and Brain Sciences

  • Boris Bernhardt - Montreal Neurological Institute

License

The BrainStat source code is available under the BSD (3-Clause) license.

Support

If you have problems installing the software or questions about usage and documentation, or something else related to BrainStat, you can post to the Issues section of our repository.

Installation Guide

BrainStat is available in Python and MATLAB.

Python installation

BrainStat requires Python 3.7+. Assuming you have the correct version of Python installed and aliased to python, you can install BrainStat by running the following

python -m pip install brainstat
Python Dependencies

If you want to use the meta analysis module, you’ll also have to download and install the package pyembree. This package is only available through conda-forge:

conda install -c conda-forge pyembree

MATLAB installation

This toolbox is compatible with MATLAB versions R2019b and newer.

We recommend installing the toolbox through the Mathworks FileExchange. Simply download the file as a toolbox and open the .mltbx file in MATLAB. Alternatively, you can install the same .mltbx file from our GitHub Releases.

If you don’t want to install BrainStat as a MATLAB Toolbox, you can also simply download the repository and run the following in MATLAB:

addpath(genpath('/path/to/BrainStat/brainstat_matlab/'))

If you want to load BrainStat every time you start MATLAB, type edit startup and append the above line to the end of this file.

MATLAB Dependencies

BrainStat relies on functionality included in the BrainSpace toolbox. Please see the BrainSpace installation guide for installation instructions.

If you wish to open gifti files (required for data loader function) you will also need to install the gifti library.

Python Index

Tutorials

Tutorial 01: Linear Models

In this tutorial you will set up your first linear model with BrainStat. To this end, we will first load some sample data from the MICS dataset.

from brainstat.datasets import fetch_mask, fetch_template_surface
from brainstat.tutorial.utils import fetch_mics_data

# Load behavioral markers
thickness, demographics = fetch_mics_data()
pial_left, pial_right = fetch_template_surface("fsaverage5", join=False)
pial_combined = fetch_template_surface("fsaverage5", join=True)
mask = fetch_mask("fsaverage5")

Lets have a look at the cortical thickness data. To do this, we will use the surface plotter included with BrainSpace. As we’ll be plotting data onto these hemispheres quite often in this tutorial we’ll create a simple function for it and plot mean thickness here.

import numpy as np
from brainspace.plotting import plot_hemispheres


def local_plot_hemispheres(values, label_text, color_range, cmap="viridis"):
    # Plot cortical surfaces with values as the data, label_text as
    # the labels, and color_range as the limits of the color bar.
    return plot_hemispheres(
        pial_left,
        pial_right,
        values,
        color_bar=True,
        color_range=color_range,
        label_text=label_text,
        cmap=cmap,
        embed_nb=True,
        size=(1400, 200),
        zoom=1.45,
        nan_color=(0.7, 0.7, 0.7, 1),
        cb__labelTextProperty={"fontSize": 12},
        interactive=False,
    )


local_plot_hemispheres(np.mean(thickness, axis=0), ["Cortical Thickness"], (1.5, 3.5))
plot tutorial 01 basics

Out:

<IPython.core.display.Image object>

Lets also have a look at what’s inside the demographics data.

print(demographics)

Out:

    SUB_ID  VISIT  AGE_AT_SCAN SEX
0   031404      1    -0.579678   F
1   04a144      1    -0.812116   M
2   0b78f1      1     0.117636   M
3   0d26b9      1     0.466293   F
4   1988b8      1    -0.114802   M
..     ...    ...          ...  ..
77  f25714      1    -0.231021   F
78  f25714      2     0.117636   F
79  f615a5      1    -0.695897   F
80  feac6b      1    -0.695897   F
81  feac6b      2    -0.347240   F

[82 rows x 4 columns]

Demographics contains four variables: a subject ID, a visit number (some subjects visited multiple times), their age at the time of scanning and their sex. Lets also print some summary statistics.

# Print demographics summary.
for i in range(1, 3):
    print(
        (
            f"Visit {i}, N={np.sum(demographics.VISIT==i)}, "
            f"{np.sum(demographics.SEX[demographics.VISIT == i] == 'F')} females, "
            f"mean subject age {np.mean(demographics.AGE_AT_SCAN[demographics.VISIT == i]):.2f}, "
            f"standard deviation of age: {np.std(demographics.AGE_AT_SCAN[demographics.VISIT==i]):.2f}."
        )
    )

Out:

Visit 1, N=70, 30 females, mean subject age -0.02, standard deviation of age: 1.02.
Visit 2, N=12, 5 females, mean subject age 0.09, standard deviation of age: 0.84.

Next, we will assess whether a subject’s age is related to their cortical thickness. To this end we can create a linear model with BrainStat. For our first model, we will only consider the effect of age, i.e. we will disregard the effect of sex and that some subjects visit twice. this end we can create a linear model with BrainStat. First we declare the age variable as a FixedEffect. The FixedEffect class can be created in two ways: either we provide the data with pandas, as we do here, or we provide a numpy array and a name for the fixed effect. Lets set up the model Y = intercept + B1 * age. Note that BrainStat includes an intercept by default.

from brainstat.stats.terms import FixedEffect

term_age = FixedEffect(demographics.AGE_AT_SCAN)
model = term_age

As said before, if your data is not in a pandas DataFrame (e.g. numpy), you’ll have to provide the name of the effect as an additional parameter as follows:

term_age_2 = FixedEffect(demographics.AGE_AT_SCAN.to_numpy(), "AGE_AT_SCAN")

Lets have a look at one of these models. As you can see below, the model is stored in a format closely resembling a pandas DataFrame. Note that an intercept is automatically added to the model. This behavior can be disabled in the FixedEffect call, but we recommend leaving it enabled. We can also access the vectors related to each effect by their name i.e. model.intercept and model.AGE_AT_SCAN will return the vectors of the intercept and age, respectively.

print(model)

Out:

    intercept  AGE_AT_SCAN
0           1    -0.579678
1           1    -0.812116
2           1     0.117636
3           1     0.466293
4           1    -0.114802
..        ...          ...
77          1    -0.231021
78          1     0.117636
79          1    -0.695897
80          1    -0.695897
81          1    -0.347240

[82 rows x 2 columns]

Now, imagine we have some cortical marker (e.g. cortical thickness) for each subject, and we want to evaluate whether this marker is different across the the lifespan. To do this, we can use the model we defined before, and a contrast in observations (here: age). Then we simply initialize an SLM model and fit it to the cortical thickness data.

from brainstat.stats.SLM import SLM

contrast_age = demographics.AGE_AT_SCAN
slm_age = SLM(
    model,
    contrast_age,
    surf="fsaverage5",
    mask=mask,
    correction=["fdr", "rft"],
    cluster_threshold=0.01,
)
slm_age.fit(thickness)

The resulting model, slm_age, will contain the t-statistic map, p-values derived with the requested corrections, and a myriad of other properties (see the API for more details). Lets plot the t-values and p-values on the surface. We’ll do this a few times throughout the tutorial so lets define a function to do this.

def plot_slm_results(slm, plot_peak=False, plot_fdr=False):

    handles = [local_plot_hemispheres(slm.t, ["t-values"], (-4, 4), "bwr")]

    plot_pvalues = [np.copy(slm.P["pval"]["C"])]
    labels = ["Cluster p-values"]

    if plot_peak:
        plot_pvalues.append(np.copy(slm.P["pval"]["P"]))
        labels.append("Peak p-vales")

    if plot_fdr:
        plot_pvalues.append(np.copy(slm.Q))
        labels.append("Vertex p-values")

    [np.place(x, np.logical_or(x > 0.05, ~mask), np.nan) for x in plot_pvalues]

    for i in range(len(plot_pvalues)):
        handles.append(
            local_plot_hemispheres(plot_pvalues[i], [labels[i]], (0, 0.05), "plasma_r")
        )

    return handles


plot_slm_results(slm_age, plot_peak=True, plot_fdr=True)
  • plot tutorial 01 basics
  • plot tutorial 01 basics
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>, <IPython.core.display.Image object>, <IPython.core.display.Image object>]

Only clusters are significant, and not peaks. This suggests that the age effect covers large regions, rather than local foci. Furthermore, at the vertexwise level we only find a small group of significant vertices in the left cingulate cortex. Lets have a closer look at the clusters and their peaks. The data on clusters are stored in tables inside BrainStatModel.P.clus and information on the peaks is stored in BrainStatModel.P.peak. If a two-tailed test is run (BrainStat defaults to two-tailed), a table is returned for each tail. The first table uses the contrast as provided, the second table uses the inverse contrast. If a one-tailed test is performed, then only a single table is returned. Lets print the first 15 rows of the inverted contrast cluster table.

print(slm_age.P["clus"][1])

Out:

    clusid  nverts    resels         P
0        1   141.0  6.283315  0.000033
1        2    82.0  3.994467  0.001858
2        3    69.0  3.871711  0.002362
3        4    61.0  3.670485  0.003517
4        5    82.0  3.652319  0.003648
..     ...     ...       ...       ...
73      74     1.0  0.050811  1.000000
74      75     1.0  0.043958  1.000000
75      76     1.0  0.039022  1.000000
76      77     1.0  0.032002  1.000000
77      78     1.0  0.019503  1.000000

[78 rows x 4 columns]

Here, we see that cluster 1 contains 373 vertices. Clusters are sorted by p-value; later clusters will generally be smaller and have higher p-values. Lets now have a look at the peaks within these clusters.

print(slm_age.P["peak"][1])

Out:

            t  vertid  clusid          P               yeo7
0    5.695420   18720      11   0.001248  Ventral Attention
1    5.164823    5430      12   0.009035             Limbic
2    4.855500   16911       6   0.027242  Ventral Attention
3    4.833974   19629       2   0.029335     Frontoparietal
4    4.628306   12603      14   0.059519       Default mode
..        ...     ...     ...        ...                ...
109  2.403000    2276      62  23.356468  Ventral Attention
110  2.394788    2185      74  23.709038       Default mode
111  2.389922   14687      76  23.918494       Default mode
112  2.382012    6087      64  24.258914       Default mode
113  2.375295    3243      72  24.548027       Default mode

[114 rows x 5 columns]

Within cluster 1, we are able to detect several peaks. The peak with the highest t-statistic (t=4.3972) occurs at vertex 19629, which is inside the frontoparietal network as defined by the Yeo-7 networks. Note that the Yeo network membership is only provided if the surface is specified as a template name as we did here. For custom surfaces, or pre-loaded surfaces (as we will use below) this column is omitted.

Interaction effects models

Similarly to age, we can also test for the effect of sex on cortical thickness.

term_sex = FixedEffect(demographics.SEX)
model_sex = term_sex
contrast_sex = (demographics.SEX == "M").astype(int) - (demographics.SEX == "F").astype(
    int
)

Next we will rerrun the model and see if our results change.

slm_sex = SLM(
    model_sex,
    contrast_sex,
    surf=pial_combined,
    mask=mask,
    correction=["fdr", "rft"],
    two_tailed=False,
    cluster_threshold=0.01,
)
slm_sex.fit(thickness)
plot_slm_results(slm_sex)
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>]

Here, we find few significant effects of sex on cortical thickness. However, as we’ve already established, age has an effect on cortical thickness. So we may want to correct for this effect before evaluating whether sex has an effect on cortical thickenss. Lets make a new model that includes the effect of age.

model_sexage = term_age + term_sex

Next we will rerrun the model and see if our results change.

slm_sexage = SLM(
    model_sexage,
    contrast_sex,
    surf=pial_combined,
    mask=mask,
    correction=["fdr", "rft"],
    two_tailed=False,
    cluster_threshold=0.01,
)
slm_sexage.fit(thickness)
plot_slm_results(slm_sexage)
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>]

After accounting for the effect of age, we still find only one significant cluster of effect of sex on cortical thickness. However, it could be that age affects men and women differently. To account for this, we could include an interaction effect into the model. Lets run the model again with an interaction effect.

# Effect of sex on cortical thickness.
model_sexage_int = term_age + term_sex + term_age * term_sex

slm_sexage_int = SLM(
    model_sexage_int,
    contrast_sex,
    surf=pial_combined,
    mask=mask,
    correction=["rft"],
    cluster_threshold=0.01,
)
slm_sexage_int.fit(thickness)
plot_slm_results(slm_sexage_int)
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>]

After including the interaction effect, we no significant effects of sex on cortical thickness in several clusters.

We could also look at whether the cortex of men and women changes differently with age by comparing their interaction effects.

# Effect of age on cortical thickness for the healthy group.
contrast_sex_int = demographics.AGE_AT_SCAN * (
    demographics.SEX == "M"
) - demographics.AGE_AT_SCAN * (demographics.SEX == "F")

slm_sex_int = SLM(
    model_sexage_int,
    contrast_sex_int,
    surf=pial_combined,
    mask=mask,
    correction=["rft"],
    cluster_threshold=0.01,
)
slm_sex_int.fit(thickness)
plot_slm_results(slm_sex_int)
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>]

Indeed, it appears that the interaction effect between sex and age is quite different across men and women, with stronger effects occuring in women.

One-tailed Test

Imagine that, based on prior research, we hypothesize that men have higher cortical thickness than women. In that case, we could run this same model with a one-tailed test, rather than a two-tailed test. By default BrainStat uses a two-tailed test. If you want to get a one-tailed test, simply specify it in the SLM model initialization with ‘two_tailed’, false. Note that the one-tailed test will test for the significance of positive t-values. If you want to test for the significance of negative t-values, simply change the sign of the contrast. We may hypothesize based on prior research that cortical thickness decreases with age, so we could specify this as follows. Note the minus in front of contrast_age to test for decreasing thickness with age.

from brainstat.stats.SLM import SLM

slm_onetailed = SLM(
    model_sexage_int,
    -contrast_age,
    surf=pial_combined,
    mask=mask,
    correction=["rft"],
    cluster_threshold=0.01,
    two_tailed=False,
)
slm_onetailed.fit(thickness)
plot_slm_results(slm_onetailed)
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>]

Notice the additional clusters that we find when using a one-tailed test.

Mixed Effects Models

So far, we’ve considered multiple visits of the same subject as two separate, independent measurements. Clearly, however, such measurements are not independent of each other. To account for this, we could add subject ID as a random effect. Lets do this and test the effect of age on cortical thickness again.

from brainstat.stats.terms import MixedEffect

term_subject = MixedEffect(demographics.SUB_ID)

model_mixed = term_age + term_sex + term_age * term_sex + term_subject

slm_mixed = SLM(
    model_mixed,
    -contrast_age,
    surf=pial_combined,
    mask=mask,
    correction=["fdr", "rft"],
    cluster_threshold=0.01,
    two_tailed=False,
)
slm_mixed.fit(thickness)
plot_slm_results(slm_mixed, True, True)
  • plot tutorial 01 basics
  • plot tutorial 01 basics
  • plot tutorial 01 basics
  • plot tutorial 01 basics

Out:

[<IPython.core.display.Image object>, <IPython.core.display.Image object>, <IPython.core.display.Image object>, <IPython.core.display.Image object>]

Compared to our first age model, we find fewer and smaller clusters, indicating that by not accounting for the repeated measures structure of the data we were overestimating the significance of effects.

That concludes the basic usage of the BrainStat for statistical models. In the next tutorial we’ll show you how to use the context decoding module.

Total running time of the script: ( 0 minutes 13.555 seconds)

Gallery generated by Sphinx-Gallery

Tutorial 02: Context Decoding

In this tutorial you will learn about the context decoding tools included with BrainStat. The context decoding module consists of three parts: genetic decoding, meta-analytic decoding and histological comparisons. Before we start, lets run a linear model testing for the effects of age on cortical thickness as we did in Tutorial 1. We’ll use the results of this model later in this tutorial.

from brainstat.datasets import fetch_mask, fetch_template_surface
from brainstat.stats.SLM import SLM
from brainstat.stats.terms import FixedEffect, MixedEffect
from brainstat.tutorial.utils import fetch_mics_data

thickness, demographics = fetch_mics_data()
mask = fetch_mask("fsaverage5")

term_age = FixedEffect(demographics.AGE_AT_SCAN)
term_sex = FixedEffect(demographics.SEX)
term_subject = MixedEffect(demographics.SUB_ID)
model = term_age + term_sex + term_age * term_sex + term_subject

contrast_age = -model.mean.AGE_AT_SCAN
slm = SLM(
    model,
    contrast_age,
    surf="fsaverage5",
    mask=mask,
    correction=["fdr", "rft"],
    two_tailed=False,
    cluster_threshold=0.01,
)
slm.fit(thickness)
Genetics

For genetic decoding we use the Allen Human Brain Atlas through the abagen toolbox. Note that abagen only accepts parcellated data. Here is a minimal example of how we use abagen to get the genetic expression of the 100 regions of the Schaefer atlas and how to plot this expression to a matrix. Please note that downloading the dataset and running this analysis can take several minutes.

import copy

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from brainspace.utils.parcellation import reduce_by_labels
from matplotlib.cm import get_cmap

from brainstat.context.genetics import surface_genetic_expression
from brainstat.datasets import fetch_parcellation

# Get Schaefer-100 genetic expression.
schaefer_100_fs5 = fetch_parcellation("fsaverage5", "schaefer", 100)
surfaces = fetch_template_surface("fsaverage5", join=False)
expression = surface_genetic_expression(schaefer_100_fs5, surfaces, space="fsaverage")

# Plot Schaefer-100 genetic expression matrix.
colormap = copy.copy(get_cmap())
colormap.set_bad(color="black")
plt.imshow(expression, aspect="auto", cmap=colormap)
plt.colorbar()
plt.xlabel("Genetic Expression")
plt.ylabel("Schaefer 100 Regions")
plt.show()
plot tutorial 02 context

Expression is a pandas DataFrame which shows the genetic expression of genes within each region of the atlas. By default, the values will fall in the range [0, 1] where higher values represent higher expression. However, if you change the normalization function then this may change. Some regions may return NaN values for all genes. This occurs when there are no samples within this region across all donors. We’ve denoted this region with the black color in the matrix. By default, BrainStat uses all the default abagen parameters. If you wish to customize these parameters then the keyword arguments can be passed directly to surface_genetic_expression. For a full list of these arguments and their function please consult the abagen documentation.

Next, lets have a look at the correlation between one gene (WFDC1) and our t-statistic map. Lets also plot the expression of this gene to the surface.

# Plot correlation with WFDC1 gene
t_stat_schaefer_100 = reduce_by_labels(slm.t.flatten(), schaefer_100_fs5)[1:]

df = pd.DataFrame({"x": t_stat_schaefer_100, "y": expression["WFDC1"]})
df.dropna(inplace=True)
plt.scatter(df.x, df.y, s=5, c="k")
plt.xlabel("t-statistic")
plt.ylabel("WFDC1 expression")
plt.plot(np.unique(df.x), np.poly1d(np.polyfit(df.x, df.y, 1))(np.unique(df.x)), "k")
plt.text(-1.0, 0.75, f"r={df.x.corr(df.y):.2f}", fontdict={"size": 14})
plt.show()
plot tutorial 02 context
# Plot WFDC1 gene to the surface.
from brainspace.plotting.surface_plotting import plot_hemispheres
from brainspace.utils.parcellation import map_to_labels

vertexwise_WFDC1 = map_to_labels(
    expression["WFDC1"].to_numpy(),
    schaefer_100_fs5,
    mask=schaefer_100_fs5 != 0,
    fill=np.nan,
)

plot_hemispheres(
    surfaces[0],
    surfaces[1],
    vertexwise_WFDC1,
    color_bar=True,
    embed_nb=True,
    size=(1400, 200),
    zoom=1.45,
    nan_color=(0.7, 0.7, 0.7, 1),
    cb__labelTextProperty={"fontSize": 12},
)
plot tutorial 02 context

Out:

/Users/reinder/opt/miniconda3/envs/python3.8/lib/python3.8/site-packages/brainspace/plotting/base.py:287: UserWarning:

Interactive mode requires 'panel'. Setting 'interactive=False'


<IPython.core.display.Image object>

We find a small correlation. To test for significance we’ll have to do some additional corrections, but more on that later.

Meta-Analytic

To perform meta-analytic decoding, BrainStat uses precomputed Neurosynth maps. Here we test which terms are most associated with a map of cortical thickness. A simple example analysis can be run as follows. The surface decoder interpolates the data from the surface to the voxels in the volume that are in between the two input surfaces. We’ll decode the t-statistics derived with our model earlier. Note that downloading the dataset and running this analysis can take several minutes.

from brainstat.context.meta_analysis import meta_analytic_decoder

meta_analysis = meta_analytic_decoder("fsaverage5", slm.t.flatten())
print(meta_analysis)

Out:

                    Pearson's r
nucleus accumbens      0.207755
accumbens              0.207571
dorsal anterior        0.201299
dacc                   0.197329
ventral striatum       0.194612
...                         ...
selectivity           -0.225116
object recognition    -0.230603
v1                    -0.232651
lateral occipital     -0.232874
sighted               -0.249728

[3228 rows x 1 columns]

meta_analysis now contains a pandas.dataFrame with the correlation values for each requested feature. Next we could create a Wordcloud of the included terms, wherein larger words denote higher correlations.

from wordcloud import WordCloud

wc = WordCloud(background_color="white", random_state=0)
wc.generate_from_frequencies(frequencies=meta_analysis.to_dict()["Pearson's r"])
plt.imshow(wc)
plt.axis("off")
plt.show()
plot tutorial 02 context

If we broadly summarize, we see a lot of words related to language e.g., “language comprehension”, “broca”, “speaking”, “speech production”. Generally you’ll also find several hits related to anatomy or clinical conditions. Depending on your research question, it may be more interesting to select only those terms related to cognition or some other subset.

Histological decoding

For histological decoding we use microstructural profile covariance gradients, as first shown by (Paquola et al, 2019, Plos Biology), computed from the BigBrain dataset. Firstly, lets download the MPC data, compute and plot its gradients, and correlate the first gradient with our t-statistic map.

from brainstat.context.histology import (
    compute_histology_gradients,
    compute_mpc,
    read_histology_profile,
)


# Run the analysis
schaefer_400 = fetch_parcellation("fsaverage5", "schaefer", 400)
histology_profiles = read_histology_profile(template="fsaverage5")
mpc = compute_mpc(histology_profiles, labels=schaefer_400)
gradient_map = compute_histology_gradients(mpc, random_state=0)

# Bring parcellated data to vertex data.
vertexwise_gradient = map_to_labels(
    gradient_map.gradients_[:, 0],
    schaefer_400,
    mask=schaefer_400 != 0,
    fill=np.nan,
)

plot_hemispheres(
    surfaces[0],
    surfaces[1],
    vertexwise_gradient,
    embed_nb=True,
    nan_color=(0.7, 0.7, 0.7, 1),
    size=(1400, 200),
    zoom=1.45,
)
plot tutorial 02 context

Out:

/Users/reinder/GitHub/BrainStat/brainstat/context/histology.py:105: RuntimeWarning:

divide by zero encountered in true_divide

/Users/reinder/GitHub/BrainStat/brainstat/context/histology.py:105: RuntimeWarning:

invalid value encountered in log

/Users/reinder/opt/miniconda3/envs/python3.8/lib/python3.8/site-packages/brainspace/plotting/base.py:287: UserWarning:

Interactive mode requires 'panel'. Setting 'interactive=False'


<IPython.core.display.Image object>
# Plot the correlation between the t-stat
t_stat_schaefer_400 = reduce_by_labels(slm.t.flatten(), schaefer_400)[1:]
df = pd.DataFrame({"x": t_stat_schaefer_400, "y": gradient_map.gradients_[:, 0]})
df.dropna(inplace=True)
plt.scatter(df.x, df.y, s=5, c="k")
plt.xlabel("t-statistic")
plt.ylabel("MPC Gradient 1")
plt.plot(np.unique(df.x), np.poly1d(np.polyfit(df.x, df.y, 1))(np.unique(df.x)), "k")
plt.text(2.3, 0.1, f"r={df.x.corr(df.y):.2f}", fontdict={"size": 14})
plt.show()
plot tutorial 02 context

The variable histology_profiles now contains histological profiles sampled at 50 different depths across the cortex, mpc contains the covariance of these profiles, and gradient_map contains their gradients. We also see that the correlation between our t-statistic map and these gradients is not very high. Depending on your use-case, each of the three variables here could be of interest, but for purposes of this tutorial we’ll plot the gradients to the surface with BrainSpace. For details on what the GradientMaps class (gradient_map) contains please consult the BrainSpace documentation.

from brainspace.utils.parcellation import map_to_labels

surfaces = fetch_template_surface("fsaverage5", join=False)

# Bring parcellated data to vertex data.
vertexwise_data = []
for i in range(0, 2):
    vertexwise_data.append(
        map_to_labels(
            gradient_map.gradients_[:, i],
            schaefer_400,
            mask=schaefer_400 != 0,
            fill=np.nan,
        )
    )

# Plot to surface.
plot_hemispheres(
    surfaces[0],
    surfaces[1],
    vertexwise_data,
    embed_nb=True,
    label_text=["Gradient 1", "Gradient 2"],
    color_bar=True,
    size=(1400, 400),
    zoom=1.45,
    nan_color=(0.7, 0.7, 0.7, 1),
    cb__labelTextProperty={"fontSize": 12},
)
plot tutorial 02 context

Out:

/Users/reinder/opt/miniconda3/envs/python3.8/lib/python3.8/site-packages/brainspace/plotting/base.py:287: UserWarning:

Interactive mode requires 'panel'. Setting 'interactive=False'


<IPython.core.display.Image object>

Note that we no longer use the y-axis regression used in (Paquola et al, 2019, Plos Biology), as such the first gradient becomes an anterior-posterior gradient.

Resting-state contextualization

Lastly, BrainStat provides contextualization using resting-state fMRI markers: specifically, with the Yeo functional networks (Yeo et al., 2011, Journal of Neurophysiology), a clustering of resting-state connectivity, and the functional gradients (Margulies et al., 2016, PNAS), a lower dimensional manifold of resting-state connectivity.

As an example, lets have a look at the the t-statistic map within the Yeo networks. We’ll plot the Yeo networks as well as a barplot showing the mean and standard error of the mean within each network.

from brainstat.datasets import fetch_yeo_networks_metadata
from matplotlib.colors import ListedColormap

yeo_networks = fetch_parcellation("fsaverage5", "yeo", 7)
network_names, yeo_colormap = fetch_yeo_networks_metadata(7)
yeo_colormap_gray = np.concatenate((np.array([[0.7, 0.7, 0.7]]), yeo_colormap))

plot_hemispheres(
    surfaces[0],
    surfaces[1],
    yeo_networks,
    embed_nb=True,
    cmap=ListedColormap(yeo_colormap_gray),
    nan_color=(0.7, 0.7, 0.7, 1),
    size=(1400, 200),
    zoom=1.45,
)
plot tutorial 02 context

Out:

/Users/reinder/opt/miniconda3/envs/python3.8/lib/python3.8/site-packages/brainspace/plotting/base.py:287: UserWarning:

Interactive mode requires 'panel'. Setting 'interactive=False'


<IPython.core.display.Image object>
import matplotlib.pyplot as plt
from scipy.stats import sem

from brainstat.context.resting import yeo_networks_associations

yeo_tstat_mean = yeo_networks_associations(slm.t.flatten(), "fsaverage5")
yeo_tstat_sem = yeo_networks_associations(
    slm.t.flatten(),
    "fsaverage5",
    reduction_operation=lambda x, y: sem(x, nan_policy="omit"),
)

plt.bar(
    np.arange(7), yeo_tstat_mean[:, 0], yerr=yeo_tstat_sem.flatten(), color=yeo_colormap
)
plt.xticks(np.arange(7), network_names, rotation=90)
plt.gcf().subplots_adjust(bottom=0.3)
plt.show()
plot tutorial 02 context

Across all networks, the mean t-statistic appears to be negative, with the most negative values in the dorsal attnetion and visual networks.

Lastly, lets plot the functional gradients and have a look at their correlation with our t-map.

from brainstat.datasets import fetch_gradients

functional_gradients = fetch_gradients("fsaverage5", "margulies2016")


plot_hemispheres(
    surfaces[0],
    surfaces[1],
    functional_gradients[:, 0:3].T,
    color_bar=True,
    label_text=["Gradient 1", "Gradient 2", "Gradient 3"],
    embed_nb=True,
    size=(1400, 600),
    zoom=1.45,
    nan_color=(0.7, 0.7, 0.7, 1),
    cb__labelTextProperty={"fontSize": 12},
)
plot tutorial 02 context

Out:

/Users/reinder/opt/miniconda3/envs/python3.8/lib/python3.8/site-packages/brainspace/plotting/base.py:287: UserWarning:

Interactive mode requires 'panel'. Setting 'interactive=False'


<IPython.core.display.Image object>
df = pd.DataFrame({"x": slm.t.flatten(), "y": functional_gradients[:, 0]})
df.dropna(inplace=True)
plt.scatter(df.x, df.y, s=0.01, c="k")
plt.xlabel("t-statistic")
plt.ylabel("Functional Gradient 1")
plt.plot(np.unique(df.x), np.poly1d(np.polyfit(df.x, df.y, 1))(np.unique(df.x)), "k")
plt.text(-4.0, 6, f"r={df.x.corr(df.y):.2f}", fontdict={"size": 14})
plt.show()
plot tutorial 02 context

It seems the correlations are quite low. However, we’ll need some more complex tests to assess statistical significance. There are many ways to compare these gradients to cortical markers. In general, we recommend using corrections for spatial autocorrelation which are implemented in BrainSpace. We’ll show a correction with spin test in this tutorial; for other methods and further details please consult the BrainSpace tutorials.

In a spin test we compare the empirical correlation between the gradient and the cortical marker to a distribution of correlations derived from data rotated across the cortical surface. The p-value then depends on the percentile of the empirical correlation within the permuted distribution.

from brainspace.null_models import SpinPermutations

sphere_left, sphere_right = fetch_template_surface(
    "fsaverage5", layer="sphere", join=False
)
tstat = slm.t.flatten()
tstat_left = tstat[: slm.t.size // 2]
tstat_right = tstat[slm.t.size // 2 :]

# Run spin test with 1000 permutations.
n_rep = 1000
sp = SpinPermutations(n_rep=n_rep, random_state=2021)
sp.fit(sphere_left, points_rh=sphere_right)
tstat_rotated = np.hstack(sp.randomize(tstat_left, tstat_right))

# Compute correlation for empirical and permuted data.
mask = ~np.isnan(functional_gradients[:, 0]) & ~np.isnan(tstat)
r_empirical = np.corrcoef(functional_gradients[mask, 0], tstat[mask])[0, 1]
r_permuted = np.zeros(n_rep)
for i in range(n_rep):
    mask = ~np.isnan(functional_gradients[:, 0]) & ~np.isnan(tstat_rotated[i, :])
    r_permuted[i] = np.corrcoef(functional_gradients[mask, 0], tstat_rotated[i, mask])[
        1:, 0
    ]

# Significance depends on whether we do a one-tailed or two-tailed test.
# If one-tailed it depends on in which direction the test is.
p_value_right_tailed = np.mean(r_empirical > r_permuted)
p_value_left_tailed = np.mean(r_empirical < r_permuted)
p_value_two_tailed = np.minimum(p_value_right_tailed, p_value_left_tailed) * 2
print(f"Two tailed p-value: {p_value_two_tailed}")

# Plot the permuted distribution of correlations.
plt.hist(r_permuted, bins=20, color="c", edgecolor="k", alpha=0.65)
plt.axvline(r_empirical, color="k", linestyle="dashed", linewidth=1)
plt.show()
plot tutorial 02 context

Out:

Two tailed p-value: 0.094

As we can see from both the p-value as well as the histogram, wherein the dotted line denotes the empirical correlation, this correlation does not reach significance.

That concludes the tutorials of BrainStat. If anything is unclear, or if you think you’ve found a bug, please post it to the Issues page of our Github.

Happy BrainStating!

Total running time of the script: ( 4 minutes 12.472 seconds)

Gallery generated by Sphinx-Gallery

Gallery generated by Sphinx-Gallery

API

Neuroimaging statistics toolbox.

Modules

brainstat.context

BrainStat’s context decoding module.

brainstat.datasets

Data included with BrainStat.

brainstat.mesh

Module for the handling of meshes and mesh data.

brainstat.stats

The statistics tools of BrainStat

brainstat.tests

Unit tests and their data generation.

brainstat.tutorial

Functions required for the BrainStat Tutorials

MATLAB Index

MATLAB Tutorials

For MATLAB tutorials, we recommend viewing these either through the Examples tab on our FileExchange page, or by opening the files in MATLAB (PACKAGE_DIRECTORY/tutorials). Alternatively, we’ve also included copies of these live scripts in ReadTheDocs:

Funding

Our research is kindly supported by:

  • Canadian Institutes of Health Research (CIHR)

  • National Science and Engineering Research Council of Canada (NSERC)

  • Azrieli Center for Autism Research

  • The Montreal Neurological Institute]

  • Canada Research Chairs Program

  • BrainCanada

  • SickKids Foundation

  • Helmholtz Foundation

We would also like to thank these funders for training/salary support

  • Savoy Foundation for Epilepsy (to RV)

  • Richard and Ann Sievers Award (to RV)

  • Healthy Brain and Healthy Lives (to OB)

  • Fonds de la Recherche du Quebec - Sante (to BB)

Credits

Some references that are incorporated into BrainStat

SurfStat references

Worsley KJ et al. (2009) A Matlab toolbox for the statistical analysis of univariate and multivariate surface and volumetric data using linear mixed effects models and random field theory. NeuroImage, Volume 47, Supplement 1, July 2009, Pages S39-S41. https://doi.org/10.1016/S1053-8119(09)70882-1

Chung MK et al. (2010) General Multivariate Linear Modeling of Surface Shapes Using SurfStat Neuroimage. 53(2):491-505. doi: 10.1016/j.neuroimage.2010.06.032

Random field theory references

Adler RJ and Taylor JE (2007). Random fields and geometry. Springer. Hagler DJ Saygin AP and Sereno MI (2006). Smoothing and cluster thresholding for cortical surface-based group analysis of fMRI data. NeuroImage, 33:1093-1103.

Hayasaka S, Phan KL, Liberzon I, Worsley KJ and Nichols TE (2004). Non-Stationary cluster size inference with random field and permutation methods. NeuroImage, 22:676-687.

Taylor JE and Adler RJ (2003), Euler characteristics for Gaussian fields on manifolds. Annals of Probability, 31:533-563.

Taylor JE and Worsley KJ (2007). Detecting sparse signal in random fields, with an application to brain mapping. Journal of the American Statistical Association, 102:913-928.

Worsley KJ, Andermann M, Koulis T, MacDonald D, and Evans AC (1999). Detecting changes in non-isotropic images. Human Brain Mapping, 8:98-101.

Multivariate associative techniques

Contextualization