Longitudinal stability of brain and spinal cord quantitative MRI measures

Mathieu Boudreau1, Agah Karakuzu1, Arnaud Boré2,3, Basile Pinsard2,3, Kiril Zelenkovski4, Eva Alonso-Ortiz1,5, Julie Boyle2,3, Pierre Bellec2,3,6, Julien Cohen-Adad1,3,5,7

  • 1NeuroPoly, Polytechnique Montreal, Montreal, QC, Canada,
  • 2Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal (CRIUGM), Montreal, QC, Canada
  • 3Unité de Neuroimagerie Fonctionnelle (UNF), Centre de Recherche de l’Institut Universitaire de Gériatrie de Montréal (CRIUGM), Montreal, QC, Canada
  • 4Faculty of Computer Science and Engineering (FINKI), Skopje, Macedonia
  • 5Centre de recherche du CHU Sainte-Justine, Université de Montréal, Montreal, QC, Canada
  • 6Psychology Department, Université de Montréal, Montreal, QC, Canada
  • 7Mila - Quebec AI Institute, Montreal, QC, Canada


Quantitative MRI (qMRI) promises better specificity, accuracy, and stability relative to its clinically-used qualitative MRI counterpart. Longitudinal stability is particularly important in qMRI. The goal is to reliably quantify tissue properties that may be assessed in longitudinal clinical studies throughout disease progression or during treatment. In this work, we present the initial data release of the quantitative MRI portion of the Courtois project on neural modelling (CNeuroMod), where the brain and cervical spinal cord of six participants were scanned at regular intervals over the course of several years. This first release includes three years of data collection and up to ten sessions per participant using quantitative MRI imaging protocols (T1, magnetization transfer (MTR, MTsat), and diffusion). Coefficient of variations (COV) over this timeframe ranged between 0.6% to 2.3% (intrasubject) and 0.4% to 3.5% (intersubject) for T1/MTR/MTsat in whole-brain white matter (WM), and between 0.6% to 1.3% (intrasubject) and 3.0% to 10.3% (intersubject) for diffusion FA/MD/RD in the three corpus callosum regions. In the spine, COVs ranged between 2.3% and 4.5% (intrasubject) and 5.1% to 9.7% (intersubject) for measured spine WM cross-sectional area (CSA) across the C2 and C3 vertebral levels, and between 3.9% to 9.5% (intrasubject) and 4.0% to 8.4% (intersubject) in WM across the C2 and C5 vertebral levels for all qMRI metrics (T1, MTR, MTsat, FA, MD, RD). Results from this work show the level of stability that can be expected from qMRI protocols in the brain and spinal cord, and could help in the design of future longitudinal clinical studies.

1     |     INTRODUCTION#

Quantitative MRI and the reproducibility crisis#

Conventional MRI images used clinically stem from using the MRI machine as a non-invasive medical device and not as a scientific instrument [Cercignani et al., 2018, Tofts, 1998]. Medical images produced from clinical MRI protocols must be interpreted by expert readers to extract useful diagnostic information, as the images alone lack biological specificity and reproducibility, due to underlying changes in biology and the electromagnetic fields the imaging hardware generates. Quantitative MRI (qMRI) techniques [Seiberlich et al., 2020] aim to produce measurements of biological or physical properties through a series of carefully planned conventional MRI images. Quantitative maps are calculated or fit from these measured datasets, which have voxelwise values that typically have physical units associated with them, for example, spin-lattice relaxation time (T1 [s]), spin-spin relaxation time (T2 [s]), myelin water fraction (MWF [%]), magnetization transfer ratio (MTR [%]), cerebral blood flow (CBF [ml/g/min]) and diffusion (restricted diffusion coefficients [mm2/s], eg. mean diffusivity (MD) and radial diffusivity (RD)). Some qMRI techniques are highly specific to certain biological changes (eg, myelin loss [Mancini et al., 2020, Schmierer et al., 2007], cerebrovascular diseases and oxygen consumption disorders [Davis et al., 1998, Ma et al., 2016, Mazerolle et al., 2018, Wang et al., 2017], iron deficiency [Lidén et al., 2021, Ropele et al., 2011], etc.). Because these measures either implicitly or explicitly account for effects that typically are unaccounted for in clinical MRI images, in principle they should have improved stability – this is one of the hallmark-promising features of qMRI. However, in practice, the field has fallen short of living up to this high bar. Even fundamental quantitative MRI techniques have been shown to vary widely amongst methods and sites; for example, despite the fact that T1 mapping is the first quantitative MRI technique to have been developed 45 years ago [Pykett and Mansfield, 1978], modern T1 mapping techniques have not consistently shown good accuracy in measuring T1 values in the brain across different sites or techniques [Stikov et al., 2015]. A lot of work has been done recently to help quantify the accuracy and improve within-vendor stability of quantitative MR measurements, such as the development of quantitative MRI calibration phantoms [Golay and Oliver-Taylor, 2022, Keenan et al., 2018, Stupic et al., 2021] and increasing integration of quantitative MRI pulse sequences as stock sequences on commercial scanners [Ma et al., 2013, Marques et al., 2010, Seiberlich et al., 2012] or as vendor-neutral implementations [Herz et al., 2021, Karakuzu et al., 2022].

Stability in qMRI: why is it needed?#

The stability of a qMRI measurement is an important characteristic to consider when designing longitudinal studies, particularly when clinical features are expected to evolve over time (eg, worsening disease, or improvement through therapeutic intervention [Oh et al., 2021]). It is also important to know the anticipated variability of these metrics to find the minimum detectable effect size in a power analysis while designing your study. Same-day test-retest studies have shown that fundamental qMRI metrics (eg, T1, T2) exhibit low intra-scanner variability in vivo (on the order of 1-2%) [Gracien et al., 2020, Lee et al., 2019]. However, test-retest studies are limited in their usefulness as a stability measure because they only consist of two measurements (leading to improper standard deviation calculations) and are done during the same day (same scanner operator, same scanner conditions), which are not realistic conditions experienced during longitudinal studies. Longitudinal stability is thus important to quantify, but can be challenging due to the potential confounds from actual changes of the subject’s tissue properties over time, even from healthy volunteers. Quantitative MRI metrics in the brain have been shown to correlate with ageing through adulthood [Erramuzpe et al., 2021, Seiler et al., 2020], although changes appear to happen slowly (over decades) and thus short-term longitudinal studies (eg, 3-5 years) should in principle quantify longitudinal stability reliably.

Stability in (q)MRI: what’s been done#

Many studies have investigated the stability of morphometrics and quantitative MRI measures. A recent landmark study investigated the longitudinal stability of clinical and functional MRI metrics of a single subject’s brain measured on multiple vendors at multiple sites over the course of 15 years (73 sessions across 36 scanners) [Duchesne et al., 2019], finding poor reproducibility across MRI manufacturers for key clinical metrics (ie., white/grey matter contrast-to-noise ratio (CNR), FLAIR white matter hyperintensities volume). For qMRI metrics, there are a few longitudinal studies that have probed different aspects of their longitudinal stability. A 7-year scan-rescan brain ageing study explored the evolution of quantitative T1 values in different tissues using the variable flip angle (VFA) technique (which depends on an additional B1 map) [Gracien et al., 2017] and found T1 values were sensitive to ageing for this timespan. The stability of quantitative brain metrics when encountering MRI software and hardware upgrades was recently explored in a four time-point, seven-year repeatability and reproducibility study [Salluzzi et al., 2022], which reported the upgrades did not affect the effect size and stability of the tested MRI biomarkers. Stability has also been explored in non-brain anatomy. For spinal cord, inter-vendor variability was recently probed by a multi-center (19 sites) study using a generic quantitative MRI spinal cord imaging protocol [Cohen-Adad et al., 2021] on a single participant over the span of one year [Cohen-Adad, 2020]. A test-retest quantitative MRI spine study has also been performed in two cohorts (young adult and elderly) over a ten month period [Lévy et al., 2018], with minimal detectable changes reported for T1, MTR, MTsat, and macromolecular tissue volume (MTV) quantitative MRI measures.

Study Objective and the CNeuroMod Project#

The objective of this study was to measure and report the stability of quantitative microstructure MRI measurements across multiple time points in the brain and cervical spinal cord. To do this, two sets of qMRI protocols (brain and spinal cord) were integrated within the Courtois project on neural modelling (CNeuroMod)1 for collecting longitudinal data on healthy subjects to train and improve artificial intelligence models on brain behaviour and activity. The qMRI measurements of the brain and spinal cord fell within the “anatomical” imaging branch of the CNeuroMod project, and additional branches of data acquired include deep scanning with functional MRI, biosignals (eg, cardiac, respiration, eye tracking), and magnetoencephalography (MEG). In addition, we developed reproducible and reusable analysis pipelines for structural qMRI of the brain and spinal cord. These pipelines are built using state-of-the-art tools in terms of pipeline management (NextFlow [Di Tommaso et al., 2017]), structural data analyses (FSL [Smith et al., 2004], ANTs [Avants et al., 2009], qMRLab [Cabana et al., 2015, Karakuzu et al., 2020], SCT [De Leener et al., 2017], etc.) and Jupyter notebooks [Beg et al., 2021] with Plotly (Plotly Technologies Inc., 2015) for presenting curated and interactive results.

2     |     RESULTS#

Six participants were repeatedly scanned on a 3T MRI scanner (Prisma Fit, Siemens, Erlangen, Germany) approximately four times a year (up to ten times for this initial 2022 data release, with more scans regularly being acquired). Custom headcases (Caseforge, Berkeley, USA) were used for each participant to minimise movements during the imaging sessions. Two sets of imaging protocols were acquired (Figure 1), one for the brain (T1w, T2w, MP2RAGE, MTsat, B1+, and diffusion) and one for the spinal cord (T1w, T2w, MTsat, and diffusion).


FIGURE 1 Overview of the structural dataset for the Courtois project on neural modelling (CNeuroMod). 6 participants were scanned up to ten times over three years; note that this is an initial data release for 2022, and more scans are regularly being acquired. The structural protocol consists of T1w, T2w and T2*w scans to quantify brain and SC (including grey matter, GM) morphometry, and MP2RAGE, magnetization transfer (MTR and MTsat), and diffusion-weighted sequences to compute metrics sensitive to demyelination in the white matter (WM).

2.1     |     Brain#

Average quantitative MRI (excluding diffusion) values for the segmented whole-brain white matter (WM) and grey matter (GM) for each subject and session are shown in Figure 2. Missing data points are either unacquired sessions or because they were excluded after doing quality control, more details are listed in the “Quality Control” section. Note that MTR is calculated from a subset of the MTsat measurements, and B1 is not shown because it is only used as a transmit radiofrequency (RF) field correction factor for the MTsat measurement, and does not have biological specificity.

from os import path
import os

if path.isdir('analysis')== False:
    !git clone https://github.com/courtois-neuromod/anat-processing-book.git analysis
    dir_name = 'analysis'
    analysis = os.listdir(dir_name)

    for item in analysis:
        if item.endswith(".ipynb"):
            os.remove(os.path.join(dir_name, item))
        if item.endswith(".md"):
            os.remove(os.path.join(dir_name, item))

cwd = os.getcwd()

from tools.data import *
from tools.plot import *
from tools.stats import *


# Python imports 
from IPython.display import clear_output
from pathlib import Path
import numpy as np

import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 1000)
pd.set_option('display.colheader_justify', 'center')
pd.set_option('display.precision', 1)

data_type = 'brain'
release_version = 'latest'

dataset = Data(data_type)
dataset.data_dir = Path(os.path.join(data_path,data_type))

fig_gm = Plot(dataset, plot_name = 'brain-1')