harmonization | VALIANT

An evaluation of image-based and statistical techniques for harmonizing brain volume measurements

waddelma — Mon, 25 Aug 2025 20:03:24 +0000

Lu, Yuanchiao, Zuo, Lianrui, Chou, Yiyu, Dewey, Blake E., Remedios, Samuel W., Shinohara, Russell Takeshi, Steele, Sonya Ulrike, Nair, Govind, Reich, Daniel S., & Prince, Jerry L. (2025). “.” Imaging Neuroscience, 3, IMAG.a.73.

Measuring brain volumes from MRI scans can be tricky because differences in scanner hardware, software, and settings can create inconsistencies. In recent years, researchers have developed “harmonization” methods to correct for these differences. This study compares three such methods: neuroCombat (a statistical correction tool), DeepHarmony (a supervised deep learning method), and HACA3 (an unsupervised deep learning approach). We tested how well these methods produce consistent brain volume measurements across two types of MRI scans (GRE and MPRAGE) and how accurately they detect simulated brain shrinkage (atrophy).

All three methods improved the consistency of brain volume measurements compared to uncorrected scans. Among them, HACA3 performed the best, showing the smallest measurement differences across all brain regions (<3%) and the highest agreement between GRE and MPRAGE scans. HACA3 also had the highest reliability across regions. In tests simulating atrophy, HACA3 most accurately preserved unchanged brain regions, DeepHarmony improved several regions, and neuroCombat showed more variability. Notably, neuroCombat could detect hippocampal atrophy only when trained on sample data, highlighting a limitation when training data are unavailable.

Overall, HACA3 was the most effective method for harmonizing MRI scans, followed by DeepHarmony, with neuroCombat showing improvements over uncorrected scans but more variability.

Fig 1 Harmonization procedures using neuroCombat, DeepHarmony, and HACA3 methods

Mitigating Over-Saturated Fluorescence Images Through a Semi-Supervised Generative Adversarial Network

landmaba — Sun, 22 Sep 2024 03:58:25 +0000

Bao, Shunxing, Guo, Junlin, Lee, Ho Hin, Deng, Ruining, Cui, Can, Remedios, Lucas W., Liu, Quan, Yang, Qi, Xu, Kaiwen, Yu, Xin, Li, Jia, & Li, Yike. (2024). Mitigating over-saturated fluorescence images through a semi-supervised generative adversarial network. In Proceedings of the 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024), Athens, Greece, May 27-30, 2024. https://doi.org/10.1109/ISBI56570.2024.10635687

The authors propose a novel solution using a hybrid generative adversarial network (GAN) called HD-mixGAN, which combines two different types of neural networks (CycleGAN and Pix2pixHD) to correct saturation artifacts. This approach takes advantage of both small datasets where paired (before and after) images are available and larger datasets that only have unpaired images of over-saturated regions. By generating synthetic data from the unpaired datasets using a CycleGAN and combining it with real data, the model effectively learns to correct saturation artifacts, improving the overall image quality.

The method was tested in a task to detect cell nuclei, where it significantly outperformed traditional methods, improving the accuracy (F1 score) by 6%. This approach represents the first focused effort to address saturation issues in multi-round MxIF imaging, providing a data-driven solution that enhances the accuracy of single-cell analysis. The study also makes its code and implementation freely available, facilitating further research and applications in this area.

This study addresses a key challenge in multiplex immunofluorescence (MxIF) imaging, a technique used in biomedical research to provide detailed insights into cell structures and spatial organization. While MxIF imaging, such as using DAPI staining to identify cell nuclei and CD20 staining for cell membranes, is invaluable for understanding cell composition, it suffers from saturation artifacts. These artifacts occur when certain areas of the image become overly bright, making it difficult to analyze individual cells accurately. Existing methods for correcting these saturation issues, like gamma correction, often fall short because they assume uniform saturation, which is rarely the case in practice.
The authors propose a novel solution using a hybrid generative adversarial network (GAN) called HD-mixGAN, which combines two different types of neural networks (CycleGAN and Pix2pixHD) to correct saturation artifacts. This approach takes advantage of both small datasets where paired (before and after) images are available and larger datasets that only have unpaired images of over-saturated regions. By generating synthetic data from the unpaired datasets using a CycleGAN and combining it with real data, the model effectively learns to correct saturation artifacts, improving the overall image quality.
The method was tested in a task to detect cell nuclei, where it significantly outperformed traditional methods, improving the accuracy (F1 score) by 6%. This approach represents the first focused effort to address saturation issues in multi-round MxIF imaging, providing a data-driven solution that enhances the accuracy of single-cell analysis. The study also makes its code and implementation freely available, facilitating further research and applications in this area.

Empirical assessment of the assumptions of ComBat with diffusion tensor imaging

landmaba — Thu, 20 Jun 2024 17:40:01 +0000

Michael E. Kim, Chenyu Gao, Leon Y. Cai, Qi Yang, Nancy R. Newlin, Karthik Ramadass, Angela Jefferson, Derek Archer, Niranjana Shashikumar, Kimberly R. Pechman, Katherine A. Gifford, Timothy J. Hohman, Lori L. Beason-Held, Susan M. Resnick, Stefan Winzeck, Kurt G. Schilling, Panpan Zhang, Daniel Moyer, and Bennett A. Landman. “.” Journal of Medical Imaging (Bellingham), vol. 11, no. 2, 024011, March 2024. doi:10.1117/1.JMI.11.2.024011.

Diffusion tensor imaging (DTI) is a magnetic resonance imaging technique that provides unique insights into white matter microstructure in the brain. However, it is susceptible to confounding effects introduced by scanner or acquisition differences. ComBat is a leading approach for addressing these site biases. Despite its frequent use for harmonization, ComBat’s robustness towards site dissimilarities and overall cohort size has not yet been evaluated in the context of DTI.

To address this, we matched 358 participants from two sites to create a “silver standard” cohort for multi-site harmonization. We harmonized mean fractional anisotropy (FA) and mean diffusivity (MD) calculated from participant DTI data for regions of interest defined by the JHU EVE-Type III atlas. To quantify the reliability of ComBat, we performed bootstrapping over 10 iterations at 19 levels of total sample size, 10 levels of sample size imbalance between sites, and 6 levels of mean age difference between sites. We measured three key metrics: (i) β_AGE, the linear regression coefficient of the relationship between FA and age; (ii) γ_sf, the ComBat-estimated site-shift; and (iii) δ_sf, the ComBat-estimated site-scaling. We evaluated the reliability of ComBat by calculating the root mean squared error (RMSE) in these metrics and examined the correlation between the reliability of ComBat and the violation of model assumptions.

Our results indicate that ComBat performs reliably for β_AGE when the total sample size is greater than 162 and the mean age difference between sites is less than 4 years. The assumptions of the ComBat model regarding the normality of residual distributions are not violated as the model becomes unstable.

In conclusion, before harmonizing DTI data with ComBat, it is crucial to examine the input cohort for size and covariate distributions at each site. Direct assessment of residual distributions is less informative on stability than bootstrap analysis. We advise caution when using ComBat in situations that do not conform to the identified thresholds.

After registration of the JHU EVE-III Atlas, mean FA values were calculated in all the regions for each participant in the silver standard cohort. A point in the experimental space is “feasible” if the sample size for either site is at least
N = 6, the imbalance level does not result in N for either site exceeding the available number of participants for that site, and if sampling of participants yielded a covariate shift within 1 year of the target age difference between sites. For each feasible point in the experimental space, 10 bootstraps were subsampled from the silver standard cohort, and the FA values for the subsamples were harmonized by ComBat. The resulting parameters were then compared to those from the silver standard to determine reliability of ComBat at that location in the experimental space.

Evaluation of Mean Shift, ComBat, and CycleGAN for Harmonizing Brain Connectivity Matrices Across Sites

landmaba — Thu, 20 Jun 2024 17:37:07 +0000

Hanliang Xu, Nancy R. Newlin, Michael E. Kim, Chenyu Gao, Praitayini Kanakaraj, Aravind R. Krishnan, Lucas W. Remedios, Nazirah Mohd Khairi, Kimberly Pechman, Derek Archer, Timothy J. Hohman, Angela L. Jefferson, Ivana Išgum, Yuankai Huo, Daniel Moyer, Kurt G. Schilling, and Bennett A. Landman. “.” Proceedings of SPIE Medical Imaging 2024: Image Processing, vol. 12926, 129261X, 2024, San Diego, California

Connectivity matrices derived from diffusion MRI (dMRI) offer an interpretable and generalizable way to understand the human brain connectome. However, dMRI is subject to inter-site and between-scanner variations, which can hinder cross-dataset analysis and affect the robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph measures derived from these matrices before and after applying three harmonization techniques: mean shift, ComBat, and CycleGAN.

The sample consisted of 168 age-matched, sex-matched normal subjects from two studies: the ��ý�� Memory and Aging Project (VMAP) and the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD). First, we plotted the graph measures and used the coefficient of variation (CoV) and the Mann-Whitney U test to assess the effectiveness of each method in removing site effects on the matrices and the derived graph measures.

ComBat effectively eliminated site effects for global efficiency and modularity, outperforming the other two methods. However, all methods exhibited poor performance when harmonizing average betweenness centrality. Second, we examined whether the harmonization methods preserved correlations between age and graph measures. All methods, except for CycleGAN in one direction, improved the correlations between age and global efficiency and between age and modularity, changing them from insignificant to significant with p-values less than 0.05.

These findings suggest that while ComBat is particularly effective for certain graph measures, challenges remain in harmonizing other measures like betweenness centrality. Nonetheless, the ability of these methods to enhance the significance of age-related correlations highlights their potential in improving the robustness of dMRI connectivity analyses across different datasets.

Figure 1. Systematic variability of connectivity matrices is high across sites. The difference matrix indicates that site 1
generates tractograms with generally longer streamlines. Note the substantial differences in the first and third quadrant.
Site 2 has fewer, shorter inter-hemispheric streamlines; site 1 has more longer streamlines across hemispheres.

Inter-vendor harmonization of CT reconstruction kernels using unpaired image translation

landmaba — Thu, 20 Jun 2024 17:33:34 +0000

Aravind R. Krishnan, Kaiwen Xu, Thomas Li, Chenyu Gao, Lucas W. Remedios, Praitayini Kanakaraj, Ho Hin Lee, Shunxing Bao, Kim L. Sandler, Fabien Maldonado, Ivana Išgum, and Bennett A. Landman. “.” Proceedings of SPIE Medical Imaging 2024: Image Processing, vol. 12926, 129261D, 2024, San Diego, California

The reconstruction kernel in computed tomography (CT) generation determines the image texture, and consistency in reconstruction kernels is crucial because the underlying CT texture can affect quantitative image analysis measurements. Harmonization, or kernel conversion, aims to minimize measurement differences caused by inconsistent reconstruction kernels. Existing methods for CT scan harmonization across single or multiple manufacturers require paired scans of hard and soft reconstruction kernels that are spatially and anatomically aligned, necessitating the training of numerous models across different kernel pairs within manufacturers.

In this study, an unpaired image translation approach was adopted to investigate harmonization between and across reconstruction kernels from different manufacturers. A multipath cycle generative adversarial network (GAN) was constructed, utilizing hard and soft reconstruction kernels from Siemens and GE vendors, sourced from the National Lung Screening Trial dataset. Fifty scans from each reconstruction kernel were used to train the multipath cycle GAN. To evaluate the effect of harmonization on the reconstruction kernels, 50 scans each from Siemens hard kernel, GE soft kernel, and GE hard kernel were harmonized to a reference Siemens soft kernel (B30f), and the percent emphysema was assessed.

A linear model was fitted considering age, smoking status, sex, and vendor, followed by an analysis of variance (ANOVA) on the emphysema scores. The approach minimized differences in emphysema measurement and highlighted the impact of age, sex, smoking status, and vendor on emphysema quantification. This study demonstrates the effectiveness of using unpaired image translation with multipath cycle GANs for kernel harmonization across different manufacturers, improving the consistency and reliability of quantitative image analysis.

Figure 1. Differences in reconstruction kernels can be minimized by harmonizing to a reference standard. Harmonizing between paired kernels (left) has been explored due to the presence of one-to-one pixel correspondence between scans. However, unpaired kernels (right) create additional difficulties due to the difference in the anatomical alignment of scans obtained for different subjects from different vendors.

MidRISH: Unbiased harmonization of rotationally invariant harmonics of the diffusion signal

landmaba — Thu, 20 Jun 2024 17:31:00 +0000

Nancy R. Newlin, Michael E. Kim, Praitayini Kanakaraj, Tianyuan Yao, Timothy Hohman, Kimberly R. Pechman, Lori L. Beason-Held, Susan M. Resnick, Derek Archer, Angela Jefferson, Bennett A. Landman, and Daniel Moyer. “” Magnetic Resonance Imaging, 2024, doi:10.1016/j.mri.2024.03.033.

Data harmonization is essential for eliminating confounding effects in multi-site diffusion image analysis. One such harmonization method, LinearRISH, scales rotationally invariant spherical harmonic (RISH) features from a “target” site to match those from a “reference” site, aiming to reduce scanner-related confounding effects. However, the designation of reference and target sites is not arbitrary, and this choice can bias the resulting diffusion metrics such as fractional anisotropy and mean diffusivity. This study introduces MidRISH, a method that projects both sites to a mid-space, thereby avoiding the bias introduced by reference site selection. The MidRISH method was validated through two experiments: harmonizing scanner differences in 37 matched patients free of cognitive impairment, and harmonizing acquisition and study differences in 117 matched patients free of cognitive impairment. The results demonstrate that MidRISH reduces the bias associated with reference site selection while maintaining the harmonization efficacy of LinearRISH. Users should be cautious when using LinearRISH harmonization, as choosing a reference site impacts the effect size of diffusion metrics. The proposed MidRISH method eliminates the bias-inducing site selection step, offering a more robust approach to harmonization.

Fig. 1. When site A (blue) is selected as the reference site for LinearRISH harmonization, site B’ (yellow) mean MD shifts up to the site A expected value. On the other hand, selecting site B (pink) as reference causes site A’ (green) to shift down to the site B expected value. It is up to the user to decide, which leads to arbitrary bias. Here we propose a quantitative solution.