statistics | VALIANT /valiant Vanderbilt Advanced Lab for Immersive AI Translation (VALIANT) Thu, 21 Nov 2024 16:41:20 +0000 en-US hourly 1 MARVEL: Bringing Multi-Agent Reinforcement-Learning Based Variable Speed Limit Controllers Closer to Deployment /valiant/2024/11/21/marvel-bringing-multi-agent-reinforcement-learning-based-variable-speed-limit-controllers-closer-to-deployment/ Thu, 21 Nov 2024 16:41:20 +0000 /valiant/?p=3292 Zhang, Y.; Quinones-Grueiro, M.; Zhang, Z.; Wang, Y.; Barbour, W.; Biswas, G.; Work, D. “.” IEEE Access, 2024, .

Variable Speed Limits (VSL) are used worldwide to help manage traffic flow on highways. Most current systems use fixed rules, which can limit their effectiveness in handling different traffic situations. Recent research has explored using advanced machine learning techniques, specifically multi-agent reinforcement learning (MARL), to improve VSL systems. However, existing MARL approaches don’t meet the real-world requirements set by U.S. traffic agencies.

This study introduces a new MARL framework called MARVEL, designed to control VSL on large highway networks while meeting practical deployment needs. MARVEL only uses data from sensors that are commonly available on highways and learns to manage speed limits based on three key traffic goals to ensure it adapts well to different conditions. It shares learned strategies among multiple VSL control points, allowing it to scale across long stretches of road.

The framework was first tested in a detailed traffic simulation with 8 VSL control points over a 7-mile section. Then, it was applied to a larger 17-mile section of Interstate 24 (I-24) near Nashville, Tennessee, involving 34 control points. MARVEL showed significant improvements, increasing traffic safety by 63.4% compared to no VSL control and improving traffic flow by 58.6% compared to the current system used on I-24. The model was also tested using real-world traffic data from I-24, demonstrating its potential for real-world application.

FIGURE 1. We consider a large-scale VSL control problem with multiple gantries evenly distributed along the freeway, where the posted speed limit is identical across lanes for each gantry. Note that there is a traffic sensor collocated with each gantry to provide state input information. We order the VSL agents starting from the most downstream one, i.e., agent 1 manages the most downstream VSL gantry (controller), and agent n manages the most upstream VSL gantry (controller). Interactive Segmentation Model for Placenta Segmentation from 3D Ultrasound Images

]]>
Wasserstein task embedding for measuring task similarities /valiant/2024/11/21/wasserstein-task-embedding-for-measuring-task-similarities/ Thu, 21 Nov 2024 16:33:54 +0000 /valiant/?p=3283 Liu, X.; Bai, Y.; Lu, Y.; Soltoggio, A.; Kolouri, S. “.” Neural Networks, Volume 181, 2025, Article 106796, .

Measuring the similarity between different tasks is important for various machine learning problems, such as transfer learning, multi-task learning, and meta-learning. Most existing methods for measuring task similarities depend on the architecture of the model being used. These approaches either rely on pre-trained models or use forward transfer as a proxy by training networks on tasks. The method proposed here is different—it is model-agnostic, does not require training, and can handle tasks with partially overlapping label sets. The technique involves embedding task labels using multi-dimensional scaling, then combining dataset samples with their corresponding label embeddings. The similarity between two tasks is then defined as the 2-Wasserstein distance between their updated samples. This method allows tasks to be represented in a vector space, where the distance between tasks can be calculated more efficiently. The results show that this approach significantly speeds up the comparison of tasks compared to other methods like the Optimal Transport Dataset Distance (OTDD). Through various experiments, the authors demonstrate that their method is closely linked to how knowledge transfers between tasks, showing strong correlations between their task similarity measure and actual transfer performance on image recognition datasets.

 

Fig. 1. Wasserstein Task Embedding framework. Given labeled task distributions and with input space , WTE first maps them into as probability distributions and by label embedding via MDS, then apply WE to get vectors and with respect to a fixed reference measure . Here is the size of reference set.

]]>
Characterizing patterns of diffusion tensor imaging variance in aging brains /valiant/2024/09/22/characterizing-patterns-of-diffusion-tensor-imaging-variance-in-aging-brains/ Sun, 22 Sep 2024 15:15:20 +0000 /valiant/?p=3018 Gao, Chenyu, Yang, Qi, Kim, Michael E., Khairi, Nazirah Mohd, Cai, Leon Y., Newlin, Nancy R., Kanakaraj, Praitayini, Remedios, Lucas W., Krishnan, Aravind R., Yu, Xin, Yao, Tianyuan, & Zhang, Panpan. (2024). Characterizing patterns of diffusion tensor imaging variance in aging brains. Journal of Medical Imaging, 11(4), 044007.

This study investigates the variability in diffusion tensor imaging (DTI) data, particularly when data are merged from multiple sites, which is crucial for large-scale analyses. DTI measures can be affected by spatially varying and correlated noise, making it important to understand how different factors—like physiology, subject behavior, and scanner interaction—impact the reliability of the results. The researchers focused on characterizing the sources of variance in DTI metrics in different brain regions to improve the accuracy of future analyses.

Using data from 1,035 subjects, aged 22 to 103, in the Baltimore Longitudinal Study of Aging, the study analyzed how DTI variance changes over time and across multiple factors. Each subject had up to 12 longitudinal DTI scans, and the authors examined how factors such as age, scan interval, motion, sex, and session order affected DTI variance in different regions of the brain. They found that the impact of these factors was complex and varied across regions. For example, the time between scans was associated with increased variance in some areas (like the cuneus and occipital gyrus) but decreased variance in others (such as the caudate nucleus). Additionally, males showed higher variability in specific regions, and head motion had a mixed impact on DTI variance across different regions.

The findings highlight the need for researchers to consider the variability in DTI metrics when analyzing data and planning studies. By accounting for these regional variations in variability, researchers can improve the accuracy and reliability of DTI-based analyses, especially in large, multi-site studies. This work also emphasizes the importance of including variance estimates in data sharing to enhance the quality of future research.

We observe that the noise (approximated by the difference between the scan and rescan
acquired within the same imaging session) in DTI scalar images, such as FA images, generally
increases with age. (Subjects’ ages are grouped into 5-year bins to respect privacy.) But motion is
also considered to increase with age.26,27 We would like to know the following: Which factor is
associated with DTI variance? Where and how does this association manifest?
]]>
Time-Series Few Shot Anomaly Detection for HVAC Systems /valiant/2024/09/22/time-series-few-shot-anomaly-detection-for-hvac-systems/ Sun, 22 Sep 2024 15:08:02 +0000 /valiant/?p=3012 Huang, Yuxin, Coursey, Austin, Quinones-Grueiro, Marcos, & Biswas, Gautam. (2024). Time-series few shot anomaly detection for HVAC systems. In Proceedings of the 12th IFAC Symposium on Fault Detection, Supervision and Safety for Technical Processes (SAFEPROCESS 2024), Ferrara, Italy, June 4-7, 2024, Volume 58, Issue 4, Pages 426-431.

This study addresses a common challenge in detecting anomalies in building heating, ventilation, and air conditioning (HVAC) systems: the limited availability of labeled data needed to train deep learning algorithms. Since labeled data is often scarce, the authors focus on developing a method that can effectively reuse existing data across different systems.

They propose a few-shot domain adaptation approach based on a long short-term memory (LSTM) autoencoder (AE) neural network. This method only requires data from normal system operations (nominal instances) from both a source domain (where more data is available) and a target domain (where data is limited). With just a small amount of data from the target system, the model adapts and improves its performance in detecting anomalies.

The results show that this approach performs better than existing unsupervised methods in various fault scenarios, making it a promising solution for detecting HVAC system anomalies in situations where labeled data is minimal. This method provides a more efficient way to detect system faults while overcoming the data limitations that often hinder the deployment of deep learning models in real-world applications.

LSTM autoencoder architecture
]]>
Empirical assessment of the assumptions of ComBat with diffusion tensor imaging /valiant/2024/06/20/empirical-assessment-of-the-assumptions-of-combat-with-diffusion-tensor-imaging/ Thu, 20 Jun 2024 17:40:01 +0000 /valiant/?p=2597 Michael E. Kim, Chenyu Gao, Leon Y. Cai, Qi Yang, Nancy R. Newlin, Karthik Ramadass, Angela Jefferson, Derek Archer, Niranjana Shashikumar, Kimberly R. Pechman, Katherine A. Gifford, Timothy J. Hohman, Lori L. Beason-Held, Susan M. Resnick, Stefan Winzeck, Kurt G. Schilling, Panpan Zhang, Daniel Moyer, and Bennett A. Landman. “.” Journal of Medical Imaging (Bellingham), vol. 11, no. 2, 024011, March 2024. doi:10.1117/1.JMI.11.2.024011.

Diffusion tensor imaging (DTI) is a magnetic resonance imaging technique that provides unique insights into white matter microstructure in the brain. However, it is susceptible to confounding effects introduced by scanner or acquisition differences. ComBat is a leading approach for addressing these site biases. Despite its frequent use for harmonization, ComBat’s robustness towards site dissimilarities and overall cohort size has not yet been evaluated in the context of DTI.

To address this, we matched 358 participants from two sites to create a “silver standard” cohort for multi-site harmonization. We harmonized mean fractional anisotropy (FA) and mean diffusivity (MD) calculated from participant DTI data for regions of interest defined by the JHU EVE-Type III atlas. To quantify the reliability of ComBat, we performed bootstrapping over 10 iterations at 19 levels of total sample size, 10 levels of sample size imbalance between sites, and 6 levels of mean age difference between sites. We measured three key metrics: (i) β_AGE, the linear regression coefficient of the relationship between FA and age; (ii) γ_sf, the ComBat-estimated site-shift; and (iii) δ_sf, the ComBat-estimated site-scaling. We evaluated the reliability of ComBat by calculating the root mean squared error (RMSE) in these metrics and examined the correlation between the reliability of ComBat and the violation of model assumptions.

Our results indicate that ComBat performs reliably for β_AGE when the total sample size is greater than 162 and the mean age difference between sites is less than 4 years. The assumptions of the ComBat model regarding the normality of residual distributions are not violated as the model becomes unstable.

In conclusion, before harmonizing DTI data with ComBat, it is crucial to examine the input cohort for size and covariate distributions at each site. Direct assessment of residual distributions is less informative on stability than bootstrap analysis. We advise caution when using ComBat in situations that do not conform to the identified thresholds.

After registration of the JHU EVE-III Atlas, mean FA values were calculated in all the regions for each participant in the silver standard cohort. A point in the experimental space is “feasible” if the sample size for either site is at least
N = 6, the imbalance level does not result in N for either site exceeding the available number of participants for that site, and if sampling of participants yielded a covariate shift within 1 year of the target age difference between sites. For each feasible point in the experimental space, 10 bootstraps were subsampled from the silver standard cohort, and the FA values for the subsamples were harmonized by ComBat. The resulting parameters were then compared to those from the silver standard to determine reliability of ComBat at that location in the experimental space.
]]>
Evaluation of Mean Shift, ComBat, and CycleGAN for Harmonizing Brain Connectivity Matrices Across Sites /valiant/2024/06/20/evaluation-of-mean-shift-combat-and-cyclegan-for-harmonizing-brain-connectivity-matrices-across-sites/ Thu, 20 Jun 2024 17:37:07 +0000 /valiant/?p=2594 Hanliang Xu, Nancy R. Newlin, Michael E. Kim, Chenyu Gao, Praitayini Kanakaraj, Aravind R. Krishnan, Lucas W. Remedios, Nazirah Mohd Khairi, Kimberly Pechman, Derek Archer, Timothy J. Hohman, Angela L. Jefferson, Ivana Išgum, Yuankai Huo, Daniel Moyer, Kurt G. Schilling, and Bennett A. Landman. “.” Proceedings of SPIE Medical Imaging 2024: Image Processing, vol. 12926, 129261X, 2024, San Diego, California

Connectivity matrices derived from diffusion MRI (dMRI) offer an interpretable and generalizable way to understand the human brain connectome. However, dMRI is subject to inter-site and between-scanner variations, which can hinder cross-dataset analysis and affect the robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph measures derived from these matrices before and after applying three harmonization techniques: mean shift, ComBat, and CycleGAN.

The sample consisted of 168 age-matched, sex-matched normal subjects from two studies: the ý Memory and Aging Project (VMAP) and the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD). First, we plotted the graph measures and used the coefficient of variation (CoV) and the Mann-Whitney U test to assess the effectiveness of each method in removing site effects on the matrices and the derived graph measures.

ComBat effectively eliminated site effects for global efficiency and modularity, outperforming the other two methods. However, all methods exhibited poor performance when harmonizing average betweenness centrality. Second, we examined whether the harmonization methods preserved correlations between age and graph measures. All methods, except for CycleGAN in one direction, improved the correlations between age and global efficiency and between age and modularity, changing them from insignificant to significant with p-values less than 0.05.

These findings suggest that while ComBat is particularly effective for certain graph measures, challenges remain in harmonizing other measures like betweenness centrality. Nonetheless, the ability of these methods to enhance the significance of age-related correlations highlights their potential in improving the robustness of dMRI connectivity analyses across different datasets.

Figure 1. Systematic variability of connectivity matrices is high across sites. The difference matrix indicates that site 1
generates tractograms with generally longer streamlines. Note the substantial differences in the first and third quadrant.
Site 2 has fewer, shorter inter-hemispheric streamlines; site 1 has more longer streamlines across hemispheres.
]]>
MidRISH: Unbiased harmonization of rotationally invariant harmonics of the diffusion signal /valiant/2024/06/20/midrish-unbiased-harmonization-of-rotationally-invariant-harmonics-of-the-diffusion-signal/ Thu, 20 Jun 2024 17:31:00 +0000 /valiant/?p=2588 Nancy R. Newlin, Michael E. Kim, Praitayini Kanakaraj, Tianyuan Yao, Timothy Hohman, Kimberly R. Pechman, Lori L. Beason-Held, Susan M. Resnick, Derek Archer, Angela Jefferson, Bennett A. Landman, and Daniel Moyer. “” Magnetic Resonance Imaging, 2024, doi:10.1016/j.mri.2024.03.033.

Data harmonization is essential for eliminating confounding effects in multi-site diffusion image analysis. One such harmonization method, LinearRISH, scales rotationally invariant spherical harmonic (RISH) features from a “target” site to match those from a “reference” site, aiming to reduce scanner-related confounding effects. However, the designation of reference and target sites is not arbitrary, and this choice can bias the resulting diffusion metrics such as fractional anisotropy and mean diffusivity. This study introduces MidRISH, a method that projects both sites to a mid-space, thereby avoiding the bias introduced by reference site selection. The MidRISH method was validated through two experiments: harmonizing scanner differences in 37 matched patients free of cognitive impairment, and harmonizing acquisition and study differences in 117 matched patients free of cognitive impairment. The results demonstrate that MidRISH reduces the bias associated with reference site selection while maintaining the harmonization efficacy of LinearRISH. Users should be cautious when using LinearRISH harmonization, as choosing a reference site impacts the effect size of diffusion metrics. The proposed MidRISH method eliminates the bias-inducing site selection step, offering a more robust approach to harmonization.

Fig. 1. When site A (blue) is selected as the reference site for LinearRISH harmonization, site B’ (yellow) mean MD shifts up to the site A expected value. On the other hand, selecting site B (pink) as reference causes site A’ (green) to shift down to the site B expected value. It is up to the user to decide, which leads to arbitrary bias. Here we propose a quantitative solution.
]]>
Nonlinear Gradient Field Estimation in Diffusion MRI Tensor Simulation /valiant/2024/06/20/nonlinear-gradient-field-estimation-in-diffusion-mri-tensor-simulation/ Thu, 20 Jun 2024 15:49:15 +0000 /valiant/?p=2564 Praitayini Kanakaraj, Tianyuan Yao, Nancy R. Newlin, Leon Y. Cai, Kurt G. Schilling, Baxter P. Rogers, Adam Anderson, Daniel Moyer, and Bennett A. Landman. “.” Proceedings of SPIE Medical Imaging 2024: Physics of Medical Imaging, vol. 12925, 1292549, 2024, San Diego, California

Gradient nonlinearities in magnetic resonance imaging (MRI) not only cause spatial distortions but also create discrepancies between the intended and acquired diffusion sensitization in diffusion-weighted (DW) MRI. With advances in scanner performance, correcting these gradient nonlinearities has become increasingly important. Common methods for estimating gradient nonlinear fields rely on phantom calibration field maps, which are often impractical, especially for retrospective data.

This study presents a new approach to estimate the complete gradient nonlinear field, denoted as L(r), by formulating a quadratic minimization problem. This method begins with the corrupted diffusion signal and estimates L(r) under two scenarios: (1) when the true diffusion tensor is known, and (2) when the true diffusion tensor is unknown and must be estimated. The validity of this mathematical approach is demonstrated both theoretically and through tensor simulation.

The estimated field is evaluated using diffusion tensor metrics: mean diffusivity (MD), fractional anisotropy (FA), and principal eigenvector (V1). Simulations with 300 diffusion tensors indicate that the formulation is stable and not ill-posed. When the true diffusion tensor is known, the change in the determinant of the estimated L(r) field relative to the true field is near zero, and the median difference in corrected diffusion metrics compared to true values is also near zero. The results show that the accuracy of L(r) estimation depends on the level of corruption in L(r).

This work introduces a novel mathematical method to estimate the gradient field without requiring additional calibration scans, offering a significant advancement for correcting gradient nonlinearities in DW MRI.

Figure 5 For two L(r) matrix (with determinant = 1.0128 and 1.0832) the true, corrupt, and corrected diffusion
tensors are shown for FA values 0.25, 0.50, and 0.75 when SNR = 30. Corrected diffusion tensor overlaid with true
tensor (column 2 and 3) appear alike when determinant = 1.0128, while with determinant = 1.0832 there are slight
variations between the estimated and true tensors (column 5 and 6).
]]>
pyPheWAS Explorer: a visualization tool for exploratory analysis of phenome-disease associations /valiant/2024/06/20/pyphewas-explorer-a-visualization-tool-for-exploratory-analysis-of-phenome-disease-associations/ Thu, 20 Jun 2024 14:32:13 +0000 /valiant/?p=2522 Cailey I. Kerley, Tin Q. Nguyen, Karthik Ramadass, Laurie E. Cutting, Bennett A. Landman, and Matthew Berger. “.” JAMIA Open, vol. 6, no. 1, 2023,

Objective: This study aims to provide an easy-to-use tool for visualizing phenome-wide association studies (PheWAS) using electronic health records (EHR).

Materials and Methods: Current PheWAS tools are complicated, requiring command-line skills and lacking full visualizations. The new tool, pyPheWAS Explorer, offers a graphical interface to help users analyze variables, test assumptions, design models, and view results seamlessly.

Results: The tool was tested with data from individuals with attention deficit hyperactivity disorder (ADHD) and a control group. Using pyPheWAS Explorer, researchers created a model that included sex and socioeconomic status as factors. The tool effectively highlighted known ADHD-related health issues.

Discussion: pyPheWAS Explorer can quickly uncover new EHR associations, making it useful for clinical experts and as an initial exploration tool for institutional EHR databases.

Conclusion: pyPheWAS Explorer simplifies the process of designing, running, and analyzing PheWAS studies, focusing on exploratory data analysis and covariate selection through an intuitive graphical interface.

Figure 2. pyPheWAS Explorer Regression Builder Panel. For demonstration, a cohort of ADHD cases and non-ADHD controls is shown. Group variables in this dataset included minimum/maximum age at visit (MinAgeAtVisit/MaxAgeAtVisit), biological sex, body mass index (BMI), and deprivation index (DEP_INDEX). The right side of this panel shows the variables sex and deprivation index loaded into the variable comparison view, while the model selection view shows the same variables added to a binary PheWAS model. Color encodings for the case and control groups, correlations, and regression coefficients are shown along the top bar.
]]>