Machine Vision | VALIANT /valiant Vanderbilt Advanced Lab for Immersive AI Translation (VALIANT) Thu, 21 Nov 2024 17:46:40 +0000 en-US hourly 1 Dual-tuned floating solenoid balun for multi-nuclear MRI and MRS /valiant/2024/11/21/dual-tuned-floating-solenoid-balun-for-multi-nuclear-mri-and-mrs/ Thu, 21 Nov 2024 17:46:40 +0000 /valiant/?p=3334 Yang, Y.; Zhang, B.; Lu, M.; Yan, X. “” Magnetic Resonance Imaging, Volume 115, 2025, Article 110268,

This study introduces a new, compact dual-tuned floating balun designed to reduce unwanted currents in MRI systems, which can affect performance and safety. Traditional floating baluns are bulky, especially in multi-nuclear MRI/MRS setups that use two RF systems. The new balun is smaller, fully removable, and does not require direct connections to cables. It uses inductive coupling between two solenoids and a two-layer design for better performance. Bench tests at 7 T showed high common-mode current suppression at both 1H and 23Na frequencies. This innovation improves MRI system efficiency by reducing size, enabling more coil elements, and simplifying cable management.

Fig. 1.

Design and construction of the dual-tuned FSB. (A) Schematic diagram of the dual-tuned FBB [,]. (B) Equivalent circuit schematic for dual-tuned FSB. The inner solenoid is floating and terminated withCLand pole-insertion circuit (CHandLH) to generate two resonant frequencies. The coaxial cable (grey) is winded around the inner solenoid to form the cable solenoid. (C) CAD model showing the mechanical structure for the dual-tuned balun. (D-E) Side view and front view of the fabricated balun for RG-174-like cable (Huber + Suhner G_02232_D). (F-G) Side view and front view of the fabricated balun for RG-223 cable.

]]>
Consensus tissue domain detection in spatial omics data using multiplex image labeling with regional morphology (MILWRM) /valiant/2024/11/21/consensus-tissue-domain-detection-in-spatial-omics-data-using-multiplex-image-labeling-with-regional-morphology-milwrm/ Thu, 21 Nov 2024 16:46:39 +0000 /valiant/?p=3298 Kaur, H.; Heiser, C.N.; McKinley, E.T.; Ventura-Antunes, L.; Harris, C.R.; Roland, J.T.; Farrow, M.A.; Selden, H.J.; Pingry, E.L.; Moore, J.F.; Ehrlich, L.I.R.; Shrubsole, M.J.; Spraggins, J.M.; Coffey, R.J.; Lau, K.S.; Vandekar, S.N. “ (MILWRM).” Communications Biology, Volume 7, Issue 1, 2024, Article 1295, .

New molecular imaging methods can capture detailed genetic and protein information directly from tissues, allowing scientists to study diseases while keeping the original structure of the tissue intact. By combining this molecular data with traditional tissue images, researchers can learn more about how different parts of tissues are affected by diseases. However, making sense of all this complex data, especially when comparing many samples, is challenging.

To help with this, we created MILWRM, a Python tool that can quickly find and label different areas within tissue samples. MILWRM analyzes images and groups similar parts of the tissue together, making it easier to identify specific regions.

We tested MILWRM on various tissue samples, including human colon polyps, lymph nodes, mouse kidneys, and mouse brain slices. The tool was able to distinguish different types of polyps and identify unique areas in the brain based on their molecular characteristics. MILWRM helps researchers understand the structure and molecular features of tissues, making it a valuable tool for studying diseases.

Fig. 1: The workflow of the MILWRM pipeline.

MILWRM begins with constructing a tissue labeler object from all the sample slides that undergo data preprocessing, serialization, and subsampling to create a randomly subsampled dataset used for k-means model construction. This subsampled data is used to find an optimal number of tissue domains, and k-selection using the adjusted inertia method. Finally, a k-means model is constructed, and each pixel is assigned a TD. Each TD has a distinct domain profile describing its molecular features. MILWRM also provides quality control metrics such as confidence scores (created with BioRender.com).

]]>
MARVEL: Bringing Multi-Agent Reinforcement-Learning Based Variable Speed Limit Controllers Closer to Deployment /valiant/2024/11/21/marvel-bringing-multi-agent-reinforcement-learning-based-variable-speed-limit-controllers-closer-to-deployment/ Thu, 21 Nov 2024 16:41:20 +0000 /valiant/?p=3292 Zhang, Y.; Quinones-Grueiro, M.; Zhang, Z.; Wang, Y.; Barbour, W.; Biswas, G.; Work, D. “.” IEEE Access, 2024, .

Variable Speed Limits (VSL) are used worldwide to help manage traffic flow on highways. Most current systems use fixed rules, which can limit their effectiveness in handling different traffic situations. Recent research has explored using advanced machine learning techniques, specifically multi-agent reinforcement learning (MARL), to improve VSL systems. However, existing MARL approaches don’t meet the real-world requirements set by U.S. traffic agencies.

This study introduces a new MARL framework called MARVEL, designed to control VSL on large highway networks while meeting practical deployment needs. MARVEL only uses data from sensors that are commonly available on highways and learns to manage speed limits based on three key traffic goals to ensure it adapts well to different conditions. It shares learned strategies among multiple VSL control points, allowing it to scale across long stretches of road.

The framework was first tested in a detailed traffic simulation with 8 VSL control points over a 7-mile section. Then, it was applied to a larger 17-mile section of Interstate 24 (I-24) near Nashville, Tennessee, involving 34 control points. MARVEL showed significant improvements, increasing traffic safety by 63.4% compared to no VSL control and improving traffic flow by 58.6% compared to the current system used on I-24. The model was also tested using real-world traffic data from I-24, demonstrating its potential for real-world application.

FIGURE 1. We consider a large-scale VSL control problem with multiple gantries evenly distributed along the freeway, where the posted speed limit is identical across lanes for each gantry. Note that there is a traffic sensor collocated with each gantry to provide state input information. We order the VSL agents starting from the most downstream one, i.e., agent 1 manages the most downstream VSL gantry (controller), and agent n manages the most upstream VSL gantry (controller). Interactive Segmentation Model for Placenta Segmentation from 3D Ultrasound Images

]]>
A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code Summarization /valiant/2024/11/21/a-tale-of-two-comprehensions-analyzing-student-programmer-attention-during-code-summarization/ Thu, 21 Nov 2024 16:37:21 +0000 /valiant/?p=3286 Karas, Z.; Bansal, A.; Zhang, Y.; Li, T.; Mcmillan, C.; Huang, Y. “.” ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 7, 2024, Article 193, .

Code summarization involves creating short, natural language descriptions of source code to help people understand it better. While previous research has looked at how programmers focus on different parts of the code when writing their own summaries, there hasn’t been much study on how they read and understand code with existing summaries. We don’t yet know how these two activities—reading and writing code summaries—compare, or how programmers pay attention to the meaning of the code during these tasks.

To explore this, we conducted an eye-tracking study with 27 participants to see where they focus when reading versus writing code summaries. We analyzed their gaze patterns, finding some differences in attention between the two tasks, as well as similarities in how they read code. We also noticed that factors like experience can influence these patterns. Additionally, we compared their gaze data to a structured representation of the code (Abstract Syntax Tree) and found that their visual focus doesn’t always match up with the actual code structure. These insights can help improve code comprehension in programming education and guide the development of automated tools for summarizing code.

.1. Example stimuli used in the task. In both conditions, the code was displayed on the left, and the summaries, pre-written or participant generated, were located in the top right. In the Reading condition, Likert scale questions for assessing summary quality were presented on the right below the pre-written summary.

]]>
Wasserstein task embedding for measuring task similarities /valiant/2024/11/21/wasserstein-task-embedding-for-measuring-task-similarities/ Thu, 21 Nov 2024 16:33:54 +0000 /valiant/?p=3283 Liu, X.; Bai, Y.; Lu, Y.; Soltoggio, A.; Kolouri, S. “.” Neural Networks, Volume 181, 2025, Article 106796, .

Measuring the similarity between different tasks is important for various machine learning problems, such as transfer learning, multi-task learning, and meta-learning. Most existing methods for measuring task similarities depend on the architecture of the model being used. These approaches either rely on pre-trained models or use forward transfer as a proxy by training networks on tasks. The method proposed here is different—it is model-agnostic, does not require training, and can handle tasks with partially overlapping label sets. The technique involves embedding task labels using multi-dimensional scaling, then combining dataset samples with their corresponding label embeddings. The similarity between two tasks is then defined as the 2-Wasserstein distance between their updated samples. This method allows tasks to be represented in a vector space, where the distance between tasks can be calculated more efficiently. The results show that this approach significantly speeds up the comparison of tasks compared to other methods like the Optimal Transport Dataset Distance (OTDD). Through various experiments, the authors demonstrate that their method is closely linked to how knowledge transfers between tasks, showing strong correlations between their task similarity measure and actual transfer performance on image recognition datasets.

 

Fig. 1. Wasserstein Task Embedding framework. Given labeled task distributions and with input space , WTE first maps them into as probability distributions and by label embedding via MDS, then apply WE to get vectors and with respect to a fixed reference measure . Here is the size of reference set.

]]>
Zero-shot prompt-based video encoder for surgical gesture recognition /valiant/2024/11/21/zero-shot-prompt-based-video-encoder-for-surgical-gesture-recognition/ Thu, 21 Nov 2024 16:29:09 +0000 /valiant/?p=3269 Rao, M.; Qin, Y.; Kolouri, S.; Wu, J.Y.; Moyer, D. “.” International Journal of Computer Assisted Radiology and Surgery, 2024,

This study explores how to build a system that can recognize surgical gestures without needing a large dataset for every possible gesture. Instead of collecting lots of labeled data, the goal is to create a model that can identify new, unseen gestures (known as zero-shot recognition). The researchers used a pre-trained model, CLIP, which understands both images and text, and adapted it for recognizing surgical gestures in videos. Their experiments showed that this method works better than traditional models, especially when the system needs to identify gestures it hasn’t been trained on. They also found that adding text descriptions during the training process improved the model’s performance. The approach, called “bridge-prompt,” shows great potential for surgical robots, as it can recognize a variety of gestures without needing retraining for each new one.

Fig. 1. Wasserstein Task Embedding framework. Given labeled task distributions and with input space , WTE first maps them into as probability distributions and by label embedding via MDS, then apply WE to get vectors and with respect to a fixed reference measure . Here is the size of reference set.

]]>
Radiofrequency-transparent local B0 shimming coils using float traps /valiant/2024/11/21/radiofrequency-transparent-local-b0-shimming-coils-using-float-traps/ Thu, 21 Nov 2024 16:19:16 +0000 /valiant/?p=3262 Liu, C.; Liang, H.; Lu, M.; Gore, J.C.; Sengupta, S.; Yan, X. “.” Magnetic Resonance in Medicine, 2024,

In high-field MRI, uneven magnetic fields (B0 inhomogeneities) can lead to poor image quality. A technique called multicoil shimming uses small coils to correct this issue, but traditional coils can interfere with the MRI’s radiofrequency (RF) signals, further reducing image quality. To solve this, a new type of coil has been developed that fixes the magnetic field problem without disrupting the RF signals. The design includes special features that prevent interference, allowing the coil to be placed near the MRI’s main coils without causing issues. Tests showed that this new coil improves magnetic field uniformity, particularly near metal implants, and reduces image distortion while maintaining the quality of the MRI signal. This innovation could enhance MRI scans without requiring major changes to existing equipment.

 

F I G U R E 1

Design and construction of the transparent direct-current (DC) coil. (A) Schematic diagram of the terminated capacitor configuration, showing the shorted and terminated capacitors of the balun. This figure demonstrates a float balun with the terminated capacitors positioned at one end. Note that they could be terminated at both ends or in the middle. (B) Cross-sectional view of the balun illustrating the placement of copper foil and multi-turn wires. (C) Schematic of the complete coil design, including multiturn wires for DC current and float radiofrequency (RF) baluns. (D) Photograph of a single float RF balun. (E) Another view of the float RF balun demonstrating its capacitor soldering. (F) Top view of the normal DC coil in a square shape, showing the coil configuration and copper wire layout. (G) Top view of the transparent DC coil, highlighting the arrangement of the float RF baluns and multiturn wires. ID, inner diameter; OD, outer diameter.

]]>
Multichannel meta-imagers for accelerating machine vision /valiant/2024/04/16/multichannel-meta-imagers-for-accelerating-machine-vision/ Tue, 16 Apr 2024 02:15:25 +0000 /valiant/?p=2000 Zheng H, Liu Q, Kravchenko II, Zhang X, Huo Y, Valentine JG. Nat Nanotechnol. 2024 Jan 4. doi: 10.1038/s41565-023-01557-2. Epub ahead of print. PMID: 38177276.

The study introduces a novel “meta-imager” that combines high-speed, low-power optical components with a digital backend to enhance machine vision systems, reducing the heavy computational load typically associated with digital neural networks. This innovative device utilizes metasurfaces for angle and polarization multiplexing, allowing it to perform complex convolution operations—essential for tasks like object classification—in a single optical shot. This integration effectively offloads much of the computational burden from the digital components to the optics, greatly reducing energy consumption and improving processing speed. The meta-imager demonstrated impressive performance, with 98.6% accuracy in classifying handwritten digits and 88.8% accuracy with fashion images. Given its compactness, efficiency, and speed, this technology shows great potential for a broad range of applications in artificial intelligence and machine vision fields, particularly in environments where real-time decision-making is crucial and computational resources are limited.


Classification of MNIST and Fashion MNIST objects. a, An input image from the MNIST dataset. b, Ideal and experimentally measured feature maps corresponding to the convolution of the data in a with channels 9 and 12. The top-left corner label indicates the channel number during convolution. c, Comparison between the theoretical and measured confusion matrices for MNIST classification. d, An input image from the Fashion MNIST dataset. The top-left corner label indicates the object class number. e, Ideal and experimentally measured feature maps corresponding to the convolution of the data in d with channels 9 and 12. The top-left corner label indicates the channel number during convolution. f, Comparison between the theoretical and measured confusion matrices for Fashion MNIST classification. g, Predicted accuracy curve for the MNIST dataset and the areal density of the basic computing unit as a function of pixel size. The insets depict the kernel profiles and feature maps at different pixel sizes.
]]>