Signal ProcessingEnglishPublished

New 'TUSA' method teaches AI to read ultrasound by learning textures

March 20, 2026arXiv: 2602.01444v1

Ultrasound pictures look different from ordinary photos. They are made from echoes of sound and show characteristic gray-scale textures. A team of researchers proposes a new way to train large AI models so they learn those ultrasound textures directly. They call the method texture ultrasound semantic analysis, or TUSA.

Instead of treating ultrasound like a normal image, TUSA splits the reconstruction task into two steps. First, a neural network assigns each pixel to one of a small number of texture channels. Second, each channel is convolved with its own learnable filter and the results are added back together to reproduce the original image. The authors implemented this with a recent segmentation backbone (SwinUNETR) and a Sparsemax step that encourages the network to pick a single texture channel per pixel.

TUSA is trained without manual labels. The approach uses contrastive and self-supervised ideas so the model learns from many unlabeled B-mode images. The team trained models on a mix of open-source, simulated, and in vivo data. After training, the encoder can be detached to produce texture-focused representations for other tasks, or the segmented textures can be reused or fine-tuned for segmentation.

The authors compared TUSA’s internal representations (latent space) to several larger “foundation” models that were trained on medical images. They report that TUSA generalized better on several downstream tasks. In the paper the model achieves reported detection accuracies of 70% for COVID-related findings, 100% for spinal hematoma, and 97% for vitreous hemorrhage. It also showed correlations with quantitative clinical measures: liver steatosis (r = 0.83), heart ejection fraction (r = 0.63), and blood oxygen saturation (r = 0.38).

Important caveats apply. The paper excerpt does not give full details about dataset sizes, the exact test sets, or statistical uncertainty for those numbers, so the reported percentages and correlations need context before drawing clinical conclusions. The authors also note more general limitations of existing foundation models: many were not designed with ultrasound physics in mind and can fail on images from probes or settings not present in their training data. TUSA aims to reduce that gap, but broader and independent validation will be needed to confirm robustness across scanners, patient populations, and clinical settings.