ICCV Logo

GDKVM: Echocardiography Video Segmentation via
Spatiotemporal Key-Value Memory with Gated Delta Rule
GDKVM:含门控 Delta 规则的时空键值记忆的
超声心动图视频分割

Rui Wang1, Yimu Sun1, Jingxing Guo1, Huisi Wu1,*, Jing Qin2
1College of Computer Science and Software Engineering, Shenzhen University
2Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University
2400101058@mails.szu.edu.cn, *Corresponding Author: hswu@szu.edu.cn
GDKVM Architecture
Figure 1. An illustration of GDKVM architecture. Linear Key-Value Association defines frame-to-frame causal relations as the state transition matrix. Gated Delta Rule helps in dynamically managing memory. Key-Pixel Feature Fusion fuses the local key feature, the global key feature with the pixel feature. 图 1. GDKVM 架构示意图。线性键值关联 (Linear Key-Value Association) 将帧与帧之间的因果关系定义为状态转移矩阵。门控 Delta 规则 (Gated Delta Rule) 有助于动态地管理记忆。键-像素特征融合 (Key-Pixel Feature Fusion) 将局部键特征、全局键特征与像素特征进行了融合。

Abstract摘要

Accurate segmentation of cardiac chambers in echocardiography sequences is crucial for the quantitative analysis of cardiac function, aiding in clinical diagnosis and treatment. The imaging noise, artifacts, and the deformation and motion of the heart pose challenges to segmentation algorithms.

While existing methods based on convolutional neural networks, Transformers and space-time memory networks, have improved segmentation accuracy, they often struggle with the trade-off between capturing long-range spatiotemporal dependencies and maintaining computational efficiency with fine-grained feature representation.

In this paper, we introduce GDKVM, a novel architecture for echocardiography video segmentation. The model employs Linear Key-Value Association (LKVA) to effectively model inter-frame correlations, and introduces Gated Delta Rule (GDR) to efficiently store intermediate memory states. Key-Pixel Feature Fusion (KPFF) module is designed to integrate local and global features at multiple scales, enhancing robustness against boundary blurring and noise interference.

We validated GDKVM on two mainstream echocardiography video datasets (CAMUS and EchoNet-Dynamic) and compared it with various state-of-the-art methods. Experimental results show that GDKVM outperforms existing approaches in terms of segmentation accuracy and robustness, while ensuring real-time performance.

超声心动图序列中心脏腔室的精确分割对于心脏功能的定量分析至关重要,有助于临床诊断和治疗。成像噪声、伪影以及心脏的变形和运动给分割算法带来了挑战。

虽然现有的基于卷积神经网络、Transformer和时空记忆网络的方法提高了分割精度,但它们往往难以在捕捉长程时空依赖关系与保持细粒度特征表示的计算效率之间取得平衡。

在本文中,我们介绍了 GDKVM,一种用于超声心动图视频分割的新颖架构。该模型采用线性键值关联(LKVA)有效地建模帧间相关性,并引入门控Delta规则(GDR)以高效存储中间记忆状态。关键像素特征融合(KPFF)模块旨在整合多尺度的局部和全局特征,增强对边界模糊和噪声干扰的鲁棒性。

我们在两个主流超声心动图视频数据集(CAMUS和EchoNet-Dynamic)上验证了GDKVM,并将其与各种最先进的方法进行了比较。实验结果表明,GDKVM在分割精度和鲁棒性方面优于现有方法,同时确保了实时性能。

Challenges in Echocardiography Segmentation 超声心动图视频分割面临的挑战

Echocardiography segmentation faces several challenges such as low contrast, speckle noise, and signal dropout. 超声心动图视频分割面临着诸如低对比度、斑点噪声和信号丢失等多重挑战。

Noise
(a)
Blur
(b)
Shape
(c)
Scale
(d)
Cycle
(e)
Dropout
(f)

Figure 2. Illustrative challenges for echocardiography video segmentation: (a) speckle noise, (b) indistinct or blurred contours, and (c-f) the substantial changes in the target’s shape and scale throughout the cardiac cycle. 图 2. 超声心动图视频分割面临的典型挑战:(a) 斑点噪声,(b) 轮廓不清或模糊,以及 (c-f) 在整个心动周期中目标的形状和尺度发生的显著变化。

BibTeXBibTeX 引用

@InProceedings{Wang_ICCV25_GDKVM,
    author    = {Wang, Rui and Sun, Yimu and Guo, Jingxing and Wu, Huisi and Qin, Jing},
    title     = {{GDKVM}: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {12191-12200}
}

Additional Resources更多资源

To-Do List

  • 问题设置流程图Problem Formulation Flowchart
  • 模型处理流程图Model Processing Flowchart