GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule GDKVM:含门控 Delta 规则的时空键值记忆的超声心动图视频分割
若当前 Shell 已激活虚拟环境(左侧显示 (base) 或 (env)),需先退出。
If a virtual environment is currently activated in your Shell (indicated by (base) or (env) on the left), please deactivate it first.
对于 Conda 环境:For Conda environments:
conda deactivate
对于普通 venv:For standard venv:
deactivate
备注: uv 的环境管理具有较高的隔离性。在大多数场景下,即便未退出 Conda 环境,直接构建 uv 虚拟环境也不会引发依赖冲突。 Note: uv's environment management features high isolation. In most scenarios, even without deactivating the Conda environment, directly creating a uv virtual environment will not cause dependency conflicts.
本项目采用 uv 进行依赖管理。该工具基于 Rust 开发,具备高效的依赖解析能力。 This project uses uv for dependency management. Developed in Rust, it features highly efficient dependency resolution capabilities.
服务器无法直接链接外网时,安装:Installation without direct internet access:
pip install uv
或更新至最新版:Or update to the latest version:
pip install --upgrade uv
验证安装:Verify installation:
uv -V
基准版本:今天的版本 0.9.10。 Baseline version: Today's version 0.9.10.
克隆代码仓库并指定本地目录名。 Clone the repository and specify the local directory name.
git clone https://github.com/wangrui2025/GDKVM.git gdkvm_20251215
进入项目目录:Enter the project directory:
cd gdkvm_20251215
项目结构概览:Project structure overview:
.
├── .python-version # Specifies the Python version
├── pyproject.toml # Main project configuration file
├── uv.lock # Lock file (Ensures consistency)
└── ... # Other source code files
.
├── .python-version # 指定项目使用的 Python 版本
├── pyproject.toml # 项目的主配置文件
├── uv.lock # 锁定文件 (确保环境一致性)
└── ... # 其他项目源代码文件
uv 将读取配置文件,创建虚拟环境,安装所有依赖。 uv will read configuration files, create a virtual environment, and install all dependencies.
对于更新的环境,可以使用 env_02 环境配置,具体代码参考 env/env_02/pyproject.toml
For a newer environment, you can use the env_02 configuration. Refer to env/env_02/pyproject.toml.
( printf "\n==========================================\n🔍 1. GPU Drivers (Hardware Foundation)\n"; nvidia-smi --query-gpu=driver_version --format=csv,noheader 2>/dev/null | awk '{print "Driver: " $1}' || echo "Driver: Unknown"; nvidia-smi 2>/dev/null | grep "CUDA Version" | awk '{print "Max CUDA: " $9}' || echo "Max CUDA: Unknown"; printf "\n==========================================\n🐧 2. OS & GLIBC\n"; [ -f /etc/os-release ] && . /etc/os-release && echo "OS: ${PRETTY_NAME}"; ldd --version | head -n 1; printf "\n==========================================\n🏗️ 3. Compiler (JIT Critical)\n"; printf "GCC: "; gcc --version 2>/dev/null | head -n 1 || echo "Not found"; printf "\n==========================================\n🛠️ 4. CUDA Toolkit\n"; if command -v nvcc >/dev/null 2>&1; then nvcc -V | grep release; else echo "⚠️ nvcc not found"; fi; printf -- "------------------------------------------\nCUDA Physical Directories (/usr/local):\n"; ls -l /usr/local 2>/dev/null | grep cuda; printf -- "------------------------------------------\nLD_LIBRARY_PATH (Runtime Libs):\n${LD_LIBRARY_PATH:-⚠️ Not set}\n"; printf "\n==========================================\n🐍 5. Python Environment\n"; printf "Python: "; python3 --version 2>&1 || echo "Not found"; printf "Path: "; command -v python3 || echo "Not found"; printf "==========================================\n" )
env_01 (兼容性较好的环境)(Stable Environment)
env_02 (较新的环境)(Newer Environment)
uv sync
场景: 若执行 uv sync 时报错(如 SSL 错误、连接超时),通常系网络策略限制导致 uv 无法自动下载 Python 解释器。
Scenario: If uv sync fails (e.g., SSL errors, timeouts), it is usually due to network policies preventing uv from automatically downloading the Python interpreter.
解决方案: 通过镜像源手动安装 Python。 Solution: Manually install Python via a mirror source.
查看可用版本:Check available versions:
uv python list
通过镜像源安装(以 3.12.12 为例):Install via mirror (e.g., 3.12.12):
uv python install 3.12.12 --mirror https://github-proxy.lixxing.top/https://github.com/astral-sh/python-build-standalone/releases/download
安装成功后,再次运行 uv sync 。After successful installation, run uv sync again.
<2.12
现象:Pydantic 2.12+ (2025-10) 的严格 Schema 校验与 wandb 字段声明冲突,多进程下可能导致崩溃。 Issue: Pydantic 2.12+ (2025-10) strict Schema validation conflicts with wandb field declarations, potentially causing crashes in multi-process modes.
解决:配置文件已强制锁定 pydantic<2.12。
Solution: The configuration file forcefully locks pydantic<2.12.
现象:wandb>=0.22.2 停止提供针对 Ubuntu 18.04 (glibc 2.27) 的预编译包,导致安装失败。
Issue: wandb>=0.22.2 stopped providing pre-built packages for Ubuntu 18.04 (glibc 2.27), causing installation failures.
解决:需确保系统安装了 Go 编译器以支持源码编译,或升级操作系统。 Solution: Ensure a Go compiler is installed for source compilation, or upgrade the OS.
现象:运行 uv add/sync 时报错(Certificate Expired),因内网防火墙/代理的自签名证书不被 uv 默认的 Rust TLS 信任。
Issue: uv add/sync fails (Certificate Expired) because self-signed certificates in intranet/proxy environments are not trusted by uv's default Rust TLS.
解决方案:切换为系统原生 TLS 验证。 Solution: Switch to system-native TLS validation.
uv add wandb --native-tls
export UV_NATIVE_TLS=1
* 建议将此命令添加到 ~/.bashrc 或 ~/.zshrc
* Recommended to add this to ~/.bashrc or ~/.zshrc
source .venv/bin/activate
注:推荐显式激活环境,以便进入交互式调试(如 Python REPL)及使用 pip 检查包状态。 Note: Explicit activation is recommended to enable interactive debugging (e.g., Python REPL) and checking package status via pip.
验证环境(应输出项目 .venv 目录下的路径):Verify environment (should output the path inside project .venv):
which python
输出结果示例:/data/Anon/Repo/gdkvm_20251215/.venv/bin/python
Example Output:/data/Anon/Repo/gdkvm_20251215/.venv/bin/python
我们使用 CAMUS 和 EchoNet-Dynamic 数据集。 We utilize the CAMUS and EchoNet-Dynamic datasets.
1. 环境配置1. Configuration
根据 Shell 环境(zsh 或 bash)选择相应的 train.sh 脚本,并配置以下环境变量以适配硬件环境:
Select the appropriate train.sh script based on your Shell environment (zsh or bash), and configure the following environment variables to adapt to your hardware:
CUDA_VISIBLE_DEVICES: 0,1 # Specify GPU device IDs
MASTER_PORT: 29500 # Port for distributed training
CUDA_VISIBLE_DEVICES: 0,1 # 指定使用的 GPU 设备编号
MASTER_PORT: 29500 # 分布式训练的主端口号,避免冲突
2. 参数设定2. Hyperparameters
编辑配置文件 config/config_gdkvm_01.yaml,针对实验需求调整关键超参数:
Edit config/config_gdkvm_01.yaml to adjust key hyperparameters:
data_path: /data/Anon/dataset/camus_png256x256_10f_20250709/ # Dataset path
batch_size: 8 # Batch size
learning_rate: 1.0e-4 # Learning rate
num_iterations: 3000 # Total iterations
eval_stage:
num_vis: 0 # Visualization count
wandb_mode: "offline" # Set to "offline"
data_path: /data/Anon/dataset/camus_png256x256_10f_20250709/ # 数据集的实际存放路径
batch_size: 8 # 单次训练的样本数量
learning_rate: 1.0e-4 # 学习率
num_iterations: 3000 # 总迭代次数
eval_stage:
num_vis: 0 # 可视化图片的数量
wandb_mode: "offline" # 设置为 "offline"
3. 执行训练3. Execute Training
赋予执行权限:Grant execution permission:
chmod +x ./train.sh
启动训练:Start training:
./train.sh
训练产物(模型权重、可视化结果等)将保存至 train.sh 中 hydra.run.dir 指定的目录。
Artifacts (weights, visualizations) are saved to the directory specified in hydra.run.dir.
gdkvm_20251215/outputs
实验采用 Weights & Biases (WandB) 进行离线日志记录。 Experiments use Weights & Biases (WandB) for offline logging.
gdkvm_20251215/wandboffline-run-20251215_123456-abcdef1gh)。Subfolders with timestamps and hashes (e.g., offline-run-20251215_123456-abcdef1gh).上传离线日志Upload Offline Logs
训练结束后,可以使用以下命令将离线数据同步到 WandB 云端: After training, sync offline data to WandB cloud:
wandb sync gdkvm_20251215/wandb/offline-run-20251215_123456-abcdef1gh
wandb sync gdkvm_20251215/wandb/offline-run-20251215_123456-abcdef1gh