Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model
Paper • 2505.07449 • Published • 1
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("General-Medical-AI/Ophora", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
We present Ophora, a pioneering model that can generate ophthalmic surgical videos following natural language instructions.
@article{li2025ophora,
title={Ophora: A large-scale data-driven text-guided ophthalmic surgical video generation model},
author={Li, Wei and Hu, Ming and Wang, Guoan and Liu, Lihao and Zhou, Kaijin and Ning, Junzhi and Guo, Xin and Ge, Zongyuan and Gu, Lixu and He, Junjun},
journal={arXiv preprint arXiv:2505.07449},
year={2025}
}
# Gated model: Login with a HF token with gated access permission hf auth login