Pretraining Frame Preservation in Autoregressive Video Memory Compression Paper • 2512.23851 • Published 9 days ago • 21
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning Paper • 2512.15635 • Published 21 days ago • 19
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 23 days ago • 72
view article Article Qwen-Image-i2L: Training Strategies for Image-to-LoRA Generation 22 days ago • 44
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published Dec 1, 2025 • 71
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published Nov 19, 2025 • 227
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published Oct 23, 2025 • 40
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published Oct 17, 2025 • 50
nablaNABLA: Neighborhood Adaptive Block-Level Attention Paper • 2507.13546 • Published Jul 17, 2025 • 124
Alchemist Collection 📊 Dataset and 🏆 checkpoints for paper 📝 "Alchemist: Turning Public Text-to-Image Data into Generative Gold" • 8 items • Updated Oct 16, 2025 • 17
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published Mar 26, 2025 • 56
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 63
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis Paper • 2412.01819 • Published Dec 2, 2024 • 34
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29, 2024 • 57
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29, 2024 • 50