8 18

Shengqiong Wu

ChocoWu

https://chocowu.github.io/

ChocoWu

AI & ML interests

Large Language Model, Multimodal learning, Scene graph Generation

Recent Activity

upvoted a paper about 1 month ago

SemanticGen: Video Generation in Semantic Space

upvoted a paper about 1 month ago

Kling-Omni Technical Report

upvoted a paper about 2 months ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

View all activity

Organizations

upvoted 2 papers about 1 month ago

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published Dec 23, 2025 • 92

Kling-Omni Technical Report

Paper • 2512.16776 • Published Dec 18, 2025 • 168

upvoted a paper about 2 months ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Paper • 2512.11749 • Published Dec 12, 2025 • 39

upvoted a paper 2 months ago

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist

Paper • 2511.08521 • Published Nov 11, 2025 • 38

upvoted 2 papers 3 months ago

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published Oct 17, 2025 • 49

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

Paper • 2510.13940 • Published Oct 15, 2025 • 7

upvoted 3 papers 4 months ago

AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes

Paper • 2510.10670 • Published Oct 12, 2025 • 19

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

Paper • 2510.08555 • Published Oct 9, 2025 • 63

UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution

Paper • 2510.08143 • Published Oct 9, 2025 • 20

updated 2 datasets 6 months ago

General-Level/General-Bench-Closeset

Updated Aug 4, 2025 • 11.3k • 2

General-Level/General-Bench-Openset

Updated Aug 4, 2025 • 765 • 4

updated a dataset 7 months ago

General-Level/General-Bench-Closeset-Scoped

Updated Jul 15, 2025 • 504 • 1

authored a paper 9 months ago

On Path to Multimodal Generalist: General-Level and General-Bench

Paper • 2505.04620 • Published May 7, 2025 • 82

upvoted 2 papers 9 months ago

3D Scene Generation: A Survey

Paper • 2505.05474 • Published May 8, 2025 • 21

On Path to Multimodal Generalist: General-Level and General-Bench

Paper • 2505.04620 • Published May 7, 2025 • 82

authored a paper 10 months ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published Apr 17, 2025 • 20

upvoted a paper 10 months ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published Apr 17, 2025 • 20

authored 2 papers 10 months ago

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Paper • 2308.05095 • Published Aug 9, 2023

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Paper • 2503.23377 • Published Mar 30, 2025 • 57

upvoted a paper 10 months ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published Mar 31, 2025 • 301

Shengqiong Wu

AI & ML interests

Recent Activity

Organizations

ChocoWu's activity