A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models Paper • 2401.00757 • Published Jan 1, 2024
BiasAsker: Measuring the Bias in Conversational AI System Paper • 2305.12434 • Published May 21, 2023
New Job, New Gender? Measuring the Social Bias in Image Generation Models Paper • 2401.00763 • Published Jan 1, 2024
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30, 2025 • 99
DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation Paper • 2506.06251 • Published Jun 6, 2025 • 2
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 93
Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published 28 days ago • 63
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 93
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30, 2025 • 99