A-Mahla (Amir Mahla)

upvoted an article 20 days ago

Article

cua-bench: A Framework for Benchmarking, Training Data, and RL Environments for Computer-Use Agents

20 days ago

•

10

upvoted a paper 2 months ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 28

upvoted an article 2 months ago

Article

Building the Open Agent Ecosystem Together: Introducing OpenEnv

+8

Oct 23, 2025

•

139

upvoted 4 papers 3 months ago

upvoted 3 articles 3 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

+9

Sep 22, 2025

•

125

Article

PrediBench: Testing AI models on prediction markets

Sep 24, 2025

•

5

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

Sep 23, 2025

•

134

upvoted a paper 4 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 124

upvoted an article 4 months ago

Article

Exploring Environments Hub: Your Language Model needs better (open) environments to learn

Sep 4, 2025

•

29

upvoted an article 6 months ago

Article

ScreenEnv: Deploy your full stack Desktop Agent

Jul 10, 2025

•

74

upvoted a paper 6 months ago

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

Paper • 2506.21506 • Published Jun 26, 2025 • 51

upvoted an article 6 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

Jun 21, 2025

•

74

upvoted 2 articles 7 months ago

Article

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

+1

Jun 6, 2025

•

55

Article

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Jun 3, 2025

•

71

upvoted an article 8 months ago

Article

TinyAgents: A Minimal Experiment with Code Agents and MCP Tools

May 16, 2025

•

30

Amir Mahla

AI & ML interests

Organizations

cua-bench: A Framework for Benchmarking, Training Data, and RL Environments for Computer-Use Agents

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Building the Open Agent Ecosystem Together: Introducing OpenEnv

FineVision: Open Data Is All You Need

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

UI-Venus Technical Report: Building High-performance UI Agents with RFT

Robot Learning: A Tutorial

Gaia2 and ARE: Empowering the community to study agents

PrediBench: Testing AI models on prediction markets

Smol2Operator: Post-Training GUI Agents for Computer Use

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Exploring Environments Hub: Your Language Model needs better (open) environments to learn

ScreenEnv: Deploy your full stack Desktop Agent

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

TinyAgents: A Minimal Experiment with Code Agents and MCP Tools

Amir Mahla

AI & ML interests

Organizations

A-Mahla's activity

cua-bench: A Framework for Benchmarking, Training Data, and RL Environments for Computer-Use Agents

Building the Open Agent Ecosystem Together: Introducing OpenEnv

Gaia2 and ARE: Empowering the community to study agents

PrediBench: Testing AI models on prediction markets

Smol2Operator: Post-Training GUI Agents for Computer Use

Exploring Environments Hub: Your Language Model needs better (open) environments to learn

ScreenEnv: Deploy your full stack Desktop Agent

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

TinyAgents: A Minimal Experiment with Code Agents and MCP Tools