PPO Agent for CarV1 Self-Driving Car

This is a trained PPO agent for the CarV1 line-following environment.

Training Details

  • Algorithm: PPO (Proximal Policy Optimization)
  • Framework: Stable-Baselines3
  • Training Timesteps: 99,840
  • Mean Reward: 807.10 ± 0.00
  • Training Date: 2026-01-04

Usage

from stable_baselines3 import PPO
from huggingface_hub import hf_hub_download

# Download model
model_path = hf_hub_download(
    repo_id="katharsis/carv1-ppo",
    filename="best_model.zip"
)

# Load model
model = PPO.load(model_path)

# Use for inference
obs = env.reset()
action, _ = model.predict(obs, deterministic=True)

Environment

The CarV1 environment simulates a line-following robot with camera-based observations:

  • Observation Space: [left_offset, right_offset, heading, speed]
  • Action Space: Continuous steering and throttle

Training Metrics

Check the associated WandB run for detailed training curves and metrics.

Downloads last month
108
Video Preview
loading

Evaluation results