PPO Agent for CarV1 Self-Driving Car
This is a trained PPO agent for the CarV1 line-following environment.
Training Details
- Algorithm: PPO (Proximal Policy Optimization)
- Framework: Stable-Baselines3
- Training Timesteps: 99,840
- Mean Reward: 807.10 ± 0.00
- Training Date: 2026-01-04
Usage
from stable_baselines3 import PPO
from huggingface_hub import hf_hub_download
# Download model
model_path = hf_hub_download(
repo_id="katharsis/carv1-ppo",
filename="best_model.zip"
)
# Load model
model = PPO.load(model_path)
# Use for inference
obs = env.reset()
action, _ = model.predict(obs, deterministic=True)
Environment
The CarV1 environment simulates a line-following robot with camera-based observations:
- Observation Space: [left_offset, right_offset, heading, speed]
- Action Space: Continuous steering and throttle
Training Metrics
Check the associated WandB run for detailed training curves and metrics.
- Downloads last month
- 108
Evaluation results
- Mean Rewardself-reported807.10 +/- 0.00