D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 2.73k • 216 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 454 • 3
D-RLAIF openai/summarize_from_feedback Viewer • Updated Jan 3, 2023 • 194k • 2.73k • 216 trl-internal-testing/tldr-preference-sft-trl-style Viewer • Updated Aug 20, 2024 • 130k • 454 • 3