Name: Better, Cheaper, Faster LLM Alignment With KTO | GTC 24 2024 | NVIDIA On-Demand
Uploaded: 2024-03-20T15:30:00Z
Duration: 1382 s
Description: Alignment with human feedback is a crucial aspect of large language model deployments

Video Player is loading.

Current Time 0:00

Duration 0:00

Loaded: 0%

Stream Type LIVE

Remaining Time 0:00

详情

字幕

Alignment with human feedback is a crucial aspect of large language model deployments. The dominant alignment approaches, reinforcement learning from human feedback and direct preference optimization, have a major downside: they require paired preference data, incurring expense and slow data annotation efforts. In the real world, unpaired data is much more abundant. Can we speed up the feedback loop by removing the requirement for paired data? As I'll explain, we can do exactly that, via a new alignment method called Kahneman Tversky Optimization (KTO).

活动: GTC 24

日期: March 2024

行业: All Industries

NVIDIA technology: CUDA,Ethernet Networking,NCCL

级别: Intermediate Technical

话题: Text Generation

语言: English

所在地: