InstructGPT vs GPT-3: The battle for better language models
InstructGPT vs GPT-3: The battle for better language models - Flow Card Image

OpenAI's new InstructGPT models are making waves in the world of language models, being much better at following user intentions and generating less toxic outputs than the popular GPT-3 models. These InstructGPT models are now the default language models on the OpenAI API, powered by reinforcement learning from human feedback (RLHF).

By having human annotators provide demonstrations of desired model behavior and ranking several outputs from the models, the resulting InstructGPT models are safer, more helpful, and more aligned with their users. This breakthrough shows that fine-tuning language models with humans in the loop is a powerful tool for improving their safety and reliability, without compromising on capabilities.

Categories : Computer Science . Machine Learning

Press Ask Flow below to get a link to the resource

     

Related