Improving instruction hierarchy in frontier LLMs

Quick Summary

"IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks."

This article was originally published by OpenAI News. You can read the full, in-depth story at the source below.

Read Full Story at OpenAI News