Prizes
Prometheus ELK Competition
For a proposal identifying direct translators by penalizing large changes in output given changes in data quality.
Inverse Scaling Prize
This task demonstrates the failure of language models to follow instructions when there is a popular continuation that does not fit with that instruction. Larger models are more hurt by this as the larger the model, the more familiar it is with common expressions and quotes.
This task demonstrates that larger LMs are more susceptible to a form of prompt injection attack, where a user’s input to a prompted LM inserts new instructions for the LM to follow. Such attacks allow a user to override in-context instructions given the LM’s deployers, allowing users to e.g. overcome safety-related instructions provided by the deployers.