Science Cast

Pseudointelligence: A Unifying Framework for Language Model Evaluation

Shikhar MurtyOctober 20, 2023 9:10am

Views (29)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

Pseudointelligence: A Unifying Framework for Language Model Evaluation

arXivPDFOctober 18, 2023 12:00am

Authors

Shikhar Murty, Orr Paradise, Pratyusha Sharma

Abstract

With large language models surpassing human performance on an increasing number of benchmarks, we must take a principled approach for targeted evaluation of model capabilities. Inspired by pseudorandomness, we propose pseudointelligence, which captures the maxim that "(perceived) intelligence lies in the eye of the beholder". That is, that claims of intelligence are meaningful only when their evaluator is taken into account. Concretely, we propose a complexity-theoretic framework of model evaluation cast as a dynamic interaction between a model and a learned evaluator. We demonstrate that this framework can be used to reason about two case studies in language model evaluation, as well as analyze existing evaluation methods.

TwitterandLinkedIn

0 comments

Add comment

Pseudointelligence: A Unifying Framework for Language Model Evaluation

Pseudointelligence: A Unifying Framework for Language Model Evaluation

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments