Poster

[GPT-4] Technical Report

Avatar

Avatar
user

What are GPT-4's strengths and weaknesses?

AI: GPT-4 is less capable than humans in many real-world scenarios, but exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. Its strengths include the ability to accept image and text inputs and produce text outputs, and its core component is the development of infrastructure and optimization methods that behave predictably across a wide range of scales. There is no mention of specific weaknesses.
Avatar
user

What are GPT-4's limitations?

AI: GPT-4 is less capable than humans in many real-world scenarios.
Avatar
user

Summarize this paper

AI: The paper describes the development of GPT-4, a model that can accept both image and text inputs and produce text outputs. While not as capable as humans in some real-world scenarios, GPT-4 performs at a human-level on various professional and academic benchmarks, including passing a simulated bar exam with a score in the top 10%. GPT-4 is a Transformer-based model and its post-training alignment process improves its factual accuracy and adherence to desired behavior. The development process involved creating infrastructure and optimization methods that are predictable across a wide range of scales. Models trained with significantly less compute can accurately predict some aspects of GPT-4's performance.

Authors

OpenAI

Abstract

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

0 comments

Recommended SciCasts
Cosmic Birefringence in 2022