On March 14, 2023, OpenAI—a MoFo client—released GPT-4, which quickly garnered broad media coverage. For those assessing the opportunities and risks related to GPT-4, it is useful to consider the extent of the stated technical and safety improvements and the limitations of the release.

GPT-4 is the newest version of OpenAI’s Generative Pre-trained Transformer model. Like previous versions of GPT, GPT-4 is a transformer-based large language model that is pre-trained using both publicly available data (such as internet data) and third-party licensed data to generate text output based on input prompts. GPT is the foundation model behind ChatGPT (the well-known model based on GPT that OpenAI fine-tuned for a chatbot experience).

GPT-4 Improvements

Open AI’s GPT-4 Technical Report states that GPT-4 demonstrates substantial improvements in performance and capabilities from GPT-3.5, including the ability to:

  • Process up to 25,000 words in combined input and output text (around eight times the amount of text compared to GPT-3.5), which allows for long-form content creation, extended conversations, and document search and analysis;
  • Perform more complex tasks, including the forthcoming ability to process images as inputs; and
  • Provide outputs containing more accurate responses and fewer confident sounding, but inaccurate, responses (i.e., “hallucinations”).

Regarding hallucinations, GPT-4 scored 19% higher than GPT-3.5 in an OpenAI internal, adversarially-designed factuality evaluation.[1] GPT-4, with fine-tuning, also showed improvements over GPT-3.5 based on publicly available benchmarks such as TruthfulQA.[2]

To demonstrate GPT-4’s ability to accomplish more complex tasks, OpenAI tested GPT-4 and GPT-3.5 on a variety of standardized exams designed for humans (e.g., the uniform bar exam, AP tests, SATs, and GREs). Notably, GPT-4 scored in the 90th percentile of exam takers in the uniform bar exam (MBE, MEE, and MPT), whereas GPT-3.5 scored in the 10th percentile.[3]

GPT-4 also has the ability to process both images and text as inputs to the model, whereas previous versions of GPT could only process text. However, GPT-4’s output responses are still limited to text only.

During the March 14, 2023 developer demo live stream, Open AI provided a demonstration of how GPT-4 is capable of processing images into code.[4] During the demo, Greg Brockman, President and Co-Founder of OpenAI, took a picture of a drawing of a website, uploaded the image as input to GPT-4 as part of a prompt, and GPT-4 generated HTML code that could be used to create an actual functioning website based on the drawing.

Another example of an application of GPT-4’s capabilities to process images is the collaboration between OpenAI and Be My Eyes, an organization creating technology for the visually impaired community. Be My Eyes has leveraged GPT-4’s ability to process images to create a “Virtual Volunteer” application with image recognition capabilities. The Virtual Volunteer application takes in input from a smartphone camera, identifies what is in the image, and reads out what is identified to the user of the application.[5]

GPT-4 Safety Improvements

As part of the process of fine-tuning GPT-4—as with prior versions of GPT—OpenAI used reinforcement learning with human feedback (RLHF) and rule-based reward models (RBRMs) in order to reduce the likelihood that GPT-4 would generate harmful content.[6] In the GPT-4 Technical Report, OpenAI states that it has further improved its application and use of these training techniques to increase the likelihood of desired behaviors from, and reduce incidents of, undesired behavior.[7]

To understand and reduce the risk from generating harmful content, OpenAI collaborated with 50 experts from various domains such as AI alignment risks, cybersecurity, bio-risk, and international security to engage in adversarial testing—feeding malicious inputs into a model and observing its responses for the purpose of identifying the model’s potential weaknesses and vulnerabilities.[8] OpenAI used these expert recommendations to improve the GPT-4 model’s output.

OpenAI states in its GPT-4 Technical Report that the safety improvements it has applied to GPT-4 has decreased the model’s tendency to respond to requests for prohibited content by 82% compared to GPT-3.5.[9] GPT-4 also responds to sensitive requests (e.g., medical advice or the possibility of self-harm) in accordance with OpenAI’s policies 29% more often than GPT-3.5.[10] When OpenAI tested GPT-4 and GPT-3.5 on the RealToxicityPrompts[11] dataset to evaluate the frequency of these models generating harmful output, the test showed that 0.73% of the time GPT-4 outputs a “toxic generation” as opposed to 6.48% of the time for GPT-3.5.[12]

GPT-4 Limitations

OpenAI has been transparent in flagging that GPT-4 is subject to many of the same limitations that are present in prior GPT models, including that the model does not always produce reliable output (e.g., biased output and “hallucinations”), is limited in its ability to “learn” from experience, and lacks information about of events occurring after September 2021,[13] the cutoff date for the vast majority of its pre-training data.[14]

Like previous versions of GPT, OpenAI noted in the GPT-4 Technical Report that GPT-4 remains vulnerable to “jailbreaks.” For example, users may be able to input adversarial prompts that succeed in eliciting output that OpenAI may have intended to be excluded from what GPT-4 displays to a user.[15] There have been previous reports of ChatGPT users discovering how to write jailbreaking prompts to trick ChatGPT to adopt a fictional persona named “DAN” (“Do Anything Now”) so that ChatGPT would display responses that the model can generate but OpenAI may have intended to be excluded from ChatGPT’s response.[16] OpenAI uses a mix of reviewers and automated systems to identify and enforce against misuse of its models and develop patches to prevent future jailbreaks.[17]

OpenAI emphasizes in the GPT-4 Technical Report that GPT-4 users should take “great care” when using GPT-4’s outputs. OpenAI also recommends that users of GPT-4 establish protocols that match the needs of the user’s specific application of GPT-4 (such as “human review, grounding with additional context, or avoiding high-stakes uses altogether”).[18]

As President and Co-Founder of OpenAI Greg Brockman said during the March 14, 2023 developer demo live stream, GPT-4 works best when used in tandem with people who check its work—it is “an amplifying tool” that when used together with humans allows us to “reach new heights,” but it “is not perfect” and neither are humans.