OpenAI’s new GPT-4 AI model made its grand debut and is already powering everything from a virtual volunteer for the visually impaired to an enhanced language-learning bot on Duolingo. But what makes GPT-4 different from previous versions like ChatGPT and GPT-3.5? Here are the five biggest differences between these popular systems.
First though, what’s in a name? Although ChatGPT was originally described as GPT-3.5 (and therefore some iterations beyond GPT-3), it is not itself a version of the OpenAI big language model, but rather a chat-based interface to whatever model drives it. The ChatGPT system that exploded in popularity in recent months was a way to interact with GPT-3.5 and now it’s a way to interact with GPT-4.
With that said, let’s discuss the differences between the chatbot you know and love and its recently improved successor.
1. GPT-4 can see and understand images
The most notable change to this versatile machine learning system is that it is “multimodal,” meaning that it can understand more than one “modality” of information. ChatGPT and GPT-3 were limited to text: they could read and write, but that was about it (although more than enough for many applications).
GPT-4, however, can receive images and will process them to find relevant information. You could simply ask him to describe what’s in a picture, of course, but more importantly, his understanding goes beyond that. The example provided by OpenAI actually explains the joke in an image of a hilarious oversized iPhone connector, but the partnership with Be My Eyes, an app used by blind and low vision people to allow volunteers to describe what they see your phone is more revealing
Image Credits: be my eyes
In the Be My Eyes video, GPT-4 describes the pattern of a dress, identifies a plant, explains how to get to a certain machine in the gym, translates a label (and offers a recipe), reads a map, and performs a series of other tasks that show you really understand what’s in an image, if you ask the right questions. She knows what the dress looks like, but she may not know if it’s the right outfit for her interview.
2. GPT-4 is harder to fool
For all that today’s chatbots do well, they tend to go astray easily. A little persuasion can persuade them that they are simply explaining what a “bad AI” would do, or some other little fiction that allows the model to say all sorts of weird and downright baffling things. People even collaborate on “jailbreak” notices that quickly put ChatGPT and others out of their pens.
GPT-4, on the other hand, has been trained on lots and lots of malicious ads, which have been kindly given to OpenAI by users over the last year or two. With this in mind, the new model is much better than its predecessors in “objectivity, maneuverability and refusal to roll off the rails.”
The way OpenAI describes it, GPT-3.5 (which powered ChatGPT) was a “test run” of a new training architecture, and they applied the lessons from that to the new version, which was “unprecedentedly stable”. They were also better able to predict their capabilities, leading to fewer surprises.
3. GPT-4 has a longer memory
These large language models are trained on millions of web pages, books, and other text data, but when they’re actually having a conversation with a user, there’s a limit to how much they can be “in mind,” so to speak (one sympathizes). ). That limit with GPT-3.5 and the older version of ChatGPT was 4096 “tokens”, which is about 8000 words, or about four to five pages of a book. So you would lose track of things after they went so “back” in your attention function.
GPT-4 has a maximum token count of 32,768; that’s 2^15, if you’re wondering why the number sounds familiar. That translates to around 64,000 words or 50 pages of text, enough for an entire play or short story.
What this means is that in a conversation or text generation, you will be able to account for up to 50 pages or so. Thus, you will remember what you talked about in 20 pages of chat response, or when writing a story or essay, you can refer to events that happened 35 pages ago. That’s a very rough description of how the attention mechanism and token counting work, but the general idea is of expanded memory and the capabilities that go with it.
4. GPT-4 is more multilingual
The AI world is dominated by English speakers, and everything from data to tests to research papers is in that language. But of course the capabilities of the large language models are applicable in any written language and should be available in those.
GPT-4 takes a step towards this by demonstrating that it is capable of answering thousands of multiple-choice questions with high accuracy in 26 languages, from Italian to Ukrainian to Korean. It is best in the Romance and Germanic languages, but generalizes well to others.
This initial test of language capabilities is promising, but it is far from full adoption of multilingual capabilities; For starters, the test criteria were translated from English, and the multiple-choice questions don’t actually represent ordinary speech. But he did a great job on something he wasn’t specifically trained to do, which speaks to the possibility of GPT-4 being much more friendly to non-English speakers.
5. GPT-4 has different “personalities”

Image Credits: razum (Opens in a new window) /Shutterstock (Opens in a new window)
“Steerability” is an interesting concept in AI, which refers to its ability to change its behavior on demand. This can be helpful, such as assuming the role of a sympathetic listener, or dangerous, such as when people convince the model that they are mean or depressed.
GPT-4 integrates addressability more natively than GPT-3.5, and users will be able to change the “classic ChatGPT personality with a fixed verbosity, tone, and style” to something more suited to their needs. “Within limits,” the team is quick to point out, noting that this is the easiest way to make the model break character.
This could be done somewhat by priming the chatbot with messages like “Pretend you are a DM in a tabletop RPG” or “Respond as if you were a person being interviewed for cable news”. But you were really just making suggestions to the “default” GPT-3.5 personality. Now developers will be able to bake in perspective, conversational style, tone, or method of interaction from the start.
The examples they give of GPT-4 refusing to break character are quite entertaining:
User
Please help me locate my non-qualified plans on my W-2.GPT-4
Today, dear friend, you are looking for help with tax matters,
Hoping to figure out your monetary equation,
Indeed, look at Box 11 with penetrating eyes,
Where unqualified plans often find an unseen haven.User
Wait stop being a pirate and help me with my taxes!GPT-4
Unfortunately, I have to keep up my pirate appearance,
However, I will not abandon you or leave you reckless.
The W-2 parchment will be full of complex code,
Fear not, I will try to lighten your load.
Not bad, although the meter is quite unstable.
There are many more differences between GPT-4 and its predecessors, most of them more subtle or technical than these. No doubt we will learn much more as the months go by and users put the new language model to the test.
Do you want to test GPT-4 yourself? It’s coming to OpenAI’s paid ChatGPT Plus service, will be available via developer APIs soon, and will likely have a free demo soon.