Amid disputes with Google and debates about regulating the use of artificial intelligence, the company OpenAI launched on Monday (13) its newest artificial intelligence language model, GPT-40. The new model has a major upgrade over its predecessor: it can process not only text but also images and audio in real time.
The letter “o” in the robot’s name comes from omni, which from Latin means “everything,” precisely because of its ability to process and combine text, image, and audio to generate a variety of different results. But not only that, the GPT-40 is faster than its previous version, managing to generate audio responses at a speed of 320 milliseconds. In comparison, GPT-3.5 and GPT-4 voice modes took 2.8 and 5.4 seconds, respectively.
Also Read: Microsoft is developing a new AI model to compete with Google-OpenAI
Moreover, different models processed responses differently. For example, while a simpler version like GPT 3.5 transcribed audio to text, GPT-4 took that transcription and inserted it with command prompts. A third model would then finally convert the text into audio, giving the person the answer.
In addition to being time-consuming, this entire process could also lose information along the way. The new GPT-40 model changed this arrangement. Now, it processes all information, allowing responses to be faster and more natural. To top it off, the new model will also come with a memory upgrade, so it can remember and learn from previous conversations.
Translation
Similar to Google Lens, the new GPT allows you to take a picture of an object with writing on it (like a menu or a road sign), and the app will translate the text.
There is also a simultaneous translation tool. All you have to do is tell the bot which languages will participate in the conversation, and it will take action to mediate the conversation in the best United Nations interpreter style.
This responsiveness, in fact, came with an intriguing update. In order for controlling responses with emotional tones depending on the situation and the individual, the model was trained to recognize variations in emotion in people’s faces and speech. It’s as if we exchanged ideas with C-3PO from Star Wars (or Samantha from Her).
You can also ask the new model to sing, solve mathematical and programming problems, tell stories, and so on. And unlike previous OpenAI bots, GPT-40 will be free for everyone.
Since the announcement, the company has gradually started to release access to the tool. Only owners of the Plus version, which is an additional cost, are currently testing the robot. OpenAI should make it available to all users in the coming weeks.
Also Read
BEFORE GOOGLE: OpenAI presents AI update for GPT-4