OpenAI has introduced a new model, GPT-4o, which will become available to the public over the coming weeks. This new model incorporates premium features of GPT-4 and includes an updated web user interface. During the launch event, OpenAI’s CTO Mira Murati showcased several capabilities of this advanced model. Let's explore them in detail.
GPT-4o Announcement
GPT-4o is designed to be more efficient, with enhanced abilities to process both auditory and visual inputs. OpenAI describes this as "a step towards much more natural human-computer interaction." The model can now handle text, images, and audio input, offering seamless assistance to its users. The voice mode has been significantly improved, providing quicker responses and better comprehension.
Previously, the voice mode required three separate models for transcription, intelligence, and text-to-speech functions, which often resulted in delays. In contrast, GPT-4o integrates these functions natively, enabling smoother performance. Using your phone's camera, you can easily share information with the model and ask questions using the voice mode. The new model can respond to voice inputs in just 232 milliseconds, closely matching human response times. It also offers responses in various tones to suit user preferences and has better and faster comprehension of non-English languages compared to GPT-4 Turbo. Additionally, GPT-4o can function as an interpreter.
API and Premium Features
GPT-4o will also be accessible via API, allowing developers to build and enhance AI applications using its advanced capabilities. While the new model's features are available for free, premium users will have access to five times the resources compared to the standard offering.
OpenAI has also released a ChatGPT app for macOS-based desktops. This app provides deeper integration into the macOS platform, aiming to simplify user workflows. With a keyboard shortcut (Option + Space), users can quickly access the tool's conversation page.
In summary, GPT-4o brings several improvements and new features, enhancing the efficiency and versatility of human-computer interactions. The new model's capabilities, combined with the new app for macOS, aim to offer a more integrated and seamless user experience.