OpenAI brings video to ChatGPT Advanced Voice Mode

ChatGPT’s Advanced Voice Mode now has video and screenshare capabilities.

The feature was last May with the release of GPT-4o, but only the audio modality has been live. Now users can chat with ChatGPT using a phone camera and the model will “see” what you see.

In the livestream, CPO Kevin Weil and other OpenAI team members demoed ChatGPT assisting with how to make pour-over coffee. By pointing the camera at the action, AVM demonstrated that it understood the principle of the coffee maker and walked the team through the brewing of their beverage. The team also showed how ChatGPT supports screensharing by understanding an open message on a phone with Weil wearing a Santa beard.

Mashable Light Speed

The long-awaited announcement comes a day after Google unveiledthe next generation of its flagship model, Gemini 2.0. The new Gemini 2.0 can also process visual and audio inputs and has more agentic capabilities, meaning it can perform multi-step tasks on the user’s …

Tags AI Technology, Digital Transformation, Machine Learning, Technology