ChatGPT’s Advanced Voice Mode could soon gain vision

ChatGPT has already impressed users with its conversational skills and task-handling capabilities, but OpenAI is preparing to take the AI assistant to the next level with vision capabilities.

With this new feature potentially arriving soon, ChatGPT may soon be able to interact with the real world through live video.

The upcoming “Live Camera” feature builds on ChatGPT’s Advanced Voice Mode, which already allows natural, conversational interactions. By adding real-time vision, ChatGPT will go beyond text and voice, enabling it to recognize objects, identify people, and make associations between items in its surroundings.

Users can activate the Live Camera by tapping a camera icon within the app, allowing the AI to “see” through their device’s camera and comment on what it observes. In a demo during OpenAI’s GPT-4o launch earlier this year, ChatGPT identified a dog, recalled its name, recognized a ball, and understood the concept of fetch—all with minimal user input.

The feature, however, comes with a cautionary note. OpenAI advises users not to rely on it for critical decisions, such as navigation or health-related assessments, underlining its experimental nature.

Recent beta updates spotted by Android Authority suggest the Live Camera feature is close to a broader rollout. While still in beta, the feature is expected to first launch for ChatGPT Plus subscribers and potentially other paid tiers.

If successful, this update could distinguish ChatGPT from competitors like Google’s Gemini. While Google Lens offers vision-based capabilities, it lacks the live and conversational functionality that ChatGPT’s Live Camera aims to deliver.

With over 10 million paying users, OpenAI’s ChatGPT trails Google’s Gemini Advanced, accessible to 100 million Google One subscribers. However, introducing vision capabilities could attract users seeking a more interactive and innovative AI assistant.

For entrepreneurs and developers, this new functionality opens up possibilities for enhanced AI-driven solutions, such as real-time customer service, interactive tutorials, and on-the-fly product identification. Businesses exploring augmented reality or advanced AI applications could find new opportunities to integrate ChatGPT’s capabilities into their workflows.

While OpenAI hasn’t confirmed a release date, the addition of real-time vision could mark a significant milestone, positioning ChatGPT as a more versatile and competitive AI platform in the rapidly evolving landscape of AI assistants.

Share this Post:

Accessibility Toolbar