AInsights: Your executive-level insights on the latest in generative AI…
Meet Agent 00AI…your new Q to help you navigate your work and personal lives.
Google hosted its I/O event and everything the company announced isn’t just disruptive or progressive in the realm of genAI, but also to itself.
Let’s run through the announcements and then let’s dive deeper to analyze how they rival OpenAI, Meta, Anthropic, and Perplexity.
For starters, Google announced Gemini AI Integration across multiple key products, which is a sign of the future for next-gen hardware and software products. At some point on the horizon, AI will simply become part of the user interface, acting as an assistant to collaborate with you in real-time, or, eventually, as a proactive agent on your behalf.
- Automatic movie/TV show description generation in Google TV.
- Geospatial augmented reality content in Google Maps.
- AI-generated quizzes on educational YouTube videos.
- Natural language search in Google Photos (“Ask Photos” feature). This is huge in of itself. How many pictures do you have on your phone or in the cloud that you’re likely never to see again? You can find pictures simply by describing them to AI!
- AI assistance for email drafting, summarization, and e-commerce returns in Gmail. Now also please enhance email search! Why is this still a challenge in 2024!?
Google I/O in Under 10 Minutes
Gemini 2 Model Update
Google also announced a new 27-billion-parameter model for Gemini 2 (latest 1.5 details here), its next-generation AI offering, optimized for efficient performance on GPUs. This larger model can support the largest input of any commercially available AI model.
Veo and Imagen 3 for Creators
Google showcased Veo, its latest high-definition video generation model designed to compete against Sora and Midjourney, and Imagen 3, its highest-quality text-to-image model, promising more lifelike visuals.
These tools will be available for select creators initially.
Audio Overviews and AI Sandbox
Google introduced ‘Audio Overviews,’ a feature that generates audio discussions based on text input, and ‘AI Sandbox,’ a range of generative AI tools for creating music and sounds from user prompts.
AI Overviews in Search
Google Search is launching ‘AI Overviews’ to provide quick summaries of answers to complex search queries, along with assistant-like planning capabilities for multi-step tasks.
Google introduced the ability to ask open-ended questions and receive detailed, coherent responses generated by AI models. This allows users to get more comprehensive information beyond just a list of links.
AI Agents: Google unveiled AI agents that can engage in back-and-forth dialogue to help users accomplish multi-step tasks like research, analysis, and creative projects. These agents leverage the latest language models to provide personalized assistance.
Multimodal Search: Google expanded its search capabilities to understand and generate responses combining text, images, audio, and other modalities. This enables users to search with images or audio clips and receive relevant multimedia results.
Longer Context: Google’s search models now have the ability to understand and incorporate much longer context from a user’s query history and previous interactions. This allows for more contextually relevant and personalized search experiences.
These new AI-powered search features aim to provide more natural, interactive, and comprehensive information access compared to traditional keyword-based search. They leverage Google’s latest advancements in large language models and multimodal AI to deliver a more assistive and intelligent search experience.
What we have yet to see though, are tools for businesses who need to be on the other side of AI search. It’s clear that search behaviors are changing, but how products and services appear on the other side of discovery is the next Wild West.
AI Teammate for Google Workspace
The ‘AI Teammate’ feature will integrate into Google Workspace, helping to build a searchable collection of work from messages and email threads, providing analyses and summaries.
Project Astra – AI Assistant
Google unveiled Project Astra, a prototype AI assistant built by DeepMind that can help users with tasks like identifying surroundings, finding lost items, reviewing code, and answering questions in real-time.
This is by far the most promising of Google’s AI assistants, and for the record, is not available yet. Project Astra represents Google’s vision for the future of AI assistants…and more.
We could also very well be on the cusp of a next-gen version of Google Glass. And this time, it won’t be so awkward now that Meta and Ray-Ban have helped to consumerize wearable AI.
So what is it?
Project Astra is a multimodal AI agent capable of perceiving and responding to real-time information through text, video, images, and speech. It can simultaneously access information from the web and its surroundings using a smartphone camera or smart glasses. The system encodes video frames and speech into a timeline, caching it for efficient recall and response. For example, in the demo below, you’ll see a live video feed panning a room where the user stops, draws and arrow on the screen, and asks the AI assistant to identify the object. In another example, the video feed continues to pan with the user asking it to recognize objects that produce sound. The AI assistant accurately identifies an audio speaker.
Project Astra Key Capabilities
Identifies objects, sounds, and their specific parts in real-time using computer vision and audio processing.
Understands context and location based on visual cues from the environment.
Provides explanations and information related to objects, code snippets, or scenarios it perceives.
Engages in natural, conversational interactions, adapting to interruptions and speech patterns.
Offers proactive assistance and reminders based on the user’s context and past interactions.
Implications for Businesses
Project Astra represents a significant leap in AI capabilities, offering several potential benefits for businesses:
Enhanced Productivity: An AI assistant that can understand and respond to the complexities of real-world scenarios could streamline various tasks, boosting employee productivity and efficiency.
Improved Customer Experience: Businesses could leverage Project Astra’s multimodal capabilities to provide more intuitive and personalized customer support, enhancing the overall customer experience.
Augmented Decision-Making: By processing and synthesizing information from multiple sources in real-time,
Project Astra could assist executives and decision-makers with data-driven insights and recommendations.
Innovation Opportunities: The advanced AI capabilities of Project Astra could pave the way for new products, services, and business models that leverage multimodal interactions and contextual awareness.
While Project Astra is still in development, Google plans to integrate some of its capabilities into products like the Gemini app and web experience later this year. Business executives should closely monitor the progress of Project Astra and explore how its cutting-edge AI capabilities could benefit their organizations and drive innovation.
And that’s your AInsights this time around. Now you and I can think about the future of AI-powered search, work, next-level creations we’ll produce, and how we’ll navigate our world, and our business, with AI by our side.
Please subscribe to AInsights, here.
If you’d like to join my master mailing list for news and events, please follow, a Quantum of Solis.
Brian Solis | Author, Keynote Speaker, Futurist
Brian Solis is world-renowned digital analyst, anthropologist and futurist. He is also a sought-after keynote speaker and an 8x best-selling author. In his new book, Lifescale: How to live a more creative, productive and happy life, Brian tackles the struggles of living in a world rife with constant digital distractions. His previous books, X: The Experience When Business Meets Design and What’s the Future of Business explore the future of customer and user experience design and modernizing customer engagement in the four moments of truth.
Invite him to speak at your next event or bring him in to your organization to inspire colleagues, executives and boards of directors.
spl835
wk9dak
The insights shared at Google I/O are truly groundbreaking! The integration of AI as an extension of our daily lives and creative processes is set to revolutionize everything. The potential for AI to enhance and transform our experiences is incredibly exciting. Thank you for highlighting these game-changing developments!