Google’s AI tool Whisk lets you use images as prompts

Google has unveiled Whisk, a new AI-powered tool designed to simplify image generation by allowing users to provide images as prompts instead of crafting detailed text descriptions.

Whisk offers a fresh approach to AI-generated art by blending subject, scene, and style cues from user-provided visuals.

Users are met with a straightforward process: upload multiple images to represent the subject, scene, or style they want. For those without images on hand, Whisk includes a dice icon that generates sample prompts—though these appear to be AI-generated themselves.

While text inputs are optional, users can add written details to refine the output further. Once the prompts are set, Whisk creates images alongside a corresponding text prompt. Users can favorite or download their results or tweak them by editing the prompts or uploading new visuals.

In a blog post, Google emphasized that Whisk is aimed at “rapid visual exploration, not pixel-perfect edits,” acknowledging that results may sometimes “miss the mark.” This iterative design lets users refine their creations and explore variations quickly.

Whisk leverages the latest iteration of Google’s Imagen 3 image-generation model, which promises more sophisticated results than its predecessors. Alongside Whisk, Google introduced Veo 2, its updated video-generation model, which is designed to understand “the unique language of cinematography.” Veo 2 reduces common AI artifacts like extra fingers and will debut in Google’s VideoFX before expanding to YouTube Shorts and other platforms in 2025.

While Whisk is still in its early stages, it offers a glimpse into how AI can evolve creative workflows. For now, the tool is only available to US users, with those outside the US given the option to be notified when it becomes accessible in their region.

Share this Post:

Accessibility Toolbar