Trudy Painter

Gemini Multimodal Launch

Videos to introduce multimodal AI (faced criticism)

Year2023
LocationGoogle
TagsAI, Film

gemini gif

In December 2023, Google launched Gemini, a multimodal AI model that could understand and generate text, images, and audio.

I worked on the small team tasked to create a series of promotional videos for the launch. Multimodal AI was a new and complex concept, so we had to create a series of videos that would explain the technology to a wide audience.


Controversy image

The launch was an overnight success. However, the launch video's popularity was followed by a wave of scrutiny.

In the video, it was unclear that the prompts voiced aloud were abridged versions of the actual prompts sent to the model. The underlying prompts for anyone to recreate the results are outlined in this developer post.

Media Coverage

Personal Note

I am EXTREMELY glad news outlets were criticizing this launch. I believe it's important to hold companies accountable for their claims, especially when it comes to introducing complex technology.

It shifted my values from making AI feel magical -> towards making it feel approachable.

Another series of videos from the launch (facing less scrutiny) were 1 minute videos demonstrating novel AI use cases, like picking an outfit or combining different emojis.

I was responsible for concepting, scripting, and voicing the use case video for coding. I personally love fractal trees... especially ties between biomimicry and code, so I made a video highlighting the model's ability to generate code by bridging patterns across different domains.

← Back