- Google has unveiled Gemini, its generative AI model.
- Google made a big internal push to get Gemini out the door this year.
- Bard is getting Gemini right away, but a more advanced model won't arrive until next year.
After months of teasing us, Google is starting to roll out its generative artificial intelligence model, Gemini.
The new model, which will be launched in phases, is Google's chance to thwart the narrative that it's fallen behind rivals such as OpenAI.
But while users will have access to Gemini this month, the most advanced version of the model won't arrive until early next year.
Gemini has three "sizes" that will be available in stages: Ultra, Pro, and Nano – the last of which is designed to run locally on devices such as smartphones.
Google is giving users access to the Pro version Wednesday through its chatbot Bard, and to Cloud customers in the coming days, but says the Ultra model – the biggest and most technically advanced of the three – is still undergoing internal testing, and won't roll out until early 2024.
Google says it plans to infuse its most popular products with Gemini over time. It will also launch Gemini Ultra to a new version of Bard called Bard Advanced next year. Sissie Hsiao, Google's VP and general manager of Bard and Assistant, wouldn't say if Bard Advanced will cost money to use but didn't deny it was a possibility.
There's immense pressure on Google right now to prove it's still an AI industry leader with Gemini, which was trained to be multimodal, meaning it can process different types of media such as text, pictures, video, and audio. But Google boasts that Gemini is its "most flexible" model too, capable of running on a range of sources from data centers to smartphones.
In a roundtable with reporters this week, Google executives said the Ultra model of Gemini is the first model to outperform human experts on MMLU (massive multitask language understanding), a measurement that tests on subjects such as math, history, law, and ethics. The model scored 90.0%, beating OpenAI's GPT-4's 86.4%
That all sounds great, but we won't be able to test the full capabilities of Gemini just yet. Google says the Pro model outperformed GPT-3.5, which powers the free version of ChatGPT, and users will be able to test a fine-tuned version for Bard starting Wednesday (only in English to start, Google says, and not in the UK). But when asked how Gemini compares to GPT-4 overall, executives declined to comment (it did, however, benchmark Gemini Ultra against GPT-4 across a range of different areas and published the results here).
Publicly, Google has swatted away suggestions it's been scrambling to chase the competition, but things have looked very different inside the company as it has raced to get Gemini out the door and infuse AI into all of its key products.
Earlier this year, CEO Sundar Pichai merged Alphabet's prized DeepMind unit with its internal AI group, Brain, to quicken work on Gemini. Staff were also told that Google would reduce the amount of research it publishes to limit rivals commercializing their ideas, BI reported.
Where Google believes Gemini has the edge on the competition is in what it calls "sophisticated reasoning," which is the way the model processes complex information across different types of media.
In one demo shown to the press, DeepMind researchers used Gemini to scour hundreds of thousands of research papers to extract specific types of data. Google said Gemini was able to distinguish between papers that were relevant to the study and those that weren't. More interestingly, they were able to show Gemini a graph with old data and have it produce an updated version, with the new data plotted.
While Gemini can process different media types, Eli Collins, VP of product for DeepMind, said the initial Gemini models won't be able to generate images and videos, but suggested this was something that would come down the line in other models.
Collins added that Google has seen some "novel" capabilities in Gemini that could give it an edge over rival models, but would not elaborate on what those might be.
Gemini has been trained on and is powered by tensor processing units (TPUs), and Google is using Gemini's rollout to announce its new Cloud TPU v5p and a new AI hypercomputer that will be used to improve AI training and delivery. Interestingly, Amin Vahdat, a VP at Google Cloud AI, said Gemini will run on both GPUs and TPUs in the future, but did not elaborate beyond that.
Google says it's making Gemini Pro available to enterprise customers through its Vertex AI program, and for developers in AI Studio, on December 13.
As for its consumer products beyond Bard, Google is launching Gemini Nano on the Pixel 8 Pro smartphone on Wednesday, which will enable features such as summarizing the contents of voice recordings.
Google also said it plans to add Gemini to SGE, its generative AI-powered version of Search, as well as Chrome, Duet AI, and other products, in the coming months.
Are you a current or former Googler with a tip? You can reach Hugh via encrypted messaging apps Signal/Telegram (+1 628-228-1836) or email at hlangley@protonmail.com.
