Gemini, Google’s new multimodal AI model

Avatar de Frederic BOX

Google has announced the launch of Gemini, its new multimodal AI model. The model is based on a 1.5 trillion-parameter large-scale language (LLM), trained on a massive dataset of text and code.

Comparison with ChatGPT

Gemini was compared with ChatGPT, another multimodal AI model developed by OpenAI. The results of these comparisons show that Gemini is significantly ahead of ChatGPT on 30 out of 32 benchmarks (evaluation date : 2023/12/06) . In particular, Gemini is better than ChatGPT at

  1. Understanding the multitasking language
  2. Generating Python code
  3. Interacting with video and audio

Gemini architecture

Gemini was built as a multimodal model from the outset. This means that it is able to process information from different sources, such as text, code, image and audio. This multimodal approach enables Gemini to be more accurate and complete than AI models that only process one source of information at a time.

Google’s responsible approach

Google has stated that it is important to proceed cautiously and step by step in the development of Gemini. The company has therefore put in place numerous security and accountability measures to ensure that Gemini is used safely and responsibly. In particular, the « controlled beta » of Gemini Ultra, Gemini’s most powerful model, will be deployed very slowly to test its capabilities.

Gemini available in three different versions

Gemini Nano

A lightweight version designed to run natively and offline on Android devices.

Gemini Pro

A more robust version designed to power many of Google’s AI services, including Bard and SGE.

Gemini Ultra

The most powerful Gemini model, designed for data centers and enterprise applications.

Gemini applications

  • Bard: Google’s chatbot
  • SGE: Google’s new search experience
  • All Google Ads products
  • Chrome browser
  • Vertex AI in Google Cloud
  • Other Google tools

Publication : Gemini, A family of highly capable multimodal models

Gemini is a new AI technology that has the potential to transform the way we interact with machines. Google has taken significant steps to ensure that Gemini is used safely and responsibly.

ggSGE.com
Hands-on with Gemini : Interaction with multimodal AI