Google just released Gemma, the “precious stone” translated from Latin, a family of lightweight, state-of-the-art open models. Gemma, inspired by its more robust sibling Gemini, is designed to make AI more accessible and open. With two versions on offer—a 7B parameter model for those with powerful hardware and a leaner 2B parameter version for more modest setups—Google says that Gemma is all about being flexible and accessible.
Gemma models are designed to be small, light, and quick. Although they’re small, Google says these little guys can outperform much bigger models in some key areas, and the best part? You can run them on your own computer without needing any extra equipment.
While Gemini is more of a "look, don’t touch" kind of deal unless you’re working through Google's specific platforms, Gemma is out there for anyone who wants to experiment. It’s a big deal because it opens up the playground to many more people, not just the ones with big budgets or special access.
Tris Warkentin from DeepMind says that the smaller models have gotten really good at generating content, something that used to be a job for the big models. Tris is really excited about what this means for everyone in the game. Now, developers don't need to equip themselves with costly hardware. Instead, they can just work on their regular laptops to run inference or fine-tune these models. It's a big deal because it makes playing around with AI a lot more accessible to more people.
What’s an “open model”?
Gemma is an open model, which doesn’t mean open source. It has more restrictions and regulations.
Google says, “Open models feature free access to the model weights, but terms of use, redistribution, and variant ownership vary according to a model’s specific terms of use, which may not be based on an open-source license. The Gemma models’ terms of use make them freely available for individual developers, researchers, and commercial users for access and redistribution. Users are also free to create and publish model variants. In using Gemma models, developers agree to avoid harmful uses, reflecting our commitment to developing AI responsibly while increasing access to this technology.” This means developers and researchers can use Gemma for fine-tuning-like tasks, but the usage will depend on the terms of use.
Benchmarks
According to the benchmark posted on HuggingFace, Gemma marks some impressive results. Gemma 7B is highly compatible with other 7B models, including Mistral 7B. A lot more will be clear once people use the model on real cases and share the results since the scores on the leaderboard above usually show the quality of pre-trained models, not chat ones. More descriptive benchmarks worth trying for chats are MT Bench, EQ Bench, and the lmsys Arena.
Using Gemma
Using Gemma is pretty easy. It’s available through Kaggle and Hugging Face, Nvidia’s NeMo, and Google’s Vertex AI. First-time Google Cloud users will get $300 to use the models, and the researchers can apply for up to $500,000 in cloud credits.
You can chat with the Gemma Instruct model on Hugging Chat.
Ethical regulations
With each model rolling out, probably except for Grok, companies are very cautious about red-teaming, and Google’s not an exception. Warkentin mentions that they’re very careful with Gemma, considering the risks of open models. They’re employing automated techniques to filter out sensitive data from the training data sets.
Google is releasing Gemma with a “responsible AI toolkit”, allowing developers to create their own rules or “banned words” before deploying Gemma to their projects.
Wrapping up
Google's Gemma launch signals a move towards more accessible AI, offering powerful yet flexible models for a wide audience. Highlighted by DeepMind's Tris Warkentin, these advancements promise to bring high-level AI tasks within reach of everyday developers. As Gemma rolls out, its open model approach and ethical framework invite a broader innovation base, setting the stage for a more inclusive and responsible future in AI development.