I. Google announced the role of Gemini this morning, which is the biggest and most capable of this day. From today, the company’s bird chat boot will be powered by a version of Gemini, and it will be available in more than 170 countries and regions in English. Next week developers and enterprise users will have access to Gemini through API, which will be available next year more advanced version.
How good is Gemini?
Google says its most capable model “exceeds 30 of the 32 widely used academic benchmarks used in LLM Research and Development.” Gemini also scored 90.0 % of the “Understanding of Multi -Task Language”, or MMLU, which assesses capabilities in 57 subjects, including mathematics, physics, history and medicine. Google said it is the first LLM to perform better than human experts in the test.
Gemini also appears a great software engineer. Last year, using an old language model, Deep Mind introduced an AI system called Alphacode, which left 54 % of human codes behind in coding competitions. The company said that using Gemini, Google created the next generation version called Alfacod 2. Sequel estimated 85 % of humans behind.
Competitive coding is different from daily software engineering in some important ways: a normal engineer is said to be much more difficult and less difficult. But still, the growth rate here is amazing.
Gemini is locally multi -modal, which means that it can analyze the contents of a picture and answer questions about it, or make any of a text prompt. During a briefing on Tuesday, a Google executive uploaded a photo of some mathematics homework in which the student delivered his calculation to the final response. Gemini was able to indicate what move he was walking in the student’s actions, and explained how to answer his mistake and the question correctly.
The “multi -modal” can read like a strange jargon, but this term is permanently revealed in conversation with Google executives. The ability to take a variety of data (text, photos, video, audio) of the AI system, to analyze them using the same tool, and translate them into different forms is based on a basis based innovation. Many other developments make it possible. . (This is a long way to say all: Sorry for the number of the word “multi -model” in the interview below.)
According to Google’s preference for chaos branding, Gemini will be available in three “sizes”: nano, which is small enough to fit into the smartphone and will provide electricity to the features in the Pixel 8 Pro smartphone starting today – Pro, which now strengthens the border. And Ultra, which will begin to make its way to the product next year.
Without using any of these models, it is difficult to compare them with rivals such as Openi and Entropic. But my basic feeling is that Gemini Pro is viewed as a response to the company’s GPT -3.5 company: In its announcement blog post, the company noted that Pro’s better performance than GPT -3.5 Demonstrated but not many benchmarking challenges.
It sets the ultra as the most widely available, GPT 4.5 Turbo’s high competitors for the crown of the general -purpose LLM. And Ultra will not be available next year to complete Google trust and safety testing.
When it is available to consumers, Ultra will strengthen a new chat boot that the company is calling an advanced call to Bard. Although the company will not confirm this on Tuesday, the branding shows that Advanced Google’s chat can be a response to GPT Plus: subscribed products for the best products available.
From there, Google says, Gemini will start with the company’s ecosystem and environmental system of enterprise products, search, chrome, advertising, and its production apps.
II. Hours after briefing about the news, I had the opportunity to find Google CEO Sunder Pachai and Google Deep Mind co -founder and CEO Demis Consignment.
It was my first time to talk about the state of art in AI with a tough thorns since March. And my first conversation with calculation. More than 30 minutes more than 30 minutes, we talked about Gemini’s novel capabilities, how AI is changing the search, and what is the pitch thinks they think the company’s progress resulted in low software engineers next year Will hire.
The highlights of the conversation after that; This interview has been edited for clarification and length.
Casey Newton: Today, you have shared a variety of industry benchmarks that you have shown the progress made with Gemini. But I want to know about your own, models’ personal testing. What are you looking at about what makes you feel like you have taken a step forward?
Demis Hassabis: I think you will see it only using new birds – overall quality has improved widely than our previous models. The kind of thing I am particularly interested in is to use it as a science assistant. Actually analyze scientific papers, graphs into these papers, translate them. Putting tables in the graph, expanding the graph. It was very useful, and I want to double it.
Sundar Pichai: Multi -ness is very interesting. We’re working to connect it to the product and expose it to thinking, but I think there will be many new synapses.
The interesting thing to me is that it’s just our 1.0. Innovation is such a strong roadmap as we see in 2024. And one of the things in which Demis and his team are really good is an infinite way to repeat it and come with the new version.
Earlier today, I asked Eli Collins, the vice president of the product in the Deep Mind, did Gemini show any novel’s abilities. He basically told me, “Stay in touch.” Do you believe that this model will demonstrate abilities ahead of the previous LLM, or do you consider it more evolutionary?
Hassabis: I think we will see some new abilities. This is a part of ultra -testing. We are a kind of beta-check responsibility for safely checking, but also to see how it can be fixed.
Your blog post has stated how good Gemini is. If that’s the case, I wonder how good the plan may be. Can you imagine construction agents using Gemini, reservations such as luggage ??
Hassabis: You hit the head there, how. This is something we are thinking about. It is in our heritage, really, from the old deep minds. We specialize in this type of agent -based system and planning system. So see this place. We are putting severe pressure on it.
But multi -faceted is an important thing. This is a basic thing you need (building agents)) If you imagine robotics, or even digital agents, and to understand the user’s interface and interact with things Method. Before you can work useful in the world, you have to analyze the environment in a multi -faceted environment. So you can think of it as a prerequisite for planning and interaction.
Pichai: But these are the innovations.
Now you are saying that Gemini will come to search next year. How do you see it changing the search experience?
Pichai: We are already experimenting with it in the search generating experience, and as we are experiencing it with it, the board is improving. We think about Gemini as basic – it will work in all our products. The search is no different.
Usually one of the multi -models is to move hard. But today, they had to work hard to make the search multi -model. Gemini as a basic model gives them [locally] capacity, so I think this is the area where they will innovate.
Do you think about the medium -term how often Gemini has increased Gemini when you get your information you need from the results without going to a website?
Pichai: Our main point is that people are looking for web -based and diversity and content environmental systems. So although with the search productive experience we can enhance our tasks, we are actually designing products in A one. The way so that people can discover. And I think that’s what users want. I see it as a proposal for the basic value of search, so that when we develop the product, it becomes part of our purpose.
I am also reading that Gemini is coming to Chrome. What can you do with Gemini in the web browser?
Pichai: What is it on the web page and can answer questions for you, and help you with the tasks. You can imagine seeing something you want to understand, such as a set of data on a web page, and saying, “Summarize it quickly for me.” Now all this is possible, okay? Once again, being an assistant for the user is to help them to help them when browsing the web. These are all possibilities.
I want to find a sense of art. I imagine that you can spend most of the 2024 only to improve Gemini 1.0. But when you look forward to training a Gemini 2.0, do you look like you have already talked about more data and counting on the techniques you already prepared? Or are there some basic research achievements you need to make first?
Hassabis: A great question. I think the answer is both – we are moving the fronts forward. We are watching many blue sky research on things like planning. Lengthening the context; And we will need all these important abilities that the current system does not have and if we are going to the AGI level system. So we are working hard on them all.
There is a lot of juice left to achieve the top scaling, improve architecture, and perhaps more extra improvement of these modern skills. And in fact there is a large number of research fields that look hopeful.
Pichai: I would say it feels very quickly for me. We have a clear line of eyes that Gemini 2.0 is going to improve. If I see all the tasks that Google Deep is doing, and you say there are 10 to 15 areas – right now you are seeing rapid growth in an area, right? But there are also innovations from other areas, which will come to them all.
It seems that your model is really nice to win coding competitions. A year from now, can you imagine that they need to be good that you do not need to hire more engineers?
Pichai: I think this really makes programmers much fruitful, and over time takes some tasks out of job.
I think programmers will have sophisticated tools that more and more people will become programmers. We should not reduce it. The bar will change, and it will expand access to the field.
Sunder, we were talking earlier this year, and you mentioned that if the pace of development in the AI field was slightly slow, you would not mind. How are you feeling about the pace of growth now?
Pichai: I have two lenses I use. I am very hopeful about the ability. For example, if I go back and forth that the successes here can easily help develop against cancer, I want it to move as soon as possible. Why would you not do? But I think that as we are going to more and more capable models, we need to take time to ensure that we keep the safety measures in place.
I think the pace is still in a place where it is interesting. But there will be moments where we feel that we can all take a breath and catch. I think they will go hand in hand.
Hassabis: I agree. It has been a little journey of rocket ship for the entire field. I’ve been working on it for 20, 30 years, and for me it is fantastic to do everything. Diseases will really be cured by AI-capable technologies. New content will help us in climate change. I think AI can be applied to help society. We are really, practical, practical, useful things, just beyond sports and things we used to do so well.
But at the same time, I have always believed that it is one of the most changing technologies that will invent humanity. I think more and more people are coming to this scene. So we need to really think and be responsible, and get more and more distant about the unannounced results.