Table of contents Table in which the world around it makes sense to open up a knowledge bank
It is a bit annoying to hear how a AI speaks in an incredibly friendly tone and tells me that I should clean up the disorder on my work station. I'm a little proud of it, but I think it's time to stack the random, scattered devices and clean up the wire chaos.
My sister would also agree. But after a AI “sees” that “sees” my table, the chaos recognizes, and the advice for the housewife is the overall picture. Google's Gemini Ai Chatbot can do that now. And much more.
The secret sauce here is a current feature update called Project Astra. It has been in development for years and finally started at the beginning of this month. The overarching idea is to serve on your phone of an all -listed, too heard and openly intelligent AI.
Google Hawks These superpowers under a fairly uninspiring name: Gemini live with camera and screen release. The company was developed in the company's Deepmind Unit and began its development as a “universal AI assistant”. It is a shame that the final name is not so aspiring.
Nadeem Sarwar / Digital trends.
Let's start with the access situation. The ability is now available for Pixel 9 and Galaxy S25 users. However, if you have an Android phone with a subscription extended with Gemini, you can access the new tool kit.
Incidentally, that would be $ 20 a month. I tried it on the two telephones mentioned above and have now also ready to roll on my OnePlus 13. The most beautiful part? You don't have to go through technical tires to access it.
A power/volume button or screen corner to summon Gemini is everything you need. It doesn't matter which app you run.
Meaning of the world around you
I started showing the camera on a painting and asked. Gemini Live was able to identify it as a painting in the Madhubani style and decrypt the bold use of colors and the representation of animals.
Nadeem Sarwar / Digital trends.
Then I had a short history hour and the variations that have developed over the years. The information was exactly, except for the granules. Fortunately, you can also choose a text-based back and forth with Gemini if you are in one place where language talks could be uncomfortable.
What I like about Gemini Live's New Camera and Screen Sharing Avatar is that she is not extremely talkative. You can interrupt it at a certain point in time, which only contributes to the “natural” attractiveness of the conversations.
I tried Gemini in a variety of scenarios. I was not prepared for it.
The answers that it delivers are usually concise, as if there is a chance (or even an impetus) to give you a follow -up request instead of giving an overwhelmingly long answer. It is characterized in a number of topics and visual scenarios, but there are some pitfalls.
Nadeem Sarwar / Digital trends.
Google Lens cannot yet use, which means that Gemini cannot compare the images that it sees on the screen of your phone with the right results on the web. In addition, it cannot access information in real time if you ask Gemini to follow up the latest developments on a topic or personality.
I asked it about plant species, restaurant lists, recordings of data from notification boards and my medical recipe for a recent flu attack. Gemini was pretty good, more than I have ever experienced that the AI chat bot has occurred so far.
Unlocking a knowledge bank
Next I pushed Gemini to understand complex academic material. I put a book about machine learning in the camera frame. Gemini Live not only recognized it, but also gave me an overview of the content of the book and its core subjects.
Nadeem Sarwar / Digital trends.
Strangely, I started leafing through the sides and landing on the list of capitals. The AI recognized the progress, stopped talking and asked me if I was now interested in a specific chapter after looking at the list of topics.
At that moment I was surprised.
I asked to break up a few complex topics, and the AI did a respectable job, even went beyond the scope of the on-page material and accessed information from its expansive knowledge bank.
For example, when I asked it about the content of the introductory page on Bhisham Sahni's pioneering Roman Tamas, the AI correctly took up the mention of the Saitya Akademi Award. Afterwards it mentioned details that were not even listed on the page, such as the year in which the prestigious literary honor won and what the book is about.
On the other hand, the Hindi language of Gemini was terrible. It was not just the bad accent, but the fact that Gemini repeatedly spoke out pure owl and no-words. When trying to read Urdu, Persian and Arabic, it did a significant better job, but often confused words from random lines.
Nadeem Sarwar / Digital trends.
During my first attempt with Urdu poetry, it not only recognized the Urdu text, but also gave a precise summary of the poem. The biggest challenge was the story again. Hearing an Anglied version of Urdu really hurt my ears.
Is characterized in surprising places
AI is a fantastic tool for problem solving, and there are numerous benchmarks to prove this. I have tested it against physics problems that deal with thermodynamics, electrochemical equations and statistical problems that occur in a handwritten notebook. Gemini Live did a fantastic job in such tasks.
It has also emerged in creative tasks. My sister, a fashion designer, presented one of her sketches in the camera view and asked both feedback and improvements. Gemini Live began praising the design, pulled parallels with some design ideology of fashion brands and made a handful of recommendations.
Nadeem Sarwar / Digital trends.
In the other products, the AI also advised my sister to convert hand -drawn sketches into digital concepts via the best tools. These words followed the instructions by providing helpful information about the software stack and where you can find learning material.
When I put a few Duracell batteries in the camera view, she not only recognized them carefully, but also told me the hyperlocal e-commerce platforms that they can deliver for me within minutes.
The services – with the name Blinkit and Swiggy Instamart – are only available in India and are mainly reserved for urban areas. Even in a slightly lit room, it could identify a few wired earphones in the first attempt.
Situation awareness is the strong suit.
Compared to their usual Gemini chat or what you can find in the section AI overviews by Google Search, the Gemini -Live talks pursue a cautious approach to exploit knowledge, especially if it is sensitive in nature. I noticed that topics such as food recommendations and medical treatment are treated with increasingly careful approach, and users are often trained to find the right expert resource.
A few familiar pitfalls
Nadeem Sarwar / Digital trends.
My overwhelming snack is that Gemini's “Project Astra” Makeover is extremely impressive. It is an insight into the future of what smartphones can achieve. With a few improvements, integrations and cross-app workflows, the Google search system can feel like an outdated relic. But at the moment there are a few blatant mistakes.
On some occasions, I noticed that the storage system would be hay. When the AI was asked to identify a fitness band in the camera view, she recognized it correctly as the Samsung Galaxy adjustment 3. However, when I asked a follow-up question, the device incorrectly perceived it as a fitness band from Huawei.
It can obviously lie too. And very confidently, I could say. For example, when I said it should summarize my inspection of the portable device, the AI replied that digital trends had not yet checked it. In reality, the article was published a week ago.
Next I asked to go through a few articles on my car side after activating the screen release. Gemini did a decent job to explain the stories, but occasionally stumbled across the context -related understanding. For example, mentioned that only Intel and AMD NPUs can create that qualify for the Copilot+ badge.
Nadeem Sarwar / Digital trends.
The article, on the other hand, clearly mentions that Qualcomm before the competition the first to meet these criteria. And that it was only at the end of last year that AMD and Intel were able to finally set up this KI chip baseline with a new portfolio of processors.
In the middle of the conversation about an article it met again a memory problem. Instead of summarizing the story that was discussed, it went back to talk about the first article that it saw about the screen release. When I interrupted it in the middle of the story, Gemini repaired his mistake.
Another topic that I noticed with the narrative of not English languages is that Gemini lives and halfway changed by the narrative on the random principle. It was quite restless and the pronunciation was absolutely mechanical, far different from its human English conversation skills.
Nadeem Sarwar / Digital trends.
The fights for machine vision can also be seen against stylistic fonts. On some occasions, it confidently spat out wrong information, and when asked to correct itself, the AI did not comment on finding the latest information on this topic. These scenarios are rare, but the Gemini errors are here to stay.
To summarize everything, I think Gemini live with camera and screen release is one of the biggest jumps that AI has made so far. It is one of the most practically rewarding implementations of the generative AI. All it needs is a shot variety and a solution for the “confident liar” syndrome.
Things are now definitely on the right track and mostly, but still a few crucial milestones away from being the perfect AI companion of techno-futuristic dreams.
Comments are closed.