Category: Artificial intelligence

  • Stable Diffusion Models for Local Hosting on Google Pixel 10 Pro

    Stable Diffusion Models for Local Hosting on Google Pixel 10 Pro

    Kamila Wojciechowska says she has found multiple clues regarding upcoming AI features that Google intends to roll out on various Pixel devices in the coming years. She references a source from Google’s gChips division for all details concerning unreleased Tensor chipsets.

    New AI Features in Development

    Wojciechowska highlights that Google is working on Video Generative ML, which is anticipated to implement AI algorithms in Google Photos and potentially YouTube Shorts, enhancing video editing capabilities. Additionally, she mentions features like ‘Speak to Tweak’ and ‘Sketch-to-Image’, the latter already seen in some recent Samsung Galaxy models.

    Innovations in Pixel Devices

    Moreover, a new ‘NanoTPU’ could be introduced with the Pixel 11 series, which would assist in monitoring sleep patterns, detecting falls, and managing sleep apnea. Google’s Stable Diffusion-based LL models are also expected to operate locally on the Pixel 10 and later, as opposed to the cloud-based system currently in use.

    Timeline for Release

    It is said that the new image signal processors (ISPs) and Tensor Processing Units (TPUs) found in the Tensor G5 and Tensor G6 are responsible for these AI-driven innovations. Reports suggest that Google is unlikely to unveil any of these features until the Pixel 10, which is expected to arrive in summer 2025. However, it’s possible that the launch could be pushed back a year to align with the rollout of Android 17 and the Pixel 11 series.

    Sources: Android Authority & Kamila Wojciechowska

  • Google Developing AI Agent to Control Web Browsers

    Google Developing AI Agent to Control Web Browsers

    According to a report from The Information, Google is developing an AI tool that can manage web browsers to make boring tasks easier, like filling out forms or reserving flights.

    Project Jarvis Unveiled

    This AI agent, known as Project Jarvis, is set to launch alongside the upcoming Gemini AI model, which might be released in December of this year. The name "Jarvis" stands for "Just Another Very Intelligent System," inspired by a fictional AI helper in the Marvel films who assists Tony Stark.

    Features of the AI Agent

    Google plans to restrict the agent’s functionality to browsers like Chrome. It will assist users with activities such as booking cinema tickets or buying goods online. People will have the ability to interact with the agent directly and give commands for various tasks.

    If this sounds a bit like something you’ve heard before, it’s because it bears a resemblance to Anthropic’s recent Claude 3.5 Sonnet, which enables app developers to "guide Claude to operate computers like humans do". OpenAI is also believed to be creating similar solutions.

    The Information, Anthropic, Reuters, Image Source.

  • Survey Reveals What Smartphone Users Want Beyond AI

    Survey Reveals What Smartphone Users Want Beyond AI

    In the past few years, top smartphone companies have really focused on adding artificial intelligence into their devices. A clear example is Apple’s advancements in AI technology. Google’s Pixel phones stand out, now featuring the Gemini AI platform. This platform offers a language model that supports natural conversations, real-time translations, text generation, and various AI-driven photography tools. Samsung and Xiaomi have also entered the competition with their own AI systems, named Galaxy AI and MiLM, respectively. Despite these innovations, it seems that many consumers are not particularly interested in AI features, indicating that they might not be a priority for most people.

    Consumer Preferences

    A large survey conducted by CNET (Computer Network) asked 2,484 Americans about what features they consider most important when buying a smartphone. The results showed that 61% preferred a big battery, while 46% looked for sufficient storage space, and 38% focused on camera quality. In contrast, only 18% of those surveyed placed importance on AI features in their purchasing decisions. Nearly half of the respondents even said they would not pay more for smartphones with AI capabilities.

    The Limits of AI Interest

    While artificial intelligence can be beneficial in daily life, the general interest seems low. One reason for this could be worries over personal data security, as 34% of participants expressed concerns in this area. Additionally, some critics claim that AI is primarily a marketing gimmick, used to create a sense of advancement without real hardware improvements. Although this critique wasn’t directly addressed in the survey, it might contribute to the overall lack of enthusiasm for AI features in smartphones.

    CNET

    Image source: Tim Dougles/Pexels

  • Google Gemini 1.5 Successor Expected in December 2023

    Google Gemini 1.5 Successor Expected in December 2023

    Announced in early December 2023, Google’s Gemini lineup of multimodal large language models includes Gemini Ultra, Gemini Pro, Gemini Flash, and Gemini Nano. This family powers the chatbot of the same name, which was previously known as Google Bard. It started its journey in early February, rolling out to 10,000 testers for less than a month of trials before its general launch. Gemini faces stiff competition from ChatGPT, Copilot, and other AI systems. If everything goes according to plan, we can expect version 2.0 to be released this December.

    Current Versions and Updates

    At the moment, the stable releases for Google Gemini are version 1.5 for the model, dated May 14th, and the Android app, which is at build 1.0, number 668480831. This app was launched less than two months ago on August 29th and received several language updates on October 1st while maintaining the same build number. In June, Google introduced a free and open-source series of LLMs based on Gemini, called Gemma. This lighter version is already at version 2. It’s important to note that this does not mean it is derived from Gemini 2.0. The most recent updates to the Gemini models were launched last month, with Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002 coming out on September 24th.

    The Future of Gemini

    Unfortunately, details about Gemini 2.0 are still unclear. However, sources close to Google’s operations have disclosed to The Verge that this new model is on track to be released in December, coinciding with OpenAI’s anticipated launch of its next major AI model. But since Sam Altman has recently refuted this rumor, Google might seize the opportunity to launch its next-generation AI at least a few weeks ahead of its main competitor.

    When it comes to features and capabilities, many are speculating that Google Gemini 2.0 will advance toward achieving reasoning skills that could match or even surpass those of humans. The Android app will surely undergo necessary updates to support the new advancements in the model.

  • Google’s Gemini Boosts Android AI with Lock Screen Control

    Google’s Gemini Boosts Android AI with Lock Screen Control

    Google is enhancing Gemini’s features on Android, making it possible for the AI assistant to manage calls and messages even when your phone is locked. Recent code discoveries in a forthcoming update indicate that Gemini will soon incorporate functions that were previously exclusive to Gemini Live.

    New Lock Screen Functionality

    A fresh settings option will enable users to activate Gemini’s functionality from the lock screen. However, for safety, you’ll still have to unlock your device if the response involves sensitive information, such as details from Gmail.

    User Interface Improvements

    In addition to these new messaging capabilities, the update will also refresh the AI assistant’s appearance. The floating text overlay will now adapt vertically to accommodate longer messages more effectively, and Gemini Extensions will be organized into various categories like Communication, Device Control, Travel, Media, and Productivity.

    Simplified Commands and Increased Usability

    Moreover, the command examples are being reduced from three to just one in each category, making it easier to use. This aligns with Google’s wider strategy to establish Gemini as the top AI assistant on Android.

    The new lock screen features are particularly useful for hands-free situations, such as when you’re driving or engaged in activities that demand your full attention.

  • Sam Altman Refutes December OpenAI Model Release Claims

    Sam Altman Refutes December OpenAI Model Release Claims

    OpenAI is gearing up to unveil a new AI model known as "Orion" in December, according to a recent report from The Verge. It suggests that the model will first be available to some of OpenAI’s close partners, with Microsoft set to host Orion on its Azure cloud platform starting in November.

    Details on the New Model

    The report highlights that OpenAI considers Orion to be the next step after GPT-4, although it’s not certain if the official name will be GPT-5 when it launches. Both OpenAI and Microsoft have chosen not to comment on this initial report, leaving many details about the new model under wraps. Back in September, Shaun Ralston from OpenAI shared a graph on X that illustrated the advancements made by the models since the release of GPT-3.

    Insights from Shaun Ralston

    In his post, Ralston mentioned a "GPT-Next" model expected to be released this year. He noted that this model was trained on a "compact Strawberry (OpenAI o1) version" and boasts a staggering 100 times more "computational volume" compared to GPT-4. Notably, Orion was also referenced in his post but as an independent model that was trained on "10K (Nvidia) H100 GPUs".

    Conclusion

    As of now, details remain scant regarding the capabilities and features of Orion. Both the AI community and industry watchers are eager to learn more as the December launch approaches.

  • Reliance and Nvidia Join Forces for AI Infrastructure in India

    Reliance and Nvidia Join Forces for AI Infrastructure in India

    First revealed in September 2023, Nvidia has reaffirmed its dedication to establishing a foundation for AI infrastructure in India. They have teamed up with Reliance Industries, a major conglomerate based in Mumbai.

    A Great Moment for India

    During Nvidia’s AI Summit in Mumbai, Mukesh Ambani, the Chairman of Reliance, stated, “This presents a fantastic chance for India” to leverage its “large pool of computer engineers.” Together, the two firms will work towards creating a scalable power infrastructure, which will have a capacity of 1 gigawatt and utilize green energy sources.

    Building AI Infrastructure

    “To lead in artificial intelligence, it’s crucial to have AI tech that India possesses, data, and finally, an AI infrastructure,” Huang noted, as reported by Mint. He announced the partnership between Reliance and Nvidia to construct this AI infrastructure in India.

    Previous Collaborations

    In the previous year, the two companies had pledged to develop AI supercomputers in India, aiming to create extensive LLMs that are trained in local languages. Nvidia will supply the necessary technology while Reliance will oversee the infrastructure’s maintenance. In addition to Reliance, Nvidia has also disclosed collaborations with several Indian IT companies, such as Tata Consultancy Services (TCS), Tech Mahindra, Infosys, and Wipro.

    Nvidia, Mint

  • Apple Critiques AI Photos: iPhone Should Capture Reality, Not Fantasy

    Apple Critiques AI Photos: iPhone Should Capture Reality, Not Fantasy

    Apple’s software chief, Craig Federighi, recently shared insights with the Wall Street Journal about Apple Intelligence and the upcoming AI features that will be launched next week for users in the US with the release of iOS 18.1. European users can expect to see these features at a later time. Initially, Apple Intelligence will provide just a handful of features, utilizing the GPT-4o AI model, the same one that powers ChatGPT, for some functionalities.

    Limited Features in iOS 18.1

    This cautious approach towards AI, particularly in image processing, appears to be a deliberate choice. In the iOS 18.1 update, Apple introduces just one AI capability in the Photos app called "Clean Up." As demonstrated in the video below, this feature lets users easily erase unwanted items from their photos with a simple tap, much like Google’s Magic Eraser has offered for some time. Federighi mentioned that there were extensive internal debates at Apple about whether the "Clean Up" feature might go too far, as removing objects could mean that a photo no longer accurately represents reality.

    Comparison with Competitors

    In contrast, Google and Samsung are pushing the boundaries of AI in image editing much more aggressively. Google’s Magic Editor not only has the ability to eliminate objects but can also insert new elements, zoom in on subjects, rearrange them, or even replace the sky to alter the image’s atmosphere. Federighi voiced his worries that such capabilities may lead people to see pictures less as truthful representations and more as imaginative creations. As a result, differentiating between authentic photography and AI-generated images could become increasingly challenging in the future.

    Addressing Authenticity in Photography

    Adobe has proposed a potential answer with its Content Credentials, a system designed to confirm the authenticity of photos and track their editing history. However, the limitation is that only images taken with cameras compatible with this platform are eligible for verification, including models like the Leica M11-P, Sony A1, A7S III, and A9, as well as the Nikon Z6 III. Some of these camera models will receive support only after a future firmware upgrade.

  • Nvidia Collaborates with Indian IT Firms for AI Development Boost

    Nvidia Collaborates with Indian IT Firms for AI Development Boost

    Nvidia is teaming up with Indian IT companies like Infosys, TCS, Tech Mahindra, and Wipro to boost AI development in India. The firm believes this "great upskill" initiative will pave the way for "a new wave of opportunity."

    New Job Creation and Training

    The goal is to generate fresh job opportunities while preparing the next generation of AI developers. Tech Mahindra has set up a Center of Excellence (CoE) that will leverage Nvidia’s AI enterprise solutions to embed generative AI into business applications. They are working on Project Indus 2.0, which is a specialized AI model aimed at understanding Hindi and its various dialects. This model utilizes Nvidia’s Nemotron-4-Mini-Hindi-4B, a four-billion-parameter linguistic small-language model, trained with "real-world Hindi data, synthetic Hindi data and an equal amount of English data."

    Expanding AI Solutions

    In addition to Tech Mahindra’s efforts, Tata Consultancy Services (TCS) has rolled out enterprise solutions targeting the automotive, manufacturing, telecommunications, financial, and retail sectors. They have successfully trained "50,000+ AI associates" to assist clients in enhancing their skills and executing AI strategies.

    Wipro has developed a generative AI studio that employs Nvidia AI enterprise solutions to speed up applications in supply chains, user agents, retail, and beyond. Meanwhile, Infosys Topaz is designed to aid companies in weaving generative AI into their daily operations. Companies like Reliance and Ola Electric have also shared their plans to utilize Nvidia’s Omniverse simulation technology for testing factory layouts prior to their implementation.

    A Promising Future for AI in India

    According to a report by Reuters, during his speech at the Nvidia AI Summit currently happening in Mumbai, CEO Jensen Huang mentioned that by the end of this year, India is expected to possess "20 times more compute power than just a little over a year ago."

    Nvidia, Tech Mahindra, Project Indus, Tata Consultancy Services, Wipro, Reuters, Image Source

  • New MvACon AI Enhances Self-Driving Car Perception Accuracy

    New MvACon AI Enhances Self-Driving Car Perception Accuracy

    Researchers at North Carolina State University have come up with a fresh method to assist self-driving cars in understanding their surroundings more effectively. This innovative system, called Multi-View Attentive Contextualization (MvACon), tackles some of the usual problems seen in existing vision transformer AI models that are designed to detect objects in 3D from various perspectives.

    Enhanced Detection Performance

    To evaluate its effectiveness, the team conducted multiple experiments using the nuScenes dataset, which is well-known in the realm of autonomous driving. MvACon significantly improved detection accuracy across different leading vision systems. When integrated with the BEVFormer system, it demonstrated noticeable advancements in identifying object locations, predicting their orientations, and estimating their speeds.

    Local Object-Context Awareness

    The researchers discovered that the attention mechanism of MvACon, which concentrates on clusters, keeps the detection precise for both vehicles and surrounding structures. They refer to this as a "local object-context aware coordinate system," suggesting that the system gains an enhanced understanding of spatial relationships, which is crucial for effectively tracking movement and orientation.

    Compatibility and Versatility

    A standout feature of this technology is its ease of integration into existing autonomous vehicle vision systems without requiring additional hardware. Regardless of the configuration, it consistently enhances performance, making it a versatile tool for various implementations.

    Testing results indicate that the system operates well even in complex situations with numerous overlapping objects.