Category: Artificial intelligence

  • Meta Launches Movie Gen AI for Quick Video and Music Creation

    Meta Launches Movie Gen AI for Quick Video and Music Creation

    Meta has introduced Movie Gen, an advanced AI that can produce and edit videos while incorporating music and sound effects based on text prompts. This AI stands out due to its exceptional video and audio generation abilities, offering features and realism that surpass those of any other AI available.

    AI Specifications

    Movie Gen is built on a 30-billion parameter AI model that can create 16-second HD clips from text prompts. It has been pre-trained with one billion images and 100 million videos, selected from a much larger dataset to enhance quality for training purposes. On the audio side, Movie Gen Audio utilizes a 13-billion parameter model designed to generate 48 kHz sound effects and music from text prompts, having been pre-trained on one million hours of audio. The AI has been improved through human feedback along with high-quality audio and video samples.

    Realistic Video Generation

    When provided with a photo of a person and a description of that person in a specific scene, the AI can produce a realistic video featuring an animated actor in that environment. It has been programmed with knowledge of 22 different camera motions and positions, such as wide angle, tilt up, and truck left, allowing filmmakers to determine virtual camera placement and movements similar to actual filming. For filmmakers who prefer traditional methods, high-end DSLRs like the Nikon Z6III, available on Amazon, can still be utilized. Interestingly, Movie Gen is capable of editing videos in a way that is both precise and realistic, which is something other AIs currently struggle to achieve.

    Audio Integration and Limitations

    Additionally, text prompts can be used to incorporate professional-quality audio into the video clips, featuring sound effects and music scores. While the AI can generate music lasting several minutes, it is restricted to 16-second video clips due to the significant computing power required. The audio is synchronized with the scene’s beats and can produce off-screen sounds, like birds chirping in a forest, based on the scene’s context.

    Meta is actively working on implementing safeguards for Movie Gen and plans to launch the AI once it is assured of its safety.


    Image 1
    Image 1
    Image 1
    Image 1
  • Google Lens Introduces Video and Voice Search Features

    Google Lens Introduces Video and Voice Search Features

    Google has introduced new voice and video search features for Google Lens during the I/O 2024 event back in May. Now users can easily long-press and ask questions using their voice, making the search process much simpler and more convenient.

    Custom Gemini Model Powers Video Search

    I/O event preview

    Enhanced Interaction with Google Lens

    Once Lens starts capturing video, users can pose questions about what they observe. For instance, when asked, “Why are they swimming together?” the Lens responded through Google Gemini. This video search capability allows users to present their phone with moving objects and inquire about them, enhancing the usefulness of Google Lens in various situations. To access this feature, users can participate in the “AI Overviews and more” experiment within Search Labs.

    Rajan Patel, Google’s vice president of engineering, explained how the feature operates. Google captures the video as a series of image frames, applying existing computer vision techniques used in Lens. Importantly, the responses are generated by a custom Gemini model designed to interpret multiple frames in sequence. Once the frames are processed, the model pulls relevant information from the web to formulate an answer.

    In conclusion, this development effectively utilizes existing technology, adding significant value to Google Lens.


    Image 1
  • OneUI 7 to Feature Useful Apple Intelligence Tool

    OneUI 7 to Feature Useful Apple Intelligence Tool

    All the latest events showcasing high-end smartphones have focused on AI, and this trend is set to persist. Samsung’s OneUI 6.1 already includes numerous useful AI tools, and a leak hints that the next version, OneUI 7, may introduce a feature akin to Apple’s AI search in its Gallery app.

    AI Search Feature in OneUI 7’s Gallery App

    The AI capabilities will allow users to search their photo collections more effectively. Instead of endlessly scrolling through countless screenshots to find a specific image, users will be able to simply search for it. This not only streamlines the process but greatly enhances the overall experience for users.

    The information comes from the credible source ICE Universe, who also mentioned that the Gallery app in OneUI 7 will receive further enhancements. However, there were no details shared about other potential updates.

    Current AI Features in Samsung’s Galaxy Devices

    At present, some of the Galaxy AI tools include Circle to Search, various note-taking and summarization options, as well as translation and transcription capabilities that function with third-party applications like WhatsApp, among others.

    Xiaomi 14T / 14T Pro was released

    Timeline for OneUI 7 Release

    As for when OneUI 7 will be rolled out, there are no set dates as of yet. Samsung generally unveils its S series flagship models alongside a significant update. Therefore, it’s likely that the Galaxy S25 Ultra will be the first device to showcase OneUI 7, expected to launch in early 2025, potentially in January.

    In addition to AI advancements, the flagship model will feature a significant redesign. Previous rumors have indicated that Samsung is moving towards rounded corners for the S25 Ultra, which should make it feel much more comfortable in hand.

  • SenseRobot Launches AI Chess Robot with 2,900 ELO for Kids

    SenseRobot Launches AI Chess Robot with 2,900 ELO for Kids

    SenseRobot has introduced the SenseRobot Chess, an innovative AI robotic chess coach designed to assist children in improving their chess strategies. This robot offers a wide range of difficulty levels, spanning from 200 to 2,900 ELO, making it suitable for both novices and seasoned players. In its first match, it successfully outplayed Hou Yifan, the world’s top active female grandmaster, who has a standard ELO of 2,633.

    Advanced Features

    The SenseRobot Chess features automatic player log-in through facial recognition, enabling it to remember player settings. It is equipped with a camera that recognizes chess pieces in 3D and has a unique three-fingered claw for moving them. This robotic coach is capable of managing over 145 endgame scenarios and offers 2,000 training exercises, along with the ability to reset pieces for new games. This design minimizes the hassle of moving pieces around when practicing solo. Additionally, it is built to be pinch-free, ensuring safety for younger users, unlike the Russian Konstantin Kosteniuk chess robot, which has been known to injure fingers.

    Learning and Playing

    While playing, the robotic coach gives verbal feedback and advice on moves, aiding children in their chess learning process as it plays against them. It includes a collection of a hundred games played by chess masters to assist children in cultivating advanced strategies and tactics. Furthermore, the robot supports remote chess gaming through Lichess, allowing players to connect globally. All games played can be recorded and shared for later review.

    The SenseRobot Chess can be purchased on JD.com for an MSRP of 4,799 yuan (~$680). Individuals who are unable to import it from China might consider alternative options, like an AI-powered talking chess board available on Amazon. For those curious about the evolution of chess computers, there are resources discussing the first computer that defeated a grandmaster, which can be found in a book on Amazon.

    SenseRobot, SenseRobot press release

    SenseRobot Logo (PRNewsfoto/SenseRobot)


    Image 1
    Image 1
  • Should Nvidia Be Concerned About Huawei’s Rising AI Chips?

    Should Nvidia Be Concerned About Huawei’s Rising AI Chips?

    Huawei is currently testing its new AI chip, the Ascend 910C, with potential clients in China. This chip is designed to serve as a robust alternative to Nvidia’s top-tier GPUs, particularly following US restrictions that have limited Nvidia’s sales in China. Samples of the Ascend 910C have been provided to major server companies in China for testing and hardware setup.

    Upgraded Technology

    The Ascend 910C is an enhanced version of Huawei’s Ascend 910B chip, which has already been utilized in various sectors within China as a substitute for Nvidia’s A100 chip, particularly in AI training applications.

    Consequences of US Sanctions on Nvidia

    Since August 2022, US sanctions have barred Nvidia from selling its A100 and H100 GPUs to China. In response, Nvidia created modified versions, including the A800 and H800; however, these too faced additional export restrictions in 2023. Despite these challenges, Nvidia continues to be a significant player in China’s AI market, introducing new products such as the H20, L20, and L2 GPUs. The H20 chip is anticipated to generate substantial revenue in China, with expected sales reaching US$12 billion in 2024, despite previous low demand.expected sales reaching US$12 billion

    Huawei’s Expanding Role in China

    The US sanctions imposed on Nvidia have opened doors for Huawei to enhance its AI infrastructure and computing capabilities in China. Eric Xu Zhijun, Huawei’s rotating chairman, highlighted that the company has established two computing divisions over the past five years to bolster the domestic AI sector. This strategic move has positioned Huawei as a formidable competitor in the AI chip industry.

    While Huawei’s AI chips, including the Ascend 910C, show significant promise, the company does encounter challenges. Huawei generally packages its AI chips with additional services, such as network and storage solutions, which might dissuade some potential clients. Moreover, many of Huawei’s AI chips currently in use are still the older 910B models.

    As the competition between Huawei and Nvidia escalates, Huawei’s ongoing advancements in AI technology may enable it to become a pivotal player in China’s AI chip market, especially as it strives for greater self-sufficiency in semiconductor manufacturing.

  • Aescape Expands AI Massage Robots to Miami Locations

    Aescape Expands AI Massage Robots to Miami Locations

    Aescape has broadened the deployment of its AI massage therapist robots to the Kimpton EPIC Hotel in Miami. This innovative robot offers tailored body massages without needing a human therapist or operator. This expansion follows the robot’s introduction earlier this year in Equinox clubs located in New York.

    How the Robot Works

    The Aescape massage robot employs two robotic arms that are strategically positioned next to and above the massage bed to provide full body massages. A camera scans the customer’s body at 1.2 million points to accurately identify the location of muscles and body tissues. To ensure privacy during the massage, customers don Aerwear body suits that resemble yoga attire. These suits also help reduce skin friction with the robotic massager.

    Features of the Massage Experience

    Instead of using hands, the massage robot utilizes Aerpoint surfaces to perform massages. Each Aerpoint is designed with seven distinct surface shapes, all heated to 95º F (35º C) for precise pressure application during the massage. Customers can choose from a range of massage programs, and additional options will be added in the future. Sessions can be personalized from 15 to 120 minutes by modifying the intensity, pressure, and specific areas of focus. Furthermore, the ambient music, lighting, and components of the massage table, including the armrest, bolster, and headrest, can be customized. All preferences are saved for easy access during subsequent visits.

    Installation Requirements and Pricing

    The Aescape setup necessitates a room measuring 8 by 10 feet, a 120V 5A power supply, and a 2MB/s Internet connection. The rental cost for the machine is noted to be $84,000 annually, with companies potentially breaking even after two appointments or approximately $230 per day. Aescape offers a ROI calculator to help estimate the accurate cost of ownership. At Kimpton, the pricing for a 15-minute session starts at $40, while a 60-minute session begins at $140. For readers who may not have access to an Aescape AI robotic massage therapist nearby, a heated chair massager pad like the one available on Amazon can be an alternative option.


    Image 1
    Image 1
    Image 1
    Image 1
    Image 1
    Image 1
  • Open NotebookLM: Convert PDFs to Podcasts with Open Source

    Open NotebookLM: Convert PDFs to Podcasts with Open Source

    For those who are new to Google’s AI project, NotebookLM serves as a research assistant platform that allows users to upload documents. It utilizes Gemini 1.5 pro to prioritize notetaking when interacting with the information extracted from these documents. NotebookLM summarizes all uploaded documents in the user’s notebook and enables users to pose questions regarding the content. After processing the data, NotebookLM provides answers along with relevant citations from the uploaded files. One of its standout features is the capability to create podcasts based on the uploaded documents. The podcasts, generated by Gemini, feature AI-curated information and consist of audio discussions between two speakers about the topics found in the materials, with segments lasting between five and thirty minutes. However, some users might hesitate to upload their content to a proprietary large language model (LLM), which is where Open NotebookLM presents a different option.

    A User-Friendly Alternative

    Open NotebookLM offers a simple and user-friendly interface, constructed using various open-source and text-to-speech technologies to convert PDFs into podcasts. For PDF processing, it employs Llama 3.1, which has a character limit of 100,000. While it may not match Gemini’s capabilities, MeloTTS delivers reliable text-to-speech performance, allowing users to modify the AI’s tone to be either "fun" or "formal." Furthermore, Open NotebookLM is compatible with just over ten languages, including Spanish, French, and German among its selections. Users can currently experiment with the project on Chua’s Hugging Face page or compile it locally using the resources provided on the project’s GitHub repository.

    Accessing the Project

    Gabriel Chua can be found on both Hugging Face and GitHub, where users can explore the Open NotebookLM project further.

  • Test Google’s Gemini Nano AI Model on Pixel 9 Devices Now

    Test Google’s Gemini Nano AI Model on Pixel 9 Devices Now

    Gemini Nano is a specialized and lightweight variant of the wider Gemini family of AI models. Although both models are part of the Gemini lineup, the Nano version focuses specifically on on-device AI functions. It aims to enhance the efficiency and privacy of conversations by processing data directly on the user’s device.

    Availability for Experimentation

    Google has launched Gemini Nano for testing on Android devices through its AI Edge SDK via AIcore. This on-device AI model is said to be optimized for tasks such as text generation and rephrasing. Following this announcement, developers will have the opportunity to explore AI capabilities without heavy reliance on cloud services. However, it is worth noting that access will be limited initially to Pixel 9 series devices.

    Use Cases and Features

    Gemini Nano excels particularly in text-centric AI applications like summarization, proofreading, and generating smart replies. The AI Edge SDK enables developers to adjust specific settings, including temperature, top-K sampling, and output length, to refine the model’s outputs. Top-K sampling is a technique in AI that restricts the number of potential next words to the top K most likely options, aiming to strike a balance between coherence and randomness. This method allows for the generation of relevant and less repetitive text while still providing variability in responses, all while reducing the need for powerful servers.

    Performance Enhancements

    In terms of performance, Gemini Nano boasts significant improvements over its predecessor. According to the company, the model’s accuracy in tasks like paraphrasing and solving math problems has risen to 90% and 23%, respectively. Google is currently providing experimental access to developers, enabling them to incorporate these features into their applications.

    Developers interested in getting started can consult the SDK’s documentation, which offers a detailed guide on implementing Gemini Nano in mobile applications.

    Android Developers Blog via @AndroidDev on X/Twitter


    Image 1
  • Huawei Chips to Power New AI Model for TikTok’s Parent Company

    Huawei Chips to Power New AI Model for TikTok’s Parent Company

    ByteDance is reportedly developing a new AI model with significant support from Huawei, which may provide the necessary hardware. The parent company of TikTok intends to utilize Huawei’s resources for training and advancing this AI initiative. Here’s what we have gathered so far.

    ByteDance Seeks Huawei’s Assistance for AI

    As per a report from Reuters, ByteDance faces challenges due to US export restrictions that hinder its ability to acquire NVIDIA chips. Initially, ByteDance’s AI project utilized NVIDIA’s H20 AI chips tailored for the Chinese market to sidestep US government restrictions. For context, the US carefully regulates which AI chips can be sold to Chinese companies to slow down the technological advancement in China.

    To navigate these obstacles, TikTok’s parent company is now looking to Huawei for assistance in training and developing its AI model. This new AI model will utilize chips from Huawei instead of NVIDIA’s offerings. This year, ByteDance has reportedly ordered 100,000 Ascend 910B chips from Huawei, but to date, it has only received about 30,000 of those units. Notably, Huawei’s Ascend 910B chips are said to outperform NVIDIA’s A100 chips in terms of GPU performance and energy efficiency.

    Challenges in Chip Supply and Future Outlook

    Despite these advantages, the development of the AI model has been hindered by chip shortages. As ByteDance tries to work around the US government’s restrictions to obtain NVIDIA chips, this development suggests a strategic move to lessen its dependence on US technology. However, it’s important to remember that this information is based on unverified reports, so it should be viewed cautiously for now. Earlier this year, ByteDance also introduced Coze, a platform akin to OpenAI, allowing users without coding expertise to create and deploy AI chatbots.

  • PlayStation Boosts AI Use to Speed Up Game Development and Cut Costs

    PlayStation Boosts AI Use to Speed Up Game Development and Cut Costs

    In the latest Corporate Report 2024, released on September 13, 2024, Sony outlines its vision for an entertainment company that aligns with societal changes and advancements in technology. The report highlights the significant roles that artificial intelligence (AI) and machine learning (ML) will play, particularly in the gaming industry.

    Advancements in Game Development

    Sony aims to incorporate technologies like real-time 3D processing and sensor technology into its game development processes. The objective is to accelerate production times and lower expenses while maintaining high standards of quality. An example mentioned in the report is the making of Marvel’s Spider-Man 2, where speech recognition software was utilized to automatically sync subtitles with character dialogues. This innovation considerably shortens the time required for development.

    Innovative Capture Techniques

    Additionally, Sony is investing in the Volumetric Capture Studio, which is designed to collect 3D data of people and settings to create images and 3D assets. This captured data will be utilized throughout the company and may also be offered for sale externally. In partnership with Epic Games, Sony is exploring the Unreal Engine to repurpose 3D elements from music video production for use in video games.

    Ethical Considerations in AI Use

    The application of AI in game development presents significant possibilities but also raises important ethical discussions. A major concern includes the potential displacement of jobs, along with worries that AI might hinder the creative process for developers. Proponents argue that AI can help eliminate monotonous tasks, giving developers more time to concentrate on creative projects. A conversation about Sony’s increased reliance on AI has already begun on Reddit.

    Image source: PlayStation, Frank_Reppold / Pixabay