Tag: AI model

  • Xiaomi Unveils MiMo-7B: First Open-Source LLM for Coding and Reasoning

    Xiaomi Unveils MiMo-7B: First Open-Source LLM for Coding and Reasoning

    Key Takeaways

    1. Xiaomi has launched its first open-source AI system, MiMo-7B, designed for complex reasoning tasks and excelling in mathematics and code generation.
    2. MiMo-7B has 7 billion parameters and competes effectively with larger models from OpenAI and Alibaba, especially in mathematical reasoning and coding contests.
    3. The model’s training involved a comprehensive dataset of 200 billion reasoning tokens, using a multi-token prediction goal to enhance performance and reduce inference times.
    4. Post-training enhancements include unique algorithms for reinforcement learning and infrastructure improvements that significantly boost training and validation speeds.
    5. MiMo-7B is available in four public variants, with notable performance benchmarks in mathematics and coding, and it can be accessed on Hugging Face and GitHub.


    Xiaomi has quietly entered the large language model arena with its new MiMo-7B, marking its first open-source AI system for the public. Created by the recently formed Big Model Core Team, MiMo-7B is designed for complex reasoning tasks and excels beyond rivals like OpenAI and Alibaba when it comes to mathematical reasoning and code generation.

    Model Specifications

    As indicated by its name, MiMo-7B has 7 billion parameters. Even though it is much smaller than many leading LLMs, Xiaomi asserts that it competes equally with larger models such as OpenAI’s o1-mini and Alibaba’s Qwen-32B-Preview, all of which are capable of AI reasoning.

    Xiaomi MiMo-7B surpasses OpenAI and Alibaba’s models in mathematics reasoning (AIME 24-25) and code contests (LiveCodeBench v5).

    Training Details

    The foundation of MiMo-7B is a rigorous pre-training schedule. Xiaomi claims to have created a comprehensive dataset consisting of 200 billion reasoning tokens and has provided the model with a total of 25 trillion tokens through three phases of training.

    Instead of the conventional next-token prediction, the company opted for a multi-token prediction goal, which they say reduces inference times without compromising the quality of the outputs.

    Post-Training Enhancements

    The post-training phase combines various reinforcement learning methods alongside infrastructure enhancements. Xiaomi developed a unique algorithm called Test Difficulty Driven Reward to mitigate the sparse reward challenges often seen in RL tasks involving intricate algorithms. Moreover, they introduced an Easy Data Re-Sampling technique to ensure stable training.

    On the infrastructure side, Xiaomi has created a Seamless Rollout system to minimize GPU downtime during both training and validation. According to their internal metrics, this results in a 2.29× increase in training speed and almost a 2× boost in validation performance. The rollout engine also supports inference methods like multi-token prediction in vLLM settings.

    Availability and Performance

    Now, MiMo-7B is open source with four public variants available:
    – Base: the unrefined, pre-trained model
    – SFT: a version refined with supervised data
    – RL-Zero: a variant enhanced through reinforcement learning starting from the base
    – RL: a more refined model based on SFT, claimed to offer the best accuracy

    Xiaomi has also shared benchmarks to support its claims, at least theoretically. In mathematics, the MiMo-7B-RL variant is said to achieve 95.8% on MATH-500 and over 68% on the 2024 AIME dataset. Regarding code, it scores 57.8% on LiveCodeBench v5 and nearly 50% on version 6. Other general knowledge tasks like DROP, MMLU-Pro, and GPQA are also included, though scores hover in the mid-to-high 50s—respectable for a model with 7 billion parameters, yet not groundbreaking.

    MiMo-7B can now be accessed on Hugging Face under an open-source license, and all relevant documentation and model checkpoints are available on GitHub.


  • Samsung Unveils Second-Generation AI Model Gauss2

    Samsung Unveils Second-Generation AI Model Gauss2

    Samsung presented the next version of its multimodal generative AI model, Gauss2, during the Samsung Developer Conference 2024. The model will come in three different sizes: Compact, Balanced, and Supreme.

    Compact Model Overview

    The Compact model is designed to be small, focusing on speed and efficiency, making it perfect for usage directly on devices by "maximizing the utilization of the device’s computing resources."

    Balanced and Supreme Models

    The Balanced model strikes a blend between performance and efficiency, tailored for various tasks that need consistency. On the other hand, the Supreme model is optimized for high performance, and Samsung claims it lowers "computational costs during training and inference processes while keeping both performance and efficiency at high levels."

    This model supports up to 14 languages, including a variety of programming languages. To enhance efficiency and performance, Samsung employs a "custom tokenizer" for the languages it supports. The company notes that the model’s processing speed "per hour is 1.5 to 3 times quicker," when compared to open-source generative AI models.

    In-House Coding Assistant

    According to Samsung, its internal coding assistant, ‘code.i,’ is currently being used by 60% of all software developers at the company.

    Samsung aims to leverage the Gauss2 model to boost productivity internally. The company "will keep extending the reach of its AI-based services throughout all product lines so users can enjoy a more convenient and pleasant daily life."

    Source: Link

  • ISRO Unveils AI Model for Aircraft Tracking and Surveillance

    ISRO Unveils AI Model for Aircraft Tracking and Surveillance

    Researchers at the Space Applications Centre (SAC), part of the Indian Space Research Organisation (ISRO), have created an AI model that provides a full suite of features for monitoring airports and tracking aircraft, in addition to offering surveillance capabilities.

    Collaboration and Testing

    According to a report by The Times of India, this model was developed in partnership with an institute and has been tested in airports located in Ahmedabad, Mumbai, and Pune. The AI model makes use of the CosmiQ Works RarePlanes dataset, which is an open-source dataset for machine learning provided by CosmiQ and AI Reverie. This dataset includes both real-world and artificially created satellite images.

    Training Data Sources

    In addition to the RarePlanes dataset, the researchers trained the model using the Airbus Aircraft detection dataset and satellite imagery from Indian remote sensing.

    "A senior official mentioned to The Times of India that the deep learning models used were named YOLOv5 and YOLOv7, which are part of the You Only Look Once (YOLO) series, known for their applications in detecting aircraft within satellite images," the report stated.

    Accuracy and Future Enhancements

    The model achieved an accuracy of 94% for larger aircraft and 88% for smaller ones, with YOLOv7 proving to be the better performer in these evaluations. The research team is currently focused on enhancing the model’s features by integrating Synthetic Aperture Radar (SAR) satellite data, which can assist in detecting aircraft even during severe weather events.

    Source: Link

  • Open Source AI Video Generator Pyramid Flow Now Online

    Open Source AI Video Generator Pyramid Flow Now Online

    Already gaining traction in YouTube tutorial clips, Pyramid Flow is an innovative AI system trained on freely available datasets, amounting to about 10 million videos. This project is a collaborative effort between AI specialists from Peking University, Kuaishou Technology, and Beijing University of Posts and Telecommunications. Notably, Pyramid Flow is itself open-source. Licensed under the MIT License, it can produce virtual high-resolution (768p) video content, and it particularly excels at 384p. Its developers claim that it can generate a five-second video in under a minute, utilizing an A100 GPU in an unspecified hardware setup.

    Performance Insights

    In various situations, Pyramid Flow performs exceptionally well. Nevertheless, when handling certain text prompts, the output can be inadequate. Like many generative AI tools, there is a degree of unpredictability involved with this model. On the positive side, Pyramid Flow requires significantly less computational power compared to its rivals. Furthermore, since its code is open-source, those who are interested can implement it in local or cloud settings without any licensing concerns.

    Copyright Concerns

    While the AI team behind Pyramid Flow has provided a list of all datasets used for its training, they did not address potential copyright issues that could arise. Some content creators argue that using open-source materials to make virtual videos infringes on the rights of copyright owners. Nevertheless, Pyramid Flow might be beneficial for refining such content without needing to engage third parties.

    Pyramid Flow (on GitHub, via Tech Xplore)

  • Google Confirms Pixel 8 Exclusion from Gemini Nano Upgrade

    Google Confirms Pixel 8 Exclusion from Gemini Nano Upgrade

    Google’s Pixel 8 and 8 Pro share similar hardware specifications, differing mainly in RAM capacity. Unfortunately, Google has disclosed disappointing news for Pixel 8 users. The standard model will not receive Gemini Nano, the smallest version of the company's AI model introduced in December. This omission is disheartening for Pixel 8 owners who anticipated having the mobile-friendly LLM on their devices.

    Google Confirms Absence of Gemini Nano on Pixel 8

    In a recent episode of The Android Show on the Android Developers channel, Google officially confirmed the absence of Gemini Nano on the Pixel 8. Terence Zhang, a developer relations engineer at Google, cited hardware limitations as the reason behind this decision. While specifics regarding these limitations were not disclosed, it's worth noting that the primary distinction between the standard and pro versions of the Pixel 8 lies in the RAM capacity, with the former featuring 8GB and the latter boasting 12GB.

    Gemini Nano is currently operational on the Pixel 8 Pro and the Galaxy S24 series. This AI model provides functionalities such as Summarize in the Recorder app, Smart Reply in Gboard, WhatsApp, and offline photography enhancements. The exclusion of this AI model from the standard variant widens the gap between the Pixel 8 and 8 Pro, a factor potential buyers should consider when contemplating a Pixel 8 purchase.

    Expansion Plans for Gemini Nano

    Google has affirmed its commitment to expanding Gemini Nano's compatibility to more high-end devices. Recently, chipmaker MediaTek announced a collaboration with Google to optimize Gemini Nano for the Dimensity 9300 and 8300 chips. Furthermore, Google has outlined its initiative to incorporate Gemini into Android smartphones starting in 2025.

  • Google Introduces AI Cyber Defense Initiative in Response to Hacker Threat

    Google Introduces AI Cyber Defense Initiative in Response to Hacker Threat

    Google has recently made a significant announcement regarding their use of artificial intelligence (AI) to combat cyber threats. This new initiative, called the AI Cyber Defense Initiative, aims to enhance internet security by proactively staying ahead of hackers and cyberattacks. With the help of AI, Google hopes to simplify the process of identifying and preventing potential threats before they can cause significant damage.

    Training AI to Recognize Cyber Threats

    The challenge in maintaining digital security today is that hackers only need to find one vulnerability to exploit, while defenders must be flawless at all times. Google believes that AI can help address this issue by empowering security experts to identify and mitigate threats more efficiently. By training AI systems to detect the early signs of cyberattacks, Google aims to predict and prevent attacks before they can cause harm.

    Introducing Magika: A Malware Detection Tool

    Google has developed a tool called Magika, which they are sharing with the cybersecurity community. Magika specializes in identifying malware, which refers to software designed to compromise or infiltrate systems. This tool enables security experts to distinguish between safe and harmful files, facilitating the early detection of potentially harmful software.

    A Collaborative Approach to Cybersecurity

    Google emphasizes that combating cyber threats requires a collective effort. They urge companies and governments to collaborate, share information, and leverage AI to enhance internet security for all users. By utilizing AI not just to respond to cyber threats but to prevent them, Google is taking a significant step towards creating a safer online environment. This initiative demonstrates the positive potential of AI in safeguarding the internet.

  • Nvidia Introduces Eos Supercomputer, Advancing the Frontiers of Artificial Intelligence

    Nvidia Introduces Eos Supercomputer, Advancing the Frontiers of Artificial Intelligence

    Nvidia has unveiled Eos, a revolutionary supercomputer for data centers, at the Supercomputing 2023 trade show. This supercomputer, known as an “AI factory,” is designed to push the boundaries of artificial intelligence development. Eos represents a new era in AI acceleration and has been named after the Greek goddess of dawn.

    Impressive Performance

    Eos is powered by 576 Nvidia DGX H100 systems, which are integrated with Quantum-2 InfiniBand networking and specialized software. This impressive setup enables Eos to achieve a remarkable 18.4 exaflops of FP8 AI performance. It is a significant advancement from Nvidia’s previous supercomputing projects, SaturnV and Selene, showcasing the advanced DGX SuperPOD architecture. This architecture allows for the rapid scaling of AI data center solutions to meet high-performance demands.

    Hardware Configuration

    At the core of Eos are 4,608 H100 GPUs, distributed across each DGX H100 system’s eight H100 Tensor Core APUs. This hardware configuration is specifically designed to handle extensive workloads, including training large language models, running AI recommenders, conducting large-scale analytics, performing quantum simulations, and more.

    Optimized for AI Tasks

    Eos’s architecture is finely tuned for AI tasks that require ultra-low latency and high throughput in massive computing clusters. The supercomputer’s networking capabilities, with speeds reaching up to 400GB/s, are crucial for handling the large datasets necessary for training AI models.

    Specialized Software Integration

    Eos also integrates specialized software to enhance AI development and deployment. Base Command facilitates AI workflow, cluster management, and provides libraries for compute, storage, and network acceleration. AI Enterprise, a cloud-native platform, aims to expedite AI application development and positions itself as the “operating system” for enterprise-level AI. Eos’s capabilities have earned it the ninth position on the TOP500 list of the world’s fastest supercomputers.

  • Shanghai Company Unveils Next-Generation Toy Bunnies Harnessing AI Technology

    Shanghai Company Unveils Next-Generation Toy Bunnies Harnessing AI Technology

    Remember your childhood teddy bear? The one that patiently listened to all your secrets and dreams? Well, there’s an upgrade in the market, because there’s a new generation of companions in town, and they’re not just soft and cuddly – they’re downright chatty. Meet the latest trend in smart toys: AI-powered robots that can hold real conversations with kids. Forget pre-programmed phrases and robotic voices; these little guys use clever technology to chat like a real friend, keeping youngsters entertained and engaged for hours.

    AI-Powered Robots: The New Generation of Smart Toys

    The bunnies use large language models to make toys more interactive

    One company leading the charge is FoloToy, a Shanghai-based start-up that uses something called large language models (LLMs) – the same tech behind chatbots like ChatGPT – to breathe life into ordinary toys. Imagine your bunny pal suddenly asking about your day or sharing funny stories. Pretty cool, right?

    Frank Murphy: The Multilingual Chatty Companion

    But FoloToy isn’t just for kids. They’ve teamed up with Yomiplanet to create Frank Murphy, a limited-edition “live” figure that chats in six languages! So, whether you’re practicing your Japanese or reminiscing about Steve Jobs with a digital pal, Frank’s got you covered.

    The Future of AI-Powered Devices

    This explosion of AI toys isn’t just about fun and games, though. It’s part of a bigger trend in China, where tech companies are racing to develop the next generation of AI-powered devices. Think voice assistants that learn your habits or screenless gadgets controlled by your thoughts. The future is here, and it’s interactive!

    Commercializing AI: The Power of Fun and Connection

    Of course, with all this new tech comes the question: how do we make money from it? Experts warn that commercializing AI can be tricky, but companies like FoloToy are betting on the power of fun and connection. After all, who wouldn’t want a chatty companion who’s always up for a conversation? So, the next time you’re feeling lonely, ditch the phone and (maybe) cuddle up with an AI buddy. You might be surprised at how much fun you can have with a robot who talks back.

  • Enhanced AI, Display, and Camera in Samsung Galaxy S24 Series Update

    Enhanced AI, Display, and Camera in Samsung Galaxy S24 Series Update

    Samsung recently released a software update for its Galaxy S24 series smartphones, which were launched last month. This update brings several new features, bug fixes, and improvements based on user feedback. Samsung aims to enhance the device’s display, camera, and AI translation capabilities through improved hardware and software integration.

    Adjustable Display Settings

    The latest update for the Galaxy S24 series introduces a new “Vividness” option under the screen’s “Advanced settings.” This feature allows users to customize the color and brightness levels of their screens according to their preferences. Now, you can enjoy a display that looks just the way you like it.

    Better Camera Features

    Samsung has also focused on enhancing the camera capabilities of the Galaxy S24 series. The update brings improvements to zoom, Portrait Mode, Night mode, and video shooting with the rear camera. These enhancements aim to make capturing pictures and videos in various situations even more enjoyable.

    Improved Communication

    In line with Samsung’s commitment to facilitating smooth communication for everyone, this update also works on reducing language barriers. The AI integrated into the Galaxy S24 series has been optimized to better understand users in different situations. This improvement will make it easier for users to communicate effectively with their devices.

    Rollout Details

    Samsung has announced that the rollout of these updates will begin in February. Users can expect to receive the update notification on their Galaxy S24 series smartphones soon.

    Overall, this software update brings significant improvements to the Galaxy S24 series smartphones. Users can look forward to a more personalized display experience, enhanced camera features, and improved communication capabilities. Stay tuned for the update notification on your device in the coming weeks.

  • Using AI and VR Technology, Tencent is Revolutionizing Chinese Opera

    Using AI and VR Technology, Tencent is Revolutionizing Chinese Opera

    Tencent, a well-known company in the gaming and social media industry, is now venturing into a new field by giving Chinese opera a modern twist. They have begun with the story of Hua Mulan, the legendary heroine who disguised herself as a man to fight in her father’s place. Thanks to Tencent’s tech-savvy approach, this traditional story is gaining a new audience in China.

    Tencent’s Modern Makeover

    Tencent has taken an old 1956 opera performance and utilized artificial intelligence (AI) to revitalize the video. The use of AI technology has successfully cleaned up the footage, resulting in a fresh and updated version. This project has been a huge success, attracting over 7 million viewers through its live stream. Collaborating with the Ministry of Culture and Tourism, Tencent aims to bring traditional Chinese opera into the digital age.

    Advancing Traditional Art

    Tencent’s efforts go beyond simply enhancing old videos. They are employing advanced technologies like 6DoF, commonly used in virtual reality, to capture the intricate movements of characters from famous operas. This technology allows people today to witness and appreciate the complex beauty of these traditional dances and stories.

    Preserving Cultural Heritage

    The ultimate goal of Tencent’s project is to preserve classic stories and share them with a new generation. Despite facing challenges such as the availability of old opera data, Tencent is finding innovative solutions. They are even planning to establish a digital library, making it easier for more people to access and enjoy these restored performances.

    In China, tech companies are leveraging their resources to celebrate and safeguard traditional culture. Through their efforts, they are ensuring that these significant pieces of China’s heritage will not be forgotten. This serves as a testament to how technology can bridge the gap between the past and the present.