Tag: Nvidia H100

  • Surge in Gray-Market Repairs for Banned Nvidia H100 and A100 GPUs

    Surge in Gray-Market Repairs for Banned Nvidia H100 and A100 GPUs

    Key Takeaways

    1. Emerging Repair Market: A profitable repair market for high-end Nvidia GPUs has developed in Shenzhen, driven by scarcity due to U.S. export bans.

    2. Impact of U.S. Restrictions: The U.S. has banned sales of the H100 and A100 GPUs to China, claiming they could aid military AI research, leading to increased demand for maintenance services.

    3. Essential Maintenance: Regular maintenance of these GPUs is crucial as many have been in continuous operation, resulting in higher failure rates and the need for repairs.

    4. Repair Costs: Repair prices range from ¥10,000 to ¥20,000 (approximately $1,400–2,800), reflecting the complexity of the work and the rarity of the hardware.

    5. Geopolitical Context: Despite U.S. concerns and legislative efforts to track high-end GPUs, the demand in China’s AI sector for maintaining these restricted products continues to thrive.


    Surging demand in China for maintainance of high-end Nvidia graphics processors—despite Washington’s export ban—has led to a small yet profitable repair market in Shenzhen. Around a dozen specialized workshops are now promoting their ability to fix and revive H100 and A100 accelerators. According to industry insiders who spoke to Reuters, these devices have come into the country through gray-market avenues.

    A Market Born from Scarcity

    This market only thrives because the hardware is both rare and essential. In September 2022, the United States prohibited sales of the H100 to China and also restricted its predecessor, the A100. U.S. officials claimed that these advanced GPUs could speed up military AI research. Although Nvidia eventually produced a less powerful H20 silicon that meets the restrictions, Chinese research institutions and cloud service providers still desire the more capable H100 for training large language models.

    Importance of Maintenance

    Maintaining these chips has become crucial. Many of them have been operating continuously in data centers for an extended period. Consequently, failure rates are increasing. Typical lifespans of two to five years mean that power-delivery circuits, high-bandwidth memory modules, and cooling fans require regular maintenance. One long-time graphics-card expert mentioned that it launched a separate subsidiary in late 2024. This new entity now refurbishes up to 500 AI GPUs each month, testing each repair in a 256-node server environment that replicates customer clusters.

    Pricing for Repairs

    Prices indicate both the rarity and the intricacy of the work. Shops charge between ¥10,000 and ¥20,000 (around $1,400–2,800) per card. This amounts to about 10 percent of the original price of the unit, covering repairs from solder-reflow tasks to HBM replacement. An equivalent eight-way H20 server officially sells for well over ¥1 million (≈ $139,000). Traders report that a B200-equipped chassis can be sold for more than ¥3 million (≈ $418,000).

    Washington’s worries extend beyond smuggling issues. This year, bipartisan bills have been introduced that would require tracking the location of high-end accelerators after sale, aiming to curb illegal activities. Nvidia asserts that only the company and its authorized partners can deliver comprehensive technical support. The company cautions that using restricted products without updated firmware and software is “a non-starter.” Still, business in Shenzhen remains active. Currently, China’s AI companies see worth in keeping illicit silicon operational—despite the geopolitical challenges.

    Source:
    Link

  • US AI Chip Export Rules Raise Concerns and Global Tensions

    US AI Chip Export Rules Raise Concerns and Global Tensions

    The United States has rolled out new regulations regarding the export of advanced AI chips. This move seeks to safeguard national security while ensuring that the US remains at the forefront of AI technology. The regulations classify different countries based on their ties with the US and establish different access levels to American AI innovations.

    Classification of Nations

    The new guidelines create a three-tier system for countries:
    Tier 1 consists of close partners like the UK, Japan, and the Netherlands, which can access US AI technologies without restrictions.
    Tier 2 includes countries such as Singapore and Israel, which are subject to export limits and licensing to ensure that security is not jeopardized during trade.
    Tier 3 comprises nations like China, Russia, and Iran, which are completely prohibited from obtaining advanced AI technologies due to security issues.

    Main Aspects of the Regulations

    These new rules set limits on exports using a Total Processing Performance (TPP) standard. For instance, AI chips like Nvidia’s H100 GPUs are restricted from reaching Tier 3 countries. However, US cloud service providers, including Amazon Web Services, Microsoft, and Google, are granted exemptions, enabling them to operate globally under certain stringent conditions.

    The purpose of these restrictions is to prevent adversarial nations from using AI chips for military improvements, surveillance, or cyber warfare. By protecting its technological advantage, the US aims to maintain its leadership in global AI and ensure national security.

    Worldwide Effects

    Manufacturers like Nvidia are likely to face difficulties, with major revenue declines anticipated in the restricted markets. On the other hand, US cloud service providers could see an uptick in benefits from these exemptions, enhancing their international footprint. There are worries about these regulations leading to fragmented global supply chains, which could negatively affect consumer markets, particularly gaming.

    With a 120-day period for public comments, the upcoming Trump administration has a chance to adjust the new framework. Analysts forecast ongoing stringent measures against China, but they also anticipate some flexibility in how these rules are implemented, aiming to balance economic growth with security requirements.

  • Apple and Foxconn Team Up for Custom AI Servers in Taiwan

    Apple and Foxconn Team Up for Custom AI Servers in Taiwan

    Apple is teaming up with Foxconn and LCFC, a Lenovo subsidiary, to create its own AI servers powered by Apple Silicon in Taiwan. This strategy is designed to enhance Apple’s data center capabilities for their upcoming Apple Intelligence services and reduce their dependency on Chinese manufacturers.

    The Reason for Choosing Taiwan

    Sources suggest that Apple chose Taiwan primarily to benefit from Foxconn’s extensive expertise in constructing AI servers. Foxconn is already producing servers equipped with Nvidia’s H100 and H200 GPUs and is preparing to collaborate on new Blackwell-based chips.

    Focus on AI Inference Management

    Unlike its rivals such as Amazon, Google, and Microsoft, Apple’s server strategy focuses more on managing AI inference instead of developing large language models. These servers are intended for internal operations, indicating that production volumes will be lower compared to typical data center configurations.

    Collaboration and Design Support

    This partnership goes beyond just server production; it also involves engineering and design assistance from Foxconn and LCFC. Although Apple may not have a lot of experience in designing data center servers, the development is expected to progress rapidly since these servers are less complex than Nvidia’s GB200 systems.

    Foxconn has AI research facilities in Hsinchu, Taiwan, and San Jose, California, where they are currently collaborating with Nvidia on upcoming GB300 server initiatives. Moreover, additional manufacturing partners like Universal Scientific Industrial may also join to further diversify the production process.

  • Meta’s Llama 4 Uses 100,000 Nvidia H100 GPUs for Training

    Meta’s Llama 4 Uses 100,000 Nvidia H100 GPUs for Training

    Meta has just announced a new update on Llama 4, their upcoming language model. During a recent earnings call, CEO Mark Zuckerberg revealed that they are training Llama 4 using a setup with over 100,000 Nvidia H100 GPUs. This is a larger setup than anything previously reported by competitors.

    Upcoming Features of Llama 4

    This new language model is set to launch in early 2025, starting with its smaller versions first. While specific capabilities are not fully disclosed, Zuckerberg suggested that Llama 4 will have enhanced features, improved reasoning abilities, and will operate at a quicker pace overall.

    Meta’s Unique Strategy

    Meta continues its strategy of offering models for free download, unlike OpenAI and Google, which restrict access through APIs. This makes Llama 4 particularly appealing for startups and researchers who prefer more flexibility in using AI technologies.

    Significant Energy and Financial Implications

    Given the extensive computing resources, the energy requirements are also considerable, estimated at around 150 megawatts—this is five times more than what the largest supercomputer at a U.S. national lab consumes. Meta’s financial plan reflects this ambitious scale, with infrastructure expenditures projected to hit $40 billion in 2024, marking a 42 percent increase from 2023.

    Despite these hefty expenses, Meta’s financial health remains strong, showing a 22 percent rise in sales, primarily driven by advertising revenue, which has helped offset a 9 percent increase in operating costs.