Tag: Nvidia H100

October 11, 2025

DeepSeek V3.2: Free Open-Source AI LLM Reduces Compute Costs

Key Takeaways

1. DeepSeek has launched the DeepSeek-V3.2-Exp AI model, known for lower compute expenses and advanced performance, ranking 11th among global LLMs.
2. The model uses a new DeepSeek Sparse Attention (DSA) framework, focusing on relevant tokens to improve speed and reduce memory usage, supporting up to a 128K-token window.
3. App developers can save over 50% in costs when using DeepSeek V3.2 Exp via its public API while maintaining similar performance levels.
4. The 400 GB model is available for free download on Hugging Face, requiring powerful hardware with multiple Nvidia GPUs or specific servers for proper operation.
5. Users wishing to run DeepSeek V3.2 on personal desktops must wait for quantized models and need a GPU with at least 24 GB of memory.

DeepSeek has unveiled its new artificial intelligence large-language model called DeepSeek-V3.2-Exp, which comes with notably lower compute expenses. This improvement is beneficial for companies utilizing the firm’s API in their applications, allowing them to save funds while accessing an advanced AI that has secured the 11th position among the most formidable LLMs introduced globally.

New Design Features

The breakthrough was made possible by implementing a novel DeepSeek Sparse Attention (DSA) framework. Unlike traditional AI transformers that index every token, this design only focuses on the most pertinent tokens. This enhancement enables the AI to handle text input more quickly, supporting up to a 128K-token window while using less memory.

Cost Reduction for Developers

App developers utilizing DeepSeek V3.2 Exp through its public API can anticipate spending over 50% less compared to the earlier version, all while ensuring similar performance levels across standard AI evaluations.

Download Requirements

The 400 GB AI LLM is available for free download on Hugging Face and can be operated locally on robust computers. Users should note that a setup with several Nvidia H100/H200/H20 GPUs or at least one NVIDIA B200/GB200 server is necessary because of the model’s requirement for more than 1.5 TB of VRAM.

For those wishing to run DeepSeek v3.2 on personal desktops, patience is required until quantized models become available on Hugging Face, like the one for v3.1 by unsloth. Additionally, a GPU with a minimum of 24 GB of memory is needed, such as the Nvidia 5090 listed on Amazon.

Source:
Link

Tags: Nvidia H100
September 4, 2025

Amkor Moves $2B Chip Plant to Peoria for U.S. Semiconductor Security

Key Takeaways

1. Amkor is establishing a new advanced packaging and testing facility on a 104-acre site in Peoria, Arizona, with construction starting soon and production expected by early 2028.
2. The investment for the facility is projected to be $2 billion, creating approximately 2,000 jobs and enhancing the U.S. semiconductor supply chain.
3. The new site aims to alleviate semiconductor supply chain issues by focusing on high-performance packaging and reducing reliance on Taiwan and South Korea.
4. The project is backed by $407 million from the CHIPS Act, but a national shortage of semiconductor talent may pose challenges for staffing.
5. Despite the new facility, immediate solutions for AI server shortages will not be realized until after it becomes operational in 2028, with ongoing reliance on Asian facilities for packaging capacity in the interim.

Amkor has updated the location for its new advanced packaging and testing facility to a 104-acre site in the Peoria Innovation Core, located in northern Peoria, Arizona. Recently, on August 29, the Peoria City Council approved the land exchange, which replaces the earlier planned 56-acre Vistancia site. Construction is expected to commence shortly, with production scheduled to kick off in early 2028. The company estimates that the investment will reach $2 billion and will generate around 2,000 jobs.

A Significant Step Forward

City officials describe this move as a “historic milestone” that will enhance the U.S. semiconductor supply chain. Amkor mentions that the larger space provides greater flexibility to meet the rising demand from customers. Having been active in the Greater Phoenix area since 1984, the company intends to cater to clients in the computing, automotive, and communications sectors from this new facility.

Addressing Supply Chain Challenges

The new facility is designed to tackle existing problems within the semiconductor supply chain. Currently, assembly, testing, and packaging are heavily concentrated in Taiwan and South Korea, leading to bottlenecks that have hindered the production of AI chips, like the Nvidia H100. The Peoria site will focus on high-performance packaging platforms, including TSMC’s CoWoS and InFO, which are utilized in data-center GPUs, and possibly Apple silicon, although this remains unconfirmed. TSMC has signed an agreement to send packaging from its Phoenix fabs to Amkor, which will help reduce turnaround times.

Funding and Labor Challenges

This project is supported by $407 million from the CHIPS Act along with federal tax incentives, making it one of the most ambitious outsourced packaging projects on American soil, aimed at keeping the U.S. competitive in multi-die systems. Nevertheless, the national shortage of semiconductor talent, estimated at around 70,000 to 90,000 workers, could create challenges for the new plant since automation alone won’t bridge the gap entirely. Amkor plans to collaborate with TSMC and other local Arizona entities to develop a supportive ecosystem.

Looking Ahead

However, don’t expect immediate solutions for AI server shortages. For the next few years, packaging capacity will still depend on Asian facilities, with the impact of the U.S. facility only beginning once Peoria starts operations in early 2028. Important milestones to keep an eye on include groundbreaking ceremonies, initial construction developments, installation of tools, hiring and training processes, localizing suppliers, and meeting initial capacity goals.

Source:
Link

Tags: Nvidia H100, TSMC
July 28, 2025

Surge in Gray-Market Repairs for Banned Nvidia H100 and A100 GPUs

Key Takeaways

1. Emerging Repair Market: A profitable repair market for high-end Nvidia GPUs has developed in Shenzhen, driven by scarcity due to U.S. export bans.

2. Impact of U.S. Restrictions: The U.S. has banned sales of the H100 and A100 GPUs to China, claiming they could aid military AI research, leading to increased demand for maintenance services.

3. Essential Maintenance: Regular maintenance of these GPUs is crucial as many have been in continuous operation, resulting in higher failure rates and the need for repairs.

4. Repair Costs: Repair prices range from ¥10,000 to ¥20,000 (approximately $1,400–2,800), reflecting the complexity of the work and the rarity of the hardware.

5. Geopolitical Context: Despite U.S. concerns and legislative efforts to track high-end GPUs, the demand in China’s AI sector for maintaining these restricted products continues to thrive.

Surging demand in China for maintainance of high-end Nvidia graphics processors—despite Washington’s export ban—has led to a small yet profitable repair market in Shenzhen. Around a dozen specialized workshops are now promoting their ability to fix and revive H100 and A100 accelerators. According to industry insiders who spoke to Reuters, these devices have come into the country through gray-market avenues.

A Market Born from Scarcity

This market only thrives because the hardware is both rare and essential. In September 2022, the United States prohibited sales of the H100 to China and also restricted its predecessor, the A100. U.S. officials claimed that these advanced GPUs could speed up military AI research. Although Nvidia eventually produced a less powerful H20 silicon that meets the restrictions, Chinese research institutions and cloud service providers still desire the more capable H100 for training large language models.

Importance of Maintenance

Maintaining these chips has become crucial. Many of them have been operating continuously in data centers for an extended period. Consequently, failure rates are increasing. Typical lifespans of two to five years mean that power-delivery circuits, high-bandwidth memory modules, and cooling fans require regular maintenance. One long-time graphics-card expert mentioned that it launched a separate subsidiary in late 2024. This new entity now refurbishes up to 500 AI GPUs each month, testing each repair in a 256-node server environment that replicates customer clusters.

Pricing for Repairs

Prices indicate both the rarity and the intricacy of the work. Shops charge between ¥10,000 and ¥20,000 (around $1,400–2,800) per card. This amounts to about 10 percent of the original price of the unit, covering repairs from solder-reflow tasks to HBM replacement. An equivalent eight-way H20 server officially sells for well over ¥1 million (≈ $139,000). Traders report that a B200-equipped chassis can be sold for more than ¥3 million (≈ $418,000).

Washington’s worries extend beyond smuggling issues. This year, bipartisan bills have been introduced that would require tracking the location of high-end accelerators after sale, aiming to curb illegal activities. Nvidia asserts that only the company and its authorized partners can deliver comprehensive technical support. The company cautions that using restricted products without updated firmware and software is “a non-starter.” Still, business in Shenzhen remains active. Currently, China’s AI companies see worth in keeping illicit silicon operational—despite the geopolitical challenges.

Source:
Link

Tags: Nvidia H100
January 16, 2025

US AI Chip Export Rules Raise Concerns and Global Tensions

The United States has rolled out new regulations regarding the export of advanced AI chips. This move seeks to safeguard national security while ensuring that the US remains at the forefront of AI technology. The regulations classify different countries based on their ties with the US and establish different access levels to American AI innovations.

Classification of Nations

The new guidelines create a three-tier system for countries:
– Tier 1 consists of close partners like the UK, Japan, and the Netherlands, which can access US AI technologies without restrictions.
– Tier 2 includes countries such as Singapore and Israel, which are subject to export limits and licensing to ensure that security is not jeopardized during trade.
– Tier 3 comprises nations like China, Russia, and Iran, which are completely prohibited from obtaining advanced AI technologies due to security issues.

Main Aspects of the Regulations

These new rules set limits on exports using a Total Processing Performance (TPP) standard. For instance, AI chips like Nvidia’s H100 GPUs are restricted from reaching Tier 3 countries. However, US cloud service providers, including Amazon Web Services, Microsoft, and Google, are granted exemptions, enabling them to operate globally under certain stringent conditions.

The purpose of these restrictions is to prevent adversarial nations from using AI chips for military improvements, surveillance, or cyber warfare. By protecting its technological advantage, the US aims to maintain its leadership in global AI and ensure national security.

Worldwide Effects

Manufacturers like Nvidia are likely to face difficulties, with major revenue declines anticipated in the restricted markets. On the other hand, US cloud service providers could see an uptick in benefits from these exemptions, enhancing their international footprint. There are worries about these regulations leading to fragmented global supply chains, which could negatively affect consumer markets, particularly gaming.

With a 120-day period for public comments, the upcoming Trump administration has a chance to adjust the new framework. Analysts forecast ongoing stringent measures against China, but they also anticipate some flexibility in how these rules are implemented, aiming to balance economic growth with security requirements.

Tags: national security, Nvidia H100
November 10, 2024

Apple and Foxconn Team Up for Custom AI Servers in Taiwan

Apple is teaming up with Foxconn and LCFC, a Lenovo subsidiary, to create its own AI servers powered by Apple Silicon in Taiwan. This strategy is designed to enhance Apple’s data center capabilities for their upcoming Apple Intelligence services and reduce their dependency on Chinese manufacturers.

The Reason for Choosing Taiwan

Sources suggest that Apple chose Taiwan primarily to benefit from Foxconn’s extensive expertise in constructing AI servers. Foxconn is already producing servers equipped with Nvidia’s H100 and H200 GPUs and is preparing to collaborate on new Blackwell-based chips.

Focus on AI Inference Management

Unlike its rivals such as Amazon, Google, and Microsoft, Apple’s server strategy focuses more on managing AI inference instead of developing large language models. These servers are intended for internal operations, indicating that production volumes will be lower compared to typical data center configurations.

Collaboration and Design Support

This partnership goes beyond just server production; it also involves engineering and design assistance from Foxconn and LCFC. Although Apple may not have a lot of experience in designing data center servers, the development is expected to progress rapidly since these servers are less complex than Nvidia’s GB200 systems.

Foxconn has AI research facilities in Hsinchu, Taiwan, and San Jose, California, where they are currently collaborating with Nvidia on upcoming GB300 server initiatives. Moreover, additional manufacturing partners like Universal Scientific Industrial may also join to further diversify the production process.

Tags: Apple, Foxconn, Nvidia H100
November 1, 2024

Meta’s Llama 4 Uses 100,000 Nvidia H100 GPUs for Training

Meta has just announced a new update on Llama 4, their upcoming language model. During a recent earnings call, CEO Mark Zuckerberg revealed that they are training Llama 4 using a setup with over 100,000 Nvidia H100 GPUs. This is a larger setup than anything previously reported by competitors.

Upcoming Features of Llama 4

This new language model is set to launch in early 2025, starting with its smaller versions first. While specific capabilities are not fully disclosed, Zuckerberg suggested that Llama 4 will have enhanced features, improved reasoning abilities, and will operate at a quicker pace overall.

Meta’s Unique Strategy

Meta continues its strategy of offering models for free download, unlike OpenAI and Google, which restrict access through APIs. This makes Llama 4 particularly appealing for startups and researchers who prefer more flexibility in using AI technologies.

Significant Energy and Financial Implications

Given the extensive computing resources, the energy requirements are also considerable, estimated at around 150 megawatts—this is five times more than what the largest supercomputer at a U.S. national lab consumes. Meta’s financial plan reflects this ambitious scale, with infrastructure expenditures projected to hit $40 billion in 2024, marking a 42 percent increase from 2023.

Despite these hefty expenses, Meta’s financial health remains strong, showing a 22 percent rise in sales, primarily driven by advertising revenue, which has helped offset a 9 percent increase in operating costs.

Tags: Llama 4, Meta, Nvidia H100

Tag: Nvidia H100

DeepSeek V3.2: Free Open-Source AI LLM Reduces Compute Costs

Key Takeaways

New Design Features

Cost Reduction for Developers

Download Requirements

Amkor Moves $2B Chip Plant to Peoria for U.S. Semiconductor Security

Key Takeaways

A Significant Step Forward

Addressing Supply Chain Challenges

Funding and Labor Challenges

Looking Ahead

Surge in Gray-Market Repairs for Banned Nvidia H100 and A100 GPUs

Key Takeaways

A Market Born from Scarcity

Importance of Maintenance

Pricing for Repairs

US AI Chip Export Rules Raise Concerns and Global Tensions

Classification of Nations

Main Aspects of the Regulations

Worldwide Effects

Apple and Foxconn Team Up for Custom AI Servers in Taiwan

The Reason for Choosing Taiwan

Focus on AI Inference Management

Collaboration and Design Support

Meta’s Llama 4 Uses 100,000 Nvidia H100 GPUs for Training

Upcoming Features of Llama 4

Meta’s Unique Strategy

Significant Energy and Financial Implications