Key Takeaways
1. DeepSeek has launched the DeepSeek-V3.2-Exp AI model, known for lower compute expenses and advanced performance, ranking 11th among global LLMs.
2. The model uses a new DeepSeek Sparse Attention (DSA) framework, focusing on relevant tokens to improve speed and reduce memory usage, supporting up to a 128K-token window.
3. App developers can save over 50% in costs when using DeepSeek V3.2 Exp via its public API while maintaining similar performance levels.
4. The 400 GB model is available for free download on Hugging Face, requiring powerful hardware with multiple Nvidia GPUs or specific servers for proper operation.
5. Users wishing to run DeepSeek V3.2 on personal desktops must wait for quantized models and need a GPU with at least 24 GB of memory.
DeepSeek has unveiled its new artificial intelligence large-language model called DeepSeek-V3.2-Exp, which comes with notably lower compute expenses. This improvement is beneficial for companies utilizing the firm’s API in their applications, allowing them to save funds while accessing an advanced AI that has secured the 11th position among the most formidable LLMs introduced globally.
New Design Features
The breakthrough was made possible by implementing a novel DeepSeek Sparse Attention (DSA) framework. Unlike traditional AI transformers that index every token, this design only focuses on the most pertinent tokens. This enhancement enables the AI to handle text input more quickly, supporting up to a 128K-token window while using less memory.
Cost Reduction for Developers
App developers utilizing DeepSeek V3.2 Exp through its public API can anticipate spending over 50% less compared to the earlier version, all while ensuring similar performance levels across standard AI evaluations.
Download Requirements
The 400 GB AI LLM is available for free download on Hugging Face and can be operated locally on robust computers. Users should note that a setup with several Nvidia H100/H200/H20 GPUs or at least one NVIDIA B200/GB200 server is necessary because of the model’s requirement for more than 1.5 TB of VRAM.
For those wishing to run DeepSeek v3.2 on personal desktops, patience is required until quantized models become available on Hugging Face, like the one for v3.1 by unsloth. Additionally, a GPU with a minimum of 24 GB of memory is needed, such as the Nvidia 5090 listed on Amazon.
Source:
Link








