DeepSeek V4: 1.6T-parameter Pro, 1M-token context, MIT license, Huawei Ascend support
DeepSeek released V4 Pro and Flash with a new hybrid attention architecture, 32T-token training, and support for Huawei Ascend chips — its first major release since V3 and R1 and the first with both Base and Instruct versions.