Long Context
Articles tagged "Long Context"
DeepSeek V4: 1M Token Context, Hybrid Attention, and What Actually Matters
DeepSeek V4 has arrived with two new Mixture-of-Experts models, a claimed 1M-token context window, and a novel hybrid attention mechanism that slashes KV cache …
DeepSeek V4 Is Here: Inside the 1.6T-Parameter Pro and Ultra-Efficient Flash Models
DeepSeek V4 has arrived with a 1.6 trillion parameter Pro model and a highly efficient Flash variant that promise huge leaps in reasoning, coding, and long-cont…