就在十几个小时前,DeepSeek 发布了一篇新论文,主题为《Conditional Memory via Scalable Lookup:A New Axis of Sparsity for Large Language Models》,与北京大学合作完成,作者中同样有梁文锋署名。 简单总结一波这项新研究要解决的问题:目前大语言模型主要通过混合专家(MoE)来 ...
The Opensource DeepSeek R1 model and the distilled local versions are shaking up the AI community. The Deepseek models are the best performing open source models and are highly useful as agents and ...
最近几个月,大语言模型(LLM)领域出现了一个有趣的现象:虽然开源社区依然活跃,但闭源模型(如GPT 5系列、Claude 4.5、Gemini 3.0)似乎正在加速拉开差距。可能是西方马上圣诞节的缘故,各家的狠活都一个接一个地来了。这种差距不仅仅体现在跑分上,更体现 ...
Threat actors are taking advantage of the rise in popularity of the DeepSeek to promote two malicious infostealer packages on the Python Package Index (PyPI), where they impersonated developer tools ...
前者聚焦平衡实用,适用于日常问答、通用Agent任务、真实应用场景下的工具调用。 推理达GPT-5水平,略低于Gemini-3.0-Pro。 后者主打极致推理,推理基准性能媲美Gemini-3.0-Pro。 还一把斩获IMO 2025、CMO 2025、ICPC World Finals 2025、IOI 2025金牌。 划重点,ICPC达到人类选手 ...
使用微信扫码将网页分享到微信 「服务器繁忙,请稍后再试。」 一年前,我也是被这句话硬控的用户之一。 DeepSeek 带着 R1 在一年前的今天(2025.1.20)横空出世,一出场就吸引了全球的目光。 那时候为了能顺畅用上 DeepSeek,我翻遍了自部署教程,也下载过不少 ...
DeepSeek is set to become the default decision-making tool for local government officials in China. In several towns, high-level officials have recently instructed their staff on using the technology, ...
South Korean officials on Saturday temporarily restricted Chinese AI Lab DeepSeek’s app from being downloaded from app stores in the country pending an assessment of how the Chinese company handles ...
DeepSeek R1论文扩至86页,强化学习提升推理能力,开源媲美闭源模型。 R1论文暴涨至86页!DeepSeek向世界证明:开源不仅能追平闭源,还能教闭源做事! 全网震撼! 两天前,DeepSeek悄无声息地把R1的论文更新了,从原来22页「膨胀」到86页。 全新的论文证明,只需要 ...
当前正在显示可能无法访问的结果。
隐藏无法访问的结果