Muon - 搜索 News

月之暗面开源改进版Muon优化器，算力需求比AdamW锐减48%，DeepSeek也适用

算力需求比AdamW直降48%，OpenAI技术人员提出的训练优化算法Muon，被月之暗面团队又推进了一步！团队发现了Muon方法的Scaling Law，做出改进并证明了Muon对更大的模型同样适用。在参数量最高1.5B的不同Llama架构模型上，改进后的Muon算力需求仅为AdamW的52%。同时团队 ...

新浪网

开源赛道太挤了！月之暗面开源新版Muon优化器

月之暗面和 DeepSeek 这次又「撞车」了。上次是论文，两家几乎前后脚放出改进版的注意力机制，可参考《撞车 DeepSeek NSA，Kimi 杨植麟署名的新注意力架构 MoBA 发布，代码也公开》、《刚刚！DeepSeek 梁文锋亲自挂名，公开新注意力架构 NSA》。这次是开源。

36氪

爆肝一篇博客拿下OpenAI Offer，Muon作者怒揭：几乎所有优化器的论文 ...

Keller Jordan：“写出一篇数据漂亮、图表华丽的优化器论文”与“这个优化器实际有没有用”之间有什么必然联系。不是顶会论文，也没有发在 arXiv 上，甚至连“正式发表”都称不上——但就是这样的一篇纯博客文章，却让一名研究员成功拿到了 OpenAI 的 Offer ...

腾讯网

微软、哈佛开源创新优化器：全面超越Muon，提升大模型训练效率

专注AIGC领域的专业社区，关注微软&OpenAI、百度文心一言、讯飞星火等大语言模型（LLM）的发展和应用落地，聚焦LLM的市场研究和AIGC开发者生态，欢迎关注！随着大模型功能的增强，训练所需的计算资源呈爆炸式增长。例如，训练一个百亿参数的大型语言模型 ...

来自MSN

Muon优化器：让AI训练更快更省，Essential AI如何拓展了深度学习的计算 ...

革新AI训练方式：Muon优化器的实用效率 2025年5月，由旧金山的Essential AI研究团队开发的Muon优化器在深度学习领域引发了一场小革命。这项研究以《Muon在预训练中的实用效率》(Practical Efficiency of Muon for Pretraining)为题，发表在arXiv上（arXiv:2505.02222v1）。研究展示了 ...

Business Insider

A weirdly wobbly 'muon' particle might revolutionize physics by revealing a 5th force of ...

A subatomic particle called the muon is wobbling far more than leading physics models can explain. Its unusual behavior could be evidence of a fifth force of nature or a new dimension. Scientists ...

Nature

Physicists spellbound by deepening mystery of muon particle’s magnetism

The muon’s mysteries continue to leave physicists spellbound. Last year, an experiment suggested that the elementary particle had inexplicably strong magnetism, possibly breaking a decades-long streak ...

Wired

To Observe the Muon Is to Experience Hints of Immortality

All people want to enact a paradigm shift, don't they? Even if it's not mRNA, or Lego, we want at least, on our one chance on Earth, to make a meme happen. So imagine the excitement on April 7, when ...

Forbes

Why The Unexpected Muon Was The Biggest Surprise In Particle Physics History

Cosmic rays, which are ultra-high energy particles originating from all over the Universe, strike protons in the upper atmosphere and produce showers of new particles. The fast-moving charged ...

Ars Technica

Fermilab’s latest muon measurements hint at cracks in the Standard Model

The Muon g-2 experiment (pronounced “gee minus two”) is designed to look for tantalizing hints of physics beyond the Standard Model of particle physics. It does this by measuring the magnetic field ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果