浣跨敤 Nemotron 3 Online 杞绘澗鏋勫缓鎮ㄧ殑 AI ?? 浜у搧
缁熻
Nemotron 3 Online 鐩稿叧鐨勬暟瀛?,
Nemotron 3 Online 鍙鎮ㄥ湪鍑犲ぉ鍐呰交鏉炬瀯寤烘偍鐨?AI ??
Super: 120B total parameters with 12B active
Context window up to 1M tokens
Nano: 30B total parameters with 3.5B active
Where teams use Nemotron 3
Practical evaluation paths
These workflows are the fastest ways to validate long-context reasoning, agentic behavior, and throughput before you commit to deployment.
Long-document synthesis
Summarize reports, legal briefs, and multi-chapter research with Nemotron 3 in a single prompt window.
Multi-file code analysis
Ask the models to navigate large repos, explain architecture, and propose refactors.
Agentic tool workflows
Test multi-step planning and tool calling for research, ops, or automation tasks.
Retrieval + 1M context
Compare RAG strategies against full-context prompting to see what works best.
Local deployment planning
Evaluate quantization targets, VRAM needs, and latency before moving on-prem.
Multilingual evaluations
Run the same prompts across languages to measure consistency and quality.
Research highlights
Architecture and benchmark visuals
Figures sourced from the paper and shared assets, highlighting Nemotron 3 architecture and evaluation snapshots.


Nemotron 3 paper figures
Paper visuals
Key diagrams and benchmark curves extracted from the paper.
Architecture and routing
MoE routing diagrams used in Nemotron 3 Nano.
The paper introduces Nemotron 3 (Nano, Super, Ultra) as a Mixture-of-Experts hybrid Mamba–Transformer family built for strong throughput and up to 1M-token context.
Most layers interleave Mamba-2 and MoE blocks with a small number of self-attention layers; larger models add LatentMoE and MTP layers for quality and faster generation.
Post-training uses multi-environment reinforcement learning to improve reasoning, multi-step tool use, and budget-controlled inference.


Accuracy-efficiency trade-off
Illustrates the accuracy-efficiency trade-off curves of Nemotron 3 Nano by varying the token budget.
Inference-time budget control lets you set a maximum token budget for the thinking trace.
When the budget is reached, appending the `</think>` token prompts the model to continue with the response based on the partial trace.
The curves below show how accuracy trades off against efficiency as the token budget changes.




闆嗘垚
涓庢偍鏈€鍠滄鐨勫伐鍏烽泦鎴?,
鏃犵紳杩炴帴娴佽鐨勫钩鍙板拰鏈嶅姟锛屼互澧炲己鎮ㄧ殑宸ヤ綔娴佺▼銆?,
鍔熻兘
鎮ㄧ殑 ?? 浜у搧鍔熻兘
璇峰湪杩欓噷浠嬬粛鎮ㄧ殑 ?? 浜у搧鐨勭壒鑹插姛鑳界殑淇℃伅
鍔熻兘
璇峰湪杩欓噷浠嬬粛鎮ㄧ殑 ?? 浜у搧鐨勭壒鑹插姛鑳界殑淇℃伅

鍔熻兘2
鎮ㄧ殑 ?? 浜у搧鍔熻兘
璇峰湪杩欓噷浠嬬粛鎮ㄧ殑 ?? 浜у搧鐨勭壒鑹插姛鑳界殑淇℃伅
鍔熻兘2
璇峰湪杩欓噷浠嬬粛鎮ㄧ殑 ?? 浜у搧鐨勭壒鑹插姛鑳界殑淇℃伅
- 鐗硅壊鍔熻兘鐗圭偣涓€
- 鐗硅壊鍔熻兘鐗圭偣浜?,
- 鐗硅壊鍔熻兘鐗圭偣涓?,
- 鐗硅壊鍔熻兘鐗圭偣鍥?

鍔熻兘3
鎮ㄧ殑 ?? 浜у搧鍔熻兘
璇峰湪杩欓噷浠嬬粛鎮ㄧ殑 ?? 浜у搧鐨勭壒鑹插姛鑳界殑淇℃伅
浜у搧鐗硅壊鍔熻兘涓€
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘涓€锛屽敖鍙兘璇︾粏锛屼娇鍏舵洿鍚稿紩鐢ㄦ埛
浜у搧鐗硅壊鍔熻兘浜?,
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘浜岋紝灏藉彲鑳借缁嗭紝浣垮叾鏇村惛寮曠敤鎴?
浜у搧鐗硅壊鍔熻兘涓?,
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘涓夛紝灏藉彲鑳借缁嗭紝浣垮叾鏇村惛寮曠敤鎴?
浜у搧鐗硅壊鍔熻兘鍥?,
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘鍥涳紝灏藉彲鑳借缁嗭紝浣垮叾鏇村惛寮曠敤鎴?
浜у搧鐗硅壊鍔熻兘浜?,
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘浜旓紝灏藉彲鑳借缁嗭紝浣垮叾鏇村惛寮曠敤鎴?
浜у搧鐗硅壊鍔熻兘鍏?,
璇峰湪杩欓噷璇︾粏鎻忚堪鎮ㄧ殑浜у搧鐗硅壊鍔熻兘鍏紝灏藉彲鑳借缁嗭紝浣垮叾鏇村惛寮曠敤鎴?
Video walkthrough
Community demo and first-look coverage
A quick video from the community to see how Nemotron-3 Super behaves in practice and what developers are testing.
Video from community discussions referenced in your YouTube notes.
甯歌闂
濡傛灉鎮ㄦ湁浠讳綍闂锛岃闅忔椂鑱旂郴鎴戜滑
瀹㈡埛璇勪环
鎴戜滑鐨勫鎴峰鎴戜滑鐨勮瘎浠?,
Jonathan Yombo
杞欢宸ョ▼甯?,Nemotron 3 Online 闈炲父鍑鸿壊涓斿疄鐢紝鏃犻渶璐瑰績锛屼竴涓湡姝g殑閲戠熆銆?
Yves Kalume
Android GDE娌℃湁缃戦〉璁捐缁忛獙锛屾垜鍙渶鍑犲垎閽熷氨鍙互鐢?Tailwindcss 閲嶆柊璁捐鎴戠殑鏁翠釜缃戠珯锛屾劅璋?Nemotron 3 Online銆?
Yucel Faruksahan
Tailkits 鍒涘缓鑰?,Nemotron 3 Online 妯℃澘鍋氬緱寰堝ソ锛岃繖鏄垜瑙佽繃鏈€濂界殑 ?? 妯℃澘锛屾病鏈変箣涓€ :)
Anonymous author
浜у搧缁忕悊鎴戜笉鐔熸倝 Tailwind锛屾垜鎯宠嚜宸卞仛涓€浜涢〉闈紝鎴戝湪缃戜笂鎼滅储浜嗗緢澶氳嫳闆勯〉闈㈠拰鍖哄潡銆傜劧鑰岋紝澶у鏁伴兘娌℃湁缁欐垜涓€涓竻鏅扮殑鎯虫硶锛屾垨鑰呴渶瑕佷竴浜?HTML/CSS 缂栫爜鑳屾櫙鏉ヤ粠鍘熷鏂囦欢涓仛涓€浜涙洿鏀癸紝鎴栬€呭お璐典簡銆傛垜涓嬭浇浜嗗叾涓竴涓?Nemotron 3 Online 妯℃澘锛屽畠闈炲父瀹规槗鐞嗚В锛屼綘鍙互鍦ㄥ紑濮嬫椂淇敼浠g爜/鍖哄潡浠ュ畬缇庡湴閫傚簲浣犵殑椤甸潰鐩殑銆?
Shekinah Tshiokufila
楂樼骇杞欢宸ョ▼甯?,Nemotron 3 Online 姝e湪閲嶆柊瀹氫箟缃戦〉璁捐鏍囧噯锛岃繖浜涘尯鍧椾负閭d簺鍠滄缇庝附浣嗗彲鑳界己涔忔椂闂村疄鐜板畠鐨勪汉鎻愪緵浜嗙畝鍗曚笖楂樻晥鐨勬柟寮忋€傛垜鍙兘鎺ㄨ崘杩欎釜涓嶅彲鎬濊鐨勫杩广€?
Oketa Fred
鍏ㄦ爤寮€鍙戝伐绋嬪笀鎴戠粷瀵瑰枩娆?Nemotron 3 Online锛佽繖浜涚粍浠跺尯鍧楄璁$簿缇庝笖鏄撲簬浣跨敤锛屼娇鍒涘缓涓€涓嚭鑹茬殑缃戠珯鍙樺緱杞昏€屾槗涓俱€?
