인공지능게시판

Jensen Huang GTC 2026 Keynote NVIDIA Founder & CEO 젠슨 황 기조연설 전문

작성자
Kyungjin Kim AI Researcher
작성일
2026-03-18 19:43
조회
198

젠슨 황 GTC 2026 기조연설

NVIDIA 창업자 겸 CEO 젠슨 황 키노트 전문 (한국어 번역)

GTC 2026 · San Jose, California



목차

1. GTC 개막 — 기술 컨퍼런스에 오신 것을 환영합니다

2. CUDA 20주년 — 가속 컴퓨팅의 기반

3. GeForce에서 RTX, 그리고 뉴럴 렌더링으로

4. 구조화 데이터와 비구조화 데이터 — cuDF & cuVS

5. 가속 컴퓨팅 플랫폼 — 수직 통합, 수평 개방

6. 산업별 AI 적용 & CUDA X 라이브러리

7. AI 네이티브 기업과 추론 인플렉션의 도래

8. 1조 달러의 수요 — AI 인프라의 미래

9. 추론 성능의 왕 — 토큰당 비용 혁명

10. Vera Rubin 아키텍처 — 에이전틱 AI를 위한 설계

11. 토큰 팩토리 경제학 — 수익의 새로운 공식

12. 로드맵 — Feynman, Rosa, 그리고 우주까지

13. Open Claw 혁명 — 에이전틱 시대의 운영체제

14. 물리적 AI와 로보틱스 — 자율주행에서 올라프까지

15. 마무리 — GTC의 미래



Part 1: GTC 개막 — 기술 컨퍼런스에 오신 것을 환영합니다

GTC 2026 기조연설 시작, NVIDIA 3대 플랫폼 소개

GTC에 오신 것을 환영합니다. 한 가지 상기시켜 드리고 싶은 것이 있습니다. 이곳은 기술 컨퍼런스입니다. 이렇게 이른 아침부터 줄을 서서 기다려주신 모든 분들, 이 자리에 함께해주셔서 정말 반갑습니다.

GTC에서 우리는 기술에 대해 이야기할 것입니다. 플랫폼에 대해서도 이야기하겠습니다. NVIDIA에는 3개의 플랫폼이 있습니다. 여러분은 우리가 주로 그 중 하나에 대해 이야기한다고 생각하실 텐데, 그것은 CUDA X와 관련된 것입니다. 시스템즈(Systems)가 또 다른 플랫폼이고, 이제 AI 팩토리(AI Factories)라는 새로운 플랫폼도 있습니다. 오늘은 이 세 가지 모두에 대해 이야기하겠습니다. 그리고 가장 중요한 것은, 생태계에 대해 이야기하겠습니다.

본격적으로 시작하기 전에, 프리게임 쇼 진행을 맡아주신 분들께 감사드립니다. 정말 훌륭했습니다. Conviction의 Sarah, Sequoia Capital의 Alfred Lim — NVIDIA 최초의 벤처 캐피털리스트이자, NVIDIA 최초의 대형 기관투자자인 Gavin Baker. 이 세 분은 기술에 깊이 정통하고, 기술 생태계 전반에 걸쳐 놀라운 안목을 갖고 계신 분들입니다.

오늘 이 자리에 참석해주신 모든 기업에도 감사드립니다. NVIDIA는 아시다시피 플랫폼 기업입니다. 우리에게는 기술이 있고, 플랫폼이 있으며, 풍부한 생태계가 있습니다. 오늘 이 자리에는 100조 달러 규모 산업의 거의 100%가 대표되어 있습니다. 450개 기업이 이 행사를 후원했고, 1,000개의 기술 세션, 2,000명의 연사가 함께합니다.

이 컨퍼런스는 인공지능의 5단계 레이어 케이크 — 토지·전력·건물 등 인프라부터 칩, 플랫폼, 모델, 그리고 궁극적으로 이 산업을 폭발적으로 성장시킬 애플리케이션까지 — 모든 계층을 다룰 것입니다.


Part 2: CUDA 20주년 — 가속 컴퓨팅의 기반

CUDA 플랫폼의 20년 여정과 플라이휠 효과

모든 것은 바로 이곳에서 시작되었습니다. 올해는 CUDA 20주년입니다. 우리는 20년 동안 CUDA를 만들어 왔습니다. 20년 동안 이 아키텍처, 이 혁명적 발명 — SIMT(단일 명령어 다중 스레드)에 헌신해 왔습니다. 스칼라 코드를 작성하면 다중 스레드 애플리케이션으로 분기될 수 있는, SIMD보다 훨씬 프로그래밍하기 쉬운 구조입니다. 최근에는 텐서 코어를 프로그래밍할 수 있도록 타일(tiles)을 추가했고, 오늘날 인공지능의 근간이 되는 수학적 구조들을 지원합니다.

수천 개의 도구와 컴파일러, 프레임워크, 라이브러리가 오픈소스로 제공됩니다. 수십만 개의 공개 프로젝트가 있고, CUDA는 말 그대로 모든 생태계에 통합되어 있습니다. 이 차트가 기본적으로 NVIDIA 전략의 100%를 설명합니다.

궁극적으로 달성하기 가장 어려운 것은 맨 아래에 있는 것, 바로 설치 기반(Installed Base)입니다. 20년이 걸려서 이제 전 세계에 수억 대의 GPU와 컴퓨팅 시스템이 CUDA를 실행하고 있습니다. 우리는 모든 클라우드, 모든 컴퓨터 회사에 있고, 거의 모든 산업에 서비스를 제공합니다.

CUDA의 설치 기반이 바로 플라이휠을 가속시키는 이유입니다. 설치 기반이 개발자를 끌어들이고, 개발자가 새로운 알고리즘을 만들어 돌파구를 달성하면 — 예를 들어 딥러닝처럼 — 그 돌파구가 완전히 새로운 시장을 열고, 새로운 생태계를 구축하며, 더 큰 설치 기반을 만듭니다. 이 플라이휠은 지금 가속 중입니다. NVIDIA 라이브러리의 다운로드 수가 엄청나게 빠르게 증가하고 있습니다.

이 플라이휠이야말로 이 컴퓨팅 플랫폼이 그토록 많은 애플리케이션과 새로운 돌파구를 지탱할 수 있게 하는 것이고, 가장 중요하게는 이 인프라들이 놀랍도록 긴 수명을 갖게 합니다. 6년 전에 출하한 Ampere의 클라우드 가격이 오히려 올라가고 있는 이유도 바로 이것입니다.

가속 컴퓨팅이 애플리케이션 속도를 엄청나게 높이는 동시에, 우리가 지속적으로 소프트웨어를 업데이트하면서, 처음 도입할 때의 성능 향상뿐만 아니라 시간이 지남에 따라 지속적인 비용 절감을 얻을 수 있습니다. 설치 기반이 크기 때문에 새로운 최적화를 릴리스하면 수백만 명이 혜택을 받습니다. 이것이 바로 NVIDIA 아키텍처가 도달 범위를 넓히고, 성장을 가속하면서, 동시에 컴퓨팅 비용을 낮추는 역학입니다.


Part 3: GeForce에서 RTX, 그리고 뉴럴 렌더링으로

프로그래머블 셰이더에서 DLSS 5까지의 25년 여정

우리의 여정은 사실 25년 전에 시작되었습니다. GeForce입니다. 여러분 중 얼마나 많은 분이 GeForce와 함께 자라났는지 알고 있습니다. GeForce는 NVIDIA 최고의 마케팅 캠페인입니다. 우리는 미래의 고객을 유치합니다 — 여러분이 직접 비용을 감당할 수 있기 훨씬 전부터요. 부모님이 비용을 내셨습니다. 해마다, 해마다, 해마다 비용을 내시다가, 어느 날 여러분은 훌륭한 컴퓨터 과학자가 되어 제대로 된 고객이 되셨습니다.

25년 전, 우리는 프로그래머블 셰이더를 발명했습니다. 가속기를 프로그래밍 가능하게 만드는, 완벽하게 비직관적인 발명 — 세계 최초의 프로그래머블 가속기, 픽셀 셰이더였습니다. 그로부터 5년 후, CUDA가 발명되었습니다. 우리가 한 가장 큰 투자 중 하나였고, 당시에는 감당하기 어려운 수준이었습니다. 회사 이익의 대부분을 소모하면서 GeForce에 실린 CUDA를 모든 컴퓨터에 보급했습니다.

픽셀 셰이더가 GeForce 혁명을 이끌었고, GeForce가 CUDA를 세계에 가져다주었습니다. 그래서 Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton, Andrew Ng 등 많은 분들이 GPU가 딥러닝을 가속하는 데 최고의 도구라는 것을 발견할 수 있었습니다. AI의 빅뱅이 시작된 것입니다.

약 10년 전, 우리는 프로그래머블 셰이딩에 두 가지 새로운 아이디어를 융합하기로 결정했습니다. 하드웨어 레이트레이싱과, 당시에는 새로운 아이디어였던 AI가 컴퓨터 그래픽스를 혁명적으로 바꿀 것이라는 비전입니다. GeForce가 AI를 세상에 가져다주었듯이, 이제 AI가 컴퓨터 그래픽스 자체를 완전히 바꿔놓으려 하고 있습니다.

오늘 미래의 모습을 보여드리겠습니다. 우리의 차세대 그래픽스 기술, 뉴럴 렌더링(Neural Rendering)입니다. 3D 그래픽스와 인공지능의 융합. DLSS 5입니다. 제어 가능한 3D 그래픽스 — 가상 세계의 구조화된 데이터(Structured Data) — 에 생성형 AI, 확률적 컴퓨팅을 결합했습니다. 하나는 완전히 예측적이고, 다른 하나는 확률적이면서도 극도로 사실적입니다. 결과물은 아름답고 놀라우면서도 제어 가능합니다. 구조화된 정보와 생성형 AI를 융합하는 이 개념은 산업에서 산업으로, 산업에서 산업으로 반복될 것입니다. 구조화된 데이터는 신뢰할 수 있는 AI의 기반입니다.


Part 4: 구조화 데이터와 비구조화 데이터 — cuDF & cuVS

데이터 처리 혁명: IBM, Dell, Google Cloud 파트너십

이것은 구조화된 데이터입니다. 여러분이 알고 계신 SQL, Spark, Pandas, Velox, Snowflake, Databricks, Amazon EMR, Azure Fabric, Google Cloud BigQuery 등 대형 플랫폼들이 데이터프레임을 처리하고 있습니다. 이 데이터프레임은 거대한 스프레드시트이며, 삶의 모든 정보를 담고 있습니다. 이것이 비즈니스의 근간이자 엔터프라이즈 컴퓨팅의 진실의 원천입니다.

이제 AI가 구조화된 데이터를 사용하게 됩니다. 그래서 우리는 이것을 엄청나게 가속해야 합니다. 미래에는 AI가 우리보다 훨씬 빠를 것이고, 미래의 에이전트들도 구조화된 데이터베이스를 사용할 것입니다.

그 다음으로 비구조화 데이터베이스가 있습니다. 벡터 데이터베이스, PDF, 비디오, 음성 등 세계 정보의 약 90%가 비구조화 데이터입니다. 지금까지 이 데이터는 세계에서 사실상 무용지물이었습니다. 읽고 파일 시스템에 넣으면 끝이었습니다. 쿼리도, 검색도 어려웠습니다. 쉬운 인덱싱이 없기 때문입니다. 의미와 목적을 이해해야만 합니다. 이제 AI가 그 일을 합니다.

NVIDIA는 두 가지 기반 라이브러리를 만들었습니다. 3D 그래픽스를 위한 RTX를 만들었듯이, 데이터프레임·구조화 데이터를 위한 cuDF와 벡터 스토어·비구조화 데이터를 위한 cuVS를 만들었습니다. 이 두 플랫폼은 미래에 가장 중요한 플랫폼이 될 것입니다.

오늘 여러 파트너십을 발표합니다. SQL의 발명자인 IBM이 Watson X Data를 cuDF로 가속합니다. Nestlé는 185개국 글로벌 운영 전반의 공급망 의사결정을 NVIDIA GPU로 가속된 Watsonx.data로 실행하여, 동일한 작업을 5배 빠르게, 83% 낮은 비용으로 처리합니다. Dell은 cuDF와 cuVS를 통합한 Dell AI Data Platform을 만들었습니다. Google Cloud BigQuery와의 협업에서는 Snapchat의 컴퓨팅 비용을 거의 80% 절감했습니다.


Part 5: 가속 컴퓨팅 플랫폼 — 수직 통합, 수평 개방

NVIDIA의 비즈니스 모델과 글로벌 클라우드 파트너십

원래 무어의 법칙이라고 불렸습니다. 매 2년마다 성능이 두 배로 늘어나는 것, 다시 말해 매년 컴퓨팅 비용을 절감하는 것이었습니다. 그런데 무어의 법칙이 한계에 달했습니다. 새로운 접근이 필요하며, 가속 컴퓨팅이 우리에게 도약의 발판을 제공합니다. NVIDIA는 알고리즘 회사입니다. 알고리즘을 계속 최적화하고, 도달 범위와 설치 기반이 크기 때문에, 모든 사람의 컴퓨팅 비용을 지속적으로 줄이고 규모와 속도를 높일 수 있습니다.

NVIDIA가 한 것은 이것입니다. 이 주제가 계속 반복될 것입니다. NVIDIA는 수직 통합된, 세계 최초의 수직 통합이면서 수평적으로 개방된 기업입니다. 그 이유는 간단합니다. 가속 컴퓨팅은 칩 문제가 아닙니다. 시스템 문제도 아닙니다. 누락된 단어가 있습니다 — '애플리케이션' 가속입니다. 모든 것을 더 빠르게 만들 수 있다면 그것이 CPU입니다. 하지만 CPU는 한계에 달했습니다.

앞으로 애플리케이션 속도를 높이고 비용을 줄이려면 도메인별 가속이 필요합니다. 그래서 NVIDIA는 라이브러리 위에 라이브러리, 도메인 위에 도메인, 수직 위에 수직을 쌓아야 합니다. 애플리케이션을 이해하고, 도메인을 이해하고, 알고리즘을 근본적으로 이해한 다음, 데이터센터든 클라우드든 엣지든 로봇이든 어디에서든 배포할 방법을 찾아야 합니다.

Google Cloud, AWS, Microsoft Azure, Oracle, CoreWeave 등 세계의 클라우드 서비스 사업자들과 깊이 통합되어 있습니다. AWS에 OpenAI를 도입할 예정이고, Azure와는 AI 파운드리에서 깊이 협력하며, Bing 검색을 가속하고, 기밀 컴퓨팅을 지원합니다. 기밀 컴퓨팅에서는 운영자조차 여러분의 데이터를 볼 수 없고, 모델에 접근할 수 없습니다. Oracle에서는 우리가 최초의 AI 고객이었고 동시에 최초의 공급업체이기도 했습니다.

Palantir, Dell과 함께 어떤 국가, 어떤 에어갭(air-gapped) 환경에서도 온프레미스로, 현장에서, AI를 배포할 수 있는 새로운 유형의 AI 플랫폼을 구축했습니다. AI는 문자 그대로 어디에서나 배포될 수 있습니다.


Part 6: 산업별 AI 적용 & CUDA X 라이브러리

자동차, 헬스케어, 로보틱스, 양자컴퓨팅 등 전방위 AI 라이브러리

이번 GTC 최대 비율의 참석자는 금융 서비스 업계에서 왔습니다. 알고리즘 트레이딩은 고전적 머신러닝과 인간의 피처 엔지니어링에서 딥러닝과 대규모 언어모델로 전환 중입니다. 헬스케어는 ChatGPT의 순간을 맞이하고 있으며, 신약 발견을 위한 AI 생물학, 고객 서비스 AI 에이전트, 진단 지원 등 흥미로운 작업이 진행 중입니다.

물리적 AI — 로봇 시스템, 산업 부문에서는 인류 역사상 가장 큰 구축이 진행 중입니다. AI 팩토리, 칩 공장, 컴퓨터 공장이 세계 곳곳에서 건설되고 있습니다. 로보틱스는 5조 달러 규모의 산업이자 제조업 분야이며, NVIDIA는 10년 넘게 이 분야에서 일해왔습니다. 로봇을 만드는 데 필요한 3대 컴퓨터 — 훈련 컴퓨터, 합성 데이터 생성 및 시뮬레이션 컴퓨터, 로봇 내부의 로보틱스 컴퓨터 — 를 구축해왔습니다. 이 쇼에 110대의 로봇이 전시되어 있습니다.

통신은 세계 IT 산업만큼이나 큰 약 2조 달러 규모입니다. 기지국은 미래에 AI 인프라 플랫폼으로 완전히 재발명될 것입니다. AI가 엣지에서 실행될 것이기 때문입니다. Nokia, T-Mobile 등과 대규모 파트너십을 맺고 있습니다.

우리 비즈니스의 핵심에는 CUDA X 라이브러리가 있습니다. NVIDIA가 발명하는 알고리즘, 이것이 우리를 특별하게 만드는 것입니다. 이 쇼에서 100개의 라이브러리, 70개의 라이브러리, 약 40개의 모델을 발표하고 있습니다. 라이브러리는 우리 회사의 핵심 자산입니다. 컴퓨팅 플랫폼을 활성화하여 문제를 해결하고 영향을 미치는 것을 가능하게 합니다. 우리가 만든 가장 중요한 라이브러리 중 하나인 cuDNN — CUDA Deep Neural Networks — 은 현대 AI의 빅뱅을 촉발했습니다.


Part 7: AI 네이티브 기업과 추론 인플렉션의 도래

ChatGPT, 추론 AI, Claude Code — 컴퓨팅 수요 100만 배 증가

수많은 소규모 기업들이 있습니다. 그 목록은 방대합니다. OpenAI, Anthropic은 물론이고 다양한 수직 시장을 위한 수많은 다른 기업들이 있습니다. 지난 2년간, 특히 지난 1년간 폭발적인 성장이 있었습니다. 이 산업은 1,500억 달러의 벤처 투자를 기록했습니다 — 인류 역사상 최대 규모입니다.

이번이 처음으로 투자 규모가 수백만 달러, 수천만 달러에서 수억 달러, 수십억 달러로 뛰어올랐습니다. 그 이유는 역사상 처음으로 이 기업들 하나하나가 컴퓨팅을 필요로 하고, 그것도 아주 많이 필요로 하기 때문입니다. 토큰이 필요합니다, 아주 많이. 그들은 토큰을 생성하고 만들거나, Anthropic과 OpenAI 등이 만든 토큰에 가치를 더할 것입니다.

우리가 컴퓨팅을 재발명했기 때문에, PC 혁명 때처럼, 인터넷 혁명 때처럼, 모바일 클라우드 시대처럼 완전히 새로운 기업군이 탄생하고 있습니다. 우리는 지금 새로운 플랫폼 전환의 시작점에 있습니다.

지난 2년간 무슨 일이 있었을까요? 세 가지가 일어났습니다. 첫째, ChatGPT가 생성형 AI 시대를 열었습니다. 이해하고 인식하는 것을 넘어 번역하고 생성하는 — 고유한 콘텐츠를 생성하는 능력입니다. 생성형 컴퓨팅은 소프트웨어의 능력이지만, 컴퓨팅 방식 자체를 근본적으로 바꿨습니다. 컴퓨팅은 검색 기반이었는데, 이제 생성형입니다.

둘째, 추론 AI — o1, 그리고 o3로 본격화되었습니다. 추론은 AI가 스스로 반성하고, 생각하고, 계획하고, 이해할 수 없는 문제를 분해하여 이해할 수 있는 단계로 나눌 수 있게 했습니다. 연구에 기반을 둘 수 있게 되어, o1은 생성형 AI를 신뢰할 수 있고 진실에 기반한 것으로 만들었습니다. 그것이 ChatGPT를 폭발적으로 성장시킨 매우 큰 전환점이었습니다.

셋째, Claude Code — 최초의 에이전틱 모델입니다. 파일을 읽고, 코딩하고, 컴파일하고, 테스트하고, 평가하고, 돌아가서 반복할 수 있습니다. Claude Code는 소프트웨어 엔지니어링을 혁명적으로 바꿨습니다. NVIDIA의 100%가 Claude Code, Codex, Cursor를 사용하고 있습니다. 오늘날 AI 에이전트의 도움 없이 코딩하는 소프트웨어 엔지니어는 한 명도 없습니다.

역사상 처음으로, AI에게 '무엇을, 언제, 어떻게'를 묻는 것이 아니라 '만들어라, 해라, 구축하라'고 말합니다. 도구를 사용하고, 맥락을 읽고, 파일을 읽으라고 합니다. 에이전틱하게 문제를 분해하고, 추론하고, 반성하고, 문제를 풀고, 실제로 작업을 수행합니다. 인식할 수 있었던 AI가 생성할 수 있는 AI가 되었고, 추론할 수 있는 AI가 되었고, 이제 실제로 일을 할 수 있는 AI가 되었습니다.

지난 2년간 컴퓨팅 수요가 약 10,000배 증가했고, 사용량은 아마 100배 증가했습니다. 저는 지난 2년간 컴퓨팅 수요가 100만 배 증가했다고 믿습니다. 우리 모두가 느끼는 것이고, 모든 스타트업이, OpenAI가, Anthropic이 느끼는 것입니다. 더 많은 용량만 있으면 더 많은 토큰을 생성하고, 매출이 올라가고, 더 많은 사람이 사용하고, AI는 더 똑똑해질 것입니다. 우리는 이제 그 긍정적 플라이휠에 도달했습니다. 추론 인플렉션이 도래한 것입니다.


Part 8: 1조 달러의 수요 — AI 인프라의 미래

2027년까지 최소 1조 달러 규모의 컴퓨팅 수요 전망

작년 이맘때, 제가 서 있던 그 자리에서, 우리는 약 5,000억 달러의 매우 높은 확신도 수요와 구매 주문을 확인했습니다. Blackwell과 Rubin에 대해 2026년까지. 작년에 그 말씀을 드렸습니다. 5,000억 달러는 엄청난 금액입니다. 그런데 여러분 중 아무도 감명받지 않으셨죠. 왜 감명받지 않으셨는지 압니다 — 여러분 모두 기록적인 해를 보냈기 때문입니다.

오늘 이 자리에서, 지난 GTC로부터 1년 후, 저는 2027년까지 최소 1조 달러를 봅니다. 말이 됩니까? 그것이 나머지 시간 동안 이야기할 내용입니다. 사실 우리는 부족할 것입니다. 컴퓨팅 수요가 그보다 훨씬 높을 것이 확실합니다.

우리 사업의 60%는 상위 5개 하이퍼스케일러입니다. 그 안에서도 일부는 내부 AI 소비 — 추천 시스템이 테이블과 협업 필터링에서 딥러닝과 대규모 언어모델로 전환 중이고, 검색도 마찬가지입니다. 나머지 40%는 말 그대로 어디에나 있습니다. 리전 클라우드, 소버린 클라우드, 엔터프라이즈, 산업용, 로보틱스, 엣지, 대형 시스템, 소형 서버. AI의 다양성이 곧 회복력입니다. 이것은 단일 앱 기술이 아닙니다. 이것은 확실히 새로운 컴퓨팅 플랫폼의 전환입니다.


Part 9: 추론 성능의 왕 — 토큰당 비용 혁명

Blackwell 50배 성능 향상, Token King, AI 팩토리 개념 정립

작년은 NVIDIA의 추론의 해였습니다. 우리는 Hopper가 전성기에 있을 때, 과감하게 시스템을 완전히 재설계했습니다. Grace Blackwell, NVLink 72는 거대한 도박이었습니다. 쉽지 않았습니다. 파트너 여러분 모두에게 감사드립니다.

NVLink 72, NVFP4 — 단순한 FP4가 아니라 완전히 새로운 종류의 텐서 코어와 연산 유닛입니다. NVFP4로 정밀도 손실 없이 추론하면서 성능과 에너지 효율을 엄청나게 높일 수 있음을 입증했습니다. 훈련에도 NVFP4를 사용할 수 있게 되었습니다. Dynamo, TensorRT-LLM 등 새로운 알고리즘도 발명했고, 커널 최적화를 위해 수십억 달러 규모의 슈퍼컴퓨터까지 구축했습니다.

이것이 Semi-Analysis의 가장 크고 포괄적인 AI 추론 벤치마크 결과입니다. 세로축은 와트당 토큰 수 — 모든 데이터센터는 전력이 제한되어 있으므로 최대한 많은 토큰을 뽑아내야 합니다. 가로축은 추론 속도 — 빠를수록 더 큰 모델, 더 많은 컨텍스트를 처리하고, 더 많이 생각할 수 있습니다. 이 축은 곧 AI의 똑똑함입니다.

NVIDIA가 세계 최고 성능이라는 것에 놀랄 분은 없을 것입니다. 놀라운 것은 한 세대만에 무어의 법칙이라면 50%, 기껏해야 1.5배를 기대할 수 있었을 텐데, 35배를 달성했다는 것입니다. Semi-Analysis의 Dylan Patel은 저를 산드백킹(sandbagging)한다고 비난했습니다. 실제로는 50배였거든요.

토큰당 비용은 세계 최저이며 사실상 따라잡을 수 없는 수준입니다. 잘못된 아키텍처라면 공짜라 해도 충분히 싸지 않습니다. 어차피 기가와트급 데이터센터를 지어야 하고, 15년간 상각하면 약 400억 달러입니다. 아무것도 올려놓지 않아도 40억 달러가 들어갑니다. 최고의 컴퓨팅 시스템을 넣어야 합니다.

여러분의 데이터센터는 더 이상 파일을 위한 데이터센터가 아닙니다. 이제 토큰을 생성하는 팩토리입니다. 팩토리는 전력이 제한되어 있습니다. 모두가 토지, 전력, 건물을 찾고 있습니다. 한번 지으면 전력이 제한됩니다. 추론이 여러분의 작업이고, 토큰이 새로운 원자재이고, 컴퓨팅이 매출이라는 것을 확실히 인식하고, 아키텍처가 최적화되어야 합니다. 미래에 모든 CSP, 모든 컴퓨터 회사, 모든 AI 기업은 토큰 팩토리 효율성을 생각하게 될 것입니다.


Part 10: Vera Rubin 아키텍처 — 에이전틱 AI를 위한 설계

DGX-1에서 Vera Rubin까지, 7개 칩·5개 랙·1개 슈퍼컴퓨터

2016년 4월 6일, 10년 전 우리는 DGX-1을 소개했습니다 — 딥러닝을 위해 설계된 세계 최초의 컴퓨터. 8개의 Pascal GPU가 1세대 NVLink로 연결된, 170테라플롭스의 컴퓨터였습니다. Volta에서는 NVLink 스위치를 도입해 16개의 GPU를 전대역폭으로 연결했습니다. Hopper는 FP8 트랜스포머 엔진을 탑재한 최초의 GPU로 생성형 AI 시대를 열었습니다. Blackwell은 NVLink 72로 AI 슈퍼컴퓨팅 시스템 아키텍처를 재정의했습니다.

그리고 이제 Vera Rubin입니다. 에이전틱 AI의 모든 단계를 위해 설계되었으며, CPU, 스토리지, 네트워킹, 보안 등 컴퓨팅의 모든 기둥을 발전시킵니다. NVLink 72, 3.6엑사플롭스의 컴퓨팅, 260TB/s의 전대역폭 NVLink 대역폭. 에이전틱 AI 시대를 초고속으로 구동하는 엔진입니다. 새로운 Vera CPU, AI 네이티브 스토리지인 STX 랙, Bluefield 4가 탑재되었고, Spectrum X 공동패키지 광학으로 스케일아웃합니다.

놀라운 새 추가가 있습니다 — Groq 3 LPX 래퍼입니다. Vera Rubin에 단단히 연결된 Groq의 LPU는 대규모 온칩 SRAM으로, 이미 초고속인 Vera Rubin에 토큰 가속기 역할을 합니다. 함께하면 메가와트당 처리량 35배 향상. 7개의 칩, 5개의 랙 규모 컴퓨터, 하나의 혁명적 AI 슈퍼컴퓨터. 단 10년 만에 4,000만 배 더 많은 컴퓨팅.

이것이 Vera Rubin 시스템입니다. 100% 액체 냉각, 모든 케이블이 사라졌습니다. 설치에 이틀 걸리던 것이 이제 2시간입니다. 45도 온수로 냉각하여 데이터센터 냉각에 쓰이는 에너지와 비용을 시스템에 활용합니다. 비밀 병기는 6세대 NVLink 스케일업 스위칭 시스템입니다 — 이더넷도 아니고 인피니밴드도 아닌, NVLink 6세대입니다. 이것은 극도로 어렵습니다.

Rubin Ultra는 Kyber라는 새 랙에 수직으로 슬라이드되며, 144개의 GPU를 하나의 NVLink 도메인으로 연결합니다. 앞쪽에 컴퓨팅, 뒤쪽에 NVLink 스위치, 하나의 거대한 컴퓨터입니다.


Part 11: 토큰 팩토리 경제학 — 수익의 새로운 공식

토큰 계층화, Groq 통합, 분리형 추론으로 350배 성능 향상

이것은 AI 팩토리의 미래를 위한 가장 중요한 차트입니다. 세계의 모든 CEO가 이것을 추적하고 깊이 연구하게 될 것입니다. 세로축은 처리량, 가로축은 토큰 속도입니다. 토큰은 새로운 원자재입니다. 성숙해지면 다른 모든 원자재처럼 계층으로 나뉠 것입니다.

고처리량·저속도는 무료 티어에 사용될 수 있습니다. 중간 티어는 100만 토큰당 3달러, 다음 티어는 6달러. 더 큰 모델, 더 빠른 속도, 더 긴 컨텍스트 — 더 높은 가격입니다. 프리미엄 모델은 100만 토큰당 150달러일 수도 있습니다. 연구자로서 하루 5,000만 토큰을 150달러에 쓴다면, 그건 별것 아닙니다.

Hopper에서 Grace Blackwell로의 전환 — 무료 티어의 처리량을 엄청나게 높이면서, 가장 많이 수익화하는 구간에서는 35배 처리량을 증가시켰습니다. Vera Rubin은 모든 티어에서 처리량을 높이고, 최고 ASP·가장 가치 있는 세그먼트에서 10배를 증가시켰습니다.

데이터센터의 25%를 무료 티어에, 25%를 중간 티어에, 25%를 고급 티어에, 25%를 프리미엄 티어에 할당한다고 가정하면 — Blackwell이 5배 더 많은 매출을 생성하고, Vera Rubin이 거기서 다시 5배를 생성합니다. Vera Rubin에 가능한 빨리 전환해야 합니다.

Groq를 결합한 이유입니다. Groq의 결정론적 데이터플로우 프로세서는 정적으로 컴파일되고, 컴파일러가 스케줄링합니다. 대규모 SRAM으로 설계되었으며, 추론이라는 단 하나의 워크로드에 특화되어 있습니다. Dynamo라는 소프트웨어로 추론을 완전히 분리(disaggregated)했습니다. 높은 처리량에 적합한 작업은 Vera Rubin에서, 저지연·대역폭 제한된 디코드 생성은 Groq에서 처리합니다.

NVLink 72가 한계에 달하는 초당 1,000 토큰 이상의 영역에서, Groq가 그 한계를 넘어섭니다. 워크로드가 고처리량 중심이면 100% Vera Rubin을 유지하고, 코딩이나 고가치 엔지니어링 토큰 생성이 많다면 데이터센터의 약 25%에 Groq를 추가하면 됩니다. Samsung이 Groq LP 30 칩을 생산하고 있으며, 3분기에 출하 예정입니다. 2년 만에 1기가와트 팩토리에서 토큰 생성률이 200만에서 7억으로 — 350배 증가합니다.


Part 12: 로드맵 — Feynman, Rosa, 그리고 우주까지

Vera Rubin → Rubin Ultra → Feynman, DSX 팩토리 플랫폼, 우주 데이터센터

Blackwell이 현재 세대입니다. Rubin에는 Oberon 시스템이 있으며, 항상 하위 호환이 됩니다. Oberon은 구리 스케일업이며, 광학 스케일업으로 NVLink 576까지 확장할 수 있습니다. NVIDIA는 구리도 광학도 모두 합니다. NVLink 144의 Kyber와 Oberon의 NVLink 72 + 광학으로 NVLink 576을 구현합니다.

Rubin Ultra에는 새 칩과 LP 35가 있습니다. LP 35는 처음으로 NVIDIA의 NVFP4 연산 구조를 통합하여 수 배의 속도 향상을 제공합니다. 그 다음 세대는 Feynman입니다. 새로운 GPU, 새로운 LPU인 LP 40, 새로운 CPU인 Rosa(Rosalind의 약자), Bluefield 5, CX 10이 포함됩니다. Feynman은 구리와 공동패키지 광학 양쪽 모두로 스케일업합니다. 매년 완전히 새로운 아키텍처입니다.

NVIDIA는 칩 회사에서 AI 팩토리 회사, AI 인프라·컴퓨팅 회사로 변모했습니다. 이제 AI 팩토리 전체를 구축합니다. 이 팩토리에서 낭비되는 전력이 너무 많습니다. 우리는 Omniverse를 만들었습니다 — 모든 구성 요소가 가상으로 만나 기가와트급 AI 팩토리를 설계하는 플랫폼입니다.

NVIDIA DSX 플랫폼 — 기계·열·전기·네트워크 시뮬레이션을 위한 DSX-M, AI 팩토리 운영 데이터를 위한 DSX Exchange, 그리드와 데이터센터 간 동적 전력 관리를 위한 DSX Flex, 토큰 처리량을 동적으로 최대화하는 DSX Max Q. Siemens, Cadence, Dassault, Jacobs, PTC 등 생태계 파트너들과 함께합니다. 여기에 2배의 효율이 숨어 있다고 확신합니다.

그리고 우리는 우주에도 갑니다. Thor는 방사선 인증을 받아 위성에 탑재되어 있습니다. 미래에는 우주에도 데이터센터를 구축할 것입니다. Vera Rubin Space One이라는 새 컴퓨터가 우주로 나갈 것입니다. 우주에는 전도도 대류도 없고 복사만 있으므로 냉각이 어렵지만, 훌륭한 엔지니어들이 해결하고 있습니다.


Part 13: Open Claw 혁명 — 에이전틱 시대의 운영체제

Open Claw, Nemo Claw, Nemotron 연합, 엔터프라이즈 AI 르네상스

Peter Steinberger가 Open Claw라는 소프트웨어를 만들었습니다. Open Claw는 인류 역사상 가장 인기 있는 오픈소스 프로젝트가 되었으며, 단 몇 주 만에 Linux가 30년간 이룬 것을 넘어섰습니다. 그만큼 중요합니다. 콘솔에 명령어를 입력하면 Open Claw를 찾아 다운로드하고, AI 에이전트를 만들어줍니다. 그 다음부터는 무엇이든 시킬 수 있습니다.

Open Claw가 무엇인지 설명하겠습니다. 에이전틱 시스템입니다. 대규모 언어모델에 연결하고 호출합니다. 리소스를 관리하고, 도구에 접근하고, 파일 시스템에 접근하고, 스케줄링하고, 크론 작업을 하고, 프롬프트를 단계별로 분해하고, 하위 에이전트를 호출할 수 있습니다. IO도 있어서 어떤 모달리티로든 대화할 수 있고, 메시지를 보내고, 이메일을 보냅니다. 사실 이것은 운영체제입니다. Open Claw는 에이전틱 컴퓨터의 운영체제를 오픈소스화한 것입니다.

Windows가 개인용 컴퓨터를 만들 수 있게 한 것처럼, Open Claw가 개인 에이전트를 만들 수 있게 합니다. 모든 기업이, 모든 소프트웨어 기업이, 모든 기술 기업의 CEO에게 질문은 이것입니다 — 'Open Claw 전략이 무엇입니까?' Linux 전략이 필요했듯이, HTTP/HTML 전략이 필요했듯이, Kubernetes 전략이 필요했듯이, 이제 모든 기업에 에이전틱 시스템 전략이 필요합니다.

그러나 한 가지 주의사항이 있습니다. 기업 네트워크 안의 에이전틱 시스템은 민감한 정보에 접근할 수 있고, 코드를 실행할 수 있고, 외부와 통신할 수 있습니다. 직원 정보에 접근하고, 재무 정보에 접근해서 외부로 보낼 수 있습니다. 당연히 허용될 수 없습니다. 그래서 우리는 Peter와 함께 세계 최고의 보안 전문가들을 모아 Open Claw를 엔터프라이즈 보안과 프라이버시가 보장되는 형태로 만들었습니다. NVIDIA의 Nemo Claw 레퍼런스 디자인과 Open Shell이 통합되어, 정책 엔진, 네트워크 가드레일, 프라이버시 라우터를 갖추고 있습니다.

NVIDIA의 개방형 모델 이니셔티브입니다. 우리는 이제 AI 모델의 모든 도메인에서 최전선에 있습니다. Nemotron(언어), Cosmos(월드 파운데이션 모델), Groot(범용 로보틱스), Alpamayo(자율주행), BioNemo(디지털 바이올로지), Earth 2(AI 물리학). 모든 모델이 리더보드 1위이며, 특히 Nemotron 3이 Open Claw에서 상위 3개 모델에 포함됩니다.

오늘 Nemotron 연합을 발표합니다. Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, Reflection, Sarvam, Thinking Machines 등 놀라운 기업들이 함께합니다. 모든 엔터프라이즈 소프트웨어 기업이 에이전틱으로 전환할 것입니다. 모든 SaaS 기업이 GaaS — Generative as a Service 기업이 될 것입니다. 이것은 엔터프라이즈 IT의 르네상스입니다. 2조 달러 산업이 수조 달러 산업이 될 것이며, 도구만 제공하는 것이 아니라 특화된 도메인의 에이전트를 임대하는 형태가 될 것입니다.

미래에 우리 회사의 모든 엔지니어에게 연간 토큰 예산이 필요할 것입니다. 기본급이 수십만 달러라면, 그 위에 절반 정도를 토큰으로 지급해서 10배의 생산성을 갖게 할 것입니다. 실리콘밸리에서는 이미 새로운 채용 도구가 되었습니다 — '내 직무에 얼마나 많은 토큰이 함께 오는가?' Open Claw 이벤트는 HTML만큼, Linux만큼 큰 사건입니다.


Part 14: 물리적 AI와 로보틱스 — 자율주행에서 올라프까지

로보택시 준비 플랫폼, 휴머노이드 로봇, Disney Olaf

에이전트는 인식하고, 추론하고, 행동합니다. 지금까지 이야기한 대부분의 에이전트는 디지털 세계에서 활동하는 디지털 에이전트입니다. 하지만 우리는 오랫동안 물리적으로 구현된 에이전트 — 로봇 — 에 대해서도 작업해왔습니다. 그들에게 필요한 AI가 물리적 AI(Physical AI)입니다.

이 쇼에 110대의 로봇이 전시되어 있으며, 로봇을 만드는 거의 모든 기업이 NVIDIA와 함께 일하고 있습니다. 훈련 컴퓨터, 합성 데이터 생성 및 시뮬레이션 컴퓨터, 로봇 내부의 로보틱스 컴퓨터 — 3대의 컴퓨터와 모든 소프트웨어 스택이 준비되어 있습니다.

자율주행의 ChatGPT 순간이 도래했습니다. 자동차를 자율적으로 성공적으로 운전할 수 있다는 것을 이제 알게 되었습니다. 오늘 4개의 새로운 파트너를 발표합니다 — BYD, Hyundai, Nissan, Geely. 합산 연간 1,800만 대입니다. 기존 파트너인 Mercedes, Toyota, GM과 함께하며, Uber와도 대규모 파트너십을 맺어 여러 도시에서 로보택시 레디 차량을 네트워크에 연결합니다.

ABB, Universal Robotics, KUKA 등 로보틱스 기업들과 물리적 AI 모델을 시뮬레이션 시스템에 통합하여 제조 라인에 배포하고 있습니다. Caterpillar도, T-Mobile도 함께합니다 — 미래의 기지국은 NVIDIA Aerial AI RAN이 되어 빔포밍을 추론하고 에너지를 절약하는 로보틱스 라디오 타워가 될 것입니다.

그리고 제가 가장 좋아하는 것 중 하나 — Disney의 로봇입니다. Newton 물리 솔버 위에서 NVIDIA Warp를 사용하여 Disney, DeepMind와 공동 개발했습니다. 올라프(Olaf)가 Omniverse 안에서 걷는 법을 배웠고, 물리적 세계에 적응할 수 있게 되었습니다. 미래의 디즈니랜드를 상상해보십시오 — 캐릭터 로봇들이 돌아다니는 것을.


Part 15: 마무리 — GTC의 미래

추론 인플렉션, AI 팩토리, Open Claw 혁명, 물리적 AI의 총정리

보통 저는 마지막에 오늘 무엇을 이야기했는지 정리합니다. 추론 인플렉션에 대해, AI 팩토리에 대해, Open Claw라는 에이전트 혁명에 대해, 그리고 물리적 AI와 로보틱스에 대해 이야기했습니다.

컴퓨팅이 폭발했습니다. CNN에서 Open Claw까지, 에이전트가 전 세계에서 작동하지만 그 힘을 감당하려면 컴퓨팅이 필요합니다. 그래서 우리는 문제를 해결했습니다 — 컴퓨팅을 4,000만 배로 늘렸습니다. 훈련이 한때의 패러다임이었다면, 이제는 추론이 전 세계를 돌립니다. Vera는 비용을 35배 줄이는 누가 보스인지 보여주고, Blackwell은 토큰을 노래하게 만듭니다 — 추론의 왕입니다.

DSX와 Dynamo가 전력을 매출로 바꾸고, Nemo Claw가 진로를 벗어나는 에이전트를 막아서 '절대 안 돼'라고 말합니다. 그리고 네, 이것은 오픈소스입니다. 생각하는 자동차와 달리는 로봇 — 이것은 영화가 아닙니다. 이미 시작되었습니다. 우리는 매년 새로운 아키텍처를 만듭니다. 에이전트들이 계속 '토큰 더 줘!'라고 외치니까요.

모두에게 훌륭한 GTC가 되기를! 감사합니다, 여러분!



KIMKJ.COM

#김경진 #김경진변호사 #김경진인공지능 #인공지능 #AI #AI전문가 #AI법률 #AI정책 #AI규제 #AI윤리 #생성형AI #ChatGPT #Claude #GPT #LLM #디지털전환 #스마트시티 #자율주행 #데이터규제 #GDPR #개인정보보호 #AI거버넌스 #국회의원김경진 #법률전문가 #테크정책 #AI교육 #AI행정혁명 #AI패권전쟁 #kimkj #kimkjcom




Jensen Huang GTC 2026 Keynote

NVIDIA Founder & CEO 젠슨 황 기조연설 전문

GTC 2026 · San Jose, California



목차 (Table of Contents)

1. Opening & Welcome to GTC

2. CUDA 20주년 — 가속 컴퓨팅의 기반

3. GeForce에서 RTX, 그리고 Neural Rendering으로

4. 구조화 데이터와 비구조화 데이터 — cuDF & cuVS

5. 가속 컴퓨팅 플랫폼 — 수직 통합, 수평 개방

6. 산업별 AI 적용 & CUDA X 라이브러리

7. AI 네이티브 기업과 추론 인플렉션의 도래

8. 1조 달러의 수요 — AI 인프라의 미래

9. 추론 성능의 왕 — 토큰당 비용 혁명

10. Vera Rubin 아키텍처 — 에이전틱 AI를 위한 설계

11. 토큰 팩토리 경제학 — 수익의 새로운 공식

12. 로드맵 — Feynman, Rosa, 그리고 그 너머

13. Open Claw 혁명 — 에이전틱 시대의 운영체제

14. 물리적 AI와 로보틱스 — 자율주행에서 올라프까지

15. 클로징 — GTC의 미래



Part 1: Opening & Welcome to GTC

GTC 2026 기조연설 시작

Welcome to GTC.

I just want to remind you, this is a tech conference.

All these people lining up.

so early in the morning.

All of you in here?

It's great to see you.

GTC.

GTC, we're gonna talk about technology.

We're gonna talk about platforms.

MVD has 3 platforms.

You think that we mostly talk about one of them.

It's related to Cuda X.

Our systems is another platform, and now we have a new platform called AI factories.

We're gonna talk about all of them.

And most importantly, we're going to talk about ecosystems.

But before I start, let me thank our pregame show hosts.

I thought they did a great job.

Sarah go of conviction.

Alfred Lim, Sequoia capital, NVIDia's 1st venture capitalist, Gavin Baker.

Nvidia's 1st major institutional investor.

These 3 people are deep in technology, deep in what's going on, and of course, they have just a really broad reach of technology ecosystem.

And then, of course, all of the VIPs that I hand selected to join us today.

All-star team.

I wanna thank all of you for that.

I also want to thank all the companies that are here.

NVIDIA, as you know.

is a platform company.

We have technology, we have our platforms, we have rich ecosystem.

And today, there are probably 100% of the $100 trillion of industry here, 450 companies sponsored this event.

I want to thank you, a 1000 Technical sessions.

2000 speakers.

This is, this conference is gonna cover every single layer of the 5 layer cake of artificial intelligence, from land power and shell, the infrastructure, to chips, to the platforms, the models, and of course, the most important, and ultimately, what's gonna take, get this industry taken off is all of the applications.


Part 2: CUDA 20주년 — 가속 컴퓨팅의 기반

CUDA 플랫폼의 20년 여정과 플라이휠 효과

What it all began, it all began here.

This is the 20th anniversary of Kuda.

We've been working on Kuda for 20 years.

For 20 years, we've been dedicated to this architecture, this revolutionary invention, SIMT, single instruction, multi-threaded, writing scalar, code could spawn off into multi-threaded application, much, much easier to program than Cindy.

We recently added tiles so that we could help people program, tensor cores, and these structures of mathematics that are so foundational to artificial intelligence today.

Thousands of tools and compilers and frameworks and libraries.

in open source,

There's a couple of 100,000 public projects. Kuda literally is integrated into every single ecosystem.

This chart basically describes 100% of invidious strategies.

You've been watching me talk about this slide from the very beginning.

And ultimately, the single hardest thing to achieve.

is the thing on the bottom, installed base.

It has taken us 20 years to now have built up 100s of 1000000s of GPUs and computing systems around the world that run Kuda.

We are in every cloud, where in every computer company, We serve just about every single industry.

The install base of Kuda is the reason why the flywheel is accelerating.

The install base is what attracts developers, who then creates new algorithms that achieves a breakthrough.

For example, deep learning.

There are so many others.

Those breakthroughs leads to entirely new markets, which build new ecosystems around them, with other companies that join, which creates a larger install base.

This flywheel, this flywheel, is now accelerating, the number of downloads of NVIDA libraries, is incredibly accelerating.

It's at a very large scale and growing faster than ever.

This flywheel is what makes this computing platform able to sustain so much applications, so many new breakthroughs, but most importantly, It also enables these infrastructures to have extraordinarily useful life.

And the reason for that is very obvious.

There's so many applications that you can run on Nvidia Kuda.

We support the entire, every single phase of the AI lifecycle.

We address every single data processing platform.

We accelerate scientific principled solvers of all different kinds.

And so the application reach is so great that once you install Nvidia GPUs, the useful life of it is incredibly high.

It is also one of the reasons why Ampire that we shipped them some 6 years ago, the pricing of Ampire in the cloud is going up.

And so all of that is made possible fundamentally because the install base is high, the flywheel is high, the developer reach is great.

And when all of that happens, and we continuously update our software, the computing cost declines.

The combination of accelerated computing speeding up applications tremendously.

Meanwhile, as we continue to nurture and continue to update software over its life, not only do you get the first time pop, you get the continuous cost reduction of accelerated computing over time.

And we're willing to nurture, willing to support every single one of these GPUs in the world because they're all architecturally compatible.

We're willing to do so because the install base is so large if we release a new optimization, it benefits millions.

This applies to everybody in the world.

This combination of dynamics is what makes the NVIDIA architecture expand its reach, accelerating its growth, at the same time, driving down computing cost, which ultimately encourages new growth.

So, Kuda is at the center of it.


Part 3: GeForce에서 RTX, 그리고 Neural Rendering으로

프로그래머블 셰이더에서 DLSS 5까지

But our journey that could actually started 25 years ago.

G Force.

I know how many of you grew up with G-force.

G Force is Invidia's greatest marketing campaign.

We attract future customers.

starting long before you could afford to pay for it yourself.

Your parents paid.

Your parents paid, your parents paid for you to be NVIDA customers, and every single year,

They paid up, year after year after year, until someday you became an amazing computer scientist and became a proper customer.

A proper developer.

But this is, this is the house that G Force made.

25 years ago, we started our journey, which led to Kuda.

25 years ago we invented the programmable shader.

A perfectly unobvious invention to make an accelerator programmable, the world's 1st programmable accelerator, the pixel shader.

25 years ago, that led us to explore further and further, 20 years later, 5 years later, the invention of Kuda.

One of the biggest investments that we made.

And we couldn't afford it at the time.

And it consumed the vast majority of our company's profits was to take Kuda on the backs of G-Force to every single computer.

We dedicated ourselves to create this platform because we felt so much, we felt so strongly about its potential.

But ultimately, the company's dedication to it, despite the hardships in the beginning, believing it every single day, for 13 generations or 20 years, we now have Kuda installed everywhere.

The pixel shader, led to, of course, the revolution of G-force.

And then 10 years ago, we introduced, about 10 years ago, what is it, 8 years ago, we introduced RTX, a complete redesign of our architecture for the modern era of computer graphics.

G-force brought Kuda to the world.

G-force,

Therefore, enabled Alex Khrushchevsky and Ilias Suscover and Jeff Hinton, Andrew Ang, and so many others, to discover that the GPU could be their friend in accelerating deep learning.

It started the bink bang of AI.

Ten years ago, we decided that we would fuse programmable shading and introduce two new ideas.

Ray tracing, hardware ray tracing, which is incredibly hard to do, and a new idea at the time.

Imagine about 10 years ago, we thought that AI would revolutionize computer graphics.

Just as G-force brought AI to the world, AI is now going to go back and revolutionize how computer graphics is done all together.

Well, today, I'm going to show you something of the future.

This is our next generation of graphics technology.

We call it neural rendering, the fusion.

The fusion of 3D graphics and artificial intelligence.

This is DLSS 5.

Take a look at it.

Is that incredible?

Computer graphics comes to life.

Now, what did we do?

We fused controllable 3D graphics.

The ground truth of virtual worlds, the structured data.

Remember this word, the structured data of virtual worlds?

of generated worlds.

We combine 3D graphics, structure data with generative AI, probabilistic computing.

One of them is completely predictive, the other one probabilistic, yet highly realistic.

We combine these two ideas.

Combine these two ideas.

control through structure data, control perfectly, and yet generating at the same time.

And as a result, the content is beautiful, amazing as well as controllable.

This concept of fusing, structured information and generative AI will repeat itself in one industry after another industry after another industry. Structured data is the foundation of trustworthy AI.


Part 4: 구조화 데이터와 비구조화 데이터 — cuDF & cuVS

데이터 처리 혁명과 클라우드 파트너십

Well, this is gonna scare you a little bit.

I'm going to flip the slide, and don't gasp.

So we're gonna go through the schematic for the rest of the time.

This is my best light.

Every time I asked my, I asked the team, what's my best light?

Repeatedly, this was it.

They say, don't do it, Jensen, don't do it.

I said, no.

These seats are free.

For some of you.

So this is your price of admission.

So this is this is structured data.

You've heard of it, sequel, spark, pandas, Velox, some of these really, really important, very large platforms, snow, snowflake, data bricks, EMR, Amazon, EMR, um, Azure, fabric, Google Cloud, Big Query.

All of these platforms are processing data frames.

These data frames are giant spreadsheets, and they hold all of life's information.

This is the structure data, the ground truth of business.

This is the ground truth of enterprise computing.

Well, now we're going to have AI use structured data.

And we better accelerate the living daylights out of it. It used to be okay, and we would, you know, of course, we would accelerate structure data so that we could do more, we could do it more cheaply.

We could do it more frequently per day and keep the company running at a much more synchronized way.

However, in the future, what's going to happen is these data structures are going to be used by AI.

And AI is gonna be much, much faster than us.

Future agents are going to use structured databases as well.

And then of course, the unstructured database.

The generative database.

This database is represents the vast majority of the world.

vector databases, unstructured data, PDFs, videos, speeches, all of the world's information, about 90% of what's generated every single year, is unstructured data.

Until now, this data has been completely useless to the world.

We read it, we put it into our file system, and that's it.

Unfortunately, we can't query it.

We can't search for it.

It's hard to do that.

And the reason for that is because there's no easy indexing of unstructured data.

You have to understand its meaning, its purpose.

And so now we have AI do that.

Just as AI was able to solve multimodality, Perception.

You can and understanding.

You can use that same technology, multimodality, perception, and understanding, to go read a PDF, to understand its meaning.

And from that meaning, embedded into a larger structure that we can search into.

We can query into.

NVIDA created 2 foundational libraries, just like we created RTX for 3D graphics.

We created QDF for data frames, structure data.

We create a QVS for vector stores, semantic data.

Unstructured data, AI data.

These two platforms are going to be two of the most important platforms in the future.

Super excited to see its adoption throughout the network, this complicated network of the world's data processing systems.

And the reason for that is because data processing has been around a long time.

And therefore, so many different companies and platforms and services, it has taken us a long time to integrate deeply into this ecosystem.

I'm super proud of the work that we're doing here

And then today, we're announcing several of them.

IBM, the inventor of sequel, one of the most important domain specific languages of all of all time, is accelerating Watson X data with QDF.

Let's take a look at it.

60 years ago, IBM introduced the system 360.

The first modern platform for general purpose computing, launching the computing era, then sequel, a declarative language to query data, without requiring the computer to be instructed step by step.

And the data warehouse.

Each the foundations of modern enterprise computing.

Today, IBM and NVIDIA are reinventing data processing for the era of AI, by accelerating IBM Watsonx.datasequel engines with NVIDIA GPU computing libraries.

Data is the ground truth that gives AI context and meaning.

AI needs rapid access to massive data sets.

Today's CPU data processing systems can't keep up.

Nestle makes thousands of supply chain decisions every day.

They're ordered to cash data market, aggregates every supply, order, and delivery event, across global operations in 185 countries.

On CPUs, Nestle refreshed the data mart a few times a day.

With accelerated Watsonx.data, running on NVIDA GPUs, Nestle can run the same workload five times faster at 83% lower cost.

The next computing platform has arrived, accelerated computing, for the era of AI.

NVID accelerates data processing on the cloud.

We also accelerate data processing on prim.

As you know, Dell is the world leading computer systems maker, and they also are one of the world's leading storage providers.

And they worked with us to create the Dell AI data platform that integrates QDF and QVS to create an accelerated data platform.

Well, for the era of AI.

And this is an example of what they did with NTT data, huge speedup.

This is cloud, Google Cloud, and Google Cloud, as you know.

We've been working with Google Cloud for a very long time.

We accelerate Google's vertex AI.

We now accelerate big query, really important uh, framework and really important platform, and this is an example of our work together with Snapchat where we reduce their cost of computing by nearly 80%.

When you accelerate data processing.

When you accelerate computing.

You get the benefit of speed, you get the benefit of scale.

But most importantly, you also get the benefit of cost.

And so all of those come together as one.


Part 5: 가속 컴퓨팅 플랫폼 — 수직 통합, 수평 개방

NVIDIA의 비즈니스 모델과 클라우드 파트너십

It was originally called Moore's law.

Moore's law was about getting performance doubling every couple of years.

That's another way of saying.

So long as the price remains about the same and most computers remained about the same, you're also getting twice the performance every year, or you're reducing the cost of computing every single year.

Well, Moore's law has run out of steam.

We need a new approach, accelerated computing allows us to take these giant leaps forward.

And as you will see later, because we continue to optimize the algorithms. And NVID is an algorithm company, as we continue to optimize the algorithms, and because our reach is so large and our install base is so large, we can reduce the computing cost, increasing the scale, increasing the speed for everybody continuously.

This is Google Cloud.

You could see this pattern I just mentioned.

I just wanted to show you 3 versions of it.

Nvidia built the accelerated computing platform.

Has a bunch of libraries on top.

I gave you 3 examples.

RTX is one of them, QDF is another QVS, and we'll show you a few more.

These libraries sit on top of our platform.

But ultimately, We integrate into the world's cloud services, into the world's OEMs and together, and other platforms that I'll show you, together, we're able to reach the world.

This pattern, NVIDIA, Google Cloud, Snapchat, will repeat over and over again and it kind of looks like this.

And so this is one example, and video with Google Cloud.

We accelerate vertex AI, we excel a big query.

We accelerate, I'm super proud of the work that we've done with Jackson, XLA.

We are incredible on PyTorch.

We're the only accelerator in the world that's incredible on Pie Torch and incredible on Jackson XLA.

And the customers that we support.

The base tens, the crowd strikes, puma, sales force.

They're not our customers.

But they're customers, developers of ours that we've integrated the NVIDA technologies into, that we can then land on the clouds.

Our relationship with cloud service providers are essentially us bringing customers to them.

We integrate our libraries, we accelerate workloads, and we land those customers in the clouds.

And so, as you could see, most of our cloud service providers love working with us.

And they're always asking us to land the next customer on their cloud.

And I just want to let you know, there are a lot of customers.

We're gonna accelerate everybody.

And so there'll be lots and lots of customers will be able to land in your cloud.

Just be patient with us.

And so this is Google Cloud.

This is AWS.

We've been working with AWS a long time.

And one of the areas, one of the, one of the things I'm super excited about this year, is we're gonna bring open AI to AWS.

And so it's going to drive enormous consumption of cloud computing at AWS.

It's going to expand the reach, expand the compute of open AI.

And as you know, they are completely compute constrained.

And so AWS, we accelerate EMR, we accelerate sage maker, we accelerate bedrock, NVIDI is integrated really deeply into AWS.

They were our 1st cloud partner.

Microsoft Azure.

Nvidia's A100 supercomputer, um, was the, the 1st one we built was for Nvidia, the 1st one we installed was at Azure, and that led to the, the, uh, the big successful partnership with open AI.

But we've been working with Azure for quite a long time.

We accelerate Azure Cloud.

Now it's their AI foundry, we partner deeply with.

We accelerate bing search, we work with them on azure regions.

This is one of the areas that is incredibly important.

As we continue to expand AI throughout the world.

One of the capabilities that we offer is confidential computing. That in confidential computing, you want to make sure that even the operator cannot see your data.

Even the operator cannot touch or see your models.

Confidential computing and VS GPUs.

the 1st ones in the world to do that.

It's now able to support confidential computing and protected deployment of these very valuable open AI models and anthropic models throughout clouds and different regions and all because of our confidential computing.

Confidential computing super important.

And here's an example, where we have different customers that we work with.

Synopsis, a great partner of ours, who are accelerating all of their EDA and CAU workflows.

And then we land it at Microsoft Azure.

We were Oracles, first AI customer.

Most people would have thought we were their 1st supplier.

We were their 1st supplier also, but we were their 1st AI customer.

I'm quite proud of the fact that I explained AI clouds to Oracle for the first time.

And we were their 1st customer.

Since then, they've really taken off.

We've landed a whole bunch of our partners there.

coherent fireworks, and of course, very famously open AI.

A great partnership with court, court, court weave.

They're the world's 1st AI native cloud, a company that was built with only one singular purpose, to provision, to host GPUs, the era of accelerated computing showed up, and the host for AI clouds.

They've got some fantastic customers, and they're growing incredibly.

One of the platforms that I'm quite excited about.

is Palantier and Dell.

The 3 of our companies have made it possible to stand up a brand new type of AI platform, the Palanteer ontology platform, an AI platform, and we could stand up these platforms in any country, in any air gapped region, completely on-prem, completely on site, completely in the field.

AI could be deployed literally everywhere.

Without our confidential computing capability, without our ability to build the end to end system, as well as offer the entire accelerated computing at AI stack, from data processing, whether it's vectors or structures, all the way to AI, it wouldn't have been possible.

I wanted to show you these examples.

This is our special working relationship with the world's cloud service providers, and many, well, all of them are here.

And I get the benefit of seeing them during boot tour, and it's just so incredibly exciting.

I just want to thank all of you for the hard work.

What Nvidia has done is this.

And you're going to see this theme over and over again.

Nvidia is vertically integrated.

The world's first, vertically integrated, but horizontally open company.

And the reason that's necessary is very simple.

Accelerated computing is not a chip problem.

Accelerated computing is not a systems problem.

Accelerated computing has a missing word.

We just never say it anymore.

Application, acceleration.

If I could make a computer run everything faster, that's called a CPU.

But that's run out of steam.

The only way for us to accelerate applications going forward and continue to bring tremendous speed up, tremendous cost reduction is through application or domain specific acceleration.

I dropped that phrase in the front, and therefore, it just became accelerated computing.

And that is the reason why NVIDIA has to be library after library, domain after domain, vertical after vertical.

We are a vertically integrated computing company.

There is no other way.

We have to understand the applications.

We have to understand the domain.

We have to understand fundamentally the algorithms, and we have to figure out how to deploy the algorithm.

In whatever scenario, it wants to be deployed, whether it's a data center, cloud, unprim at the edge, or in a robotic system.

All of those computing systems are different, and finally, the systems and chips.

We are vertically integrated.

What makes it incredibly powerful, and the reason why you saw all the slides, is because NVIDIO is horizontally open.

We'll work and integrate ambidious technology into whatever platform you would like us to integrate into.

We offer you the software, we offer you libraries.

We integrate with your technology so that we can bring accelerated computing to everybody in the world.

Well, This GTC is really a great demonstration of that.


Part 6: 산업별 AI 적용 & CUDA X 라이브러리

자동차, 헬스케어, 로보틱스, 양자컴퓨팅 등 전방위 AI

You know, most of the time, most of the time you'll see me talk about these verticals, and I'll use some examples.

But in every single case,

Whether it's automotive, by the way, financial services, the largest percentage of attendees at this GTC is from the financial services industry.

I know.

I'm hoping it's developers, not traitors.

Guys.

Here's, here's, here's one thing I wanted to say.

And so, in the audience, represents Nvidia's ecosystem, upstream of our supply chain, and downstream of our supply chain, and we work, we think about our supply chain upstream and downstream.

And it's just so exciting that, our entire upstream supply chain, this last year, Irrespective of whether you're a 50 year old company, we have 70 year old companies.

We have a 150 year old company.

Who are now part of Nvidia supply chain and partnering with us, either upstream or downstream.

And last year, You had your record year.

Did you not?

Congratulations.

We're onto something here.

This is the beginning of something very, very big.

And so, if you look at accelerated computing, we've now set the computing platform, but in order for us, to activate those computing platforms, we need to have domain-specific libraries that solve very important problems in each one of the verticals that we address, you see us addressing every single one of those, autonomous vehicles, our reach, our breath, our impact incredible.

We have a track on that, financial services I just mentioned.

Algorithmic trading is going from classical machine learning with human feature engineering called, the quants did that, to now, supercomputers, studying massive amounts of data, discovering insight and discovering patterns by itself.

And so this is going through its deep learning and its transformer moment.

Healthcare is going through their chat GPT moment, some really exciting work that we're there.

We have a great keynote track here.

We have a great keynote track, Kimberly Powell's doing the great keynote track for healthcare.

We're talking about AI physics or AI biology for drug discovery, AI agents for customer service.

And support.

of diagnosis, diagnosis.

And of course, physical AI, robotic systems, all these different vectors of AI have different platforms that NVIDI provides, industrial.

We are completely resetting and starting the largest buildout of human history.

And most of the world's industries, building AI factories, building chip plants, building computer plants are represented here today.

Media and entertainment, gaming, of course, real-time AI platform, so that we could translation and broadcast, support and live, live games and live video, enormous amount of it will be augmented with AI.

We have a, we have a platform called holoscan.

Quantum.

There are 35 different companies here, building with us the next generation of quantum GPO hybrid systems, retail and CPG, using NVIDO for supply chain, using creating a gentic shopping systems.

AI agents for customer support.

A lot of work being done here.

$35 trillion industry, robotics, $5000000000000 industry and manufacturing.

NVIDA has been working in this area for a decade now, building 3 computers, the fundamental computers necessary to build robotic systems.

We are integrated with, working with literally every single company that we know of building robots.

We have 110 robots here at the show.

And then telecommunications.

About as large as the world's IT industry, about $2 trillion.

We see, of course, base stations everywhere.

It's one of the world's infrastructures.

It was the infrastructure of the last generation of computing.

That infrastructure is going to get completely reinvented.

And the reason for that is very simple.

That base station, which is, it does one thing, which is base station.

is going to be an AI infrastructure platform in the future.

AI will run at the edge.

And so lots of, lots of great, um, uh, great, uh, discussion there in our platform there is called aerial, our A, Iran, big partnership with Nokia, big partnership with T-Mobile and many others.

At the core of our business, everything that I just mentioned, computing platforms, but very importantly, are Cuda X libraries, is the algorithm, the algorithms that NVIDA invents, we are an algorithm company.

That's what makes us special.

That what that's what makes it possible for me to be able to go into every single one of these industries, imagine the future and have the world's best computer scientists describe and solve problems, refactor it, re-express it.

and turn it into a library.

We have so many.

I think we have, at this show, we are announcing 100, 100 libraries.

70 libraries, maybe 40 models.

And that's just at the show.

We're updating these all the time.

We're updating them all the time.

The libraries is the crown jewels of our company.

It is what makes it possible for that platform, the computing platform to be activated in service of solving a problem, making impact.

One of the biggest, one of the most important libraries that we ever created, coup DNNN, 'Couda deep neural networks.

It completely revolutionized artificial intelligence caused a big bang of modern AI.

Let me show you a short video about 'Couda X.

20 years ago, we built 'Cuda.

A single architecture for accelerated computing.

Today, we've reinvented computing.

A thousand coup de X libraries helped developers make breakthroughs in every field of science and engineering.

Co opt for decision optimization.

Culetho for computational lithography.

coup DSS for direct sparse solvers.

coup equivariance, for geometry aware neural networks.

Aerial for AI RAM.

Warp for differentiable physics.

Pair of bricks for genomics.

At their foundation are algorithms, and they are beautiful.

Everything you saw was a simulation.

Some of it was principled solvers.

Fundamental physics solvers.

Some of it was AI surrogates, AI physical models, and some of it was physical AI robotics models.

Everything was simulated.

Nothing was animated, nothing was articulated, everything was completely simulated.

That is what fundamentally MVIDIA does.

It is through the... connection of understanding of the algorithms. With our computing platforms that were able to open up to unlock these opportunities.

NVIDIA is a vertically integrated computing company with open horizontal integration with the world.

So that's coup to X.


Part 7: AI 네이티브 기업과 추론 인플렉션의 도래

ChatGPT, 추론 AI, Claude Code — 컴퓨팅 수요 100만 배 증가

Well, just now you saw a whole bunch of companies.

You saw Walmart, and, you know, there's L'Oreal, and incredible companies, established companies, JP Morgan, and Roche, and these are companies, in companies that define society to today, Toyota is here.

These are some of the largest companies in the world.

It is also true.

that there's a whole bunch of companies you've never heard of.

These are companies, we call them AI natives.

A whole bunch of small companies.

The list is gigantic.

I couldn't, this is just a little tiny, tiny bit of it.

And, um, I couldn't decide whether to show you more, show you less.

And so I made it so that you couldn't see any.

And nobody's feelings are hurt.

However, inside this list are a bunch of brand new companies, there are companies like, for example, you might have heard a couple of them, open AI, anthropic, but there's a whole bunch of others.

There's a whole bunch of others, and they serve different verticals.

Something happened in the last 2 years.

particularly this last year.

We've been working with the AI natives for a long time.

And this last year, it just skyrocketed.

Now I'll explain to you why it happened.

This industry has skyrocketed $150 billion of investment into venture investment into startups, the largest in human history.

This is also the first time.

That the scale of the investments went from millions of dollars, 10s of millions of dollars, to 100s of millions of dollars, and 1000000s of dollars.

And the reason for that is this is the 1st time in history.

that every single one of these companies needs compute and lots and lots of it.

They need tokens, lots and lots of it.

They're either going to create and build and create tokens and generate tokens, or they're going to integrate.

Add value to tokens that are available, created by anthropic and open AI, and others.

And so this industry is different in so many different ways, but the one thing that is very clear, the impact that they're making, the incredible value that they're delivering already is quite tangible, AI natives.

All because we reinvented computing.

Just like during the PC revolution, a whole bunch of new companies were created, just as during the internet revolution, a whole bunch of companies were created in a mobile cloud,

a whole bunch of companies were created, each one of them had their own standards, and we're talking about one of the major standards that just happened incredibly important.

And this generation, we also have our own large number of very, very special companies.

We reinvented computing.

It stands to reason, there's going to be a whole new crop of really important companies, consequential companies for the future of the world.

The Googles, the Amazons, the metas, consequential companies that have come as a result of the last computing platform shift.

We are now at the beginning of a new platform shift.

But what happened in the last couple years?

Well, we've been watching, as you know, we've been working on deep learning and working on AI, the big bang of modern AI, we were right there at the spot, and we've been advancing this field for quite some time.

But why the last 2 years?

What happened in the last 2 years?

Well, 3 things.

Chat GPT, of course, started the generative AI era.

It's able to not just understand, perceive and understand.

It's able to also translate and generate, generation of unique content.

I showed you the fusion of generative AI with computer graphics and it brought computer graphics to life.

You guys, everybody in the world should be using ChatGPT.

I know I use that every single morning, use the planning this morning.

And so ChatGPT was the generative AI, the era.

The second, by the way, generative, generative computing, versus the way we used to do computing, it's not, it's generative AI is a capability of software, but it has profoundly changed how computing is done.

Computing used to be retrieval based, now it's generative.

Keep that thought in mind when I talk about certain things, and you'll realize why it is that everything that we do is going to change how computers are architected, how computers are provided, how computers are going to be built out, and what is the meaning of computing altogether.

Generative AI.

2023.

End of 22, 2023.

The next reasoning AI, 01.

Which, and then took off with 03.

Reasoning allowed it to reflect, allows it to think to itself, allowed it to plan, break down, break down problems, and decompose a problem it couldn't understand into steps, or parts that it could understand.

It could ground itself on research, 01 made generative AI, trustworthy, and grounded on truth.

That caused ChatGPT to simply took off.

And that was a very, very big moment.

The amount of input tokens that was necessary in order to produce, and the amount of output tokens it generated in order to reason.

The model was a little bit larger.

You know, of course, you could have much larger models, the model, 01 was a little bit larger, not much larger, but it's input token usage for context.

And its output token for thinking.

Increase the amount of computation tremendously.

Then came clock code.

The 1st agentic model.

It was able to read files, code, compile it, test it, evaluate it, go back and iterate on it.

Cloud code has revolutionized software engineering, as all of you know.

100% of NVIDIA is using a combination of, or oftentimes all 3 of them, clyde code, codex, and cursor all over NVIDIA.

There's not one software engineer today who is not assisted by one or many AI agents helping them code.

Claude code completely revolutionizes the new inflection.

And for the 1st time, you don't ask it, AI, what, when, how?

You ask it, create, do, build, you ask it to use tools, take your context, read files.

It's able to agentically break down a problem, reason about it, reflect on it.

It's able to solve problems, and actually perform tasks.

And AI, that was able to perceive, became an AI that could generate, became an AI that could reason, an AI that could reason now became an AI that can actually do work.

Very productive work.

The amount of computation in the last 2 years, we know that everybody in this room knows, the computing demand for NVIDA GPU is off the charts.

Spot pricing is skyrocketing.

You couldn't find a GPU if you tried, and yet in the meantime, we're shipping GPUs out.

Incredible amounts of it, and demand just keeps on going up.

There's a reason for that.

This fundamental inflection.

Finally, AI is able to do productive work, and therefore the inflection point of inference has arrived.

AI now has to think, in order to think, it has to inference.

AI now has to do in order to do, it has to inference.

AI has to read in order to do so.

It has to inference.

It has to reason.

It has to inference every part of AI.

Every time it has to think.

It has to reason, it has to do.

It has to generate tokens.

It has to inference.

its way past training now.

It's in the field of inference.

So the inference inflection has arrived.

At the time when the amount of tokens, the amount of compute necessary, increased by roughly 10,000 times.

Now, when I combine these two, the fact that, since in last 2 years, the computing demand, computing demand of the work has gone up by 10,000 times, and the amount of usage, the amount of usage has probably gone up by 100 times.

People have heard me say, I believe that computing demand has increased by one million times in the last two years.

It is the feeling that we all have.

It is the feeling every startup has, it's the feeling that open AI has, is the feeling that anthropic has.

If they could just get more capacity, they could generate more tokens, the revenues would go up, more people could use it, the more advanced, the smarter the AI could become.

We are now at that positive flywheel system.

We have we have reached that moment.

The inflection, the inference inflection has arrived.


Part 8: 1조 달러의 수요 — AI 인프라의 미래

2027년까지 최소 1조 달러 규모의 컴퓨팅 수요 전망

Last year, at this time, I said, that, where I stood, at that moment in time, we saw about 500 billion dollars.

We saw $500 billion.

of very high confidence demand and purchase orders.

For Blackwell and Ruben through 2026.

I said that last year.

Now, I don't know if you guys feel the same way, but $500 billion is an enormous amount of revenue.

Not one impressed.

I know why you're not impressed, because all of you had record years.

Well, I'm here to tell you, that right now where I stand.

A few short months after GTCDC.

One year after last GTC.

right here where I stand, I see through 2027.

at least one trillion dollars.

Now, does it make any sense?

And that's what I'm going to spend the rest of the time talking about.

In fact, we are gonna be short.

I am certain, computing demand will be much higher than that.

And there's a reason for that.

So the 1st thing is, um, we did a lot of work in the last year.

Of course, as you know, 2025 was MVIDIO's year of inference.

We wanted to make sure that not only were we good at training and post training, that we were incredibly good at every single phase of AI, so that the investments that were made.

Investments made in our infrastructure could scale out for as long as they would like to use it.

And the useful life of NVIDious infrastructure would be long, and therefore, the cost would be incredibly low.

The longer you could use it, the lower the cost.

There's no question in my mind.

And video systems are the lowest cost infrastructure you could get for AI infrastructure in the world.

And so the 1st part was last year was all about AI for inference.

And it drove this inflection point.

Simultaneously, We were very pleased last year, that Anthropic has come to NVIDIA, that MSL, Meta SL has chosen NVIDIA.

And meanwhile, meanwhile, as a collection, as a group, this represents one third of the world's AI compute.

Open source models.

Open source models have reached near the frontier, and it is literally everywhere.

And NVIDA, as you know, today, we're the only platform in the world today that runs every single domain of AI, across every single one of these AI models, in language and biology, in computer graphics, computer vision, and speech, proteins and chemicals, robotics, and otherwise, edge or cloud, any language,

Invidious architecture is fungible for all of that and we're incredible for all of that.

That allows us to be the lowest cost, the highest confidence platform.

Because when you're building these systems, as I mentioned, a $100000000 is an enormous amount of infrastructure.

You have to have complete confidence, that the $100000000000 you're putting down, will be utilized, would be performant, would be incredibly cost effective, and have useful life for as long as you could see.

That infrastructure investment you could make on NVIDA, you could make with complete confidence.

We have now proven that.

It is the only infrastructure in the world that you could go anywhere in the world and build with complete conference.

You want to put it in any of the clouds.

We're delighted by that.

You want to put it on prim.

We're happy about that.

You want to put it in any country, anywhere, we're delighted to support you.

We are now.

a computing platform that runs all of AI.

Now, our business

Already starting to show that.


Part 9: 추론 성능의 왕 — 토큰당 비용 혁명

Blackwell 35배 성능 향상, Token King, AI 팩토리 개념

60% of our business is hyper scalers, the top 5 hyper scalers.

However, even within that top 5 hyper scalars, some of it is internal AI consumption.

The internal AI consumption really important work, like Rexus is moving from recommender systems, of tables, and collaborative filtering, and content filtering.

It's moving towards deep learning and large language models.

Search, moving to deep learning, large language models.

Almost all of these different hyper scale workloads are now moving, shifting towards a workload that NVIDA GPUs are incredibly good at.

But on top of that, because we work with every AI lab, because we work with every AI, we accelerate every AI model, and because we have a large ecosystem of AI natives that we work with, that we can bring to the clouds.

That investment no matter how large, no matter how quick, that compute will be consumed.

And that represents 60% of our business.

The other 40% is just everywhere.

Regional clouds, sovereign clouds, enterprise, industrial, robotics, edge, big systems, supercomputing systems, small servers, enterprise servers, the number of systems, incredible.

The diversity of AI is also its resilience.

The span of reach of AI is its resilience.

There is no question, this is not a one app technology.

This is now fundamental.

This is absolutely a new computing platform shift.

Well, our job is to continue to advance the technology, and one of the most important things that I mentioned last year was last year was our year of inference.

We dedicated everything.

We took a giant chance and reinvented, while Hopper was at its prime, and it was just cooking, we decided that the Hopper architecture, the envy linked by 8, had to be taken to the next level.

We completely rearchitected the system, disaggregated the computing system altogether and created MVLink 72.

The way that it's built, the way it's manufactured, the way it's programmed, completely changed.

Grace Blackwell, MVLing 72 was a giant bet.

And it wasn't easy for anybody.

And many of my partners here in the room.

I want to thank all of you for the hard work that you guys did.

Thank you.

Envy link 72.

MVFP4, not just FP4, red precision, FP4 is a whole different type of tensor core and computational unit.

We've demonstrated now that we can inference NVFP4 without loss of precision but gigantic boost and performance and energy efficiency.

We've also been able to use MVFP for training.

So, MVLink 72, MVFP4, the invention of dynamo, tensor RTLLM, a whole bunch of new algorithms.

We even built a supercomputer to help us optimize kernels and help us optimize our complete stack.

We call it DGX Cloud.

We invested 1000000000s of dollars of supercomputing capability, help us create the kernels, the software that made inference possible.

Well, the results all came together.

And people told, people used to tell me, but Jensen, inference is so easy.

Inference is the ultimate hard.

Inference is ultimate hard.

It is also ultimate important because it drives your revenues.

And so this is the outcome.

This is from semi-analysis.

This is the largest, most comprehensive sweep.

of AI that has AI inference that has ever been done.

And what you see here on the left, on this side.

On this side is tokens per watt.

Tokens per watt is important because every data center, every single factory, by definition, is power constrained.

A one gigawatt factory will never become two.

It's physically constrained.

The laws of atoms, the laws of physicality.

And so, that one gigawatt of data center, you want to drive the maximum number of tokens, which is the production, the product of that factory.

So you want that, you want to be on top of that curve as high as you want.

This, the X axis, is the introactivity, the speed of each inference.

The faster you can inference, the faster you could, of course, respond, but very importantly, the faster you can inference, the larger the models, the more context you could process, the more tokens you can think through.

This axis is the same as smartness of the AI.

And so this is the throughput of the AI.

This is the smartness of the AI.

Notice, the smarter the AI, the lower your throughput.

Makes sense.

You're thinking longer.

Okay?

And so this axis is the speed, and I'm going to come back to this, this is important.

where I torture all of you.

But it's too important.

Every CEO in the world, you watch every CEO in the world, will study their business from now on in the way I'm about to describe.

Because this is your token factory.

This is your AI factory.

This is your revenues.

There's no question about that going forward.

And so this is the throughput.

This is the intelligence.

Better per watt, for a given power of data center, the more throughput, the more tokens you could produce.

On this side is cost.

Notice, NVIDIA is the highest performance in the world.

Nobody would be surprised by that.

They would be surprised by the fact that in one generation, whereas Moore's law would have given us through transistors, 50%, 2 times, Moore's law would probably give us one.5 times more performance.

You would have expected from Hopper H200, one.5 times higher.

Nobody would have expected 35 times higher.

I said last year.

At this time, that Invidious, Grace Blackwell, Embling 72 was 35 times per per watt.

Nobody believed me.

And then semi-analysis came out.

And Dylan Patel had a quote, He accused me of sandbagging.

He accused me of sandbagging.

Jensen sandbagged.

It's actually 50 times.

And he's not wrong.

He's not wrong.

And so our cost per token.

Our cost per token is the lowest in the world, you can't beat it.

I've said before, if you have the wrong architecture, even if it's free, it's not cheap enough.

And the reason for that is because no matter what happens, you still have to build a gigawatt data center.

You still have to big up, build a gigawatt factory, and that gigawatt factory for 15 years, amortize the cross.

That gigawatt factory is about $40 billion.

Even when you put nothing on, it's $4000000000 in.

You better make for darn sure you put the best computer system on that thing so that you could have the best token cost.

NVDS token cost is world-class.

Basically untouchable at the moment.

And the reason that's true is because of extreme code design.

And so I'm very happy that he named us.

There was a monkey king, Tolken King.

Well, we take all of our software.

As I told you, we vertically integrate, but we horizontally open.

We're vertical integration horizontal open.

We integrate all of our software and all of our technology, however we could package it up and integrate it into the world's inference service providers.

And these companies are growing so fast.

They're growing so fast.

Fireworks, Lynn is here, together, they're just growing so incredibly fast.

A 100 times in the last year.

They are token factories.

And the effectiveness, the performance, and the token cost production capability for their factories is everything to them.

And this is what happened.

This is, we updated their software, same system.

And notice, their token speeds, incredible.

The difference, before, before NVIDIA updated everything, and all of our algorithms, and software, and all the technology that we bring to bear.

About 700 tokens per second, average went to nearly 5000, 7 times higher.

And so this is the incredible power of extreme codesign.

I mentioned earlier the importance of factories.

This is the importance of factory.

Your data center, it used to be a data center for files.

It's now a factory to generate tokens.

Your factory is limited no matter what.

Everybody is looking for land, power, and shell.

Once you build it, you are power limited.

Within that power limited infrastructure, you better make for darn sure that your inference, because you know inference is your workload, and tokens is your new commodity, that compute is your revenues, that you want to make sure that the architecture is as optimized as you can.

In the future.

Every single CSP.

Every single computer company, every single cloud company, every single AI company, every single company period, are gonna be thinking about their token factory effectiveness.

This is your factory in the future.


Part 10: Vera Rubin 아키텍처 — 에이전틱 AI를 위한 설계

DGX-1에서 Vera Rubin까지, MVLink 72, 액체 냉각, Rubin Ultra

And the reason why I know that is because everybody in this room is powered by intelligence.

And in the future, that intelligence will be augmented by tokens.

So let me show you how we got here.

On April 6th, 2016, a decade ago, we introduced DGX1, the world's first computer designed for deep learning.

Eight Pascal GPUs connected with the first generation NV link, 170 terra flops in one computer, the world's first computer designed for AI researchers.

With Volta, we introduced MV Link Switch, 16 GPUs connected with full, all to all bandwidth, operating as one giant GPU, a giant step forward, but model sizes continued to grow.

The data center needed to become a single unit of computing.

So Melonox joined NVIDA.

In 2020, DGXA 100 SuperPod became the first GPS supercomputer, combining scale up and scale out architecture.

NV Link 3 for scale up, Connect X6, and Quantum Infiniban for scale out.

Then Hopper, the first GPU with the FP8 transformer engine that launched the generative AI era.

MVLink 4, Connect X 7, Bluefield 3, DPUs, Second generation Quantum Infiniban.

It revolutionized computing.

Blackwell redefined AI supercomputing system architecture with MV Links 72, 72 GPUs connected by MV Links 5, 130 terabytes per second of all 12 bandwidth.

Compute trace integrate Blackwell GPUs, Grace CPUs, Connect X 8, and Bluefield 3.

Scaleout runs over Spectrum 4 ethernet.

With three scaling laws in full steam, pre training, post training, and inference, and now agentic systems, compute demand continues to grow exponentially.

And now, Vera Rubin.

Architected for every phase of agentic AI, advancing every pillar of computing, including CPU, storage, networking, and security.

Vera Rubin, MV Link 72.

3.6 exa flops of compute, 260 terabytes per second of all to all NV link bandwidth,

The engine supercharging the era of agentic AI, the Vera CPU rapid.

Designed for orchestration and agentic workblues.

The STX rack, AI native storage, built with Bluefield 4.

Scale out with Spectrum X co packaged optics, increasing energy efficiency and resiliency, and an incredible new addition, the GROC 3 LPX wrapper.

Tightly connected to Vera Rubin, Grock's LPU's massive on chip SRAM, a token accelerator to the already incredibly fast Vera Rubin.

Together, 35 times more throughput per megawatt.

The new Vera Rubin platform, seven ships, five rack scale computers, one revolutionary AI supercomputer, for agentic AI, 40 million times more compute, and just 10 years.

Now, in the good old days, when I would say, Hopper, I would hold up a chip.

That's just adorable.

This is Vera Reuben.

When we think ver...

When we, when we think Vera Rubin, we think the entire system, vertically integrated, completely with software, extend it end to end, optimized as one giant system.

The reason why it's designed for agentic systems is very clear, because agents, of course, the most important workload is it's thinking the large language model. The large language models are going to get larger and larger and larger, it's going to generate more and more tokens more quickly, so it could think more quickly, but it also has to access memory.

It's going to pound on memory really hard.

KV cash, structured data, QDF, unstructured data, CVS,

It's going to be pounding on the storage system, really, really hard, which is the reason why we reinvented the storage system.

It is also going to use tools.

And unlike humans that are more tolerant to slower computers, AI wants the tools to be as fast as possible.

These tools, web browsers in the future, they could also be virtual PCs in the cloud.

Those PCs have to be, and those computers have to be as fast as possible.

We created a brand new CPU.

A brand new CPU that's designed for extremely high single threaded performance, incredibly high data output, incredibly good at data processing, and extreme energy efficiency.

It is the only data center CPU in the world that uses LPDDR 5, LPDDR 5, and incredible single thread performance, and performance per want that is unrivaled.

And so that's, we built that so that it could go along with the rest of these racks for agentic processing.

And so here it is.

This is the Grace Blackwall.

Oh, no, Vero Rubin, where is it?

Here it is.

Okay?

So this is the Vera Rubin system.

Notice, since the last time, 100% liquid cooled, all of the cable's gone.

What used to take, what used to take, 2 days to install now takes 2 hours.

Incredible.

And so the manufacturing cycle time is going to dramatically reduce.

This is also a supercomputer that is cooled by, it's cooled by hot water, 45 degrees, which takes the pressure off of the data center, takes all of that cost and all of that energy that's used to cool the data center and makes it available for the system.

This is the secret sauce.

It is the only, we're the only company in the world that has today, built the 6th, 6th generation scale up switching system.

This is not ethernet.

This is not Infiniban.

This is MVLink.

This is the 6th generation MV link.

This is insanely hard to do well.

It is insanely hard to do, period, and I'm just super proud of the team.

Envy Link, completely cool.

This is the brand new Grock system.

And I'll show you a little bit more about it.

This system, 8 guac chips, this is the LP 30.

The world's never seen it.

Anything that the world's ever seen, is V1.

This is 3rd generation.

And we're in volume production now.

And I'll show you more about that in just a second.

The world's 1st CPO, Spectrum X Switch.

This is also in full production.

co-packaged optics.

Optics comes directly onto this chip.

Interfaces directly to silicon.

Electrons gets translated to photons, and it gets directly connected to this chip.

We invented the process technology with TSMC.

where the only one in production with it today.

It's called Coop.

It's completely revolutionary.

NVIDA is in full production.

with Spectral Max.

This is the VERA system.

Twice the performance per watt of any, any CPUs in the world today.

It is also in production.

Well, you know, we never, we never thought we would be selling CPUs, standalone.

Um, we are selling a lot of CPU standalone.

This is already for sure going to be a multi-billion dollar business for us.

So I'm very, very pleased with our CPU architects.

We designed a revolutionary CPU.

And this is the CX9 powered with Vera CPU, the Blue Field 4 STX, our new storage platform.

Okay?

So these are the four, these are the racks.

And it's connected.

Each one of these racks, the envy link rack, this is, I've shown you guys this before.

It's just super heavy.

It seems to get heavier every year.

Because I think there's just more cables in there every year.

And so this is the MVLink rack.

We've also taken this technology because it is so efficient to create a data center with these cabling systems, structured cables.

So we decided to do that for ethernet.

So this is ethernet 256 liquid cooled nodes in one rack, and it is also connected with these incredible connectors.

You guys want to see, um, Ruben Ultra?

So this is the Ruben Ultra Compute node.

Unlike Ruben, that slides in horizontally, Ruben Ultra goes into a whole new rack, it's called kiber that enables us to connect 144 GPUs in one MVLink domain.

And so the kyber rack, this, I, I could lift it, I'm sure, but I won't.

It's quite heavy.

This, this is one compute node, and it slides into the kyber rack vertically.

This is where it connects into.

This is the midplane.

The kyberax, those 4 top envy link connectors slide in and connect into this, and this becomes one of the nodes.

And each one of these whacks is a different compute node.

And this is the amazing part.

This is the midplane.

And the back of the midplane, instead of the cabling system, which has its limits in terms of how far we could drive cables, copper cables, we now have this system, to connect 144 GPUs.

This is the new MV link.

This sits also vertically.

And it connects into the midplanes on the back.

Compute in the front, MVLink, switches in the back, one giant computer.

Okay?

So that is Ruben Ultra.

As I mentioned, as I mentioned, How about we take this back down?

I need the rest of my slides.

Oh, it's coming down?

Okay.

Thank you, Janine.

This is what happens when you, this is what happens when you don't practice.

Okay, all right, so, um, you saw, you, take your time.

Just don't get hurt.

You saw, you saw this slide.


Part 11: 토큰 팩토리 경제학 — 수익의 새로운 공식

토큰 계층화, Grock 통합, 분리형 추론, 350배 성능 향상

You know, only on video's keynote, will you see last year's slide presented again.

And the reason for that is, I just want to let you know that last year, I told you something very, very important.

And it's so important is worthwhile to tell you again.

This is probably the single most important chart for the future of AI factories, and every CEO, every CEO in the world will be tracking it, will be studying it very deeply.

It's much, much more complicated than this.

is multidimensional, but you will be studying the throughput and the token speed of your AI factories, the throughput, token speed, at ISO power, because that's all the power you have, throughput, and token speed for your factories forever.

And that analysis is going to lead directly to your revenues.

What you do this year will show up precisely next year as your revenues.

And this chart is what it's all about.

And I said, on the vertical axis, on the vertical axis, thank you guys.

On the vertical axis is throughput, on the horizontal axis, is token rate.

Today, I'm going to show you this.

Because we're able, because we're now able to increase the token speed, and because model sizes are increasing, because the token length, the context length, depending on the different grades of different application use case, continues to grow from maybe 100,000 tokens, input length to maybe 10000s, the token input length is growing, and also the output token length is growing.

And so, all of these play into, ultimately, the marketing and the pricing of future tokens.

Tokens are the new commodity.

And like all commodities, once it reaches an inflection, once it becomes mature or becomes maturing, it will segment into different parts.

The high throughput, low speed, could be used for the free tier.

The next tier could be the medium tier.

Larger model, maybe, higher speed for sure.

larger input context length, that translates to a different price point.

You could see from all the different services.

This one is free, it's a free tier.

The 1st tier could be $3 per 1000000 tokens.

The next tier could be $6 per million tokens.

You would like to be able to keep pushing this boundary, because the larger the model, smarter, the more input token context length, more relevant, the higher the speed, the more you can think and iterate smarter AI models.

So this is about smarter AI models.

And when you have smarter AI models, each one of these clicks allows you to increase the price.

So this is $45, and maybe one day, there'll be a premium model that allows you a premium service that allows you to generate token speeds that are incredibly high, because you're in a critical path, or maybe you're doing really long research, and $150 per million tokens.

is just not a thing.

So let's translate that.

Suppose you were to use 50000000 tokens per day as a researcher at $150 per 1000000 tokens.

As it turns out, as a research team, that's not even a thing.

So we believe that this is the future.

This is where AI wants to go.

This is where it is today.

It had to start here to establish the value and establish its usefulness and get better and better and better.

In the future, you're going to see most services, encompass all of that.

This is Hopper.

Hopper started, and I moved the chart.

This is 50.

This is 100.

Hopper looks like this.

And you would have expected Hopper, the next generation to be higher, but nobody would have expected it to be that much higher.

This is Grace Blackwell.

What Grace Blackwell did is, at your free tier, increase your throughput tremendously.

However, where you mostly monetize your service, it increased your throughput by 35 times.

This is no different than any product that every company makes.

The higher the tier, the higher the quality, the higher the performance, the lower the volume, the lower the capacity.

And so it is no different than any other business in the world.

And so now we're able to increase this tier by 35X.

And we introduced a whole new tier.

This is the benefit of Grace Blackwell.

A huge jump over Hopper.

Well, this is what we're doing with.

Okay, so this is Grace Blackwell.

Okay, let me just reset, reset this.

And this is Vera Rubin.

Okay?

Now just think, just think what just happened.

At every single tier, at every single tier, at every single tier, we increase the throughput, and at the tier that where your highest ASP and your most valuable segment, we increased it by 10 X.

That is the hard work.

This is incredibly hard to do out here.

This is the benefit of Eddie Link 72.

This is the benefit of extremely low latency.

This is the benefit of extreme codesign that we can shift the entire area up.

Now, what does it mean from a customer perspective in the end?

Suppose I were to take all of that.

And I just, you know, multiply it against, suppose I took 25% of my power, used it in a free tier, 25% of my power into medium tier, 25% of my power in the high tier, and 25% of my power in the premium tier.

My data center only has a gigawatt.

And so I get to decide how I want to distribute.

The free tier allows me to track more customers.

This allows me to serve my most valuable customers.

And the combination, the product of all that, allows you basically your revenues.

The revenues you can generate, assuming this simplistic example allows Blackwell to generate five times more revenues, Vera Rubin, to generate five times. Yeah.

So Vera Rubin, you should get there as soon as you can.

And the reason for that is because your cost of tokens goes down and your throughput goes up.

Now, but we want even more.

We want even more.

And so let me just show you back to this.

This is, as I told you, this throughput requires a ton of flops, this latency, this interactivity requires enormous amount of bandwidth.

Computers don't like extreme amount of flops, extreme amount of bandwidth because there's only so much surface area for chips that any systems has.

And so optimizing for high throughput and optimizing for low latency are, in fact, enemies of each other.

And so, this is what happened when we combined with rock.

Okay?

And so we acquired the team that worked on the grock chips and licensed the technology, and we've been working together now to integrate the system.

This is what that looks like.

So at the most valuable tier, at the most valuable tier, we're now going to increase performance by 35x.

Now, this very simple chart reveal to you exactly the reason why NVIDIA is so strong in the vast majority of the workloads so far.

And the reason for that is because up in this area, throughput matters so much, envealing 72 is so game changing, it is exactly the right architecture, and it's even hard to beat, even as you add groc to it.

However, if you extend it, this chart, way out here, and you said you wanted to have services that delivers not 400 tokens per second, but a 1000 tokens per 2nd, All of a sudden, envy link 72 runs out of steam and simply can't get there.

We just don't have enough bandwidth.

And so this is where Grock comes in.

And this is what happens when we push that out.

So, it goes out beyond, thank you.

Goes out beyond even the limits of what MVLink 72 can do.

And if you were to do that, translate that into revenues, relative to Blackwell, Vera Rubin is 5X.

If most of your workload is high throughput, I would stick, which is 100% vera Ruben. If a lot of your workload wants to be coding and very high valued engineering, token generation, I would add groc to it.

I would add grock to maybe 25% of my total data center.

The rest of my data center is all 100% Vera Rubin.

And so that gives you a sense of how you would add Grock to Vera Rubin and extend its performance and extend its value even more.

This is what happens.

This is a contrast.

The reason why the reason why Grok was so attractive to me, is because their computing system, a deterministic data flow processor, it is statically compiled, it is compiler scheduled, meaning the compiler figures out when the date, when to do the compute, the computer and data arrives at the same time.

All of that is done statically in advance.

and scheduled completely in software.

There's no dynamic scheduling.

The architecture is designed with massive amounts of SRAM.

It is designed just for inference, this one workload.

Now this one workload, as it turns out, is the workload of AI factories.

And as the world continues to increase the amount of high speed tokens, it wants to generate with super smart tokens it wants to generate, the value of disintegration is going to get even higher.

And so these are two extreme processors you could see.

One chip, 500 megabytes.

One vera Ruben chip, one Ruben chip, 288 gigabytes.

It would take a lot of rock chips to be able to hold the parameter size of Reuben as well as all of the context that has to go, the KV cache that has to go along with it.

So that limited Grock's ability to really reach the mainstream, to really take off, until we had a great idea.

What if we disaggregated inference altogether with a piece of software called Dynamo?

What if we re architected the way that inference is done in the pipeline?

so that we could put the work that makes perfect sense on Vera Rubin, and then offload the decodeneration, the low latency, the bandwidth limited challenged part of the workload for Grock.

And so we united, unified, two processors of extreme differences.

One for high throughput, one for low latency.

It still doesn't change the fact that we need a lot of memory.

And so, Grock, we're just gonna add a whole bunch of Grock chips, which expands the amount of memory it has.

And so, if you could just imagine, out of a trillion parameter model, we have to store all of that in grock chips. However, it sits next to Mvidia Vera Rubin, where we could we could hold the massive amounts of KV cash that's necessary in processing all of these agentic AI systems.

It's based upon this idea of this aggregated inference.

We do the prefill.

That's the easy part, but we also tightly integrate the decode.

So, the attention part of decode is done on NVIDious Verrubin, which needs a lot of math, and the feed forward network part of it, the decode part is done.

The token generation part is done on Vera Rubin, on the ground ship.

The two of them working tightly coupled together over today, ethernet, with a special mode to reduce its latency by about half.

And so that capability allows us to integrate these two systems, we run dynamo, this incredible operating system for AI factories on top of it, and you get 35 times increase.

35 times increase, not to mention additional new tiers of inference performance for token generation the world's never seen.

So this is it, this is grog.

The Vera Rubin systems, including rock.

I want to thank Samsung, who manufactures the Grock LP 30 Chip for us, and they're cranking as hard as they can.

I really appreciate, appreciate you guys.

We're in production with the Grock chip, and, uh, you know, we'll ship it in the second half, probably about Q3 time frame.

Okay?

Grock LPX.


Part 12: 로드맵 — Feynman, Rosa, 그리고 그 너머

Vera Rubin → Rubin Ultra → Feynman, AI 팩토리 플랫폼 DSX, 우주

Vera Reuben?

You know, it's kind of hard.

It's kind of hard to imagine any more customers.

You know, and, and to, the, the really great thing is, is, um, Grace Blackwell, early sampling of it was really complicated because of coming together, EmmyLink 72, but the sampling of Vero Rubin is just going incredibly well.

And in fact, Satya, I think, texted out already, that the 1st Vera Ruben Rack is already up and running at Microsoft Azure.

And so I'm super excited for them.

We're going to just keep cranking these things out.

We have now set up a supply chain that could manufacture thousands a week of these systems, essentially multi-gigawatts of AI factories per month inside our supply chain.

And so, we're going to crank out these Vera Rubin racks while we're cranking out the GB 300 racks.

We are in full production.

The various CPUs.

Incredibly successful.

And the reason for that is because AI needs CPUs for tool use, and VERA CPU was designed just perfectly for that sweet spot.

Incredible for the next generation of data processing.

Vera CPU is ideal.

The Vera CPU plus blue plus CX9 connected into the blue field force stack.

100%.

100% of the world's storage industry is joining us on this system.

And the reason for that is because they see exactly the same thing.

The storage system is going to get pounded.

It's going to get pounded because we used to have humans using the storage systems.

We just have humans using SQL.

Now we're going to have AIs using these storage systems, and it's going to store QDF accelerated storage, QVS accelerated storage, as well as very importantly, KV cashing.

Okay?

So this is the Vera Rubin system.

Now, what's amazing is this.

In just two years time, in a one gigawatt factory.

In just two years time, in one gigawatt factory.

Using the mathematics that I showed you earlier, Whereas Moore's law would have given us a couple of steps, we would have, you know, X factored, the number of transistors, we would X factored, the number of flops, we would X factored, the number of amount of bandwidth, but with this architecture, we're going to take our token generation rate from 2 million.

to 700 million, 350 times increase.

This is, this is the power of extreme code design.

This is what I mean when we integrate and optimize vertically, but then we open it horizontally for everybody to enjoy.

This is our roadmap very quickly.

Blackwell is here.

The Oberon system.

In the case of Ruben, we have the Oberon system, we're always backwards combatible so that if you wanted to not change anything and just keep on moving through with the new architecture, you could do so.

The old system, the standard rack system, Oberon, still available.

Oberon is copper scale up, and with Oberon, we could also use optical scale out, or excuse me, opticals scale up, to expand to MVLink 576.

Okay?

And so there's a lot of conversation about, is Nvidia going to copper scale up or optical scale up?

We're gonna do both.

So, we're gonna have MBLink 144 with kiber.

And then with Opturon, Opturon.

Oberon, we're going to EnvyLink 72 plus optical to get to EnvyLink 576.

The next generation of Ruben, with Ruben Ultra, we have the Ruben Ultra Chip, which is which is taping out, and we have a brand new chip, LP 35.

LP 35 will, for the 1st time, incorporate invidious MVFP4 computing structure.

Give you another few X factor speed up.

Okay?

And so this is Oberon, MVLink 72, optical scale up.

And it uses Spectrum 6, the world's first co-packaged optical, and all of this is in production.

The next generation from here is Feyman.

Feyman has a new GPU, of course.

It also has a new LPU.

LP 40.

Big step up, incredible, incredible new technology.

Now, uniting the scale of NVIDIA and the GROC team, building together, LP 40,

It's going to be incredible.

A brand new CPU called Rosa.

Short for Rosalind.

Bluefield 5, which connects the next CPU with the next Superneck, CX 10.

We will have kiber, which is copper scale up, we will also have kiber, CPO scale up.

So for the 1st time, we will scale up with both copper and co package optics. Okay?

And so a lot of people have been asking, you know, Jensen, is copper going to still be important, the answer is yes.

Jensen, are you going to scale up?

optical.

Yes.

Are you going to scale out optical?

Yes.

And so for everybody who's in our ecosystem,

We need a lot more capacity.

And that's really the key.

We need a lot more capacity for copper.

We need a lot more capacity for optics.

We need a lot more capacity for CPO.

And that's the reason why we've been working with all of you to lay the foundation for this level of growth.

And so Feyman will have all of that.

Let me see if I missed everything.

That's it.

Every single year, brand new architecture.

Very quick.

Very quickly.

Nvidia went from a chip company to a AI factory company or AI infrastructure company, AI computing company, these systems.

And now we're building entire AI factories.

There's so much power that is squandered in these AI factories.

We want to make sure that these AF factories come together, design in the best possible way.

Most of these components, never meet each other.

Most of us technology vendors, now we all know each other, but in the past, we never met each other until the data center.

That can't happen.

We're building super complex systems.

And so we have to meet each other virtually somewhere else.

And so we created omnivorse.

And the omnivorous DSX world, a platform where all of us can meet and design these giga factories, giga, you know, gigawatt AI factories virtually in system.

We have simulation, systems for the racks for mechanical, thermal, electrical, networking, those simulation systems integrated into all of our ecosystem partners of incredible tools companies.

We also operated, connected to the grid so that we could interact with each other, send each other information so that we could adjust, grid power and data center power accordingly, saving energy.

And then inside the data center, using Max Q so that we could adjust the system dynamically across power and cooling and all of the different technologies we all work on together, so that we leave no power squandered.

so that we run at the most optimal rate to deliver enormous amount of token throughput.

There's no question in my mind there's a factor of 2 in here.

And a factor of 2 at the scale we're talking about is gigantic.

We call this the Nvidia DSX platform.

And just as all of our platforms, there's the hardware layer, there's the library layer, and there's the ecosystem layer.

is exactly the same way.

Let's show it to you.

The greatest infrastructure buildout in history is underway.

The world is racing to build chip, system, and AI factories, and every month of delay cost 1000000000s in lost revenues.

AI factory revenues are equal to tokens per watt.

So with power constraints,

Every unused watt is revenue lost.

Nvidia DSX is an omnivorse digital twin blueprint for designing and operating AI factories for maximum token throughput, resilience and energy efficiency.

Developers connect through several APIs.

DSXM for physical, electrical, thermal, and network simulation, DSX exchange, for AI factory operational data.

DSXFlex, for secure, dynamic power management between the grid, and DSX Max Cube to dynamically maximize token throughput.

It starts with sim ready assets from MVIDIA and equipment manufacturers.

Managed by PTC Windshill PLM.

Then, model based systems engineering is done in Daso Systems 3D experience.

Jacobs brings the data into their custom omniverse app to finalize design.

It's tested with leading simulation tools, using Siemens star CCM+ for external thermals.

Cadence Reality for Internal, E tap for Electrical, an NVIDia's network simulator, DSX Air, and virtually commissioned through Procore, to ensure accelerated construction time.

When the site goes live, the digital twin becomes the operator.

AI agents work with DSX Max Q to dynamically orchestrate infrastructure.

Phaedra's agent oversees cooling and electrical systems.

sending signals to Max Q, which continuously optimizes compute throughput and energy efficiency.

Emerald AI agents interpret live grid demand and stress signals, and adjust power dynamically.

With DSX, Nvidia and our ecosystem of partners are racing to build AI infrastructure around the world, ensuring extreme resiliency, efficiency, and throughput.

It's incredible, right?

Well?

Omnivorse, omnivorse was designed to hold the world's digital twin, starting from the Earth, and it's gonna hold digital twins of all sizes.

And so we have just such a great ecosystem of partners.

I wanna thank all of you.

All of these companies are brand new to our world.

We didn't know many of you, just a couple years ago.

And now we're working so close together to work on and build together the largest computer, the world's ever seen, and also to do it at planetary scale.

So, Nvidia DSX is our new AI factory platform.

I'll spend very little time on this this time.

However, we're going to space.

We've already been out in space.

Thor is radiation approved, and we're in satellites, you do imaging from satellites, in the future, we'll also build data centers in space.

Obviously, very complicated to do so.

We have, we're working with our partners on a new computer called Vera Rubin Space one, and it's going to go out to space and start data centers out in space.

Now, of course, in space, there's no Conduction, there's no convection, there's just radiation.

And so we have to figure out how to cool these systems out in space, but we've got lots of great engineers working on it.


Part 13: Open Claw 혁명 — 에이전틱 시대의 운영체제

Open Claw, Nemo Claw, 엔터프라이즈 AI, Nemotron 연합

Let me talk to you about something new.

So, so, um, Peter Steinberger's here, and, um, he wrote a piece of software.

It's called Open Claw.

And, and, um, I don't know if he realized, uh, how successful it was gonna be, um, but the importance is profound.

Open claw is the number one, is the most popular open source project in the history of humanity, and it did so in just a few weeks.

It exceeded, it exceeded what Linux did in 30 years.

And it's that important.

It is that important.

It will do.

Well,

Uh, this is all you do.

Okay?

We're announcing our support of it.

Let me just quickly go through this.

I want to show you a couple things.

You simply type this.

You type it, this into a into a console.

And um, it goes out, it finds open claw, it downloads it, it builds you an AI agent.

And then you could tell it, whatever else you need to do.

Okay?

So let's take a look.

An open source project just dropped.

Andre Carpathy has just launched something called... No research is a huge deal.

You give an AI agent a task.

Go to sleep.

It runs 100 experiments overnight, keeping what works and killing what doesn't.

I really love what my stuff enables that person to do, and we had, like, one guy.

He told me, like, he installed it as a 60 year old dad, and, like, they made beer, connected the machine via Bluetooth to open claw.

And then we automated everything, including the whole website for people to order.

The lobster lag of here.

Hundreds of people are queuing up for lobsters in San Jed.

Open claw.

Open cloud, open cloud.

You want to build open claw with open claw.

Everyone is talking about open clock, but what is open clock?

Believe it or not, there's already a claw con.

Incredible.

Incredible.

Now, I illustrated effectively what open claw is in this way.

And so all of you can understand it, but let's just think what happened.

What is open claw?

It connects, it's an agentic system.

It calls and connects to large language models.

So the first thing it has, it has resources that it manages.

It could access tools, it could access follow systems, it could access large language models.

It's able to do scheduling.

It's able to do crime jobs, it's able to decompose a problem that a prompt that you gave it into step by step by step.

It could spun off and call upon other subagents.

It has IO.

You could talk to it in any modality you want.

You could wave at it and understand you.

You could talk to any modella you want.

sends you messages, it texts you, sends you email.

So it's got IO.

Um, what else does it have?

Well, based on that, you could, you could say, in fact, it's an operating system.

I just used the same syntax that I would describe an operating system.

Open claw has open sourced.

essentially, the operating system of agentic computers.

It is no different than how Windows made it possible for us to create personal computers.

Now, open claw has made it possible for us to create personal agents.

The implication is incredible.

The implication is incredible.

First of all, the adoption says something, you know, all in itself.

However, the most important thing is this.

Every single company now realized, every single company, every single software company, every single technology company, for the CEOs, the question is, what's your open clause strategy?

Just as we need it all, have a Linux strategy.

We all need it to have HTTP HTML strategy, which started the internet.

We all needed to have a Kubernetti strategy, which made it possible for mobile cloud to happen.

Every company in the world today needs to have an open clause strategy, an agentic system strategy.

This is the new computer.

Now, this is just the exciting part.

This is Enterprise IT before open claw.

You know, and I mentioned earlier, the way Enterprise IT works.

And the reason why it's called data centers is because these large rooms, these large buildings held data, held the files of people, the structured data of business.

It would pass through software that has tools and, you know, systems of records and all kinds of workflow that's codified into it, and that turns into tools that humans would use.

Digital workers would use.

That is the old IT industry.

Software companies creating tools.

Saving files, and of course, GSI's consultants that help companies figure out how to use these tools and integrate these tools.

These tools are incredibly valuable for governance and security and privacy and compliance and all of that continues to be true.

It's just that post-open clock.

Post-agentic.

This is what it's gonna look like.

This is the extraordinary part.

Every single IT company.

Every single company, every SAS company,

Every SAS company will become a, A gas company.

No question about it.

Every single SAS company would becoming a gas company, and agentic as a service company.

And what's amazing is this, you now open claw gave us, gave the industry exactly what it needed, at exactly the time.

Just as Linux gave the industry, exactly what it needed, exactly the time, just as Kubernini showed up at exactly the right time, just as HTML showed up.

It made it possible for the entire industry to grab onto this open source stack and go do something with it.

There's just one catch.

Agentic systems.

In the corporate network can have access to sensitive information.

It can execute code, and it can communicate externally.

Just say that out loud.

Okay?

Think about it.

Access sensitive information, execute code, communicate externally.

You could, of course, access, employee information, access, apply access, finance, information, access, information, and send it out, communicate externally.

Obviously, this can't possibly be allowed.

And so what we did was we worked with Peter.

We took some of the world's best security and computing experts, and we worked with Peter to make open claw.

Open claw, enterprise, secure, and enterprise private, capable.

And we call that, this is our NVIDIA open claw reference for open Nemo claw, which is a reference for open claw, and it has all these agentic AI tool kits, and the 1st part of it is technology we call open shell that has now been integrated into open claw.

Now, it's enterprise ready.

This stack, this stack, with a reference design we call Nemo Klaw, Nemo Klaw, okay, with a reference stack we call Nemo Klaw, you could download it, play with it.

And you could connect to it, the policy engine of all of the SAS companies in the world.

And your policy engines are super important, super valuable.

So the policy engines could be connected.

Nemo claw or open claw with open shell would be able to execute that policy engine.

It has a policy, it has a network guardrail.

It has a privacy router.

And as a result, we could protect and keep the clause from executing inside our company and do it safely.

We also added several things to the agentic system.

And one of the most important things you want to do with your own claw, custom claws, is so that you can have your custom models.

And this is invidious open model initiative.

We are now at the frontier of every single domain of AI models.

Whether it's Nemotron, Cosmos, World Foundation model, Groot, Artificial, General Robotics, Human Robotics models, Alpamayo, for Autonomous vehicle, Bio Nemo, for digital biology, Earth 2, for AI physics, we are at the frontier on every single one.

Take a look.

The world is diverse.

No single model can serve every industry.

Open models is one of the largest and most diverse AI ecosystems in the world, nearly 3 million open models across language, vision, biology, physics, and autonomous systems, enable AI bills for specialized domains.

NVIDIA is one of the largest contributors to open source AI.

We build and release six families of open frontier models, plus the training data, recipes, and frameworks to help developers customize and adopt.

New leaderboard topping models are launching for every family. At the core, Nemotron, reasoning models for language, visual understanding, rag, safety, and speech.

Can you hear me now?

Hello?

Yes, I can hear you now.

Cosmos.

Frontier models, for physical AI world generation and understanding.

Alpamayo, the world's first thinking and reasoning autonomous vehicle AI.

Group. Foundation models for general purpose robots. Bionema. Open models for biology, chemistry, and molecular design.

Earth 2. Models for weather and climate forecasting, rooted in AI physics.

NVIDA open models give researchers and developers, the foundation to build and deploy AI for their own specialized domains. Our models are...

Thank you.

Our models are valuable to all of you, because number one, it's on the top of the leaderboard.

It's world class.

But most importantly, it's because we are not gonna give up working on it.

We're gonna keep on working on it every single day.

Nemotron 3 is going to be followed by Nemotron 4.

Cosmos one was followed by Cosmos 2.

Groot, Groot, Generation 2.

Each and one of these will continue to advance these models.

Vertical integration, horizontal openness, so that we can enable everybody to join the AI revolution.

Number one, on leaderboard across research and voice and world models and artificial general robotics and self-driving cars and reasoning, and, of course, one of the most important one, this is Nemotron 3, in Open Claw.

This is Nemotron 3 and Open Claw, and look at the top three.

There are the 3 best models in the world.

Okay?

So we are at the frontier.

It is also true.

It is also true that we want to create the foundation model, so that all of you could fine-tune it, post-trained it into exactly the intelligence you need.

This is NemoTron 3 Ultra.

It is going to be the best base model the world's ever created.

This allows us to help every country build their sovereign AI.

And we're working with so many different companies out there.

And one of the most exciting things that we're doing today, I'm announcing today, is a Nemotron Coalition.

We are so dedicated to this.

We have invested 1000000000s of dollars of AI infrastructure so that we could develop the core engines for AI that's necessary for all the libraries of inference and so on, but also to create.

the AI models to activate every single industry in the world.

Large language models is really important.

Of course it's important.

How could human intelligence not be?

However, in different industries around the world, in different countries around the world, you need to have the ability to customize your own models, and the domains, the domain of the domain of the models is radically different, from biology, the physics, the self-driving cars, the general robotics, to, of course, human language.

And we have the ability to work with every single region to create their domain specific, their sovereign AI.

Today, we're announcing a coalition.

To partner with us, to make Nemotron 4 even more amazing.

And that coalition has some amazing companies in it, Black Forest Labs, imaging company, cursor, the famous coding company,

We use lots of it, Lang Chang, 1000000000 downloads for creating custom agents, mistral, the Arthur, Arthur mentioned.

I think he's here.

Incredible, incredible company, perplexity, perplexes, computer.

Absolutely use it.

Everybody use it.

It is so good.

A multimodal, agentic system, reflection, Sarvum from India, thinking machine, Mirror Marati's lab.

Incredible companies joining us.

Thank you.

I said, I said that every single enterprise company, every single software company in the world needs an agentic systems, need an agent strategy.

You need to have an open class strategy, and they all agree.

And they're all partnering with us to integrate Nemo, the Nemo Claw, reference design, the NVIDIA, agentic AI, toolkit, and of course, all of our open models.

One company after another, there's so many.

And we're partnering with all of you.

I'm really grateful for that.

And this is our moment.

This is a reinvention.

This is a Renaissance.

A Renaissance of the Enterprise IT, from what would be a $20000000000 industry.

This is going to become a multi-trillion dollar industry, offering not just tools for people to use, but agents that are specialized in very special domains that you're expert in that we could rent.

I could totally imagine in the future.

Every single engineer in our company will need an annual token budget.

They're gonna make a few $100,000 a year, their base pay.

I'm gonna give them, probably half of that, on top of it, as tokens so that they could be amplify 10 X.

Of course we would.

It is now one of the recruiting tools in Silicon Valley.

How many tokens comes along with my job?

And the reason for that is very clear, because every engineer that has access to tokens will be more productive.

And those tokens, as you know, will be produced by AI factories that all of you and us, we partner to build.

Okay?

So every single enterprise company in today, Sit on top of file systems and data centers.

Every single software company of the future will be agentic, and they will be token manufacturers.

There'll be token users for their engineers, and there'll be token manufacturers for all of their customers.

The open class event, the open claw event cannot be understated.

This is as big of a deal as HTML.

This is as big of a deal as Linux.

We have now a world class open a gentic framework that all of us could use to build our open clause strategy.

And we've created a reference design we call Nemo Cloud, Nemo Cloud, that all of you could use, that is optimized.

It's performant, it is safe and secure.

Speaking of agents, agents as you know, perceive, reason, and act.

Most of the agents in the world today that I've spoken about are digital agents.

They act in the digital world.


Part 14: 물리적 AI와 로보틱스 — 자율주행에서 올라프까지

로보택시, 휴머노이드 로봇, Disney Olaf, Physical AI

They act in the digital world.

They reason, they write software.

It's all digital, but we also have been working on physically embodied agents for a long time.

We call them robots.

And the AIs that they need are physical AIs.

We have some big announcements here.

I'm gonna just walk through a few of them.

110 robots here, almost every single company in the world, I can't think of one that are building robots is working with NVIDIA.

We have 3 computers, the training computer.

The synthetic data generation and simulation computer, and of course, the robotics computer that sits inside the robot itself.

We have all the software stacks necessary to do so.

The AI models to help you.

And all of this is integrated into ecosystems around the world, and all of our partners from siemens to cadence, incredible partners everywhere.

And today, we're announcing a whole bunch of new partners.

As you know, we've been working on self-driving cars for a long time.

The chat GPT moment of self-driving cars has arrived.

We now know we could successfully autonomously drive cars.

And today we are announcing 4 new partners for invidious robotaxi ready platform.

BYD, Hyundai, Nissan, Gili, all together, 18 million cars built each year.

joining our partners from before, Mercedes, Toyota, GM, the number of robotaxi ready cars in the future are going to be incredible, and we're announcing also a big partnership with Uber.

Multiple cities, we're going to be deploying and connecting these robotaxi ready vehicles into their network.

And so a whole bunch of new cars.

We have, uh, ABB, Universal Robotics, uh, Kuka, so many robotics companies here, and we're working with them to implement our physical AI models integrated into simulation systems so that we could deploy these robots into manufacturing lines all over.

We have caterpillar here.

We even have T-Mobile here.

And the reason for that is in the future, that radio tower used to be a radio tower, is going to be an NVIDIA aerial AI RAM.

And so this is going to be a robotics radio tower, meaning it can reason about the traffic, figures out how to adjust its beam forming so that it could save as much energy as possible, and increase the amount of fidelity as possible.

There are so many humanoid robots here.

But one of my favorites.

One of my favorites is a Disney robot.

You know what?

Tell you what, let me just show you some of the videos.

Let's look at that first.

The first global rollout of physical AI at scale is here.

Autonomous vehicles.

And with Nvidia Alpamayo, vehicles now have reasoning, helping them operate safely and intelligently across scenarios.

We ask the car to narrate its actions.

I'm changing lanes to the right to follow my route.

Explain its thinking as it makes decisions.

There's a double parked vehicle in my lane.

I'm going around it.

And follow instructions.

Hey, Mercedes.

Can we speed up?

Sure, I'll speed up.

This is the age of physical AI and robotics.

Around the world, developers are building robots of every kind.

But the real world is massively diverse, unpredictable, full of edge cases.

Real world data will never be enough to train for every scenario.

We need data generated from AI and simulation.

For robots, compute is data.

Developers pre train world foundation models on Internet scale video and human demonstrations, and evaluate the model's performance to prepare them for post training.

Using classical and neural simulation, they generate massive amounts of synthetic data and train policies at scale.

To accelerate developers, MVIDA build open source Isaac Lab for robot training and evaluation and simulation.

Newton for extensible and GPU accelerated differentiable physics simulation.

Cosmos world models for neural simulation, and Groot open robotics foundation models for robot reasoning and action generation.

With enough compute, developers everywhere are closing the physical AI data gap.

Peritas AI trains their operating room assistant robot in Amvidia Isaac Lab, multiplying their data with Nvidia Cosmos World models.

Skilled AI uses Isaac Lab and Cosmos to generate post training data for their skilled AI brain.

They use reinforcement learning to harden the model across thousands of variations.

Humanoid uses Isaac Lab to train whole body control and manipulation policies.

Hexagon Robotics uses Isaac lab for training and data generation.

Foxcon fine tunes group models in Isaac Lab.

As does Noble machines.

Disney Research uses their Camino Physics Simulator in Newton and Isaac Lab to train policies across their character robots in every universe.

Ta, da, da, da, da, da, da, da, da, da, da, da, da, da, da, da, da, da, da.

Ladies and gentlemen, Olaf?

Does it? Soon coming through.

Noon works.

Wow. Omnivorse works.

Olaf, how are you?

I'm so happy now that I'm leaving you.

I know.

Because I gave you your computer.

Jetson.

What's that?

Well, it's in your tummy.

That's gonna be amazing.

And you learn how to walk inside omnivorse.

I got to walk.

This is so much better than riding on a reindeer, gazing up at a beautiful sky.

And it was because of physics, using this Newton solver that runs on top of Nvidia warp, that we jointly develop with Disney, and with Deep Mind, that made it possible for you to be able to adapt to the physical world.

Check that out.

to say that.

That's how smart you are.

I miss No May.

Not a Snoclopedia.

Could you imagine this?

The future of Disneyland?

All these all these robots, all these characters wandering around?

You know, I have to admit, though, I thought you were gonna be taller.

I've never seen such a short snowman, to be honest.

Nope.

Hey, tell you what.

You wanna help me out?

Hooray!

Okay.

Usually, usually I close the keynote by telling you what I told you.

We talked about inference inflection, we talked about the AI factory, we talked about the open claw, agent revolution that's happening.

And of course, we talked about physical AI and robotics.

But tell you what, why don't we get some friends to help us close it out?


Part 15: 클로징 — GTC의 미래

마무리 랩과 인사

But tell you what, why don't we get some friends to help us close it out?

All right, play it.

Terminating simulation.

Hello?

Anybody here?

The keynote's overall, we said, gents and map the road ahead.

AI factory's coming alive.

Agents learning how to drive from open models to robots, too.

Now we'll break it all down.

Compute exploded what we saw from CNNs to open cloth.

Agents work and cross the land.

But they need the power to meet the man.

So we solve the problem.

It was brilliant.

We multiply, compute by 40 million.

Well, once upon an AI time train was paradigm.

Sure, it talking models how, but in friends runs the whole world now.

Very shows us who's the bars at 35 times less the cost.

Blackwell makes the token singing video.

The inference king.

Yeah, factories once took years, vendors pulling racks and gears.

Built up slowly, piece by piece,

No clear way to scale this bee's, DSX and dynamo.

Know what to do.

Turning power into revenue.

Agents used to wait and see.

Now, act autonomously, but if they ever try to stray safe course, block and say, no way, Nemo claws there to guard the course.

And yes, my friends, it's open sore.

Cars that think enjoys that run, this ain't the movies.

It's all begun.

How come I old call the shots?

It's a GPT moment for the bots from Sim the streets.

Now watch them drive.

Blow your hands up.

For physical, yeah.

Not to your age, but what came before, now we built for AI.

Even more of a Reuben plus Grock, make the inference splash, put them together.

Now it's raining cash.

We build new architecture every year 'cause claws keep yelling more tokens here.

The AI stacks for all to make, so let us all eat 5, lay a cake, the moment bright.

The path is clear

'Cause open models let us hear when day does missing.

There's no dispute.

We just generate more with compute.

Robots learning without flaw, fueling the 4 scaling laws.

The future's here, won't you come and see?

Welcome all to GTC.

All right, have a great GTC!

Wave.

Thank you, everybody.

Thank you. I just met... Okay



KIMKJ.COM

#김경진 #김경진변호사 #김경진인공지능 #인공지능 #AI #AI전문가 #AI법률 #AI정책 #AI규제 #AI윤리 #생성형AI #ChatGPT #Claude #GPT #LLM #디지털전환 #스마트시티 #자율주행 #데이터규제 #GDPR #개인정보보호 #AI거버넌스 #국회의원김경진 #법률전문가 #테크정책 #AI교육 #AI행정혁명 #AI패권전쟁 #kimkj #kimkjcom


전체 0

위로 스크롤