Nvidia CEO Jensen Huang on what’s next for the AI boom

Huang’s GTC keynote pitched an AI economy built on inference, tokens, and agentic systems — with Nvidia selling the factory floor where all of that work runs

Josh Edelson/AFP via Getty Images

Before Jensen Huang even got to his leather-jacket entrance at this year’s GTC, Nvidia $NVDA had already started selling the myth. The preshow soundtrack sounded suspiciously custom-built for a coronation — lyrics about amazing things arriving on schedule, legends being made, the future showing up right on cue; songs even Shazam couldn’t identify. (The first AI demo of the day could very well have been the playlist.) Half the room had phones raised for Huang’s entrance like Silicon Valley had booked its own arena act. For one afternoon, the San Jose Sharks’ home rink belonged to a different kind of power play. Because Huang walked onstage and did what he does best: turned a product keynote into a zoning hearing for the future.

The Nvidia founder opened GTC by promising a tour through “every single layer” of AI, then spent the next couple of hours arguing that the company isn’t just selling chips into a hot market. Nope. The company wants to define the whole physical plant of the AI economy: the compute, the networking, the storage, the software, the models, the factories, and — because subtlety is clearly out of season — maybe even the (still theoretical) data centers way up in space.

The keynote sprayed announcements in every direction, but the real message was tighter. Huang wanted investors, customers, and rivals to hear four things clearly: AI demand is still climbing fast enough to justify outrageous amounts of spending; inference is now the center of the battlefield; agents are supposed to spill out of chatbots and into the daily machinery of office work; and the next gold rush after digital AI could be physical AI, where robots, autonomous systems and industrial software burn through even more data and infrastructure. You can’t spell Nvidia without AI.

Huang opened where he usually opens when the market starts wondering whether Nvidia’s moat might someday spring a leak: software. He spent the early stretch reminding everyone that CUDA is 20 years old, that Nvidia’s installed base sits “in every cloud” and “every computer company.” Nvidia’s strongest shield is still the software ecosystem wrapped around the silicon, not the green rectangles by themselves.

That logic shaped the rest of the speech. Huang lingered on structured data, called it the “ground truth” of enterprise computing, and said that AI can finally make use of the ocean of unstructured information — PDFs, videos, speech, all the corporate attic junk companies have hoarded for years without really knowing how to search or monetize. Watch out, world; Nvidia wants a claim on the database, too.

GTC isn’t just about a faster, better chip anymore. This year’s big speech was about Nvidia’s attempt to become the company that owns the economics of AI work itself — the chips, the storage, the networking, the orchestration layer, the digital twin, the open-model politics, the agent runtime, and whatever comes after the data center once Earth starts feeling crowded. GTC 2026 was an inference keynote, an agent keynote, and an AI-factory keynote, with the hardware serving as the proof rather than the plot.

Well, that’s a big number

Huang’s biggest flex was numerical. He marked the 20th anniversary of CUDA, called it the flywheel behind accelerated computing, said computing demand has risen “1 million times over the last few years,” and then raised the stakes by saying he now sees at least $1 trillion in revenue opportunity from 2025 through 2027, up from the $500 billion figure he had previously attached to Blackwell and Rubin demand through 2026. Nvidia shares closed up 1.6% on Monday, which reads like approval without full conversion.

That number — and Huang’s framing — may very well have been the keynote’s organizing principle. Nvidia wanted investors and customers to hear, in public and at volume, that the buildout is still early, still broadening, and still large enough to make current spending look like down-payment money. That number also did some quiet cleanup work. Nvidia has spent months fielding the usual questions that arrive whenever a company becomes the main cashier at a capital-spending frenzy: How long can this last? What happens when hyperscalers find religion on costs? How much of the next phase leaks to custom chips and cheaper alternatives?

Huang’s answer was to widen the lens — to make the market bigger and the workload messier. He said “the inference inflection has arrived,” and built the middle of the keynote around a simple argument: AI can now do productive work. And once that happens, the demand picture changes. Training giant models and admiring them was never going to be the final stage. That all moves into production, where the meter never stops running.

This — this! — is your revenue, he was saying, turning a data center into a mint and a power bill into destiny. Nvidia was busy pitching reality so improved it could practically be invoiced, and the room was still full of people trying to decide whether the demo looked transcendent or just slightly more expensive.

Tokens were everywhere in the keynote — in the opening video, in the performance charts, in the economic argument. The point, essentially, is that the future value of AI lies in generating useful output continuously, which means inference becomes the part of the stack where cost, latency, and throughput start to really and truly matter. Huang is pitching dependency. He wants customers thinking in gigawatt campuses, integrated racks, megawatt budgets, and token throughput curves, not in servers they can mix and match at will.

Inference takes center stage

Perhaps one of the sharpest lines of the keynote was also the simplest: “The inference inflection has arrived.” Nvidia knows the world has gotten interested in cheaper, leaner inference hardware. Fine. It would like to sell that, too.

Huang broke inference into two stages — prefill and decode — and laid out a system in which Nvidia’s Vera Rubin chips handle the prefill work, while Groq-derived silicon tackles decode, the step that actually spits out the answer. That matters; inference is where Nvidia’s next chapter gets messier. Training made the company rich. Serving hundreds of millions of users in real time is where customers start asking questions about cost, latency, and whether they really need the same silicon for every step.

Huang’s response was classic Nvidia. Don’t defend the GPU in isolation; swallow the whole stack. He described Vera Rubin as “a generational leap” built around seven chips and five rack-scale systems, with Nvidia claiming the platform can train large mixture-of-experts models with one-fourth the number of GPUs versus Blackwell and deliver up to 10 times higher inference throughput per watt at one-tenth the cost per token. He also used the keynote to look beyond Rubin to the future platform Feynman, because in Nvidia-land, the next generation is standing in the wings before the current one finishes taking its bow.

Huang isn’t pitching a faster part so much as a bigger dependency. Nvidia announced a Vera Rubin DSX AI factory reference design, DSX simulation tools for planning AI factories before they’re built, and a broader menu of storage, networking, and system components meant to operate as one vertically integrated machine. The message was hard to miss: Stop thinking about servers, start thinking about campuses. Or, if you’re Nvidia, start sending invoices like a utility.

Agents leave the demo stage

If the hardware pitch was about keeping Nvidia at the center of inference, the software pitch was about making sure enterprise AI doesn’t become someone else’s party. Huang said “100% of Nvidia” is now using Claude Code, Codex, and Cursor; people are no longer asking AI who and what and when and where and how. They’re asking it to create. To do. Sorry, chatbot companies — AI is now being treated less as a conversational novelty and more as a labor system.

Huang spent the day trying to make sure that labor system runs through Nvidia’s stack. The company rolled out OpenClaw and NemoClaw for the OpenClaw community — in partnership with the all-too-trendy company — pushed its Agent Toolkit and OpenShell runtime, and leaned into AI-Q, which is meant to route queries and cut costs by more than 50% through a hybrid mix of frontier and Nvidia’s open models.

There’s a strategic hedge tucked inside all that openness.

Nvidia unveiled the Nemotron Coalition with Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab, with the first project set to underpin the coming Nemotron 4 model family. Read the subtext, and it’s pretty clear that Nvidia doesn’t want the future of AI software split neatly between a few giant closed-model vendors and a pile of commodity hardware underneath. It wants a hand in the open-model layer, too — the piece that shapes who gets to build, tune, and own AI outside the walls of the biggest labs.

The empire pitch gets bigger

And then, because Huang has never met a metaphor he couldn’t upscale, the keynote spilled outward from the data center into almost every adjacent industry it could find.

Huang has been widening Nvidia’s story beyond digital assistants for a while, and this year’s GTC pushed that theme even harder. Nvidia announced a Physical AI Data Factory Blueprint with Microsoft $MSFT Azure and Nebius that’s meant to automate how training data gets generated, augmented, and evaluated for robotics, vision AI agents, and autonomous vehicles. The pitch is straightforward enough: Real-world data is scarce, edge cases are annoying, and synthetic data plus simulation can turn compute into the raw material these systems need.

Huang also previewed GR00T N2, a next-generation robot foundation model based on DreamZero research that the company says more than doubles success versus leading VLA models on new tasks in new environments. Chatbots got Wall Street excited. Physical AI is the part that could keep the infrastructure binge going for years, because robots, industrial systems, and autonomous machines don’t just need models — they need endless training data, simulation, networking, sensors, and edge compute.

Huang even brought Disney $DIS’s Olaf onstage, a small piece of physical-AI theater that made the broader point more cleanly than another architecture slide really could have. Nvidia says Disney has been training Olaf and its BDX droids with a GPU-accelerated physics simulator built on Nvidia’s Warp framework and integrated into Newton, with Olaf set to debut at Disneyland Paris on March 29.

Nvidia also made sure autonomous vehicles kept their place on everyone’s bingo cards. The company said BYD, Geely, Isuzu, and Nissan are building Level 4-ready vehicles on its DRIVE Hyperion stack, while Uber $UBER is slated to roll out Nvidia-powered robotaxis in Los Angeles and San Francisco in the first half of 2027 before expanding to 28 markets by 2028. Autonomy fits Huang’s broader case almost too neatly: The next phase of AI will move through the physical world, which means more sensors, more simulation, more networking, more edge compute, and, conveniently for Nvidia, more expensive hardware everywhere.

Huang even took the bigger-and-better story a step further and said Nvidia is going to space, with future Vera Rubin-based systems aimed at orbital data centers and autonomous space operations. Sure, that sounds a little like a man who has discovered there are still a few untouched sectors left. But it also sounds like a company determined to make “AI infrastructure” mean nearly every expensive machine in sight. Nvidia is still the chip king, sure. But Huang no longer sounds especially interested in that title by itself. His company is trying to graduate from chip supplier to factory architect, operating-system vendor, and toll collector for a world where AI does more of the work and power-constrained data centers become revenue engines measured in tokens per watt.

By the time Huang was done, the keynote felt bigger than a launch calendar. It read like an empire map. Yes, there was DLSS 5 for graphics, new industrial software tie-ins, telecom edge partnerships, and an avalanche of developer plumbing. But the durable takeaway was simpler and much bigger: Nvidia wants AI to stop being understood as just a software category and start being treated as a utility-scale infrastructure project, with Nvidia’s hardware and software embedded at each and every layer.

That’s a very Jensen Huang message. The unnerving part for rivals is that, for now at least, he still has plenty of customers willing to build around it.