NVIDIA Is Not a GPU Company Anymore

Q: How powerful is the RTX Spark GPU?

The top RTX Spark configuration carries 6,144 Blackwell CUDA cores, the same count as a desktop GeForce RTX 5070. Because it runs inside a thin laptop or a small desktop, real performance will land below the desktop card, and NVIDIA's gaming claim of 1440p at over 100 frames per second has not yet been independently tested.

Q: Should you buy an RTX Spark, or is it overhyped?

For most buyers, no. The headline feature, running a large language model locally, needs the expensive 128GB configuration, and a cloud API delivers a faster and smarter model for a fraction of the upfront cost. Gamers get more frames per dollar from a conventional laptop, and Windows on Arm still struggles with software not compiled for it. RTX Spark makes sense mainly if you specifically need local AI for privacy or offline reasons, or want one quiet machine that does AI, creation and gaming. The wildcard is that NVIDIA is the first to pair Arm Windows with a real GPU, the ingredient that sank the earlier attempts.

A long time ago, NVIDIA used to be a GPU company. The line has become a running joke among people who watch this industry, and on the first morning of June, in a full hall at the Taipei Music Center, Jensen Huang spent two hours turning the joke into a thesis. He wore the leather jacket. He held up silicon. And by the time he walked off stage at GTC Taipei, the company best known for the card you slot into a gaming PC had announced the entire chip inside your next laptop, the model inside your next AI agent, an open humanoid robot for the world's research labs, and a supercomputer that fills five racks and behaves like one machine.

None of it was framed as a graphics launch, because for the first time in years there was no flagship gaming card to launch. That absence is the story. Huang has spent recent keynotes talking less about frame rates and more about what he now calls AI factories, and this year he said the quiet part with a straight face. His customers, he argued, do not want to buy computers at all. They want to build factories that turn electricity into intelligence.

This matters even if you will never set foot near a data center. For almost the whole history of personal computing, the chip running your machine came from Intel, AMD, Apple or Qualcomm, and NVIDIA sold you a card to bolt on beside it. That arrangement is the thing that ended in Taipei. With a new superchip called RTX Spark, NVIDIA stops being the part you add to a PC and starts trying to be the PC. And the same architecture that runs your laptop now runs the building where models are trained, the models themselves, and the robots meant to carry the results into the physical world.

The GPU did not disappear. It became the front door to everything else.

01The chip that walked into Intel's living room

The headline product for ordinary buyers is RTX Spark, and it is exactly the move the leakers spent months predicting under the codenames N1 and N1X. It is a single system on chip, what NVIDIA calls a superchip, that fuses a Grace class CPU of up to 20 Arm cores with a Blackwell GPU of up to 6,144 CUDA cores, all sharing up to 128GB of unified LPDDR5X memory on a chip built at TSMC on a 3 nanometre node. The CPU side was co-developed with MediaTek, which is how NVIDIA suddenly has a credible Arm laptop processor without having built a phone chip business first.

It is, in plain terms, the friendly Windows sibling of the DGX Spark, the Linux mini machine NVIDIA sold to researchers. The silicon shares the same GB10 foundation, but where DGX Spark spoke to data center networking and lab software, RTX Spark speaks Windows 11 on Arm, USB4 and the things a real laptop needs. Huang called it the most efficient PC chip ever built, which is the kind of sentence you should always read with one eyebrow raised, and we will get to that.

Anatomy of RTX Spark. A 20 core Arm CPU and a 6,144 core Blackwell GPU on one package, joined by NVLink C2C and feeding from a single 128GB memory pool. The unified pool is the trick: the GPU can read a large model without copying it across a bus first. Diagram by Silicon Tales.

The pitch is unified memory used as a weapon. Because the CPU and GPU share one large pool, NVIDIA says a Spark machine can run a 120 billion parameter language model locally with up to one million tokens of context, render a 90GB 3D scene with OptiX and DLSS, edit 12K video through the Blackwell decoder, and still play AAA games at 1440p above 100 frames per second with ray tracing, DLSS and Reflex. The framing is deliberate. This is not sold as a faster gaming laptop. It is sold as a personal AI computer that happens to game, built for a world where an agent runs quietly in the background as what NVIDIA describes as a teammate rather than a chatbot you poke at.

Here are the numbers worth keeping in your head, with the honest caveats attached.

NVIDIA RTX Spark// top configuration

GPU: 6,144 Blackwell CUDA coressame core count as a desktop RTX 5070
CPU: Up to 20 Arm coresGrace class, co-developed with MediaTek
Memory: Up to 128 GB LPDDR5Xunified, shared between CPU and GPU
Bandwidth: Up to about 300 GB/sGB10 class, exact figure awaits testing
Process: TSMC 3nmNVLink C2C linking CPU and GPU
Platform: Windows 11 on Armno discrete GPU, no ConnectX
Availability: Fall 2026around 30 laptops, 10 plus desktops
Builders: Acer, Asus, Dell, Gigabyte, HP, Lenovo, MSIMicrosoft Surface Laptop Ultra to follow

The competitive picture is where this gets spicy, because NVIDIA is not entering an empty room. It is walking into a market Apple reshaped with its own silicon, that Qualcomm tried and largely fumbled with Snapdragon X, and that Intel and AMD have defended for decades. Intel executives have publicly admitted to a healthy dose of paranoia about the arrival. The table below is positioning, not a benchmark scoreboard, and it reflects what each camp claims rather than measured results.

// how NVIDIA is positioning RTX Spark against the incumbents (vendor claims, not independent benchmarks)
	RTX Spark subject	Apple silicon	Qualcomm Snapdragon X	Intel / AMD x86
Instruction set	Arm	Arm	Arm	x86
Graphics pedigree	Full RTX and CUDA stack	Strong, but closed and no CUDA	Adreno, modest for gaming	Radeon and Arc, mature drivers
Unified memory	Up to 128 GB	Large on high end parts	Up to 64 GB tier	Smaller fast pools, split memory
Local AI headline	120B model, 1M context (claimed)	Capable, tied to its own frameworks	Copilot+ class NPU	AI split across CPU, NPU, dGPU
Big question	Real power and thermals on thin laptops	Will it ever open up	App and game compatibility	Can x86 match the AI memory story

// stay skeptical The one specification that should make you cautious is the GPU. RTX Spark carries the same 6,144 CUDA cores as a desktop RTX 5070, but it lives inside a laptop power budget, so the desktop card's performance is a ceiling it will not reach, not a target it will match. NVIDIA's 1440p gaming claim came with no game list, no settings and no power figures, and Windows on Arm still leans on translation for software that was never compiled for it. Treat the slide numbers as the start of the conversation, not the verdict, until independent reviewers measure retail machines this fall.

02The leak said one thing. The truth was smaller.

Hours before the keynote, a full spec sheet for the chips then called N1 and N1X leaked, and it was everywhere. It described a four tier family climbing to a monster top configuration. It is a useful artifact, not because it was right, but because it was so confidently wrong, and comparing it to what NVIDIA actually confirmed is a small lesson in how hype inflates on the way to a launch.

The leaked top tier claimed more than 20,000 CUDA cores, a figure that would have put a thin laptop ahead of a desktop RTX 5090. The real top configuration is 6,144 cores. That is a gap of more than three times, and it is the difference between a credible efficient laptop chip and a physically implausible one. The shape of the story survived. The specifics did not.

Rumor / pre-launch leak

The "N1 / N1X" leak sheet

Top tier name N1X
CUDA cores 20,480
Memory bus 256-bit
Memory 128 GB
Bandwidth ~273 GB/s
Power 120 W
Lineup 4 tiers, X to S

Confirmed by NVIDIA

RTX Spark, as announced

Brand RTX Spark
CUDA cores 6,144
CPU up to 20 Arm
Memory up to 128 GB
Bandwidth ~300 GB/s
Foundation GB10 / 3nm
Lineup several models, 16 to 128 GB

The instinct to believe a clean, detailed leak is strong, especially one that arrives the night before and confirms what you already wanted to be true. The 6,144 figure also happens to be NVIDIA's actual ceiling, not a midrange part, which is worth remembering the next time a chart promises you the moon a few hours before a keynote.

03A revolution, or a rerun?

There is a line older than computing itself, that the mountains went into labour and gave birth to a mouse. Watch the RTX Spark portion of the keynote a second time with the marketing turned down, and it starts to fit a little too well. Take away the word AI, which Huang repeated like a metronome, and what remains is a thin, expensive Arm laptop running Windows. The uncomfortable part is that we have been sold this laptop before. Not once, but twice.

We have stood here twice before

Windows on Arm is not a frontier, it is a graveyard with two headstones. In 2012 Microsoft shipped Windows RT on the Surface RT, an Arm machine that could not run ordinary Windows programs and was quietly retired inside two years. In 2024 it tried again with the Copilot+ PCs built on Qualcomm's Snapdragon X Elite, launched under an "AI PC" campaign so loud, and forgotten so fast, that most buyers today could not name a single model. Both promised Apple class battery life with Windows freedom. Both broke on the same rock.

That rock is the developer, and it has not moved. A Windows program has to be recompiled and tested to run natively on Arm, and a developer who can see billions of x86 machines in the world is not going to rewrite an app to chase a few hundred thousand Arm ones. This is why every earlier push stalled, and why the platform, even now that it has quietly grown competent, still sends serious creators to Apple and serious gamers to an x86 box with an NVIDIA card inside. RTX Spark arrives carrying that entire history on its first day.

Windows on Arm has two headstones. RTX Spark is the third attempt, and the first arriving with a serious GPU behind it. Whether that is enough to do what battery life and an AI label could not is the real question. Diagram by Silicon Tales.

There is a second, quieter awkwardness, which is that RTX Spark is barely new. A year ago NVIDIA sold what is essentially this same chip as the DGX Spark, a small Linux machine aimed at researchers who wanted to run large models on a desk. Same GB10 foundation, same 128GB of unified memory, same headline trick. What changed in Taipei is the operating system and the story wrapped around it: the lab box now plays games and speaks Windows. That is a refresh wearing the costume of a new category.

A fix for a problem most people do not have

The entire pitch leans on a single belief, that you will want to run a large language model on your own machine rather than in the cloud, and it is worth being blunt about how thin that belief is for most of us. The capability NVIDIA put on the poster, a 120 billion parameter model running locally, needs the 128GB configuration, which is the costly end of the family, the point where a Spark machine wanders into serious gaming laptop money. For that price you get a model that runs slower and reasons worse than what a few dollars of cloud API hands you the same afternoon. The people with a genuine reason to keep data off the cloud, regulated industries, privacy sensitive work, certain enterprises, already buy dedicated hardware and know exactly why. Everyone else is being offered a cure for a symptom they have never felt.

And computing revolutions are not won at the top of the price list, they are won at the bottom. Apple did not reshape the laptop because the M chip was clever in the abstract, it did so because a genuinely capable MacBook Air came within reach of a normal budget, the kind of machine a student can actually buy. A personal computer that starts cheap and does real work changes the world. One that asks several thousand dollars for a local AI trick most owners will never switch on is a product hunting for an audience it has not found yet. So who is it for? Not most gamers, who get more frames per dollar from a conventional laptop today. Not most developers, who run the numbers and rent the cloud. Not the handheld crowd, who were left out entirely, with no SteamOS console and no palm sized device, presumably because the chip cannot yet sip power that gently. Huang gave his consumer flagship a sliver of a two hour keynote and put no price on the screen, and from a showman this disciplined, that silence is its own kind of review.

// but here is the thing And yet you should not close the case, because one detail turns the whole story into a genuine question. Every previous Arm on Windows attempt died for the same reason above all others, a lack of graphics power, and graphics power is the one thing NVIDIA has in embarrassing surplus. For the first time the company shipping the Arm Windows chip is also the company that owns the gaming GPU, the drivers, the upscaling and the developer relationships the others never had. Maybe a credible GPU is precisely the missing ingredient that finally turns the graveyard into a garden. Or maybe it is a brilliant answer to a question almost nobody was asking, battery powered gaming, while the real walls, price and software, stand exactly where they have always stood. NVIDIA is betting the third time is different. After sitting through the first two, you would be entirely right to wonder, and that wondering is the most interesting thing this launch left behind.

04Meanwhile, on the data center floor

If RTX Spark is the consumer face of the day, the engine behind everything is Vera Rubin, and NVIDIA used Taipei to confirm it has reached full production. Named after the astronomer who found the first strong evidence for dark matter, Rubin is the successor to Grace Blackwell, and it is built for exactly the workload Huang kept returning to: agents that turn a single prompt into a long chain of reasoning, retrieval and tool use.

A single Vera Rubin NVL72 rack, the design once labelled NVL144, packs 72 Rubin GPUs across 144 dies with 36 Vera CPUs, and NVIDIA quotes roughly five times the inference and three and a half times the training performance of Blackwell, with up to ten times the agent throughput at scale. The platform now operates as a pod of five purpose built racks behaving as one supercomputer, cooled entirely by liquid, with cable free trays that drop a rack installation from two hours to about five minutes. It also introduces Spectrum-X Ethernet Photonics, co-packaged optics meant to make million GPU factories practical. With a proven open MGX design, NVIDIA says 150 partners in Taiwan alone, across more than 350 factories in 30 countries, are ramping it.

We took Rubin apart in detail when it was first unveiled, so rather than repeat the teardown here, this is the moment it stops being a roadmap and becomes a shipping product. If you want the deep architecture walk through, that is a separate read.

Computex 2026 keynote slide titled Vera Rubin in Full Production, listing ODM and OEM partners including Dell Technologies, HPE, Lenovo and Supermicro, with Jensen Huang on stage in front of a long row of Vera Rubin server racks. — **Vera Rubin in full production.** The data center engine behind the agentic AI pitch, with the supply chain to match. Source: NVIDIA Computex 2026 keynote.

05Nobody wants a computer. They want a factory.

The most revealing slide of the keynote was not a chip at all. It was an ecosystem chart for something NVIDIA calls DSX, its framework for the people who build AI factories: the clouds, the power and cooling vendors, the construction firms, the storage makers and the server builders, all arranged as layers of a single stack. Huang framed it bluntly. The world is in the middle of what he called the largest infrastructure build out in human history, and he means it literally, because in this business compute is revenue.

The economics are brutal and clarifying. A one gigawatt AI factory now costs somewhere between 50 and 100 billion dollars to stand up. At that scale, every watt that does useful work is income and every idle watt is loss, which is why Huang kept repeating that throughput per watt is revenue, and why DSX includes a configuration he says delivers 40 percent more GPUs inside the same power budget. The framework's operating layer is open source. NVIDIA wants to be the standard the whole industry builds on, not just a supplier to it.

NVIDIA DSX AI Factory Ecosystem slide grouping dozens of partners by layer, from AI clouds and AI factory software through design and construction, energy and cooling, infrastructure and facilities, down to compute systems, with a rendered AI factory floor behind Jensen Huang. — **The DSX AI factory ecosystem.** NVIDIA is no longer selling boxes, it is selling the blueprint for the building that holds them. Source: NVIDIA Computex 2026 keynote.

Underneath the partner logos sits the genuinely radical idea, and a tech writer at the event put it best: for 40 years every computer was designed around the assumption that a human was using it. NVIDIA's argument is that the primary user of the next generation of computing is not a person at all. It is an autonomous agent that receives a goal, breaks it into steps, picks its own tools and checks its own work. Build the machine for the agent, not the human, and the whole design changes. That single reframing is what ties a laptop chip, a data center rack and a robot into one coherent story.

06The shovel maker is selling gold now

For years the comfortable take on NVIDIA was that it sold the shovels in a gold rush and stayed politely out of the mining. That is over too. In Taipei the company shipped Nemotron 3 Ultra, the largest model in its open weights family, with roughly 550 billion total parameters and about 55 billion active per token through a hybrid Mamba and Transformer mixture of experts design. It uses tricks worth naming: a latent mixture of experts that compresses tokens before routing, multi token prediction for faster generation, and Mamba layers that make a one million token context window practical rather than theoretical.

The honest scorecard is more interesting than a victory lap. By the independent Artificial Analysis Intelligence Index, Nemotron 3 Ultra scores 48, which makes it the strongest open weights model to come out of the United States, comfortably ahead of Google's Gemma 4 and OpenAI's open release. It is not the overall open frontier. That title sits with the Chinese model Kimi K2.6 at 54. NVIDIA's own comparison slide was refreshingly unguarded about this, showing the Ultra winning on agent productivity, instruction following and long context while trailing rivals on long horizon planning, coding and broad knowledge work.

Where Nemotron 3 Ultra leads

Top United States open weights model, index score 48
Long context handling near the top of the field
Agent productivity and instruction following
Speed, with 300 plus tokens per second served

Where it trails

Overall open frontier, held by Kimi K2.6 at 54
Long horizon planning, behind GLM
Coding, behind Kimi
Broad knowledge work, behind GLM

Announcing NVIDIA Nemotron 3 Ultra benchmark slide comparing the 550 billion parameter model against GLM 5.1, Kimi K2.6 and Qwen3.5 across agent productivity, long horizon planning, coding, instruction following, knowledge work, professional tasks and long context. — **NVIDIA's own benchmark slide.** Note the bold numbers in the rival columns. A chip company that now ships a frontier class open model, and is candid that it does not win every row. Source: NVIDIA Computex 2026 keynote.

There is a delicious tension in all this. The same week NVIDIA shipped a model that competes with what software companies sell, Huang stood up and argued that software companies are more valuable than ever, not less. His evidence was that useful AI has arrived and that developer activity is surging, with commits on platforms like GitHub running far ahead of last year. His logic is consistent, if self serving: agents are only as useful as the software they can act on, so an explosion of agents is an explosion of demand for good software, and for the compute underneath it. Tokens, he said, are now profitable units of revenue. NVIDIA intends to sell you the tokens, the chip that generates them, and increasingly the model in between.

07And then it stood up and grew a body

The last act was the most cinematic and, fittingly, the least finished. NVIDIA announced the Isaac GR00T Reference Humanoid Robot, billed as the first open humanoid reference design, built on the Jetson Thor compute module and the Isaac GR00T software platform. The body comes from Unitree, the H2 Plus chassis, fitted with Sharpa five finger hands for dexterous manipulation. The brain is a Jetson AGX Thor with a Blackwell GPU and, once again, 128GB of unified memory, the same number that runs through this entire keynote like a watermark.

The point is openness as a land grab. By giving research labs at Ai2, ETH Zurich, Stanford and UC San Diego a shared, full stack platform spanning simulation, training, evaluation and on robot inference, NVIDIA is doing for humanoid robotics what CUDA did for AI: making its tools the default that an entire field learns on. Pair it with Cosmos 3, the open physical AI model NVIDIA also showed, and the ambition is clear. Huang called humanoid robots a multitrillion dollar opportunity, and he was not being shy about who intends to own the substrate.

// stay skeptical As has become a tradition at these humanoid reveals, there was no working robot on stage, only renders and a reference design. The hardware is promised from Unitree in late 2026, with developer workflows arriving on GitHub and Hugging Face. A reference design and an open platform are real and useful things. A robot that reliably does useful work in a warehouse is a much harder thing, and the gap between the two is where most robotics hype quietly goes to die.

08The map of the empire

Step back from the individual announcements and a single picture forms. NVIDIA now has a product at every altitude of computing, from the chip on your lap to the factory that trains the models to the robot that carries the work into the world, and the thing binding all of it together is not any one piece of silicon. It is CUDA and the software platform wrapped around it, the layer that makes the laptop, the rack and the robot speak the same language. That is the moat, and that is why a rival cannot answer this with a single faster chip.

NVIDIA AI Platform keynote slide showing the full software stack, with a row of CUDA-X libraries including cuDNN, Magnum IO, cuOpt, cuVS, cuDF, Omniverse, TRT-LLM, Dynamo, NeMo, PhysicsNeMo, Cosmos and Nemotron above a hardware row spanning Hopper, RTX PRO Blackwell, Grace Blackwell, Vera Rubin and ConnectX networking, under an NVIDIA and Nebius partnership banner. — **The platform, in one slide.** The libraries across the top and the hardware along the bottom are the same CUDA stack, drawn the way NVIDIA draws it. The Nebius banner up top is the tell: even the clouds renting this out plug into the same layer. Source: NVIDIA Computex 2026 keynote.

One company, every layer. The announcements are not separate products, they are floors of the same building, and CUDA is the staircase that connects them. Diagram by Silicon Tales.

09The verdict

So, is NVIDIA still a GPU company? Only in the way that Amazon is still a bookshop. The graphics processor is the seed everything grew from, and it remains at the center, but the company that walked off stage in Taipei sells the chip in your laptop, the engine in the data center, the framework for the factory, the model that thinks and the robot that acts. The strategy has a name that Huang likes, extreme co-design, and the unnerving thing for competitors is that it appears to work. When one company controls every layer, it can squeeze gains out of the seams between them that nobody buying parts from four vendors can match.

The risks are the obvious flip side. A computing world this dependent on one company is fragile, and regulators in more than one capital are already circling. The distance between the leak and the truth on RTX Spark, and the missing robot on stage, are reminders that the keynote is a sales document, not a delivery receipt. The local AI claims need reviewers, the gaming numbers need testing, and the robot needs to actually stand up and do something useful before the multitrillion dollar language means anything.

But the direction is not in doubt, and pretending otherwise is the real mistake. For 40 years we built computers for people. NVIDIA just spent two hours arguing, with a product at every level to back it up, that the next computer is built for the agent, and that it intends to be that computer top to bottom. A long time ago, NVIDIA used to be a GPU company. After Taipei, calling it one is like calling the ocean a puddle.

10Frequently asked questions

What is the NVIDIA RTX Spark?

RTX Spark is NVIDIA's first consumer PC superchip, a single Windows on Arm system on chip that pairs a Grace CPU of up to 20 Arm cores with a Blackwell GPU of up to 6,144 CUDA cores and up to 128GB of unified LPDDR5X memory. It is built on the same GB10 foundation as the DGX Spark and is aimed at running AI agents, creator workloads and games locally. RTX Spark laptops and mini PCs are due in fall 2026.

Is RTX Spark the same chip as the leaked NVIDIA N1 and N1X?

Yes. The chips that circulated for months under the codenames N1 and N1X are the silicon NVIDIA officially branded RTX Spark. The leaked spec sheets were not accurate, though. One widely shared leak claimed a top configuration with more than 20,000 CUDA cores, while the confirmed top configuration is 6,144 CUDA cores, a gap of more than three times.

How powerful is the RTX Spark GPU?

The top configuration carries 6,144 Blackwell CUDA cores, the same count as a desktop GeForce RTX 5070. Because it runs inside a thin laptop or small desktop, real performance will land below the desktop card, and NVIDIA's gaming claim of 1440p above 100 frames per second has not yet been independently tested.

When can I buy an RTX Spark laptop?

NVIDIA says RTX Spark laptops and compact desktops will be available beginning in fall 2026, with around 30 laptops and more than 10 desktops from Acer, Asus, Dell, Gigabyte, HP, Lenovo and MSI. Microsoft's Surface Laptop Ultra is expected later in the year. Pricing has not been announced, and the first wave targets the premium end of the market.

Should you buy an RTX Spark, or is it overhyped?

For most buyers, the honest answer is no. The headline feature, running a large language model locally, needs the expensive 128GB configuration, and a cloud API delivers a faster, smarter model for a fraction of the upfront cost. Gamers get more frames per dollar from a conventional laptop, and Windows on Arm still struggles with software that was never compiled for it. RTX Spark makes sense mainly if you specifically need local AI for privacy or offline reasons, or want one quiet machine that does AI, creation and gaming. The intriguing wildcard is that NVIDIA is the first to pair Arm Windows with a real GPU, the missing ingredient that sank the earlier attempts.

What is Nemotron 3 Ultra and is it the best AI model?

Nemotron 3 Ultra is NVIDIA's largest open weights model, with roughly 550 billion total parameters and about 55 billion active per token. By the Artificial Analysis Intelligence Index it is the strongest United States open weights model, scoring 48, ahead of Gemma 4 and gpt-oss. It is not the overall open frontier, which the Chinese model Kimi K2.6 holds at 54, and NVIDIA's own slide shows it trailing rivals on planning, coding and knowledge work.

Is NVIDIA still a GPU company?

Not in any narrow sense. NVIDIA now designs the CPU and GPU inside consumer laptops with RTX Spark, the rack scale engine in the data center with Vera Rubin, the framework for whole AI factories with DSX, the open models that reason with Nemotron 3 and Cosmos 3, and the reference design for humanoid robots with Isaac GR00T. The GPU is still at the center, but it has become the entry point to a full stack bound together by CUDA.