Decart Lands $300M Series B to Break Nvidia CUDA Lock-In

Decart Lands $300M Series B to Break Nvidia CUDA Lock-In

Decart, the Israeli AI software startup, announced a $300 million Series B today (May 18, 2026) at a $4 billion valuation — and the most interesting detail isn’t the headline number, it’s the investor list. Nvidia participated alongside Radical Ventures, Atreides Management, Sequoia, Benchmark, Zeev Ventures, and individual investors including Andrej Karpathy, Michael Eisner, and the Yamauchi family (founders of Nintendo). The product Decart is selling: cross-chip portability for AI models — specifically, a software layer called DOS that lets AI workloads switch between Nvidia, Google, and Amazon chips without rewriting code. In other words: Nvidia is funding the company building the most credible threat to its own CUDA software lock-in. Decart $300M Series B is the strangest strategic bet of 2026, and worth understanding.

What’s actually new

The funding round itself is straightforward: $300 million led by Radical Ventures at a $4 billion valuation, bringing Decart’s total funding to $450M+. What’s distinctive is the product strategy. Decart ships three product lines: DOS (the AI optimization layer that makes models portable across chip vendors), Lucy (a real-time interactive video world model), and Oasis (a world model for physical AI and robotics simulations). The DOS product is the strategically critical one for the broader AI infrastructure landscape. It explicitly targets Nvidia’s CUDA moat — the software ecosystem that has locked AI workloads to Nvidia GPUs for over a decade.

Decart’s partnership with Amazon, announced as part of the round, is the practical demonstration of the value proposition: enterprise customers can use Decart’s technology to run AI applications across Amazon Web Services chips (Trainium, Inferentia) and Nvidia GPUs interchangeably, optimizing for cost, latency, or availability rather than being locked to any single vendor’s hardware. The strategic significance is large: if Decart’s DOS works as advertised at scale, it removes the primary technical reason buyers stay locked into Nvidia, even when AMD, Google TPUs, or AWS chips are cheaper or more available.

Why it matters

  • The CUDA lock-in is the most valuable software moat in tech. Nvidia’s market dominance isn’t just hardware — it’s the years of CUDA-specific tooling, libraries, and developer mindshare that make migrating away painful. A software layer that genuinely abstracts CUDA changes the strategic landscape for every AI vendor.
  • Nvidia’s investment is the most-discussed strategic puzzle of the week. Why does Nvidia fund a company building anti-Nvidia moats? Most plausible reading: Nvidia gets a seat at the table on portability, can shape the standard, and gains visibility into customer churn risk. Alternative reading: Nvidia knows portability is coming anyway and would rather be inside than outside.
  • Amazon’s partnership is a real bet against Nvidia. AWS has been steadily building its own AI chips (Trainium, Inferentia); the Decart partnership accelerates AWS’s ability to offer Nvidia-equivalent AI experience on AWS-controlled silicon.
  • It validates the cross-chip thesis. Multiple startups have tried to build CUDA alternatives or portability layers; few have raised $300M at $4B. Decart’s traction signals real enterprise demand.
  • Investor list signals confidence. Sequoia, Benchmark, Karpathy, and a Nintendo founder don’t typically co-invest in vaporware. The technical and strategic case both got serious diligence.
  • Israeli AI startup momentum continues. Decart joins a wave of well-funded Israeli AI infrastructure companies; the country’s AI ecosystem is having a defining year.

How to use it today

Decart’s DOS product is enterprise-targeted; most direct users will be infrastructure teams at companies running significant AI workloads. Below are concrete actions to take based on the announcement.

  1. If you’re an AI infrastructure leader: contact Decart for an evaluation. The pitch — cost reduction, capacity flexibility, vendor risk hedging — is potentially compelling for workloads above a meaningful spend threshold. The economic case for portability layers strengthens at scale.
  2. If you’re running production AI workloads on Nvidia only: document your CUDA-specific dependencies. CUDA kernels, NCCL libraries, TensorRT optimizations, custom Triton kernels — each represents potential portability complexity. Knowing your dependency surface is the prerequisite for any portability decision.
  3. If you’re an enterprise IT leader: factor potential portability into your AI infrastructure planning. The next 12-24 months will see meaningful changes in the cost-quality-flexibility frontier for AI workloads.
  4. If you’re a startup founder: study Decart’s product positioning. The pattern of “this product is unblocked by a fundamental shift in the market” applies to many opportunities in 2026.
  5. If you’re an investor: the Decart round (and adjacent companies like Modular, Lightmatter, Tenstorrent) signal real venture conviction in CUDA-alternative infrastructure. The category is investable.
# What CUDA portability typically looks like under the hood:

# Traditional Nvidia-CUDA stack:
#   Application
#     -> PyTorch / TensorFlow / JAX
#       -> CUDA libraries (cuDNN, cuBLAS, NCCL)
#         -> CUDA runtime
#           -> Nvidia GPU hardware

# Portability layers like Decart DOS:
#   Application
#     -> PyTorch / TensorFlow / JAX (unchanged)
#       -> Decart DOS layer  ←  intercepts here
#         -> CUDA libraries OR alternative (HIP, XLA, etc.)
#           -> CUDA runtime OR alternative
#             -> Nvidia / AMD / Google TPU / AWS Trainium / etc.

# What the portability layer must handle:
# - Kernel translation (CUDA kernels -> HIP, OpenCL, or equivalent)
# - Memory model differences (GPU memory hierarchies vary)
# - Performance tuning per backend (one-size-fits-all rarely optimal)
# - Distributed training primitives (NCCL alternatives)
# - Mixed-precision and quantization handling
# - Operator coverage (every PyTorch op needs target-backend equivalent)

# Why this is hard:
# - CUDA has years of optimization that competitors don't match
# - Custom kernels in CUDA may not have direct equivalents
# - Performance can degrade significantly with naive translation
# - New ops added to PyTorch / TF need timely portability layer updates

# Why it's solvable:
# - Major models converge on a relatively small set of operations
# - Compiler advances (MLIR, OpenXLA) reduce per-backend work
# - Hardware vendors increasingly publish optimized libraries
# - Enterprise demand makes the engineering investment worthwhile

# What to ask Decart (or competitors) before adopting:
# 1. Which models do you support today, with what performance?
# 2. What's the performance delta vs native CUDA on Nvidia?
# 3. What's the delta vs native HIP on AMD?
# 4. How do you handle custom CUDA kernels?
# 5. What's the migration cost from CUDA-native to your layer?
# 6. What enterprises are using this in production?

For developers and platform engineers, the immediate question is whether portability is worth the integration cost. The honest answer for most teams in 2026: not yet, but maybe in 18-24 months. Native CUDA performance and tooling maturity still beat portability layers for most workloads. Decart’s bet is that this gap closes meaningfully through 2027-2028, and that enterprises with significant AI spend will demand portability options once the gap is acceptable.

How it compares

Decart isn’t alone in the cross-chip portability space. The table below maps the competitive landscape for AI infrastructure portability in 2026.

Company / Project Approach Backing Maturity
Decart (DOS) Software portability layer + world models Radical, Nvidia, Sequoia, Benchmark, AWS partnership Series B; enterprise pilots
Modular (MAX, Mojo) New language + portability runtime GV, General Catalyst, others Productized; growing adoption
OpenXLA Open-source compiler ecosystem Google, AMD, Intel, others Used by Google internally; spreading
MLIR / LLVM Compiler infrastructure Open source, vendor-supported Foundation layer; widely used
AMD ROCm + HIP CUDA-compatible API for AMD GPUs AMD Mature on Linux; improving on Windows
Tenstorrent (TT-NN) Custom chip + custom stack Jim Keller; major investors Hardware shipping; software maturing
Cerebras Software Platform Vendor-specific stack for wafer-scale Self Production for specific customers

The clearest differentiator for Decart is its combination of cross-chip software (DOS) with strategic partnerships (Amazon) and major investor backing. Modular has been working on similar ambitions for longer but has had a more language-focused pitch. OpenXLA and MLIR are foundational infrastructure that other companies build on. AMD ROCm + HIP is the most direct CUDA-compatible alternative but is AMD-specific. Each plays a different role in the larger ecosystem; Decart’s positioning is the most direct attack on Nvidia’s software moat.

What’s next

Three threads to watch over the next 12-24 months. First, customer reference cases. Decart announcing one or two named enterprise customers running real production workloads through DOS would be the strongest validation. The AWS partnership alone isn’t enough; we need to see workloads moving. Second, performance benchmarks. Independent third-party benchmarks comparing DOS to native CUDA, native HIP, and other portability layers will determine credibility. Promises matter less than measured throughput-per-dollar across chip types. Third, Nvidia’s response. Nvidia’s investment can be read as alignment or as positional surveillance; how Nvidia treats Decart’s product over the next year — whether they cooperate on integration, compete with their own portability solutions, or do nothing — tells us whether Nvidia sees portability as an opportunity or a threat.

For the broader AI infrastructure market, the Decart round signals continued investor conviction that the Nvidia hardware monopoly will be challenged at the software layer rather than purely at the hardware layer. Hardware competitors (AMD, Intel, Google TPUs, AWS Trainium) have spent years catching up on raw silicon performance; software portability has lagged. If Decart and adjacent companies succeed, the second half of the 2020s could see meaningfully more chip diversity in AI workloads than the first half. That changes pricing dynamics, vendor power dynamics, and the strategic positioning of every AI-native company that runs significant compute.

Frequently Asked Questions

Is Decart’s DOS available to use today?

Available to selected enterprise customers via direct contact with Decart. Not yet a self-serve product for individual developers. Watch for productization through 2026-2027 as the company scales.

Why would Nvidia invest in a company trying to break their lock-in?

Several theories. First, Nvidia may believe portability is coming anyway and prefers to be inside the deal than outside. Second, having a seat lets Nvidia influence direction (compatibility, performance targets). Third, Decart’s adjacent products (world models) are real Nvidia customers regardless of DOS. Fourth, the financial return on the investment may matter more to Nvidia than the strategic threat, given Decart’s $4B valuation could become $40B if the bet plays out.

Does Decart’s DOS make AMD and Intel competitive with Nvidia overnight?

No. Hardware performance still matters; software portability is necessary but not sufficient. AMD’s MI400 series and Intel’s Gaudi offerings are technically competitive on some workloads; portability layers like DOS make adoption easier but don’t change underlying hardware capability.

Should I bet my AI infrastructure strategy on Decart’s DOS?

Not yet. Pilot it; evaluate performance and reliability; understand integration cost. For mission-critical AI workloads in 2026, native CUDA on Nvidia GPUs remains the safer choice. The interesting decision will be in 2027-2028 if DOS matures further.

How does Decart compare to Modular’s approach?

Modular is building both a new language (Mojo) and a portability runtime (MAX). Decart focuses more narrowly on portability without the language layer. Different bets on what the bottleneck is for cross-chip AI infrastructure. Both can succeed; both could be acquired; both could be displaced by open-source alternatives like OpenXLA.

What does this mean for Nvidia’s stock and competitive position?

Not material in the short term — Nvidia’s revenue is dominated by data center GPU sales that won’t be affected by Decart’s product for years. In the medium term (2-4 years), if portability layers mature broadly, Nvidia faces real pricing pressure as customers can credibly threaten to move workloads. Nvidia’s strategic response is one of the most-watched questions in semiconductor strategy through 2027.

Scroll to Top