XDOF Raises $70M to Build the Data Backbone for the Robotics AI Race

Nobody talks about the boring part of building robots. Everyone wants to discuss the model, the hardware, and the sci-fi future where machines fold your laundry. But the actual unglamorous work of collecting the data that makes any of that possible? That part gets skipped.

XDOF is not skipping it.

The robotics training data startup raised $70 million in funding and emerged from stealth on June 17, 2026. It is building the infrastructure that AI labs need to train robots on real-world physical tasks. And the timing is not accidental. Just weeks before the announcement, OpenAI confirmed it was reviving its robotics program, the same one it shut down in 2021. Physical AI is the next major frontier, and every serious lab knows it.

What Is XDOF and What Does It Do?

XDOF (pronounced “ecks-doff”) was founded in October 2024 by Philipp Wu (CEO), Fred Shentu (CTO), and Nemo Jin (COO). The name is a play on “degrees of freedom,” which refers to the number of independent movements a robot can make. The X means no limits on those movements. That is the ambition.

The team comes from Covariant, Meta, and Tesla. Before starting XDOF, Wu and Shentu worked on a UC Berkeley project called GELLO, a low-cost teleoperation system that lets human operators control robotic arms to generate training data. It became a widely cited paper in robotics circles. A lot of people had the same problem. GELLO proved the solution had legs.

So they built a company around it.

XDOF does not build robots. It does not build AI models. What it builds is the engine underneath both. Data pipelines. Collection tools. Annotation systems. The stuff that frontier labs technically could build themselves but really do not want to.

Why Did XDOF Raise $70 Million?

Here is the kicker. Training a language model is hard, but you can scrape the internet for data. Training a robot to interact with the physical world? You cannot scrape anything. You have to go out and physically collect it. With real hardware. In real environments. At scale.

That costs money. A lot of it.

XDOF needs fleets of robots. Global teams of trained human teleoperators. Proprietary wearable sensors that it is building from scratch. And full data annotation pipelines running in parallel. CEO Philipp Wu has been direct about the scope of the operation: they need data infrastructure spanning hundreds of thousands of square meters with hundreds of robots requiring ongoing maintenance and calibration.

The $70 million makes that possible. And it needs to happen fast, because the window will not stay open forever.

Who Invested in XDOF’s $70M Funding Round?

The investor list matters here. This was not a quiet seed round from a couple of angels. XDOF’s $70M raise was backed by:

Thrive Capital
Spark Capital
Andreessen Horowitz (a16z)
Lux Capital
WndrCo

That is a serious lineup. These firms are not writing checks into robotics data infrastructure out of curiosity. They see a standalone category forming, one that sits separate from model development and has its own compounding value. When a16z, Thrive, and Spark are all in the same round, that is a signal worth paying attention to.

What Problem Is XDOF Solving for AI and Robotics Labs?

Let’s be honest about what is happening in the AI industry right now. Every major lab wants robots. OpenAI is back in the game. Google DeepMind is pushing hard. The competitive pressure is real, and it is accelerating.

But wanting robots and being able to train them are two very different things.

To train a capable robot, you need physical world data at a scale that simply does not exist yet. And building the systems to generate that data in-house means warehouse operations, hardware fleets, trained staffing, and annotation infrastructure running around the clock. Most labs would rather focus on the model.

So they outsource the rest. XDOF is where they outsource it to.

The company’s core argument is that physical AI’s biggest bottleneck is not compute and not model architecture. It is the data feedback loop. Get that right, and everything else can move. XDOF already has 20 active customers, including several frontier AI labs that Wu is not yet able to name publicly.

What Is the ABC-130K Dataset XDOF Just Released?

At the same time as the funding announcement, XDOF dropped ABC-130K. It is the largest open-source bimanual robot manipulation dataset in the world, built in collaboration with researchers from UC Berkeley, Carnegie Mellon, MIT, and Amazon FAR.

The numbers:

130,000 trajectories of robotic manipulation data
300 hours of simulation data
100 hours of evaluation data

XDOF has already used this dataset to train robots on precision tasks. Folding T-shirts. Flattening cardboard boxes. Placing AirPods into their cases. The dataset is live on Hugging Face, free and open to anyone in the research community.

And there is a strategic logic to releasing it for free. Before you charge a frontier lab for bespoke data collection, you show them what your data quality actually looks like. ABC-130K is that proof.

How Does XDOF Collect Robot Training Data?

XDOF runs what Wu calls a three-tier data pyramid.

Tier 1 is the most valuable. Human operators directly control the specific robot being trained. Task-specific, high-fidelity, no shortcuts.

Tier 2 uses GELLO-style devices to collect broader manipulation data applicable across multiple robot types.

Tier 3 is egocentric data. Humans wear XDOF’s proprietary sensors and perform everyday tasks. First-person motion data for hand-tracking and physical interaction learning.

Hardware choices cascade in ways most people underestimate. Pick the wrong camera, and the data will have problems you will not spot until the robot starts failing in the field. Wu is clear on this. The pipeline has to be designed right from the beginning, or the data is quietly broken from day one.

Why Are Top AI Labs Paying XDOF Instead of Doing It Themselves?

Because building this in-house is not a side project. It is an entire company.

The reality is, most AI labs are not in the business of managing warehouse-scale robot fleets, recruiting and training global teams of human teleoperators, or maintaining calibration systems across hundreds of machines. Their job is to build great models. Not to run operations at the scale of a mid-size logistics company.

So they pay someone else to do it. Right now, that someone is XDOF.

With 60 employees, 20 customers, $70M in the bank, and the biggest open-source robotics dataset ever published, XDOF is not betting on the physical AI wave. It is building the infrastructure that the wave runs on. And the labs writing the checks already know it.

The data bottleneck in physical AI is real. Everyone in the industry feels it. XDOF just raised $70 million and released the largest robotics dataset in existence on the same day. That is not a coincidence. That is a company that knows exactly what problem it is solving and exactly how fast it needs to move.

What Is XDOF Robotics

More Startups News

Read about – Startup business models

Read in – Startup Directory

Read about Solo businesses

Swapnil Gupta

Hi Friends, This is Swapnil; I love reading and sharing knowledge. Currently working as a content writer at startupsunion.com. You all can hang out with me here.