What Is XDOF Robotics: The $70M Startup Building the Data Engine Behind the Future of AI Robots

Everyone is talking about robots right now. OpenAI is back in the game. Figure AI is raising hundreds of millions. Google DeepMind is pushing physical AI harder than ever. But here is something most people are glossing over: none of these companies can actually train their robots properly. Not because the models are bad. Not because the hardware is not ready.

Because the data does not exist.

That is the problem XDOF Robotics walked out of stealth to solve in June 2026. And honestly? It might be the most important robotics company nobody had heard of until now. With $70 million raised and 20 paying customers already on the books, XDOF is not building the robots. It is building what makes robots actually work.

What Is XDOF Robotics and Why Is Everyone Talking About It?

XDOF Robotics is a Berkeley-based startup that collects, annotates, and delivers training data for physical AI systems. Founded in October 2024 by Philipp Wu, Fred Shentu, and Nemo Jin, the company officially came out of stealth on June 17, 2026 alongside a major funding announcement and the release of the largest open-source robotics dataset ever published.

The logic behind XDOF Robotics is straightforward once you hear it. Language models got smart because the internet existed. Decades of human writing, sitting there, ready to train on. Robots do not have that luxury. Teaching a robotic arm to pick up a glass, sort a package, or fold a shirt requires physical interaction data. The kind where you actually have to go out and collect it. In the real world. At scale.

YouTube videos? Low fidelity. Gig worker footage? Hard to reconcile with real physical laws. The reality is, nobody had built the right infrastructure to solve this at scale. So Wu, Shentu, and Jin built it themselves.

XDOF Robotics is part hardware company, part operations firm, part data platform. It designs the tools to collect robot interaction data, runs the physical operations behind that collection, then cleans and annotates everything into datasets that are ready for model training. For AI labs that want capable robots but have zero interest in managing warehouse-scale data operations, XDOF is the answer.

What Does XDOF Mean? The Story Behind the Name

The name is a direct reference to a core robotics concept: degrees of freedom (DOF). In robotics, degrees of freedom describe how many independent movements a robot can perform. A human arm from shoulder to wrist has seven. Figure AI’s latest humanoid robot reaches 30.

The X is the whole point. Wu describes it as standing for “arbitrary degrees of freedom. Unlimited degrees of freedom.” No ceiling. No fixed constraint on what robots should eventually be able to do.

It is a quietly ambitious name. Most startups pick something that sounds fast or clever. XDOF Robotics picked a term from mechanical engineering that basically says: we are building for everything, not just the narrow version of robotics that exists right now. And given the scale of what they are attempting, that ambition does not feel like marketing. It feels accurate.

The bottom line is this. XDOF Robotics is solving the problem that every serious physical AI company will eventually have to face. Training data for robots is scarce, structurally hard to collect, and brutally complex to manage at scale. With a $70 million raise, a world-class investor base, a growing list of frontier lab customers, and the largest open-source robotics dataset ever released, XDOF is making its case quietly and clearly. Not with noise. With infrastructure.

XDOF Raises $70 Million: Who Backed It and Why

In June 2026, XDOF Robotics closed a $70 million funding round. The backers include Thrive Capital, Spark Capital, Andreessen Horowitz, Lux Capital, and WndrCo. That is not a random collection of names. That is a who’s who of tier-one venture capital, all writing checks into the same bet at the same time.

So why now? Timing matters here.

Just weeks before XDOF’s announcement, OpenAI confirmed it was reviving its robotics program after shutting it down in 2021. That single decision sent a signal across the entire industry. Physical AI is no longer a side project sitting in a lab somewhere. It is a core priority for the most powerful AI companies in the world. And when that happens, the ecosystem around it moves fast.

XDOF Robotics had around 60 employees at launch. But here is the kicker: it already had approximately 20 paying customers, including several frontier AI labs. None of them have been named publicly. But the fact that multiple leading labs are already outsourcing this to XDOF instead of building it themselves says everything you need to know about how hard the problem actually is.

Why Robot Training Data Is the Biggest Problem in AI Right Now

Let’s be honest about something the AI industry does not say loudly enough. The hard part of building robots is not the model. It is not the chip. It is the data.

And not just any data.

XDOF Robotics was founded on a very specific insight: physical interaction data is structurally different from every other kind of AI training data. You cannot scrape it. You cannot crowdsource it easily. You have to go out, set up robots or operators in real environments, capture what happens when objects are touched and moved, and then make sense of all of it.

Philipp Wu hit this wall himself as a PhD student at UC Berkeley, where his research focused on teaching robots to learn from large-scale datasets.

“We didn’t have large-scale data to work with,” Wu said. “There was this chicken-and-egg problem. We first needed to actually collect data before we could even ask how to train a foundation model for robotics.”

And that is not just an academic headache. Building an in-house data operation means warehouse space. Robot fleets. Maintenance. Operator training. Calibration. It is operationally brutal. It is expensive. And it pulls focus away from the actual research these labs are trying to do. So they pay XDOF to do it instead.

How XDOF Collects Real-World Data to Train Robots

XDOF Robotics runs a three-tier data collection system. Each tier captures something different. Together they cover the full range of what robot training actually needs.

The first tier is deployment-robot teleoperation. Real robots, real environments, real interaction data captured at the source. The second tier uses the GELLO device, a low-cost teleoperation tool that lets a human operator physically control a robotic arm to generate training data. Wu and co-founder Fred Shentu built GELLO during their time at UC Berkeley. It became widely cited across the robotics research community because it addressed a shared problem nobody else had solved cheaply.

The third tier is different. Egocentric wearable sensors placed on human workers performing everyday tasks. This captures the natural variability of human movement, the kind that is genuinely hard to replicate in controlled settings. And that variability is critical. A robot that only learned from perfectly staged environments fails the moment something unexpected happens in the real world.

But data collection is only part of it. XDOF also handles cleaning, labeling, and annotation. Raw interaction footage is not training-ready. Someone has to structure it, label it, and make sure it actually teaches the model what you want it to learn. That full-stack capability is what separates XDOF Robotics from a simple data vendor.

What Is ABC-130K and Why It Matters for Robotics Research

On the same day XDOF came out of stealth, it released ABC-130K — the world’s largest open-source bimanual robot manipulation dataset. Built in collaboration with researchers at UC Berkeley, Carnegie Mellon, MIT, and Amazon FAR.

The numbers are significant. 130,000 robot interaction trajectories. 300 hours of simulation data. 100 hours of evaluations. Nothing in the open-source robotics world came close to this before.

Here is why bimanual matters specifically. Two robotic arms working in coordination is one of the hardest problems in physical AI. They have to work together fluidly, anticipate each other, and handle tasks that a single arm simply cannot do. Training for that requires enormous amounts of diverse, precisely labeled data. Until ABC-130K, no public dataset could provide that.

By releasing it openly through Hugging Face, XDOF Robotics did two things at once. It gave the broader research community access to something genuinely valuable. And it put XDOF’s data collection capabilities on public display in the most credible way possible. You cannot argue with 130,000 trajectories. The work speaks for itself.

Which AI Labs and Companies Are Already Using XDOF?

At launch, XDOF Robotics had roughly 20 paying customers including several of the world’s top frontier AI labs. Names have not been disclosed publicly. But the team’s background adds credibility here. XDOF’s founders and early hires come from Covariant, Meta, and Tesla. These are not people who stumbled into robotics. They built careers in it.

And the customer behavior itself is telling. These frontier labs are not hobbyists. They have enormous engineering teams and deep capital reserves. The fact that they are paying XDOF instead of building their own data infrastructure is a deliberate strategic choice. Keeping warehouse-scale operational complexity off the balance sheet, even as robotics becomes a research priority, is just smart resource allocation.

The market logic is simple. More companies across logistics, healthcare, manufacturing, and consumer electronics will deploy general-purpose robots. Every single one of them will need training data. XDOF Robotics is building the infrastructure layer that sits underneath all of it.

XDOF Raises $70M

More Startups News

Read about – Startup business models

Read in – Startup Directory

Read about Solo businesses

Swapnil Gupta

Hi Friends, This is Swapnil; I love reading and sharing knowledge. Currently working as a content writer at startupsunion.com. You all can hang out with me here.