There’s a warehouse just outside Boston where the sailcloth for the USS Constitution was made in the late 1700s. Somewhat ironically, that warehouse now contains about 100 humanoid robots who are busily putting things in boxes, taking them out again, stacking them, dropping them, and even folding clothes.

Mostly, however, they’re failing.

They’re all called Sonny. They are vaguely humanoid, not explicitly, like a minimalist artist’s conception of a humanoid robot. None has been on site longer than two months, and some arrived a few days ago. Each time they succeed in a task, a robot trainer gives them a tiny digital reward. Each time they fail, it’s logged, labeled, and fed back – along with all the successes – into the machine learning model that will ultimately run Sonny and all his brothers and sisters around the country. And given they’re all basically just born … they’re failing much more than they’re succeeding.

But that’s all perfectly according to plan.

I’m in DF1 — Data Factory One — and according to its operator and owner, Tutor Intelligence, it's the largest robot data collection operation in the United States. Co-founder and CEO Josh Gruenstein compares it, only half-joking, to the Large Hadron Collider: a giant, expensive, purpose-built instrument of discovery, designed to answer one question.

Can you actually scale your way to a generally capable industrial robot?

Nobody has done it yet. Tutor's bet is that they can be first.

Tutor’s contrarian thesis: nobody has built a robot product

Most robotics coverage focuses on hardware: the form factor, the degrees of freedom, the dexterity of the hands. (Guilty as charged!)

Gruenstein argues that misses the actual problem.

"Nobody has ever successfully built robot products," he told me during a tour of the Watertown, Massachusetts facility. "People have built robotic solutions providers , for narrow end users. But what does it mean to build one robot that can do the same task for thousands of customers?"

In other words, to build a general-purpose robot.

Gruenstein’s distinction matters. Industrial robotics has existed for over half a century and Japanese, German, Danish and some American firms have built significant businesses around custom automation. But they are typically one-customer relationships, deeply specialized, expensive to deploy and impossible to redeploy elsewhere. The classic robotics startup pattern, Gruenstein argues, is to bet the company on a single enterprise contract, win it, and then discover you've built something that doesn't generalize to anyone else.

Tutor is structured to avoid that trap.

The company describes itself as a "robotics research and deployment company" that is vertically integrated across foundation model research, robot manufacturing (assembled in-house in Watertown), and the deployment and maintenance of robots at customer sites. The thesis is that the data flywheel only works if you own all of it.

A recent Bessemer report on the future of humanoid robots agrees, by the way. The report argues that full-stack robotics companies are best positioned in the current market thanks to the data-model-capability flywheel: the more data you have, the better your model is, the more capable your robots become.

“All companies will converge to want to own both the platform and the application layer,” says Gruenstein. "One of the biggest reasons is data."

Two robots, one architecture

Tutor manufactures and delivers several robots already. One, Cassie, is already in the field, and it’s a 2,000-pound industrial robot designed to manipulate boxes and pallets, the workhorse units of B2B distribution and manufacturing. Cassie isn’t a humanoid and doesn't try to be.

"Cassie is going to outwork a humanoid robot 24/7," Gruenstein said.

Sonny, on the other hand, while built on the same hardware and software architecture as Cassie, is optimized for tabletop manipulation tasks: picking, packing, sorting. Each Sonny has six degrees of freedom per arm (the mathematical minimum to reach any position in 3D space), two arms for throughput and fault tolerance, and four cameras including ones mounted on the hands themselves. The grippers are FINRAY-style — bio-inspired, compliant, 3D-printed, deliberately simple. More sophisticated hands will arrive when Sonny exhausts the training possibilities in these low-tech grippers.

Sonny has wheels, not legs, and that because it’s more efficient: you can carry more battery, you’re more stable, and you don’t waste tech on features that don’t really buy you any needed capability.

No Sonny units have shipped to customers yet. Gruenstein hopes to start customer pilots before the end of this month, but he's careful to hedge: "There's epistemic uncertainty attached with any timelines."

Most humanoid robotics companies prefer the polished demo video. Tutor is showing us DF1 while the robots are still learning and visibly struggling.

"In the robotics community, there's been a very demo-oriented approach. The success criteria is we want to show the robot doing the thing once, get a video, and post that video. That's a culture we're going to have to move away from as robotics moves from something that exists only in the lab to something that exists in the field."

SKU coverage: a metric robot makers aren’t tracking yet

While Sonny is still in the lab, Cassie is already doing real work in the field. One of Tutor's customers, Productive — a third-party logistics firm specializing in custom kitting and contract packaging — deployed its first Cassie palletizer about three months ago.

CFO and co-owner Paul Baker walked me through what he's actually measuring.

"One of the metrics we look at when we evaluate robotics is what we call SKU coverage," Baker said. "We handle tens of thousands of different items throughout the year. A human's SKU coverage is 100%. A robot's is a small fraction of that."

The math gets unforgiving fast. Productive will assemble roughly 30 million kits this year, each containing 10 to 15 items — meaning the company will execute somewhere around half a billion pick-and-place operations across thousands of distinct products: band-aids, hammers, lip balm, body wash, fishing lures, surgical kit components.

"For a robot, it might not be enough work, because it might only be able to handle a portion of those SKUs," Baker told me. "If your SKU coverage is too low, we can't keep the robot busy even for a shift."

That’s a metric I haven’t heard of before, but I like it a lot. SKU coverage cuts through a lot of humanoid robotics hype. Most demo videos show a robot doing one task again and again, but in a lot of the world’s logistics and warehousing facilities, that’s not very realistic.

Productive’s question is: across our actual product mix, what percentage of our work can you do? Right now, getting to 25-40% would be meaningful. Hitting 100% – the human baseline — is a long way off.

Baker is also running humanoids from Avatar (wheel-based, two arms, doing teleoperated pick-and-place with a roadmap to autonomy) and cobots from Blue Sky Robotics. He's bullish on wheels over legs for warehouse work: "I can't think of really any workflows in our warehouse that would require legs."

The capital is arriving fast

Whatever you think of the timelines, the capital flow is real.

Ola Simino, who leads physical AI for AWS’s Generative AI Innovation Center, cited investment numbers that have moved sharply in the past 18 months. Roughly $7 billion went into robotics and physical AI startups in 2024. In 2025, that number was $40.7 billion: about 9% of all global venture capital deployment.

That’s an important signal that robotics is suddenly at the center of global innovation.

Simino frames the demand side around labor shortages that are already a problem: there are a projected 1.9 million unfilled manufacturing jobs in the U.S. by 2033, she said, along with half a million current openings in construction (where 40% of the existing workforce is expected to retire within a decade). She also cited a World Health Organization estimate of an 11 million healthcare worker shortfall by 2030.

Inside Amazon’s own fulfillment network, now over a million deployed robots as of September, the company reports 25% efficiency gains and a 40% reduction in workforce injuries.

Whether Tutor’s bet pays off depends to some degree on whether DF1 actually produces what Gruenstein hopes: a foundation model general enough to drop a Sonny into a customer site and have it perform at industrial standard with minimal task-specific tuning.

That’s a high bar. Nobody has cleared it, and Tutor won’t get there immediately.

But standing in the middle of a hundred robots that are all visibly, publicly, unembarrassedly trying and failing and failing and failing … I get the sense Tutor is at least asking the right question.

And the good news is that you don’t have to solve 100% of the problems to get to useful work: the 25-40% SKU coverage that Baker mentioned. Anything over that, you’re into the gravy.