Physical AI Data Infrastructure

Real-world multimodal human interaction data for training robotics foundation models, world models, and embodied AI systems.

Egocentric RGB · Stereo Depth · 6-Axis IMU
Hand Pose · Body Pose · Action Labels · Temporal Sync

Dataset Request

Enterprise inquiries responded to within 24 hours.
Custom collections available.

500K+

DEMONSTRATED TASKS

RECORDED IN SITU

25+

WORK ENVIRONMENTS

FULLY INSTRUMENTED

7

MODALITY STREAMS

HARDWARE SYNCHRONIZED

±2ms

FRAME ALIGNMENT

SYNC TOLERANCE

HDF5

STORAGE FORMAT

NATIVE MODEL INPUT

The physical world,
instrumented.

DatraAI deploys egocentric wearable capture systems in factories, warehouses, kitchens, and service environments. We record synchronized multimodal streams — RGB video, stereo depth, 6-axis IMU, hand and body pose — then structure, timestamp-align, and annotate every frame to produce training-ready datasets for physical AI systems.

IField Capture

Collection

Egocentric wearable rigs

deployed on real workers.

Real tasks. Real motion.

IIData Engineering

Processing

Frame-level sync

across all modalities.

Annotation + QA pipeline.

IIIDataset Delivery

Delivery

Ready-to-train format

HDF5 / JSON / MP4+metadata

Exclusive or non-exclusive.

Seven modalities.
One synchronized stream.

±2ms sync precision
across all streams

01

RGB VIDEO

Egocentric 1080p/30fps, 98° FOV wearable capture

1920×1080 · 30FPS · H.264

02

STEREO DEPTH

Binocular depth estimation for 3D spatial understanding

Disparity · Point Cloud · 30FPS

03

6-AXIS IMU

Accelerometer + gyroscope at 200Hz, timestamped

Accel · Gyro · 200Hz · ±2ms

04

HAND POSE

21-keypoint skeleton, per-frame grasp and manipulation labels

MediaPipe · 21 Keypoints · 30FPS

05

BODY POSE

Full-body skeleton for locomotion and posture datasets

33 Landmarks · 3D · 30FPS

06

ACTION LABELS

Task-level and frame-level action segmentation

Verb · Object · Phase · Confidence

07

TEMPORAL ANNOT.

Start/end timestamps, phase labels, interaction boundaries

Millisecond · Phase · Boundary

Collected across
the physical world.

Expanding to: Healthcare · Agricultural · Gig Workers · Residential

Data Pipeline

FROM WORKER TO WEIGHT UPDATE

01

CAPTURE

Wearable head-mounted rigs & sensors

02

INGEST

Lossless sensor stream aggregation & upload

03

SYNC

Hardware-level temporal synchronization (±2ms)

04

ANNOTATE

Semantic action parsing & joint kinematics

05

DELIVER

Unified HDF5 format & metadata schema

RGB · 30FPS

DEPTH · 30FPS

IMU · 200HZ

POSE · 30FPS

LABELS · FRAME-LEVEL

Dataset Catalog

Browse the
dataset catalog.

Access synchronized multimodal datasets from industrial, warehouse, household, and service environments. Exclusive and non-exclusive licensing available.

BROWSE DATASETS →REQUEST CUSTOM COLLECTION →
“The bottleneck for physical AI is not compute. It is data — real-world, embodied, multimodal, grounded in physical reality. DatraAI is building the collection and delivery infrastructure that makes training-ready physical AI data as accessible as compute.”

RGBDEPTHIMUPOSEACTION