Enrichment services

Turn existing datasets into higher-value training assets.

HumanoidLayer enriches public or private robotics datasets with labels, metadata, QA, and format structure so teams can train, evaluate, and compare models faster.

Designed for robot learning

Outputs are scoped around robotics fields: actions, objects, phases, outcomes, sensors, source context, formats, and license notes.

Metadata cleanup

Normalizes source, license, task, robot, modality, and environment metadata.

Input data

Public or private dataset folders, manifests, or links.

Output

Dataset card, schema map, license notes, and searchable metadata.

Best for

Datasets that are valuable but hard to inspect or compare.

Request package

Action and object labeling

Adds action verbs, object categories, tool references, and interaction tags.

Input data

Video, RGB-D, or robot episode data.

Output

Frame, clip, or episode-level labels in buyer-preferred schema.

Best for

Manipulation, tool-use, warehouse, and household datasets.

Request package

Temporal segmentation

Splits long demonstrations into task phases, attempts, recoveries, and outcomes.

Input data

Continuous videos, teleoperation sessions, or demonstrations.

Output

Segment boundaries, phase labels, success/failure tags, and QA notes.

Best for

Long-horizon tasks and egocentric workflow video.

Request package

Language instruction generation

Creates concise task instructions and natural-language episode descriptions.

Input data

Episodes with video, actions, or metadata.

Output

Instruction fields, captions, and task taxonomies ready for VLA workflows.

Best for

Language-conditioned policy training and retrieval.

Request package

Format conversion

Packages datasets into LeRobot, HDF5, RLDS, WebDataset, Parquet, or custom schemas.

Input data

Raw assets, TFDS/RLDS, HDF5, MP4, folders, or manifests.

Output

Versioned data pack with schema notes and validation report.

Best for

Teams that need data to match an existing training stack.

Request package

QA and validation

Flags duplicates, corrupt files, missing metadata, low-quality clips, and schema drift.

Input data

Dataset pack, manifest, or source archive.

Output

QA report, exclusion list, quality signals, and review notes.

Best for

Commercial delivery, procurement review, and benchmark hygiene.

Request package

Custom annotation workflow

Designs a domain-specific labeling workflow for robotics teams.

Input data

Buyer taxonomy, sample data, target model use case, and acceptance criteria.

Output

Annotation protocol, pilot batch, QA rubric, and production estimate.

Best for

New tasks, specialized embodiments, and proprietary data.

Request package

Request Dataset Enrichment

Send the dataset, target labels, desired output format, and model workflow context. We will return a pilot scope, QA assumptions, and delivery plan.

Object labels
Action labels
Task phase segmentation
Success/failure tags
Scene metadata
Hand-object interaction labels
Affordance labels
Temporal segmentation
Language instruction generation
QA and duplicate detection
Format conversion
License and source metadata cleanup

Enrichment request

Use this for public datasets in the catalog or private datasets your team already controls.

Need new data instead of enrichment?

When an existing dataset cannot cover the target task, environment, embodiment, or modality, move into a custom collection pilot.

Request Custom Collection