M4Human

A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction

1NTU, 2University of Edinburgh, 3UPenn, 4UCL, 5MIT

Accepted by CVPR 2026

Interpolate start reference image.

Overview of M4Human: the largest multimodal dataset for high-fidelity mmWave radar-based human motion sensing. It covers diverse free-space motions (e.g., rehabilitation, exercise, and sports) beyond simple in-place actions, with high-quality marker-based motion annotations. Such diversity supports a broad range of human sensing tasks, including tracking, human mesh recovery, action recognition, and human motion generation, as well as privacy-preserving applications in elderly care, rehabilitation, robotics, and VR gaming.

Abstract

Human mesh reconstruction (HMR) provides direct insights into body-environment interaction, which enables various immersive applications. While existing large-scale HMR datasets rely heavily on line-of-sight RGB input, vision-based sensing is limited by occlusion, lighting variation, and privacy concerns. To overcome these limitations, recent efforts have explored radio-frequency (RF) mmWave radar for privacy-preserving indoor human sensing. However, current radar datasets are constrained by sparse skeleton labels, limited scale, and simple in-place actions.

To advance the HMR research community, we introduce M4Human, the current largest-scale (661K-frame) (9 times prior largest) multimodal benchmark, featuring high-resolution mmWave radar, RGB, and depth data. M4Human provides both raw radar tensors (RT) and processed radar point clouds (RPC) to enable research across different levels of RF signal granularity. M4Human includes high-quality motion capture (MoCap) annotations with 3D meshes and global trajectories, and spans 20 subjects and 50 diverse actions, including in-place, sit-in-place, and free-space sports or rehabilitation movements. We establish benchmarks on both RT and RPC modalities, as well as multimodal fusion with RGB-D modalities. Extensive results highlight the significance of M4Human for radar-based human modeling while revealing persistent challenges under fast, unconstrained motion. The dataset and code will be released after the paper publication.

Sensor System Setup

Interpolate start reference image.

Overview of the system setup: M4Human designs a multimodal sensing platform with high-precision marker-based MoCap system. Appropriate calibration and synchronization workflow are designed for accurate alignment between modalities and annotations.

Dataset Videos

Video of multi-Modal sensor data and MoCap Mesh annotations in M4Human. We verify the accuracy of alignment between mmWave radar data and MoCap annotations under diverse actions.

In-Place Daily Actions

In-Place Rehabilitation Actions

Non-In-Place Daily Actions

Non-In-Place Sports Actions

Dataset Scale

Interpolate start reference image.

Comparison of M4Human with prior datasets († denotes non-public data): Overall, M4Human is the largest RF-based dataset with multi-granularity motion annotations across diverse sensing tasks. It provides both raw radar tensors (RT) and filtered radar point clouds (RPC) for high-fidelity HMR. Human body annotations are obtained with a high-precision marker-based MoCap system rather than RGB(D) images (entries marked with *)

Interpolate start reference image.

Action set in M4Human. M4Human extends beyond simple in-place activities to complex, non-in-place rehabilitation and sports.

mmWave HMR Prediction Results

We provide visualization of frame-level Human Mesh Reconstruction (HMR) Results.

BibTeX

@article{fan2025m4human,
    title={M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction},
    author={Fan, Junqiao and Zhou, Yunjiao and Yang, Yizhuo and Cui, Xinyuan and Zhang, Jiarui and Xie, Lihua and Yang, Jianfei and Lu, Chris Xiaoxuan and Ding, Fangqiang},
    journal={arXiv preprint arXiv:2512.12378},
    year={2025}
  }
      

License

M4Human is released under the CC BY-NC 4.0.