MP1: Mean Flow Tames Policy Learning in 1-step for Robotic Manipulation

Abstract

In robot manipulation, robot learning is becoming a prevailing approach. However, generative models within this field face a fundamental trade-off between the slow, iterative sampling of diffusion models and the architectural constraints of faster Flow-based methods, which often rely on explicit consistency losses. To address these limitations, we introduce MP1, which pairs 3D point-cloud inputs with the MeanFlow paradigm to generate action trajectories in one network function evaluation (1-NFE). By directly learning the interval-averaged velocity via the MeanFlow Identity, our policy avoids any additional consistency constraints. This formulation eliminates numerical ODE-solver errors during inference, yielding more precise trajectories. MP1 further incorporates CFG for improved trajectory controllability while retaining 1-NFE inference without reintroducing structural constraints. Because subtle scene-context variations are critical for robot learning, especially in few-shot learning, we introduce a lightweight Dispersive Loss that repels state embeddings during training, boosting generalization without slowing inference. We validate our method on the Adroit and Meta-World benchmarks, as well as in real-world scenarios. Experimental results show MP1 achieves superior average task success rates, outperforming DP3 by 10.2% and FlowPolicy by 7.3%. Its average inference time is only 6.8 ms—19× faster than DP3 and nearly 2× faster than FlowPolicy. Our code is available at https://anonymous.4open.science/r/xxxx.

MP1
Mean Flow Tames Policy Learning in 1-step for Robotic Manipulation

Anonymous AAAI submission

Real World Demos

Task: Hammer

MP1 (Ours)

DP3

Flow Policy

Task: Drawer Close

MP1 (Ours)

DP3

Flow Policy

Task: Heat Water

MP1 (Ours)

DP3

Flow Policy

Task: Stack Block

MP1 (Ours)

DP3

Flow Policy

Task: Spoon

MP1 (Ours)

DP3

Flow Policy

Simulation Robot Experiments

Hammer

Drawer Close

Pick Place

Assembly

Coffee Pull

Stick Push

Abstract

Comparison of Inference Speed and Success Rate Across Methods

Success Rate Curves on Meta-World tasks

The effect of the number of demonstrations on different methods.

MP1 Mean Flow Tames Policy Learning in 1-step for Robotic Manipulation

Anonymous AAAI submission

Real World Demos

Task: Hammer

MP1 (Ours)

DP3

Flow Policy

Task: Drawer Close

MP1 (Ours)

DP3

Flow Policy

Task: Heat Water

MP1 (Ours)

DP3

Flow Policy

Task: Stack Block

MP1 (Ours)

DP3

Flow Policy

Task: Spoon

MP1 (Ours)

DP3

Flow Policy

Simulation Robot Experiments

Hammer

Drawer Close

Pick Place

Assembly

Coffee Pull

Stick Push

Abstract

Comparison of Inference Speed and Success Rate Across Methods

Success Rate Curves on Meta-World tasks

The effect of the number of demonstrations on different methods.

MP1
Mean Flow Tames Policy Learning in 1-step for Robotic Manipulation