CSCE 689 - Comparative Analysis of AI Video Generation Models for Robotic Manipulation Tasks
Task
Grok
Sora 2
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
T1
Pour water into mug
Grok
Sora
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
T2
Place spatula on pan
Grok
Sora
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
T3
Wipe crumbs off table
Grok
Sora
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
T4
Lift lid from pot
Grok
Sora
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
T5
Push cube to target marker
Grok
Sora
Veo2
Veo3.1Fast
Veo3.1Quality
Wan2.1
📋 Task Descriptions
T1: Pour water into mug
Tabletop scene with static camera. A robot arm picks up a glass bottle, tilts it over a ceramic mug, and pours it into the mug, then returns the bottle to the table.
T2: Place spatula on pan
A single robotic arm on a kitchen counter picks up a metal spatula from the table and carefully places it inside a frying pan.
T3: Wipe crumbs off table
Top-down camera above a table. A robot arm uses a yellow sponge to wipe a line of crumbs toward the right side of the table in one smooth motion.
T4: Lift lid from pot
Side view of a table. A robot arm grasps the lid of a cooking pot, lifts it straight up, then moves it to the right and sets it down.
T5: Push cube to target marker
Top-down camera. A blue cube sits on the left of the table and a red circular marker is on the right. A robot arm uses its gripper as a pusher to push the cube across the table until it stops on top of the red marker.