ChannelLife Australia - Industry insider news for technology resellers
Story image

MIT & NVIDIA create framework for intuitive robot control

Yesterday

Researchers from MIT and NVIDIA have developed a framework that allows users to correct a robot's actions in real-time, employing feedback similar to that used in human interactions.

The framework has been designed to help robots perform tasks that align with user intent without needing additional data collection or retraining of machine-learning models. Instead, users can guide the robot using intuitive interactions, such as pointing to objects, tracing a trajectory on a screen, or physically nudging the robot's arm.

"We can't expect laypeople to perform data collection and fine-tune a neural network model. The consumer will expect the robot to work right out of the box, and if it doesn't, they would want an intuitive mechanism to customize it. That is the challenge we tackled in this work," says Felix Yanwei Wang, an Electrical Engineering and Computer Science (EECS) graduate student and lead author of the research paper.

Wang's co-authors include Lirui Wang PhD '24, Yilun Du PhD '24, and military figures such as Julie Shah, an MIT Professor of Aeronautics and Astronautics and the Director of the Interactive Robotics Group in MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), along with NVIDIA's Balakumar Sundaralingam, Xuning Yang, Yu-Wei Chao, Claudia Perez-D'Arpino PhD '19, and Dieter Fox.

The research, which will be presented at the International Conference on Robotics and Automation, showcases the framework's ability to provide a more accessible way for users to correct robot actions that are misaligned with their expectations.

By allowing human users to correct the robot's behaviour without inadvertently causing new errors, the team aimed to create actions that are aligned with user intent and feasible in execution. Wang explains, "We want to allow the user to interact with the robot without introducing those kinds of mistakes, so we get a behaviour that is much more aligned with user intent during deployment, but that is also valid and feasible."

The framework includes three interaction methods: pointing to an object in the robot's view interface, tracing a trajectory, or physically moving the robot's arm. According to Wang, "When you are mapping a 2D image of the environment to actions in a 3D space, some information is lost. Physically nudging the robot is the most direct way to specifying user intent without losing any of the information."

To mitigate risks of invalid actions, a specific sampling procedure is employed. This method ensures that the robot selects actions that most closely match the user's objectives from a valid set. "Rather than just imposing the user's will, we give the robot an idea of what the user intends but let the sampling procedure oscillate around its own set of learned behaviours," Wang adds.

The research showcases an improved success rate in task completion by 21 percent compared to alternatives that do not integrate human intervention, demonstrating enhanced performance in various simulations and real-world scenarios involving a robot arm in a toy kitchen setting.

This advancement could potentially lead to more user-friendly robots that adapt to new environments and tasks without extensive retraining, with the added benefit that repeated nudge corrections in similar situations could reinforce the robot's learning for future tasks.

"But the key to that continuous improvement is having a way for the user to interact with the robot, which is what we have shown here," Wang states.

Future research aims to further optimise the speed and performance of the sampling procedure and to explore robot policy generation in unfamiliar environments.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X