Researchers from the University of Hertfordshire have developed a new algorithm that will allow robots to function more intuitively—that is, make decisions using their environment for guidance.
The principle is that, through the algorithm, the robot agent creates its own goals.
For the first time, the algorithm unifies different goal-setting approaches under one concept which is tied directly to physics, and it furthermore makes this computation transparent so that others can study and adopt it.
The principle of the algorithm is related to the famous chaos theory, because the method makes the agent “master of the chaos of the system’s dynamics.”
The study has been published in the journal PRX Life. Herts researchers explored robot “motivation models” that mimic the decision-making processes of humans and animals, even in the absence of clear reward signals.
The study introduces artificial intelligence (AI) formulas that compute a way for a robot to decide future actions without direct instructions or human input.
Daniel Polani, Professor of Computer Science and senior author explains, “In an applied sense, what this could mean, for example, is getting a robot to play and manipulate objects on its own without being told to do so.
“It could enhance the way robots learn to interact both with humans and with other robots by encouraging more ‘natural’ behaviors and interactions.
“This has further applications—such as the survivability behavior of semiautonomous robots placed in situations where they are unreachable by a human operator, such as in subterranean or interplanetary locations.”
In humans and animals, one theory assumes the existence of an “intrinsic motivation,” where behaviors are driven only by the interaction between the being and its environment rather than by specific learned rewards, such as food. This paper successfully translates that “intrinsic motivation” theory into one that can be used by robotic agents.
Professor Polani adds, “This work is exciting because we can now implement a mechanism, similarly to those helping humans and animals solve new problems without prior experience, in robots.
“We expect that we can build on this work to develop more human-like robots in the future with more intuitive processes. It opens up a huge opportunity for more sophisticated robots with similar decision processes to us.”
The theory underlying this paper, called “empowerment maximization,” has been developed at Herts for many years. It suggests that by increasing the range of future outcomes, a robot will have better options also in the longer future. Importantly, this method replaces and thus possibly obviates traditional reward systems (e.g. food signals).
While empowerment maximization has shown promise, it is not yet fully understood or widely applied. Most studies used to rely on simulations, while meticulously calculating the necessary information for complex systems and the theory remains challenging.
However, this latest innovative research aims to explain why empowerment-based motivations can create behaviors similar to those of living organisms, potentially leading to more intrinsically motivated robots; and it additionally offers a significantly improved way to compute these motivations.
Professor Polani says the next steps are to use this breakthrough algorithm to allow robots to discover more about the world, developing direct learning and identifying and honing new skills that would drive their value in real-world scenarios.