More transparency and understanding of machine behavior | MIT News
Explaining, interpreting and understanding the human mind presents a unique set of challenges.
Doing the same for machine behavior is a whole different story.
As artificial intelligence (AI) models are increasingly used in complex situations – approving or denying loans, helping doctors make medical diagnoses, helping drivers on the road, or even taking full control – humans still lack a holistic understanding of their abilities and behaviors.
Existing research mainly focuses on the basics: how accurate is this model? Often, focusing on the notion of simple precision can lead to dangerous oversights. What if the model makes mistakes with very high confidence? How would the model behave if it encountered something new, like a self-driving car seeing a new kind of traffic sign?
In the quest for better human-AI interaction, a team of researchers at MIT’s Laboratory for Computing and Artificial Intelligence (CSAIL) have created a new tool called Bayes-TrEx that enables developers and users to gain in transparency in their AI model. Specifically, it does this by finding concrete examples that lead to a particular behavior. The method uses “posterior Bayesian inference”, a mathematical framework widely used to reason about the uncertainty of the model.
During experiments, the researchers applied Bayes-TrEx to several image-based datasets and discovered new information that was previously overlooked by standard assessments focusing only on the accuracy of predictions.
“Such analyzes are important to verify that the model works correctly in all cases,” says Yilun Zhou, PhD student at MIT CSAIL, co-principal investigator on Bayes-TrEx. “A particularly alarming situation is when the model makes mistakes, but with very great confidence. Due to the high user confidence compared to the reported high confidence, these errors may go unnoticed for a long time and only be discovered after causing significant damage.
For example, after a medical diagnostic system has finished learning about a set of X-ray images, a physician can use Bayes-TrEx to find images that the model has misclassified with very high confidence, in order to make sure that no particular variant of a disease is missing.
Bayes-TrEx can also help to understand the behaviors of models in new situations. Take autonomous driving systems, which often rely on camera images to capture traffic lights, bike lanes and obstacles. These common events can be easily recognized with great precision by the camera, but more complicated situations can present literal and metaphorical obstacles. A zipped Segway could potentially be interpreted as something as big as a car or as small as a bump in the road, resulting in a tricky turn or costly collision. Bayes-TrEx could help cope with these new situations to come time consuming and allow developers to correct any unwanted results before potential tragedies occur.
In addition to images, the researchers are also tackling a less static area: robots. Their tool, called “RoCUS”, inspired by Bayes-TrEx, uses additional adaptations to analyze robot-specific behaviors.
While still in a testing phase, experiences with RoCUS indicate new findings that could easily be missed if the assessment focused solely on task completion. For example, a 2D navigation robot that used a deep learning approach preferred to navigate closely around obstacles, due to the way the training data was collected. Such a preference, however, could be risky if the robot’s obstacle sensors are not fully accurate. For a robot arm hitting a target on a table, the asymmetry in the robot’s kinematic structure showed greater implications for its ability to hit targets on the left versus the right.
“We want to make human-AI interaction safer by giving humans more information about their AI collaborators,” said Serena Booth, PhD student at MIT CSAIL, co-lead author with Zhou. “Humans should be able to understand how these agents make decisions, predict how they will act in the world, and most importantly, anticipate and work around failures. ”
Booth and Zhou are co-authors of the Bayes-TrEx work alongside MIT CSAIL PhD student Ankit Shah and MIT professor Julie Shah. They presented the document virtually at the AAIA conference on artificial intelligence. Along with Booth, Zhou and Shah, MIT CSAIL postdoctoral fellow Nadia Figueroa Fernandez contributed to the work on the RoCUS tool.