top of page
  • Brendan Stec

Investigating Deep Learning and the "Black Box Problem"

When I first learned about deep learning models, one of my initial concerns was their lack of interpretability. You feed in millions of data points, the data bounces around in a highly nonlinear manner, and out pops the results, which can be miraculously accurate.

Deep learning models are sophisticated machine learning algorithms that recognize your voice when you shout "Hey Alexa!" They know how to drive your Tesla Model S on auto-pilot on the highway. They know you'd like to start the next season of Billions on Netflix. But how do they know? Even to the bearded PhD's who build and train these special machine learning models, the answer isn't clear why these algorithms do what they do or predict what they predict. The internal mechanics are too complex to explain (1).

We have on our hands the classic Black Box Problem (2). Data goes in, results come out. Hey, the results work! Why? We don't know! But hell, this automation sure does create some efficiency! and cost savings! and insights!

Fig. 1: The Black Box Problem of a Deep Learning Model

This diagram roughly describes how deep learning models work at a high level. They are fed in training data, they find interesting patterns with this data via complex neural networks that mimic the brain, and they apply these patterns to provide certain results. The pattern-finding stage, represented by crisscrossing networks and nodes, is too complex to fully grasp, creating the so-called Black Box.

To the suspicious businessperson, the Black Box Problem feels like a potential deal breaker for leveraging deep learning models in different business contexts, where transparency behind an algorithm's choice is important for both risk management and regulatory purposes. Technologists constantly broadcast the value of deep learning (and general AI) models: they will help organizations detect fraud risk, buy and sell stocks, predict machine failure in factories, drive vehicles, and so forth. They will execute extraordinarily important decisions, where a lot of money and human lives could be at stake. Hopefully, they will be better at this than humans. But there is little said about their apparent limitations: the Black Box Problem, the complexity of it all, the lack of transparency and explainability for ultimate end users. That's where some conservative organizations lose trust in deep learning.

At first glance, the alternative doesn't appear so bad: good ole human workers. Forget data-driven insights, forget efficiency! and cost savings!, forget opaque deep learning models and stick to intuition. At least humans can explain why they do what they do. At least humans have interpretability.

Well, there's a problem here too. Humans aren't that interpretable either.

Just like deep learning models, we actually understand little about the human brain's internal mechanics. In fact, much of our brain's activities are automatic, unconscious, and therefore inaccessible for analysis by our conscious mind. Even worse, we invent narratives or reasons for why we choose to do one thing or the other, but the stories we tell ourselves and others aren't always accurate. Often, our brain succumbs to what Michael Shermer calls "patternicity" – the attempt to find structure in meaningless data (3, p. 139).

Since our brain depends so heavily on evolutionary heuristics and automatic wiring buried deep in the unconscious, where there are millions of neurons firing in millions of crisscrossing combinations, there is no way (at least in the foreseeable future) we can ever truly know why a person acted a certain way, said a certain thing, or predicted a certain event. Even they may not know themselves. Consider this example from neuroscientist Alex Eagleman (3, p. 59):

"You may have a difficult time putting into words the characteristics of your father's walk, or the shape of his nose, or the way he laughs – but when you see someone who walks, looks, or laughs like him, you know it immediately."

Eagleman's point is that we know things from experience without an ability to explain how or why we know it. We just know how to ride a bike, recognize a face, or perform a surgery. Our own expertise is unconscious, unexplainable, and therefore un-interpretable to others. As mathematician Stéphane Mallat puts it, "There are things we cannot verbalize. When you ask a medical doctor why he diagnosed this or this, he's going to give you some reasons. But how come it takes 20 years to make a good doctor? Because the information is just not in books." (4) In other words, in order to know why a doctor, or any other human for that matter, makes a particular decision or takes a particular action, we can't merely screw open his skull and see the true reasoning written out for us.

Fig. 2: The Black Box Problem of a Human Brain

In order to perform tasks and make decisions, we rely on our senses, memories, habits, skills, and so forth. It's not clear to what degree each of these influences our ultimate behavior. Even we don't know ourselves. In this way, the brain is a Black Box.

We have the same Black Box Problem with humans that we do with machines!

And yet, we trust the advice, knowledge, predictions, expertise, and so forth of people all day long, people we work with, people we love, people we see on TV or Instagram, without thinking un minuto about the brain mechanics and neuron firings generating their thoughts. These people could be riddled with unconscious bias, they could be overfitting based on their experience alone, they could be forgetting important details, but we often still trust them. Based on what exactly?

1: Their credibility, reputation, age, education, experience, resume, certifications. Are they competent or respected?

2: The results of their actions. Do the results make sense to us? Are they accurate or correct?

3: The "explanations" for their actions, when they're even possible (as we've seen it's often impossible to provide "explanations" of one's behavior accurately.)

We do not peer into the brain to see how they use what they've learned in the past to execute some action in the future. We do not cry out for interpretability like we do with deep learning models. We only care about their credentials and that the results work.

We want deep learning models that are interpretable, that are transparent and unbiased, that can explain why they provide the outputs they provide based on historical data. But how often do we hold our fellow humans to that same standard?

Never – because we can't. That's a good thing. The human brain works so effectively because it is a gnarled, convoluted, divided, unfathomable, and unintelligible mass of neurons, synapses, chemicals, and tissue.

Deep learning models work so effectively because they are a convoluted mess of weights, hidden layers, and activation functions, challenging to fully grasp.

Deep learning models are effective not despite their complexity, but because of their complexity. If they were deeply transparent, if their mechanics were simple and linear, they wouldn't be able to perform at the level that they do.

You can't build a univariate linear regression in Excel to drive your car.

This is the trade-off businesses must accept for the time being. At the extremes, pure human intuition and pure deep learning models both can perform well, but they both are not fully transparent. In the middle, simpler data-driven methods, such as regressions, decision trees, and other statistical methods, are more transparent but also more rudimentary.

Fortunately, hoping to alleviate this trade-off, researchers in AI are developing methods to improve the interpretability of complex deep learning models and whole classes of machine learning models in general. One of these new tools is called SHAP (SHapley Additive exPlanation), and it's gathering interest as a unified approach for alleviating the Black Box Problem. The SHAP coefficients of a trained model indicate which variables most contribute to specific model outputs, providing a user some insight into why the model predicted a certain outcome or executed a particular action (5).

As for the human brain, we'll likely never achieve true transparency into its underlying mechanics, directly into the Black Box. But we've already addressed that problem. Resumes, SAT scores, experience, certifications, and reputations provide us with historical data to estimate an individual's credentials for making good choices. We look at the results of an action and assume they are a function of an individual's ability. We trust an individual's explanations for behavior, even if these explanations are biased or incorrect. As a result, we rely on the Black Boxes of our family members, co-workers, doctors, Uber drivers and beyond every day without thinking much about true, objective interpretability.

As machine learning researcher Pierre Baldi reminds us, “You use your brain all the time; you trust your brain all the time; and you have no idea how your brain works.” (4) Maybe we'll eventually trust deep learning models – and AI in general – in this same manner.

For the time being, true interpretability for both deep learning models and the human brain is a work in progress.



(1) "Dealing with Deep Learning’s Big Black Box Problem" by Alex Woodie (found here)

(2) "The Artificial Intelligence Black Box Problem & Ethics" by Théo Szymkowiak (found here)

(3) Incognito by Alex Eagleman (2011)

(4) "Can we open the black box of AI?" by Davide Castelvecchi (found here)

(5) "Demystifying Black-Box Models with SHAP Value Analysis" by Peter Cooman (found here)

bottom of page