When I first started reading about machine learning, I was surprised to see linear regression as the first type of algorithm. And the second algorithm I learned was... logistic regression? Wait, how is this different from statistics or econometrics? Is the only difference that regressors are referred to as "features" in machine learning, or does it go deeper than that?

The reality - which I slowly grasped after doing some data wrangling of my own - is that the two approaches share two completely unique goals: machine learning focuses on prediction, while classical statistics focuses on inference. When Wells Fargo implements an anomaly detection algorithm to flag fraudulent transactions, its primary goal is to predict bad transactions as accurately as possible. There are just far too many transactions and reams of data and too little time to complete a detailed statistical analysis. Better predictions save more money and better reduce risk, and for business applications, the more money the merrier.

This all being said, classical statistical methods also have a significant place in the world. All researchers implement statistical methods to infer causality between covariates and outcomes (for example, that smoking causes lung cancer). While predicting a certain outcome is important, science at its core focuses more on determining whether a certain treatment causes a certain outcome. That's why econometricians have tools such as instrumental variables to help tease apart partial effects. They seek to understand whether X causes Y so they can enact policy to either remove or improve X. Prediction has a secondary role.

In the end, both machine learning and statistics practitioners can learn from each other. Statisticians will need a firm grasp of machine learning techniques to handle the increasingly large datasets of the information age. Splitting data into training, cross-validation, and test sets to improve out of sample performance of a model can also be another useful machine learning technique for statisticians/econometricians. In contrast, machine learning practitioners can be wise to avoid thinking about their algorithm as simply a black box (which is definitely a challenge when working with neural networks). Focusing on domain expertise - fundamental knowledge of what is being modeled - is a step in the right direction.

Further Reading: Big Data: New Tricks for Econometrics by Hall Varian