This talk introduces a new generation of machine learning methods that provide state of the art performance and are very interpretable. Optimal classification (OCT) and regression (ORT) trees are introduced for prediction and prescription with and without hyperplanes. It will be shown that (a) Trees are very interpretable, (b) They can be calculated in large scale in practical times, and (c) In a large collection of real world data sets, they give comparable or better performance than random forests or boosted trees. Their prescriptive counterparts have a significant edge on interpretability and comparable or better performance than causal forests. These optimal trees with hyperplanes have at least as much modeling power as (feedforward, convolutional and recurrent) neural networks and comparable performance in a variety of real world data sets. Finally, a variety of optimal trees applications in financial services will be discussed.