Machine learning for markets
Tree-based models
4 min
Tree-based models are among the most effective and practical tools for tabular financial data — often outperforming neural networks when your features are a structured table of numbers rather than raw sequences or images.
Decision trees
A decision tree asks a sequence of yes/no questions to reach a prediction: 'Is volatility above X? If so, is momentum positive? Then predict up.' It is intuitive and easy to visualise — but a single deep tree overfits ferociously, memorising the training data including its noise.
Random forests
A random forest fixes this by training many trees, each on a random subset of the data and features, then averaging their predictions. No single tree dominates, the noise in individual trees cancels out, and the ensemble generalises far better. The principle — combine many weak, decorrelated learners — is one of the most reliable ideas in machine learning.
Gradient boosting
Gradient boosting (XGBoost, LightGBM, CatBoost) builds trees sequentially, each new tree correcting the errors of the ensemble so far. It is frequently the top performer on tabular problems and is heavily used by quant funds.
Why they suit finance
- They capture nonlinear relationships and interactions between features automatically.
- They are relatively robust to feature scaling and to irrelevant inputs.
- They give feature-importance scores, offering a window into what drives the prediction.
The same warning
Their power to fit is exactly the danger. A boosted model with enough trees and depth will fit historical noise perfectly. The honest practitioner constrains complexity, validates out-of-sample with the walk-forward methods in the final chapter, and stays sceptical of a backtest that looks too good.
This content is for educational and informational purposes only and is not investment, financial, tax or legal advice. Trading and investing carry risk, including the possible loss of capital. Any performance shown by third-party tools is hypothetical and not a promise of future results. Do your own research and consider professional advice before making any decision.