Machine learning for markets
Supervised learning
4 min
Supervised learning is the most common machine-learning setup in finance. You show the model examples where you know the answer — historical features paired with the outcome that followed — and it learns a mapping from features to outcome.
Two flavours
- Regression — predict a number. For example, predict next week's return as a continuous value. Linear regression is the simplest case; the tree and neural models later in this chapter are nonlinear cousins.
- Classification — predict a category. For example, will the stock be up or down over the next five days? Up/down/flat is a three-class problem. Classification often works better than regression in markets because you are asking an easier question — direction, not magnitude.
Features and labels
- Features (inputs) — anything you think carries signal: past returns, volatility, volume, technical indicators, fundamental ratios, macro data, even text sentiment.
- Label (target) — the thing you are predicting, defined over a clear future horizon.
Defining the label well is half the battle. A sloppy label — like one that peeks at information not available at decision time — produces a model that looks brilliant and is useless live.
The central tension
A supervised model is only as good as the assumption that the future resembles the past. Markets are adversarial and non-stationary: patterns that paid yesterday get arbitraged away. Every supervised model in finance fights a slow decay as the world changes and others discover the same edge. This is why validation discipline (the final chapter) matters more here than in almost any other domain.
This content is for educational and informational purposes only and is not investment, financial, tax or legal advice. Trading and investing carry risk, including the possible loss of capital. Any performance shown by third-party tools is hypothetical and not a promise of future results. Do your own research and consider professional advice before making any decision.