ForecastingStocks

Supervised learning is the most common machine-learning setup in finance. You show the model examples where you know the answer — historical features paired with the outcome that followed — and it learns a mapping from features to outcome.

Two flavours

Regression — predict a number. For example, predict next week's return as a continuous value. Linear regression is the simplest case; the tree and neural models later in this chapter are nonlinear cousins.
Classification — predict a category. For example, will the stock be up or down over the next five days? Up/down/flat is a three-class problem. Classification often works better than regression in markets because you are asking an easier question — direction, not magnitude.

Features and labels

Features (inputs) — anything you think carries signal: past returns, volatility, volume, technical indicators, fundamental ratios, macro data, even text sentiment.
Label (target) — the thing you are predicting, defined over a clear future horizon.

Defining the label well is half the battle. A sloppy label — like one that peeks at information not available at decision time — produces a model that looks brilliant and is useless live.

The central tension

A supervised model is only as good as the assumption that the future resembles the past. Markets are adversarial and non-stationary: patterns that paid yesterday get arbitraged away. Every supervised model in finance fights a slow decay as the world changes and others discover the same edge. This is why validation discipline (the final chapter) matters more here than in almost any other domain.