Factor Models using Machine learning Approach

Abhishek Chikara
6 min readFeb 1, 2021


Not a blog writing notes for myself :)

Why Factor investing is important and why it matters?

Securities(stocks, bonds…etc) earn their risk premium(It represents payment to investors for tolerating the extra risk in a given investment over that of a risk-free asset) through exposure to a small number of rewarded risk factors

Factors are fundamentally broad, persistent characteristics that can both impact and drive asset returns. They are generally persistent over time and have consistently demonstrated an ability to explain stock returns

Factors are the foundation of investing just as nutrients are the foundation of our food. — Outlook article

What Types of Factors models are there?

  1. Macro-Economic Factors models: GDP
  2. Micro-Economic Factors models: Fundamental Data
  3. Implicit Factor models: statistical artifact

Benefits of Factor Investing?

Factor Models

The market model is represented by the equation:

Ri,t- rf,t =αi+ βi(Rm,t−rf,t) + εi,t — Eq(1)

Ri,t: the return of security(i) on a day (t)

rf,t: risk-free rate

(Rm,t−rf,t): Excess return in terms of excess returns on the market(Market premium)

βi: movement of security (i) exposer with respect to the market return movements(Basically how much security is affected by the Market movements)

εi,t: Residual Component while αi is your abnormal return or the regression-intercept

LOL(use Google image next time)

So using this equation in terms of risk analysis context we take variance on both sides and we do remove the alpha to make the equation simple


So the Explanatory power or R² is the part of the variance explained by the factor model

R² = βiσ²M / σ²i(tell us how much variation in the movement of security (i) can be explained by taking Market movement into account)

also, The explanatory power of the factor model, measured by R2, allows us to decompose the risk of the stock into the systematic versus specific risk components.

The above equation is a linear regression that has a single explanatory variable(Market portfolio/Index) while we can also add multiple explanatory variables(other factors) which are also known as Multi-Factor Case

Decomposition of Risk

Single Factor model able to define the 30%–40% risk-return explained but still, 60% -70% returns left unexplained

Fama and French in 1992 looked at the explanatory power(R²) in terms of cross-sectional differences in expected returns when looking at factors other than the market factor.

They are looking at the average return on a set of stocks And then they are looking at the beta of each stock as an explanatory variable.

They’re also looking at the market cap of the stocks(size factor) and the value factor which is book to market estimate.

And then they do a cross-sectional regression, where they are trying to explain differences in average performance across stocks in terms of differences in details on the one hand and Exposure with respect to the other explanatory variables(market, size, value)

Basics Factors which is used in nowadays investment Practice are :

  1. Market
  2. Size(small-cap stocks tens to outperform large-cap stocks)
  3. Value( value stocks tend to outperform growth stocks )
  4. Momentum (past winners tends to outperform past losers)
  5. volatility(low volatility stocks tend to outperform high volatility stocks)

All these premium might explain why that value stocks and small-cap stocks are riskier companies they eventually have a higher return.

Using Factor Models for portfolio construction and analysis

Let’s Take CAPM(capital asset pricing model) as an example of a factor model that we can use to generate a more robust expected return parameter estimate.

Step-1 We are going to run this market model regression see equation 1. So we are going to try and estimate what’s the Alpha and the Beta of our model.(what’s the exposure of the stock with respect to the underlying market return and also what’s the abnormal return which we call Alpha(i) in this case)

Ri,t- rf,t =αi+ βi(Rm,t−rf,t) + εi,t

Step 2 As we get our best estimates alpha i hats and beta i hats and then we impose alpha i = 0 even there is positive or negative alpha. we gonna have to impose the structure of alpha i =0 which would be the case if CAPM was the true asset pricing model

Ri,t =rf,t+ βi(Rm,t−rf,t)

Machine Learning Approach(Penalty Methods)

Statistically, in the factor model, we select the features and compare them against the returns over time, and this is done through regression.

In a Machine learning Approach, we have factors or features that are selected through the selection process through the data itself. Then we conduct a traditional regression, minimizing some error between the return of the asset that we’re looking at and the loadings of the factors on that asset.

X is the Data matrix

Y is the vector of performance(Returns)

minimize : 1/2n |y-Xβ|(Cost Function in terms of machine learning)

optimal solution: β=(X^t X)^-1 * X^t *y (How did we arrive at this solution)

If we have explanatory variables that have high correlations, then you can end up with betas that are very high and very low because of that multicollinearity, not because of any fundamental reason besides that.

So a penalty method called ridge regression can solve this issue which basically took the Betas and squeezed them down, tilts them to lower numbers based on a penalty function. (also called regularization and penalty terms in regression.)

β(λ) = arg min|| y-Xβ ||² + λ|β|²(Ridge regression Penalty)

optimal solution: β=(X^t X+ λI)^-1 * X^t *y

The standard assumptions for a penalized regression model are:

  1. additive and linear (or transformed linear) relationship between the explanatory (independent) variables and the forecast (dependent) variable,
  2. The lack of high dependency between the explanatory variables, and
  3. A causal relationship between the explanatory variables and the forecast variable. The second assumption — the lack of multi-collinearity — is required for the solution of the estimation optimization model. Due to the nature of economic problems, we must be careful to avoid spurious relationships — sometimes called sunspot linkages. A penalized regression model is based on shrinkage assumptions taken from machine learning concepts.

It was realized early on that if you have a ridge regression or a traditional regression without penalties, you end up with most of the betas having non-zero numbers. Most of the variables have some small influence on it.

But we know that in most cases there are some explanatory variables that are much more powerful than other variables. We want a way to reduce and identify those.

Linear absolute value regression(Lasso Regression) is used now as an alternative to ridge regression.

Lasso is the same type of idea where we’ve taken a penalty term instead of minimizing the sums of the Betas

minimize the absolute value of the Betas. We basically add Lasso Term in our Cost Function

minimize: 1/2n |y-Xβ| + λ|β|(Lasso Term)

The advantage is that we’re going to have a smaller number of non-zero Betas than you would get with the ridge regression or the traditional regression

How to select the Optimal Penalty Parameter, The Lambda

Using Machine Learning Approach we can use Cross-validation.

Procedure For applying Regularized Regression to Asset allocation

  1. Select the Factors/Features that span the investor’s assets categories
  2. Conduct Regularized Regression over a chosen Historical Time period
  3. Calculate Optimal Tuning Parameter(Lambda)
  4. Compute Factor Loading with Lambda
  5. Apply Risk Allocation For Portfolio Construction