CRSP Calculations

This area contains formulas and methodologies used to derive CRSP variables in the stock and index files and generated by the CRSP data utilities.

Adjusted Data

Price, dividend, shares, and volume data are historically adjusted for split events to make data directly comparable at different times during the history of a security. CRSP provides raw, Unadjusted Data, but data utilities stk_print and ts_print can be used to generate Adjusted Data.

An adjustment base date is chosen as the anchor date. All data on this date are unadjusted, and other data are converted based on the split events between the base date and the time of that data. The adjustment base date is usually chosen to be the last available day of trading.

Split events always include stock splits, stock dividends, and other distributions with price factors such as spin-offs, stock distributions, and rights. Shares and volumes are only adjusted using stock splits and stock dividends. Split events are applied on the Ex-Distribution Date.

Price and dividend data are adjusted with the calculation:

A(t) = P(t) / C(t),

where A(t) is the adjusted value at time t, P(t) is the raw value at time t, and C(t) is the cumulative adjustment factor at time t.

Share and volume data are adjusted with the calculation:

A(t) = P(t) * C(t),

where A(t) is the adjusted value at time t, P(t) is the raw value at time t, and C(t) is the cumulative adjustment factor at time t.

In both cases, where C0 is the adjustment base date, the cumulative adjustment factor is:

if t = C0

C(t) = 1.0

if t > C0 and no split events since t-1

C(t) = C(t-1)

if t > Cand a split event with factor f since t-1

C(t) = C(t-1) * f

if t > C0 and split event change

C(t-1)/f

if t < C0 and a split event change

C(t+1)*f

Where factor is typically the Factor to Adjust Price variable + 1.

If there is a gap in trading where possible split events are not known, all adjusted values are set to missing when the gap is between the observation and the adjustment base date.

Monthly: If monthly summary data (id or Low Price, Ask or High Price, and Volume Traded) are adjusted, the adjustment factor cannot take into account adjustments that take place in the middle of the month. Therefore, the result assumes all adjustment events occur on the last trading day of the month. A more accurate monthly adjusted value can be derived by adjusting and resummarizing the underlying daily data.

Annualized Return

Annualized Return is the constant annual return applied to each period in arrays that would result in the actual compounded return over that range. An Annualized Return is a special case of a Geometric Average Return where the time periods are expressed in terms of years.

Associated Portfolio Return

Associated Portfolio Returns are a composite of a group of portfolio index series based on a time-dependent portfolio assignment for a security. They are built for each security based on assignments within the specified portfolio type. The associated portfolio return at any time is the return of the portfolio to which the security belongs at that time. If the security is not assigned to a portfolio of that type at the time, the associated portfolio return is set to a missing value.

Cumulative Return

A Cumulative Return is a compounded return from a fixed starting point. Each period in a time series of Cumulative Returns contains the compounded return from the first period in the time series to the end of that period.

Delisting Return

Delisting Return is the return of security after it is delisted. It is calculated by comparing a value after delisting against the price on the security’s last trading date. The value after delisting can include a price on another exchange or the total value of distributions to shareholders. If there is no opportunity to trade a stock after delisting before it is declared worthless, the value after delisting is zero. Delisting Returns are calculated similarly to total returns except that the value after delisting is used as the current price.

Valid delisting payment information is either a valid price with at least a bid and ask quote within ten trading periods, or a complete set of payments received for the shares. If information after delisting is insufficient to generate a return a missing value is reported.

Monthly: The monthly Delisting Return is calculated from the last month ending price to the last daily trading price if no other delisting information is available. In this case the delisting payment date is the same as the delisting date. If the return is calculated from a daily price, it is a partial-month return. The partial-month returns are not truly Delisting Returns since they do not represent values after delisting, but allow the researcher to make a more accurate estimate of the Delisting Returns.

When valuing a portfolio, the Delisting Return or other representation can be used to assign a value to the delisted security. The researcher must decide whether to assign alternate estimated values based on the Delisting Code when delisting payment information is unavailable. If using monthly data and an alternate estimate for Delisting Return is used, partial month returns should also be adjusted by this factor.

Dividend Amount in Period (ts_print item)

Dividend Amount is the cash adjustment factor in a holding period return time period used to calculate returns. It is an adjusted summation of all distribution cash amounts available in the distribution history with Ex-distribution dates after the previous period and up to and including the current period, adjusted to the basis at the end of the previous period. Dividend Amount can be divided into nonordinary and ordinary types. Nonordinary dividends include return of capital distributions. Ordinary dividends are excluded from capital appreciation returns calculations.

To calculate an adjusted Dividend Amount in Period to its basis at the end of a date range, the following formula may be used with data items extracted through ts_print:

Divamt in period adjusted to end of range = divamt / cumfacpr / facpr

Where  
divamt Dividend Amount in Period
cumfacpr Cumulative Factor to Adjust Prices over a Date Range
facpr Factor to Adjust Price in Period

Thus, to calculate a total return using adjusted prices and dividends,

Total Return = (adjprc + (divamt / cumfacpr / facpr)) / prev_adjprc – 1

Where  
adjpr Price Adjusted, End of Period
ev_adjprc Price Adjusted, End of Previous Period

 

Excess Return

An Excess Return is defined as the return in excess of a comparable benchmark. The benchmark can be a single associated index series or a composite of a group of portfolio index series based on security and time-dependent portfolio assignments.

If an Excess Return is based on a single index series, the Excess Return for a period is

E(t) = R(t)-I(t),

where E(t) is the Excess Return at time t, R(t) is the security return at time t, and I(t) is the index return at time t. If the security return R(t) is based on a previous price t’ that is not the previous time period, I(t) is the compounded index return from t’ + 1 to t.

If an Excess Return is based on associated portfolios, the Excess Return for a period is

E(t) = R(t)-I(p(t),t)

where E(t) is the Excess Return at time t, R(t) is the security return at time t, p(t) is the portfolio assignment of the security at time t, and I(p(t),t) is the return of that portfolio at time t. If the security return R(t) is based on a previous price t’ that is not the previous time period, I(p(t),t) is the compounded return of the security’s portfolio return from t’ + 1 to t. If the security is not assigned a portfolio assignment of the given type at time t, E(t) is set to a missing value.

When cumulating Excess Return, the security returns and the index returns are cumulated separately before subtracting the difference.

Factor to Adjust Prices in Period (ts_print item)

Factor to Adjust Prices in Period is the amount the current price is multiplied by in returns calculations so that current and previous prices are on the same split-adjusted basis. Factor to Adjust Prices in Period is derived from the Factor to Adjust Price field of distributions with Ex-Distribution Dates after the previous period and up to and including the current period. In simple stock splits, Factor to Adjust Prices in Period is distribution Factor to Adjust Price plus one.

Geometric Average Return

A Geometric Average Return is the constant return applied to each period in a range that would result in the compounded return over that range.

The Geometric Average Return is calculated using the formula below:

gn = ( 1 + rc)1/n – 1

Where

gn = the Geometric Average Return applicable on
each subset period n

rc = the cumulative return over the entire period

n = the number of equal subset periods to average
the return

Income Return

Income Return is the return on the ordinary dividends paid to shareholders of a security. It is the ratio of the amount of ordinary dividends since the end of the previous period up to and including the end of the period of interest to the price at the end of the previous period. It is similar to a dividend yield.

Income Return is calculated by CRSP as the difference of the Total Return and Capital Appreciation Return, as follows. irett=tretarett where:

  • irett is the income return for time t
  • tret is the total return for time t,
  • arett is the capital appreciation return for time t.

Index Count

Index Count is the count in an index for a time period is the number of securities in the portfolio during the time period. Rules are based on the specific index or portfolio methodology. See Total and Used Counts for more details.

Index Level

Index Level is the value of an investment relative to its value at one fixed point in time. Index Levels allow convenient comparison of the relative performance of the different portfolio or asset classes. Differences arise when indexes are based on different underlying databases such as daily and monthly CRSP stock products.

The initial date and value are set arbitrarily, but must be consistent if comparing multiple indexes. The Index Level for any series at any time after the initial point indicates the value at that time of the initial value invested at the initial point. The Index Level for any series at any time before the initial point, indicates the value invested at that time that will result in the initial value at the initial point. The Index Level of a series missing prior to its first available return. Let:

  • It = Index Level for any series at time t
  • Rt = return for the period t-1 to t
  • F = First Return. The time of the first non-missing return of the series
  • D = Initial Date. An arbitrary date where the level is set to the initial value
  • L = Initial Level. An arbitrary value the level is set to on the initialization date

then

  • if t = D, then It = L
  • if t > D, then It = It-1*(1+ Rt)
  • if t < D, then It = 
  • if t-1 < F then It is set to missing- Note: Missing values are file format specific.

Defined CRSP indexes use the following initial dates and levels:

CRSP Stock File Indexes  
initial level 100.00
initial date December 29, 1972
CRSP Cap-Based Portfolios  
initial level 1.00
initial date December 31, 1925
CRSP US Government Treasury and Inflation Indexes  
initial level 100.00
initial date December 29, 1972

Publicly available indexes such as for the S&P 500 Composite and NASDAQ Composite have initial values set by their creators and differ from the CRSP initializations.

Index Return

An Index Return is the change in value of a portfolio over some holding period. The return on an index (Rt) is calculated as the weighted average of the returns for the individual securities in the index:

where:

  • Rt is the index return
  • wi,t is the weight of security i at time t.
  • ri,t is the return of security i at time t. (see section xxxx of Stock guide for Security Return Calculation)

In a value-weighted index, the weight (wi,t) assigned is its total market value; see Index Weight below. In an equally-weighted index, the weight is equal and by convention wi,t is set to one for every stock. Such an index would consist of n stocks, with the same dollar amount invested in each stock.

The security returns can be total returns or capital appreciation (returns without dividends). This determines whether the index is a total return index or a capital appreciation index.

In an index where the individual components are not known, but an index level is available from an external source, such as the Standard & Poor’s 500 Composite Index, the return Rt is calculated as follows:

Rt is the index return for time t

It is the index level at time t

It-1 is the index level at end of the previous period (time t-1)

Index Weight

The weight of an index for a time period is the total market value of the securities in the index at the end of the previous trading period. Vt = ∑(wi,t)= ∑(vi,t) where: vi,t = pi,t-1si,t-1 in which:

  • vi,t is value of security i at time t
  • pi,t-1 is the price of security i at the end of the previous trading period (time t-1).
  • si,t-1 is the number of shares outstanding of security i at the end of the previous trading period (time t-1).

Market Capitalization

Market Capitalization (in 1000s) is a measurement of the size of a security defined as the price multiplied by the number of shares outstanding. CRSP uses the closing price or the absolute value of the bid/ask average from the Price or Bid/Ask Average variable and the applicable shares observation from the Shares Outstanding Observation Array for each calendar period to calculate Market Capitalization.

Rebasing Index Levels

It is possible to rebase an index to make index levels of two index level series comparable. To rebase an index, choose a new initial date and value, find the current index level on the new initial date, and multiply the levels on all dates by the new initial value divided by the old initial date index level:

where:

  • It = Original Index Level for the series at time t
  • Nt = New Index Level for the series at time t
  • D = New Initial Date.
  • ID = Original Index Level for the series on the new initial date
  • L = New Initial Level.

Return

A Return is the change in the total value of an investment in a security over some period of time per dollar of initial investment. Total Return is the Holding Period Total Return for a sale of a security on the given day, taking into account and reinvesting all distributions to shareholders. It is based on a purchase on the most recent time previous to this day when the security had a valid price. Usually, this time is the previous calendar period, but may be up to ten calendar periods prior to the calculation.

Returns are calculated as follows:

For time t (a holding period), let

  • t’ = time of last available price < t
  • r(t) = return on purchase at t’, sale at t
  • p(t) = last sale price or closing bid/ask average at time t
  • d(t) = dividend amount for t
  • f(t) = factor to adjust price in period t
  • p(t’) = last sale price or closing bid/ask average at time of last available price < t

t’ is usually one period before t, but t’ can be up to ten periods before t if there are no valid prices in the interval. If there is a trading gap with unknown status between t and t’, the previous price is considered invalid.

In daily databases, dividends are reinvested in the security on the Ex-Distribution Date. In monthly databases, the returns are holding period returns from month-end to month-end, not compounded daily returns, and dividends are reinvested in the security at month-end.

The Factor to Adjust Prices in Period is derived from the distribution history Factor to Adjust Price using all distributions with Ex-Distribution dates after the previous period and up to the end of the current period. The dividend amount is derived from the distribution history Dividend Cash Amount and Factor to Adjust Price in the same range. For example, if a 2-for-1 split is the only distribution event in the time range, Factor to Adjust Price is 1.0, Factor to Adjust Prices in Period is 2.0, and Dividend Cash Amount is 0.0. If a one dollar dividend is the only distribution event in the time range, both Dividend Cash Amount and dividend amount are 1.0.

A series of special return codes specify the reason a return is missing:

-66.0 Valid current price, but no valid previous price; either first price, unknown exchange between current and previous price, or more than 10 periods between time t and the time of the preceding price t
-77.0 Not trading on the current exchange at time t
-88.0 Outside the range of the security’s price range
-99.0 Missing return due to missing price at time t

Scholes-Williams Beta

Beta is a statistical measurement of the relationship between two time series, and has been used to compare security data with benchmark data to measure risk in financial data analysis. CRSP provides annual betas computed using the methods developed by Scholes and Williams (Myron Scholes and Joseph Williams, “Estimating Betas from Nonsynchronous Data,” Journal of Financial Economics, vol 5, 1977, 309-327).

Beta is calculated each year as follows:

where:

  • βi is the Beta for security i for the year being calculated
  • ri,t is the return of security i at day t
  • lri,t = ln(1+ri,t ) is the natural log of the return of security i at time t+1 or the continuously compounded return.
  • Mt is the value-weighted market return at time t
  • lMt=ln(1+Mt ) is the natural log of the value-weighted market return at time t+1 or the continuously compounded return.
  • M3t = lMt-1+ lMt + lMt+1 is the three-day moving window of the above market return
  • ni is the number of non-missing returns for security i during the year

where the summations are over t and include all days on which security i traded, beginning with the first trading day of the year and ending with the last trading day of the year. There are two index families based on Scholes- Williams Beta calculations: NYSE/NYSE MKT and NASDAQ-only.

In the NYSE/NYSE MKT family, only trading prices are considered in the beta calculation, and a security must have traded half the days in a year to be given a non-missing beta for that year. The index used in the calculation is the total returns on the Trade-only NYSE/NYSE MKT Value-Weighted Market Index.

Betas for the NASDAQ family do not use the standard Scholes-Williams trade-only data restriction, since most NASDAQ securities were not required to repoRt transactions until 1992. Removing bid/ask averages would restrict NASDAQ data to only NASDAQ National Market securities after 1982. NASDAQ returns based on bid/ask averages have different characteristics from trade-based returns, and betas are provided for comparison. NASDAQ betas are based on the total returns on the NASDAQ Value-Weighted Market Index.

Standard Deviation

Standard Deviation is a statistical measurement of the volatility of a series. CRSP provides annual standard deviations of daily returns using the following calculations:

where:

  • σi is the standard deviation for security i for the year being calculated
  • ri,t is the return of security i at time t
  • ni is the number of non-missing returns for security i during the year

where the summations are over t and include all days on which security i had a non-missing return, beginning with the first trading day of the year and ending with the last trading day of the year. A security must have valid returns for eighty percent of the trading days in a year to have a Standard Deviation calculated. There are two families of indexes provided by CRSP with annual standard deviations as the statistic, the NYSE/NYSE MKT Standard Deviation Portfolios and the NASDAQ Standard Deviation Portfolios.

Total Counts (totcnt) and Used Counts (usdcnt)

Total Counts and Used Counts are provided for all indexes and portfolios. The following table identifies differences.

Total Count Used Count
Current Day closing price required for inclusion Previous day & current day closing prices required for inclusion
On same date the Total Count will always be greater than or equal to the Used Count. The difference will be the number of securities with missing prices on the previous day (usually adds). The Total Count on Day t will be greater than or equal to the Used Count on Day t+1. The difference will be the number of securities with missing prices on t+1 (usually the drops)
Total Count will fluctuate throughout the year. Used Count will fluctuate throughout the year.

Total Value (totval) and Used Value (usdval)

Total Value and Used Value are provided for all CRSP stock indexes. The following table identifies differences.

Total Value Used Value
Current Day market value of eligible securities - price and shares for the current day are required for inclusion For value-weighted indexes, this is the Index weight - market value of eligible securities with - price for the current day and price and shares for the previous day are required for inclusion
On same date the Total Value will always be greater than or equal to the Used Value.  

Trade-Only Data

CRSP provides Price or Bid/Ask Average as the standard daily price field, and derives returns from this field. Bid/ask averages are marked as negative numbers by convention. A trade-only price is derived from Price or Bid/Ask Average by setting all bid/ask average prices to missing. Trade-only returns are calculated using trade-only prices. A trade-only index is calculated using trade-only prices and returns.

Unadjusted Data

Unadjusted Data is price, dividend, shares, and volume data reported in the amounts reported at the time of the observations. All CRSP data are provided unadjusted. However, the distribution history can be used to generate Adjusted Data from the raw data.

Weighted Return

Weighted Return is the relative weight of a security within a portfolio or index multiplied by its return. In a value-weighted portfolio, Weighted Return is the capitalization at the end of the previous period multiplied by the return for the period.