Corrigendum to Pustejovsky and Tipton (2018), redux

A revised version of Theorem 2

In my 2018 paper with Beth Tipton, published in the Journal of Business and Economic Statistics, we considered how to do cluster-robust variance estimation in fixed effects models estimated by weighted (or unweighted) least squares. We were recently alerted that Theorem 2 in the paper is incorrect as stated. It turns out, the conditions in the original version of the theorem are too general. A more limited version of the Theorem does actually hold, but only for models estimated using ordinary (unweighted) least squares, under a working model that assumes independent, homoskedastic errors. In this post, I’ll give the revised theorem, following the notation and setup of the previous post.
robust variance estimation
econometrics
matrix algebra
Author

James E. Pustejovsky

Published

November 7, 2022

UPDATE, March 8, 2023

The correction to our paper has now been published at Journal of Business and Economic Statistics. It is available at https://doi.org/10.1080/07350015.2023.2174123.

In my 2018 paper with Beth Tipton, published in the Journal of Business and Economic Statistics, we considered how to do cluster-robust variance estimation in fixed effects models estimated by weighted (or unweighted) least squares. As explained in my previous post, we were recently alerted that Theorem 2 in the paper is incorrect as stated. It turns out, the conditions in the original version of the theorem are too general. A more limited version of the Theorem does actually hold, but only for models estimated using ordinary (unweighted) least squares, under a working model that assumes independent, homoskedastic errors. In this post, I’ll give the revised theorem, following the notation and setup of the previous post (so better read that first, or what follows won’t make much sense!).

Theorem 2, revised

Consider the model (1)yi=Riβ+Siγ+Tiμ+ϵi, where yi is an ni×1 vector of responses for cluster i, Ri is an ni×r matrix of focal predictors, Si is an ni×s matrix of additional covariates that vary across multiple clusters, and Ti is an ni×t matrix encoding cluster-specific fixed effects, all for i=1,...,m. Let Ui=[Ri Si] be the set of predictors that vary across clusters and Xi=[Ri Si Ti] be the full set of predictors. Let U¨i=(ITiMTTi)Ui be an absorbed version of the focal predictors and the covariates. The cluster-robust variance estimator for the coefficients of Ui is (2)VCR2=MU¨(i=1mU¨iWiAieieiAiWiU¨i)MU¨, where A1,...,Am are the CR2 adjustment matrices.

If we assume a working model in which Ψi=σ2Ii for i=1,...,m and estimate the model by ordinary least squares, then the CR2 adjustment matrices have a fairly simple form: (3)Ai=(IiXiMXXi)+1/2, where B+1/2 is the symmetric square root of the Moore-Penrose inverse of B. However, this form is computationally expensive because it involves the full set of predictors, Xi, including the cluster-specific fixed effects Ti. If the model is estimated after absorbing the cluster-specific fixed effects, then it would be convenient to use the adjustment matrices based on the absorbed predictors only, (4)A~i=(IiU¨iMU¨U¨i)+1/2. The original version of Theorem 2 asserted that Ai=A~i, which is not actually the case. However, for ordinary least squares with the independent, homoskedastic working model, we can show that AiU¨i=A~iU¨i. Thus, it doesn’t matter whether we use Ai or A~i to calculate the cluster-robust variance estimator. We’ll get the same result either way, but A~i is bit easier to compute.

Here’s a formal statement of Theorem 2:

Let Li=(U¨U¨U¨iU¨i) and assume that L1,...,Lm have full rank r+s. If Wi=Ii and Φi=Ii for i=1,...,m, then AiU¨i=A~iU¨i, where Ai and A~i are as defined in and , respectively.

Proof

We can prove this revised Theorem 2 by showing how Ai can be constructed in terms of A~i and Ti. First, because TiTk=0 for any ik, it follows that TiMTTi is idempotent, i.e., TiMTTiTiMTTi=TiMTTi.

Next, denote the thin QR decomposition of U¨i as QiRi, where Qi is semi-orthogonal (QiQi=I) and Ri has the same rank as U¨i. Next, let B~i=IiU¨iMU¨U¨i and observe that this can be written as B~i=IiQiQi+Qi(IRiMU¨Ri)Qi. It can then be seen that A~i=B~i+1/2=IiQiQi+Qi(IRiMU¨Ri)+1/2Qi. It follows that A~iTi=Ti because QiTi=0. Further, B~iTi=Ti as well.

Now, let Bi=(IiXiMXXi) and observe that this can be written as Bi=IiU¨iMU¨U¨iTiMTTi=B~iTiMTTi because U¨iTi=0.

We then construct the full adjustment matrix Ai as (5)Ai=A~iTiMTTi. Showing that BiAiBiAi=Bi will suffice to verify that Ai is the symmetric square root of the Moore-Penrose inverse of Bi. Because TiMTTi is idempotent, B~iTi=Ti, and A~iTi=Ti, we have BiAiBiAi=(B~iTiMTTi)(A~iTiMTTi)(B~iTiMTTi)(A~iTiMTTi)=(B~iA~iTiMTTi)(B~iA~iTiMTTi)=(B~iA~iB~iA~iTiMTTi)=(B~iTiMTTi)=Bi.

From the representation of Ai in , it is clear that AiU¨i=A~iU¨iTiMTTiU¨i=A~iU¨i.

Back to top