lfe2fixest • lfe2fixest

Start by loading this package. Note the example scripts that follow assume you have both lfe and fixest, as well as modelsummary installed on your system.

library(lfe2fixest) ## This package

## Aside: Make sure you have the following packages installed on your system if
## you want to run the example scripts below:
## library(lfe); library(fixest); library(modelsummary)

Let’s create an lfe-based R script, that’s deliberately messy to pose an additional challenge (inconsistent formatting, etc.)

lfe_string = "
library(lfe)
library(modelsummary)

## Toy dataset
aq = airquality
names(aq) = c('y', 'x1', 'x2', 'x3', 'mnth', 'dy')

## Simple OLS
mod1 = felm(y ~ x1 + x2, aq)

## Add FE & cluster var
mod2 = felm(y ~ x1 + x2 |
              dy |
              0 |
              mnth, aq)

## Add 2nd cluster var & some estimation options
mod3 = felm(y ~ x1 + x2 |
              dy |
              0 |
              dy + mnth,
            cmethod = 'reghdfe',
            exactDOF = TRUE,     ## Irrelevant for feols (should be ignored)
            aq)

## IV reg with weights
mod4 = felm(y ~ 1 |
              dy |
              (x1 ~ x3) |
              mnth,
            weights = aq$x2,
            data = aq
            )

## Multiple IV
mod5 = felm(y ~ 1 |
              0 |
              (x1|x2 ~ x3 + dy + mnth) |
              dy,
            data = aq
            )

## Regression table
mods = list(mod1, mod2, mod3, mod4, mod5)
msummary(mods, gof_omit = 'Pseudo|Within|Log|IC', output = 'markdown')
"
writeLines(lfe_string, 'lfe_script.R')

We can now convert this script to the fixest equivalent using the package’s main function, lfe2fixest(), or its alias, felm2feols(). While the function(s) accept several arguments, the only required argument is an input file. Similarly, if no output file argument is provided, then the function(s) will just print the conversion results to screen.

# felm2feols('lfe_script.R') ## same thing
lfe2fixest('lfe_script.R')
#> 
#> library(fixest)
#> library(modelsummary)
#> 
#> ## Toy dataset
#> aq = airquality
#> names(aq) = c('y', 'x1', 'x2', 'x3', 'mnth', 'dy')
#> 
#> ## Simple OLS
#> mod1 = feols(y ~ x1 + x2,  data = aq)
#> 
#> ## Add FE & cluster var
#> mod2 = feols(y ~ x1 + x2 | dy, cluster = ~mnth,  data = aq)
#> 
#> ## Add 2nd cluster var & some estimation options
#> mod3 = feols(y ~ x1 + x2 | dy, cluster = ~dy + mnth,  data =
#>             aq)
#> 
#> ## IV reg with weights
#> mod4 = feols(y ~ 1 | dy | x1 ~ x3, cluster = ~mnth, weights = aq$x2, data = aq )
#> 
#> ## Multiple IV
#> mod5 = feols(y ~ 1 | x1 + x2 ~ x3 + dy + mnth, cluster = ~dy, data = aq )
#> 
#> ## Regression table
#> mods = list(mod1, mod2, mod3, mod4, mod5)
#> msummary(mods, gof_omit = 'Pseudo|Within|Log|IC', output = 'markdown')

Looks good. Note that the feols (felm) model syntax has been cleaned up, with comments removed and everything collapsed onto a single line.¹ Let’s write it to disk by supplying an output file this time.

# felm2feols(infile = 'lfe_script.R', outfile = 'fixest_script.R') ## same thing
lfe2fixest(infile = 'lfe_script.R', outfile = 'fixest_script.R')

Note that the lfe2fixest() is a pure conversion function. It never actually runs anything from either the input or output files. That being said, here’s a quick comparison of the resulting regressions — i.e. what we get if we actually do run the scripts. As an aside, my scripts make use of the excellent modelsummary package to generate the simple regression tables that you see below, although we’re really not showing off its functionality here.

First, the original lfe version:

source('lfe_script.R', print.eval = TRUE)
#> 
#> 
#> |            | Model 1 | Model 2  |    Model 3    | Model 4  | Model 5  |
#> |:-----------|:-------:|:--------:|:-------------:|:--------:|:--------:|
#> |(Intercept) | 77.246  |          |               |          |  93.452  |
#> |            | (9.068) |          |               |          | (65.757) |
#> |x1          |  0.100  |  0.099   |     0.099     |          |          |
#> |            | (0.026) | (0.031)  |    (0.029)    |          |          |
#> |x2          | -5.402  |  -5.577  |    -5.577     |          |          |
#> |            | (0.673) | (1.100)  |    (1.053)    |          |          |
#> |`x1(fit)`   |         |          |               |  0.733   |  0.236   |
#> |            |         |          |               | (0.254)  | (0.179)  |
#> |`x2(fit)`   |         |          |               |          |  -9.558  |
#> |            |         |          |               |          | (3.510)  |
#> |Num.Obs.    |   111   |   111    |      111      |   111    |   111    |
#> |R2          |  0.449  |  0.665   |     0.665     |  -2.658  |  0.071   |
#> |R2 Adj.     |  0.439  |  0.527   |     0.527     |  -4.093  |  0.054   |
#> |Std.Errors  |         | by: mnth | by: dy & mnth | by: mnth |  by: dy  |

Second, the fixest conversion:

source('fixest_script.R', print.eval = TRUE)
#> 
#> 
#> |            | Model 1 | Model 2  |    Model 3    | Model 4  | Model 5  |
#> |:-----------|:-------:|:--------:|:-------------:|:--------:|:--------:|
#> |(Intercept) | 77.246  |          |               |          |  93.452  |
#> |            | (9.068) |          |               |          | (65.757) |
#> |x1          |  0.100  |  0.099   |     0.099     |          |          |
#> |            | (0.026) | (0.031)  |    (0.029)    |          |          |
#> |x2          | -5.402  |  -5.577  |    -5.577     |          |          |
#> |            | (0.673) | (1.100)  |    (1.053)    |          |          |
#> |fit_x1      |         |          |               |  0.733   |  0.236   |
#> |            |         |          |               | (0.254)  | (0.179)  |
#> |fit_x2      |         |          |               |          |  -9.558  |
#> |            |         |          |               |          | (3.510)  |
#> |Num.Obs.    |   111   |   111    |      111      |   111    |   111    |
#> |R2          |  0.449  |  0.665   |     0.665     |  -2.658  |  0.071   |
#> |R2 Adj.     |  0.439  |  0.527   |     0.527     |  -4.093  |  0.054   |
#> |Std.Errors  |   IID   | by: mnth | by: dy & mnth | by: mnth |  by: dy  |
#> |FE: dy      |         |    X     |       X       |    X     |          |

Some minor formatting differences aside, looks like it worked and we get the exact same results from both scripts. Great!

Let’s clean up before closing.

file.remove(c('lfe_script.R', 'fixest_script.R'))
#> [1] TRUE TRUE