Resources

The internet abounds nowadays with lots of helpful resources for learning data science, programming, etc. For classical statistics however, I find most of the popular online resources are fairly cursory. My hunch is that the dearth of accessible, detailed resources is that nowadays statistics for most professionals boils down to A/B testing e.g. the new name for Stats 101 hypothesis testing, and predictive machine learning models. This leaves people like me who work more on the inference side high and dry when you find yourself wondering why a particular method isn’t working or if you’re trying to improve your inference for a future analysis.

Udemy and the like are helpful as a starting point when learning about new statistical methods, but I find individual blog posts, forums like crossvalidated, and (unfortunately :P) academic papers are the best resources for insightful discussions by experts on the finer details of statistical methods.

This page is my attempty to compile helpful, lesser known resources in one place. I’ve tried to only select resources that were approachable enough so that I can understand them, and therefore actually useful to my work as a statistician.

General Statistics Stuff

“Basic” hypothesis testing, Ordinary Gaussian linear regression and its modifications, variable coding, power calculations, and other general philosophical considerations.

Analyzing Experimental Data/Post-hoc comparisons

Focusing on estimating marginal means and multiplicity adjustments for experimental data. The Mixed Effects models section below also can be used to analyze experimental data (the resources from Keith Lohse are a good place to start).

GLMs

General considerations on GLMs, what they mean, how they are computed, deciding on distributions and link functions, etc.

Mixed Effects Models

LMMs and GLMMs, and their various considerations. This topic can get very hairy, so there are a ton of resources out there. Some are more helpful than others.

Ben Bolker is the GOAT when it comes to mixed effects models, this lecture is a good place to start
He has a bunch of textbooks too, none of which I have read. Some of the supplementary material is available online though
For getting started with mixed effects models in R, the CRAN page is actually a great resource, with brief explanations on the overwhelming number of different packages
lme4 and nlme are the most commonly used R packages, here is a good answer summarizing their strengths and weaknesses
A nice conceptual tutorial to LMMs with R examples is provided here (On the same site, the author also has some nice articles on the difference between ML and REML that I found interesting, as well as a comparison of LMMs+Boostrapping and Bayesian Hierarchical Model)
Mixed effects model tutorials for planned factorial designs (Keith Lohse is a professor at WashU and has a Youtube channel that walks through a lot of these workshops)
A workshop by University of North Dakota on power analyses in GLMMs using the simr package, emphasizing tests of fixed effects
An insightful presentation by Bates on why lme4 doesn’t have p-values/CIs/etc. on variance components
A nice answer on using bootstrapping to make CIs for lmer models
A discussion of crossed vs mixed random effects
Paper comparing conditional models (e.g. GLMMs) and marginal models (e.g. GEEs), arguing that conditional models are the logical superset of models
An SE answer also talking about conditional models vs. marginal models
Yet another one, but this one I think is more cogent
This one gets right to the point in binary model setting, no fluff
Paper from Gelman about the strengths (prediction) and limitations (causal inference in observational data)
A paper on Kenward-Rogers degrees of freedom when doing hypothesis testing with LMMs
Satterthwaite vs. Kenward-Rogers
An SE discussion of GLMMs vs. GLS models
Doug Bates doesn’t like the term BLUP and prefers “conditional mode”
A discussion of compound symmetry in covariance matrix, and how it makes certain models equivalent
Speed test! Fitting mixed effects models in Python, Julia, and R
Really interesting post on negative intraclass correlations in GLMMs

GEEs

Marginal models, when you should use them, and how they stack up conceptually with other models like GLMMs.

Bayesian Statistics

Emphasis on McElreath’s statistical rethinking course.

Intro to R resources in Bayesian statistics (rstan, brms), with basic examples
McElreath’s Statistical Rethinking but using brms
Paper on Bayesian model evaluation using LOOCV - haven’t read this yet - might eventually get to it
What does Frank Harrell mean when he says “make the sample size a random variable when possible?”
Vignette for estimating nonlinear models in brms
p-values vs. Bayes Factors

Survival Models

Robust Statistics

Nonlinear modeling (nls, GAMS, NLMMs)

R Programming and Shiny

If you are subscribed to Shiny tags in LinkedIn, you’ve probably seen this guy evangelizing his “truly open source” alternative to Posit Connect and Appsilon, to be fair, he probably knows what he’s doing.

Miscellaneous Interesting Things

Basically like the general statistics section up above, but this section goes into more advanced “niche” topics.

Nightmares from MS program

Things that remind me of probability classes in grad school. Surprisingly enough, grad school probability (beyond knowing pbinom and the like) can occasionally be pretty useful in a workplace context, though I prefer to estimate probabilities through simulation when needed.

Miscellaneous papers

A collection of PDFs, written by others, that I’ve picked up over time.

These papers are a good complement to the resources above, in particular, the multilevel text written by Gelman and Hill I think is one of the 3 seminal texts for all working statisticians.