Giuseppe Ragusa: The Limits of Machine Learning for Casual Inference

Data: Mercoledì 6 Maggio 2026, ore 15:30

Luogo: Aula Seminari Bruguier Pacini, DEM – Online via Teams

Relatore: Giuseppe Ragusa (Università di Roma, La Sapienza)

Titolo: The Limits of Machine Learning for Causal Inference

Abstract:

Double/Debiased Machine Learning (DML) provides √n-consistent estimation of low-dimensional parameters in the presence of high-dimensional nuisance functions, provided those nuisance functions are estimated at a sufficiently fast rate. We examine when this rate requirement is met in practice. Through a systematic review of convergence-rate theory for Lasso, random forests, neural networks, and other learners, we show that the required rate is harder to achieve than widely assumed— especially for random forests, which lack any general L2(P)-rate guarantee strong enough for DML. We document a misalignment between standard cross-validation, which targets prediction risk, and the nuisance estimation quality that DML actually requires. As a result of this misalignment, two learners with nearly identical prediction errors can produce dramatically different biases in the target estimator. Extensive simulations across four partially linear regression designs confirm that (i) finite-sample bias, not variance, is the dominant source of inference failure; (ii) prediction-optimal tuning can produce 50-fold more bias than a well-chosen fixed learner pair; and (iii) implementation randomness from stochastic learners may be a non-negligible additional variance source. We propose a target-aware tuning algorithm that supplements prediction- based cross-validation with a stability screen and a least-regularized selection step, motivated by semiparametric undersmoothing logic.

Organizzatore del seminario: Francesco Valentini