arXiv Analytics

Sign in

arXiv:2403.03772 [cs.LG]AbstractReferencesReviewsResources

AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs

Victor Akinwande, J. Zico Kolter

Published 2024-03-06Version 1

Existing causal discovery methods based on combinatorial optimization or search are slow, prohibiting their application on large-scale datasets. In response, more recent methods attempt to address this limitation by formulating causal discovery as structure learning with continuous optimization but such approaches thus far provide no statistical guarantees. In this paper, we show that by efficiently parallelizing existing causal discovery methods, we can in fact scale them to thousands of dimensions, making them practical for substantially larger-scale problems. In particular, we parallelize the LiNGAM method, which is quadratic in the number of variables, obtaining up to a 32-fold speed-up on benchmark datasets when compared with existing sequential implementations. Specifically, we focus on the causal ordering subprocedure in DirectLiNGAM and implement GPU kernels to accelerate it. This allows us to apply DirectLiNGAM to causal inference on large-scale gene expression data with genetic interventions yielding competitive results compared with specialized continuous optimization methods, and Var-LiNGAM for causal discovery on U.S. stock data.

Comments: Accepted at MLGenX @ ICLR 2024. Open source at https://github.com/Viktour19/culingam
Categories: cs.LG, cs.DC, stat.ML
Related articles: Most relevant | Search more
arXiv:1807.04010 [cs.LG] (Published 2018-07-11)
Causal discovery in the presence of missing data
arXiv:2205.13869 [cs.LG] (Published 2022-05-27)
MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models
Erdun Gao et al.
arXiv:2111.05070 [cs.LG] (Published 2021-11-09, updated 2022-01-31)
Almost Optimal Universal Lower Bound for Learning Causal DAGs with Atomic Interventions