{ "id": "2209.02142", "version": "v1", "published": "2022-09-05T21:02:12.000Z", "updated": "2022-09-05T21:02:12.000Z", "title": "Harvesting the Lyα forest with convolutional neural networks", "authors": [ "Ting-Yun Cheng", "Ryan Cooke", "Gwen Rudie" ], "comment": "22 pages including 2-pages Appendices, 14 figures plus 4 figures in Appendices. This paper is submitted to MNRAS and has addressed the first referee report", "categories": [ "astro-ph.GA", "physics.data-an" ], "abstract": "We develop a machine learning based algorithm using a convolutional neural network (CNN) to identify low HI column density Ly$\\alpha$ absorption systems ($\\log{N_{\\mathrm{HI}}}/{\\rm cm}^{-2}<17$) in the Ly$\\alpha$ forest, and predict their physical properties, such as their HI column density ($\\log{N}_{\\mathrm{HI}}/{\\rm cm}^{-2}$), redshift ($z_{\\mathrm{HI}}$), and Doppler width ($b_{\\mathrm{HI}}$). Our CNN models are trained using simulated spectra (S/N $\\simeq10$), and we test their performance on high quality spectra of quasars at redshift $z\\sim2.5-2.9$ observed with the High Resolution Echelle Spectrometer on the Keck I telescope. We find that $\\sim78\\%$ of the systems identified by our algorithm are listed in the manual Voigt profile fitting catalogue. We demonstrate that the performance of our CNN is stable and consistent for all simulated and observed spectra with S/N $\\gtrsim10$. Our model can therefore be consistently used to analyse the enormous number of both low and high S/N data available with current and future facilities. Our CNN provides state-of-the-art predictions within the range $12.5\\leq\\log{N_{\\mathrm{HI}}}/\\mathrm{cm^{-2}}<15.5$ with a mean absolute error of $\\Delta(\\log{N}_{\\mathrm{HI}}/{\\rm cm}^{-2})=0.13$, $\\Delta(z_{\\mathrm{HI}})=2.7\\times{10}^{-5}$, and $\\Delta(b_{\\mathrm{HI}})=4.1\\ \\mathrm{km\\ s^{-1}}$. The CNN prediction costs $<3$ minutes per model per spectrum with a size of 120\\,000 pixels using a laptop computer. We demonstrate that CNNs can significantly increase the efficiency of analysing Ly$\\alpha$ forest spectra, and thereby greatly increase the statistics of Ly$\\alpha$ absorbers.", "revisions": [ { "version": "v1", "updated": "2022-09-05T21:02:12.000Z" } ], "analyses": { "keywords": [ "convolutional neural network", "voigt profile fitting catalogue", "hi column density ly", "high resolution echelle spectrometer", "identify low hi column density" ], "note": { "typesetting": "TeX", "pages": 22, "language": "en", "license": "arXiv", "status": "editable" } } }