arXiv Analytics

Sign in

arXiv:2211.06516 [cs.LG]AbstractReferencesReviewsResources

Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms

Vashist Avadhanula, Omar Abdul Baki, Hamsa Bastani, Osbert Bastani, Caner Gocmen, Daniel Haimovich, Darren Hwang, Dima Karamshuk, Thomas Leeper, Jiayuan Ma, Gregory Macnamara, Jake Mullett, Christopher Palow, Sung Park, Varun S Rajagopal, Kevin Schaeffer, Parikshit Shah, Deeksha Sinha, Nicolas Stier-Moses, Peng Xu

Published 2022-11-11Version 1

We describe the current content moderation strategy employed by Meta to remove policy-violating content from its platforms. Meta relies on both handcrafted and learned risk models to flag potentially violating content for human review. Our approach aggregates these risk models into a single ranking score, calibrating them to prioritize more reliable risk models. A key challenge is that violation trends change over time, affecting which risk models are most reliable. Our system additionally handles production challenges such as changing risk models and novel risk models. We use a contextual bandit to update the calibration in response to such trends. Our approach increases Meta's top-line metric for measuring the effectiveness of its content moderation strategy by 13%.

Related articles: Most relevant | Search more
arXiv:2311.05075 [cs.LG] (Published 2023-11-09)
Mental Health Diagnosis in the Digital Age: Harnessing Sentiment Analysis on Social Media Platforms upon Ultra-Sparse Feature Content
arXiv:2007.13525 [cs.LG] (Published 2020-07-27)
Detecting Transaction-based Tax Evasion Activities on Social Media Platforms Using Multi-modal Deep Neural Networks
arXiv:2205.09095 [cs.LG] (Published 2022-05-18)
Conformalized Online Learning: Online Calibration Without a Holdout Set