arXiv:2409.15848 Abstract | arXiv Analytics

arXiv:2409.15848 [cs.LG]Abstract References Reviews Resources

iGAiVA: Integrated Generative AI and Visual Analytics in a Machine Learning Workflow for Text Classification

Yuanzhe Jin, Adrian Carrasco-Revilla, Min Chen

Published 2024-09-24Version 1

In developing machine learning (ML) models for text classification, one common challenge is that the collected data is often not ideally distributed, especially when new classes are introduced in response to changes of data and tasks. In this paper, we present a solution for using visual analytics (VA) to guide the generation of synthetic data using large language models. As VA enables model developers to identify data-related deficiency, data synthesis can be targeted to address such deficiency. We discuss different types of data deficiency, describe different VA techniques for supporting their identification, and demonstrate the effectiveness of targeted data synthesis in improving model accuracy. In addition, we present a software tool, iGAiVA, which maps four groups of ML tasks into four VA views, integrating generative AI and VA into an ML workflow for developing and improving text classification models.

Categories: cs.LG, cs.CL

Keywords: text classification, machine learning workflow, visual analytics, integrated generative ai, va enables model developers

Related articles: Most relevant | Search more

arXiv:2211.00369 [cs.LG] (Published 2022-11-01)

Anytime Generation of Counterfactual Explanations for Text Classification

Daniel Gilo, Shaul Markovitch

arXiv:2107.10314 [cs.LG] (Published 2021-07-21)

Small-text: Active Learning for Text Classification in Python

Christopher Schröder, Lydia Müller, Andreas Niekler, Martin Potthast

arXiv:2010.05223 [cs.LG] (Published 2020-10-11)

End to End Binarized Neural Networks for Text Classification

Harshil Jain, Akshat Agarwal, Kumar Shridhar, Denis Kleyko