{ "id": "2006.15437", "version": "v1", "published": "2020-06-27T20:12:33.000Z", "updated": "2020-06-27T20:12:33.000Z", "title": "GPT-GNN: Generative Pre-Training of Graph Neural Networks", "authors": [ "Ziniu Hu", "Yuxiao Dong", "Kuansan Wang", "Kai-Wei Chang", "Yizhou Sun" ], "comment": "Published on KDD 2020", "categories": [ "cs.LG", "cs.SI", "stat.ML" ], "abstract": "Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs usually requires abundant task-specific labeled data, which is often arduously expensive to obtain. One effective way to reduce the labeling effort is to pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels. In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training. GPT-GNN introduces a self-supervised attributed graph generation task to pre-train a GNN so that it can capture the structural and semantic properties of the graph. We factorize the likelihood of the graph generation into two components: 1) Attribute Generation and 2) Edge Generation. By modeling both components, GPT-GNN captures the inherent dependency between node attributes and graph structure during the generative process. Comprehensive experiments on the billion-scale Open Academic Graph and Amazon recommendation data demonstrate that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.", "revisions": [ { "version": "v1", "updated": "2020-06-27T20:12:33.000Z" } ], "analyses": { "keywords": [ "graph neural networks", "outperforms state-of-the-art gnn models", "significantly outperforms state-of-the-art gnn", "attributed graph generation task", "generative pre-training" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }