{ "id": "1803.04947", "version": "v1", "published": "2018-03-13T17:31:17.000Z", "updated": "2018-03-13T17:31:17.000Z", "title": "Takeuchi's Information Criteria as a form of Regularization", "authors": [ "Matthew Dixon", "Tyler Ward" ], "categories": [ "stat.CO", "stat.ME" ], "abstract": "Takeuchi's Information Criteria (TIC) is a linearization of maximum likelihood estimator bias which shrinks the model parameters towards the maximum entropy distribution, even when the model is mis-specified. In statistical machine learning, $L_2$ regularization (a.k.a. ridge regression) also introduces a parameterized bias term with the goal of minimizing out-of-sample entropy, but generally requires a numerical solver to find the regularization parameter. This paper presents a novel regularization approach based on TIC; the approach does not assume a data generation process and results in a higher entropy distribution through more efficient sample noise suppression. The resulting objective function can be directly minimized to estimate and select the best model, without the need to select a regularization parameter, as in ridge regression. Numerical results applied to a synthetic high dimensional dataset generated from a logistic regression model demonstrate superior model performance when using the TIC based regularization over a $L_1$ and a $L_2$ penalty term.", "revisions": [ { "version": "v1", "updated": "2018-03-13T17:31:17.000Z" } ], "analyses": { "keywords": [ "takeuchis information criteria", "regularization", "regression model demonstrate superior model", "logistic regression model demonstrate superior", "model demonstrate superior model performance" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }