Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification

M Loog

doi:10.1109/TPAMI.2015.2452921

Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification

M Loog

Pattern Recognition and Bioinformatics

Research output: Contribution to journal › Article › Scientific › peer-review

49 Citations (Scopus)

86 Downloads (Pure)

Abstract

Improvement guarantees for semi-supervised classifiers can currently only be given under restrictive conditions on the data. We propose a general way to perform semi-supervised parameter estimation for likelihood-based classifiers for which, on the full training set, the estimates are never worse than the supervised solution in terms of the log-likelihood. We argue, moreover, that we may expect these solutions to really improve upon the supervised classifier in particular cases. In a worked-out example for LDA, we take it one step further and essentially prove that its semi-supervised version is strictly better than its supervised counterpart. The two new concepts that form the core of our estimation principle are contrast and pessimism. The former refers to the fact that our objective function takes the supervised estimates into account, enabling the semi-supervised solution to explicitly control the potential improvements over this estimate. The latter refers to the fact that our estimates are conservative and therefore resilient to whatever form the true labeling of the unlabeled data takes on. Experiments demonstrate the improvements in terms of both the log-likelihood and the classification error rate on independent test sets.

Original language	English
Pages (from-to)	462-475
Number of pages	14
Journal	IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume	38
Issue number	3
DOIs	https://doi.org/10.1109/TPAMI.2015.2452921
Publication status	Published - 2016

Keywords

Maximum likelihood
semi-supervised learning
contrast
pessimism
linear discriminant analysis

Access to Document

10.1109/TPAMI.2015.2452921

bare_jrnl_compsoc_finalAccepted author manuscript, 237 KB

Cite this

@article{26f0713f9ceb4939a8bd0b72d229bfe7,

title = "Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification",

abstract = "Improvement guarantees for semi-supervised classifiers can currently only be given under restrictive conditions on the data. We propose a general way to perform semi-supervised parameter estimation for likelihood-based classifiers for which, on the full training set, the estimates are never worse than the supervised solution in terms of the log-likelihood. We argue, moreover, that we may expect these solutions to really improve upon the supervised classifier in particular cases. In a worked-out example for LDA, we take it one step further and essentially prove that its semi-supervised version is strictly better than its supervised counterpart. The two new concepts that form the core of our estimation principle are contrast and pessimism. The former refers to the fact that our objective function takes the supervised estimates into account, enabling the semi-supervised solution to explicitly control the potential improvements over this estimate. The latter refers to the fact that our estimates are conservative and therefore resilient to whatever form the true labeling of the unlabeled data takes on. Experiments demonstrate the improvements in terms of both the log-likelihood and the classification error rate on independent test sets.",

keywords = "Maximum likelihood, semi-supervised learning, contrast, pessimism, linear discriminant analysis",

author = "M Loog",

year = "2016",

doi = "10.1109/TPAMI.2015.2452921",

language = "English",

volume = "38",

pages = "462--475",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE",

number = "3",

}

TY - JOUR

T1 - Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification

AU - Loog, M

PY - 2016

Y1 - 2016

N2 - Improvement guarantees for semi-supervised classifiers can currently only be given under restrictive conditions on the data. We propose a general way to perform semi-supervised parameter estimation for likelihood-based classifiers for which, on the full training set, the estimates are never worse than the supervised solution in terms of the log-likelihood. We argue, moreover, that we may expect these solutions to really improve upon the supervised classifier in particular cases. In a worked-out example for LDA, we take it one step further and essentially prove that its semi-supervised version is strictly better than its supervised counterpart. The two new concepts that form the core of our estimation principle are contrast and pessimism. The former refers to the fact that our objective function takes the supervised estimates into account, enabling the semi-supervised solution to explicitly control the potential improvements over this estimate. The latter refers to the fact that our estimates are conservative and therefore resilient to whatever form the true labeling of the unlabeled data takes on. Experiments demonstrate the improvements in terms of both the log-likelihood and the classification error rate on independent test sets.

AB - Improvement guarantees for semi-supervised classifiers can currently only be given under restrictive conditions on the data. We propose a general way to perform semi-supervised parameter estimation for likelihood-based classifiers for which, on the full training set, the estimates are never worse than the supervised solution in terms of the log-likelihood. We argue, moreover, that we may expect these solutions to really improve upon the supervised classifier in particular cases. In a worked-out example for LDA, we take it one step further and essentially prove that its semi-supervised version is strictly better than its supervised counterpart. The two new concepts that form the core of our estimation principle are contrast and pessimism. The former refers to the fact that our objective function takes the supervised estimates into account, enabling the semi-supervised solution to explicitly control the potential improvements over this estimate. The latter refers to the fact that our estimates are conservative and therefore resilient to whatever form the true labeling of the unlabeled data takes on. Experiments demonstrate the improvements in terms of both the log-likelihood and the classification error rate on independent test sets.

KW - Maximum likelihood

KW - semi-supervised learning

KW - contrast

KW - pessimism

KW - linear discriminant analysis

U2 - 10.1109/TPAMI.2015.2452921

DO - 10.1109/TPAMI.2015.2452921

M3 - Article

SN - 0162-8828

VL - 38

SP - 462

EP - 475

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 3

ER -

Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification

Abstract

Keywords

Access to Document

Fingerprint

Cite this