Co-Training by Committee: A Generalized Framework for Semi-Supervised Learning with Committees
Received:September 28, 2008  Revised:December 13, 2008  Download PDF
Mohamed Farouk Abdel Hady,Friedhelm Schwenker. Co-Training by Committee: A Generalized Framework for Semi-Supervised Learning with Committees. International Journal of Software and Informatics, 2008,2(2):95~124
Hits: 5133
Download times: 3401
Mohamed Farouk Abdel Hady  Friedhelm Schwenker
Fund:This work was supported by the German Science Foundation (DFG) under grant SCHW623/4-3 and a scholarship of the German Academic Exchange Service (DAAD).
Abstract:Many data mining applications have a large amount of data but labeling data is often di cult, expensive, or time consuming, as it requires human experts for annotation.Semi-supervised learning addresses this problem by using unlabeled data together with labeled data to improve the performance. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by two or more redundantly su cient sets of features (views) and additionally these views are independent given the class. However, these assumptions are not satis ed in many real-world application domains. In this paper, a framework called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classi ers is used for semi-supervised learning that requires neither redundant and independent views nor di erent base learning algorithms. The framework is a general single-view semi-supervised learner that can be applied on any ensemble learner to build diverse committees. Experimental results of CoBC using Bagging, AdaBoost and the Random Subspace Method (RSM) as ensemble learners demonstrate that error diversity among classi ers leads to an e ective Co-Training style algorithm that maintains the diversity of the underlying ensemble.
keywords:data mining  semi-supervised learning  co-training  classi cation  ensemble learning  decision tree  visual object recognition
View Full Text  View/Add Comment  Download reader

 

 

more>>  
Visitor:3140795
Top Paper  |  E-mail Alert  |  Publication Ethics  |  New Version

© Copyright by Institute of Software, the Chinese Academy of Sciences
京ICP备05046678号-5

京公网安备 11040202500065号