Global and Local (Glocal) Bagging Approach for Classifying Noisy Dataset |
Received:October 15, 2008 Revised:December 20, 2008 Download PDF |
Peng Zhang,Zhiwang Zhang,Aihua Li,Yong Shi. Global and Local (Glocal) Bagging Approach for Classifying Noisy Dataset. International Journal of Software and Informatics, 2008,2(2):181~197 |
Hits: 4176 |
Download times: 2917 |
Peng Zhang Zhiwang Zhang Aihua Li Yong Shi |
|
Fund:This work is supported by a grant from National Natural Science Foundation of China (#70621001,#70531040, #70501030, #10601064, #70472074), National Natural Science Foundation of Beijing #9073020, 973 Project #2004CB720103, Ministry of Science and Technol |
|
Abstract:Learning from noisy data is a challenging task for data mining research. In this paper, we argue that for noisy data both global bagging strategy and local bagging strategy su er from their own inherent disadvantages and thus cannot form accurate prediction models. Consequently, we present a Global and Local Bagging (called Glocal Bagging:GB) approach to tackle this problem. GB assigns weight values to the base classi ers under the consideration that: (1) for each test instance Ix, GB prefers bags close to Ix, which is the nature of the local learning strategy; (2) for base classi ers, GB assigns larger weight values to the ones with higher accuracy on the out-of-bag, which is the nature of the global learning strategy. Combining (1) and (2), GB assign large weight values to the classi ers which are close to the current test instance Ix and have high out-of-bag accuracy. The diversity/accuracy analysis on synthetic datasets shows that GB improves the classi er ensemble's performance by increasing its base classi er's accuracy. Moreover, the bias/variance analysis also shows that GB's accuracy improvement mainly comes from the reduction of the bias error. Experiment results on 25 UCI benchmark datasets show that when the datasets are noisy, GB is superior to other former proposed bagging methods such as the classical bagging, bragging, nice bagging, trimmed bagging and lazy bagging. |
keywords:bagging ensemble learning sampling |
View Full Text View/Add Comment Download reader |
|
|
|