An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes

doi:10.1016/j.patcog.2011.01.017

Pattern Recognition

Volume 44, Issue 8, August 2011, Pages 1761-1776

https://doi.org/10.1016/j.patcog.2011.01.017 Get rights and content

Abstract

Classification problems involving multiple classes can be addressed in different ways. One of the most popular techniques consists in dividing the original data set into two-class subsets, learning a different binary model for each new subset. These techniques are known as binarization strategies.

In this work, we are interested in ensemble methods by binarization techniques; in particular, we focus on the well-known one-vs-one and one-vs-all decomposition strategies, paying special attention to the final step of the ensembles, the combination of the outputs of the binary classifiers. Our aim is to develop an empirical analysis of different aggregations to combine these outputs. To do so, we develop a double study: first, we use different base classifiers in order to observe the suitability and potential of each combination within each classifier. Then, we compare the performance of these ensemble techniques with the classifiers' themselves. Hence, we also analyse the improvement with respect to the classifiers that handle multiple classes inherently.

We carry out the experimental study with several well-known algorithms of the literature such as Support Vector Machines, Decision Trees, Instance Based Learning or Rule Based Systems. We will show, supported by several statistical analyses, the goodness of the binarization techniques with respect to the base classifiers and finally we will point out the most robust techniques within this framework.

Research highlights

► One-vs-one and one-vs-all are ensembles for multi-class problems. ► The confidence estimates and their aggregation are key factors of these ensembles. ► Aggregations based on voting and estimation of probabilities are the most robust. ► One-vs-one is more robust, one-vs-all has received less attention. ► Binarization is beneficial even when it is not necessary.

Introduction

Supervized Machine Learning consists in extracting knowledge from a set of n input examples x₁, …, x_n characterized by i features $a_{1}, \dots, a_{i} \in A$ , including numerical or nominal values, where each instance has associated a desired output y_j and the aim is to learn a system capable of predicting this output for a new unseen example in a reasonable way (with good generalization ability). This output can be a continuous value $y_{j} \in R$ or a class label $y_{j} \in C$ (considering an m class problem $C = {c_{1}, \dots, c_{m}}$ ). In the former case, it is a regression problem, while in the latter it is a classification problem [22]. In classification, the system generated by the learning algorithm is a mapping function defined over the patterns $A^{i} \to C$ and it is called a classifier.

Classification tasks are widely used in real-world applications, many of them are classification problems that involve more than two classes, the so-called multi-class problems. Their application domain is diverse, for instance, in the field of bioinformatics, classification of microarrays [51] and tissues [71], which operate with several class labels. Computer vision multi-classification techniques play a key role within objects [72], fingerprints [41] and sign language [8] recognition tasks, whereas in medicine, multiple categories are considered in problems such as cancer [6] or electroencephalogram signals [38] classification.

Usually, it is easier to build a classifier to distinguish only between two classes than to consider more than two classes in a problem, since the decision boundaries in the former case can be simpler. This is why binarization techniques have come up to deal with multi-class problems by dividing the original problem into easier to solve binary classification problems that are faced by binary classifiers. These classifiers are usually referred to as base learners or base classifiers of the system [30].

Different decomposition strategies can be found in the literature [52]. The most common strategies are called “one-vs-one” (OVO) [47] and “one-vs-all” (OVA) [17], [7].

•
OVO consists in dividing the problem into as many binary problems as all the possible combinations between pairs of classes, so one classifier is learned to discriminate between each pair, and then the outputs of these base classifiers are combined in order to predict the output class.
•
OVA approach learns a classifier for each class, where the class is distinguished from all other classes, so the base classifier giving a positive answer indicates the output class.

In the recent years, different methods to combine the outputs of the base classifiers from these strategies have been developed, for instance, new approaches in the framework of probability estimates [76], binary-tree based strategies [23], dynamic classification schemes [41] or methods using preference relations [44], [24], in addition to more classical well-known combinations such as Pairwise Coupling [39], Max-Wins rule [29] or Weighted Voting (whose robustness has been recently proved in [46]).

In the specialized literature, there exist few works comparing these techniques, neither between OVO and OVA, nor between different aggregation strategies. In [42] a study of OVO, OVA and Error Correcting Output Codes (ECOC) [21] is carried out, but only within multi-class Support Vector Machine (SVM) framework, whereas in [52] an enumeration of the different existing binarization methodologies is presented, but also without comparing them mutually. Fürnkranz [31] compared the suitability of OVO strategies for decision trees and decision lists with other ensemble methods such as boosting and bagging, showing also the improvement of using confidence estimates in the combination of the outputs. In [76], a comparison in the framework of probability estimates is developed, but no more possible aggregations for the outputs of the classifiers are considered.

Our aim is to carry out an exhaustive empirical study of OVO and OVA decompositions, paying special attention to the different ways in which the outputs of the base classifiers can be combined. The main novelties of this paper with respect to the referred previous studies [42], [31], [76], [52] consist in the following points:

•
We develop a study of the state-of-the-art on the aggregation strategies for OVO and OVA schemes. To do so, we will present an overview of the existing combination methods and we will compare their performances over a set of different real-world problems. Whereas a previous comparison exists between probability estimates by pairwise coupling [76], to the best of our knowledge, a comparison among the whole kind of aggregation methods is missing.
•
We analyse the behaviour of the OVO and OVA schemes with different base learners, studying the suitability of these techniques in each base classifier.
•
Since binarization techniques have been already proven as appropriate strategies to deal with multi-class problems [30], [31], [42], [63] where the original classifiers do not naturally handle multiple class labels, we analyse whether they also improve the behaviour of the classifiers that have a built-in multi-class support.

Thus, our intention is to make a thorough analysis of the framework of binarization, answering two main questions:

1.
Given that we want or have to use binarization, how should we do it? This is the main objective of this paper; to show the most robust aggregation techniques within the framework of binarization, which is still an unanswered question. Therefore, we analyse empirically which is the most appropriate binarization technique and which aggregation should be used in each case.
2.
But, should we do binarization? This is an essential question when we can overcome multi-class problems in different ways (the base classifier is able to manage multiple classes). Previous works have been done showing the goodness of binarization techniques [30], [31], [42], [63], although we develop a complementary study to stress their suitability with a complete statistical analysis among different learning paradigms that support multi-class data.

In order to achieve well-founded conclusions, we develop a complete empirical study. The experimental framework includes a set of nineteen real-world problems from the UCI repository [9]. The measures of performance are based on the accuracy rate and Cohen's kappa metric [18]. The significance of the results is supported by the proper statistical tests as suggested in the literature [20], [35], [34]. We chose several well-known classifiers from different Machine Learning paradigms as base learners, namely, SVMs [73], decision trees [62], instance-based learning [1], fuzzy rule based systems [16] and decision lists [19].

Finally, we included an indepth discussion on the results, that have been acquired empirically along the experimental study. This allowed us to answer the issues previously raised and summarize the lessons learned in this paper. Additionally, we showed some new challenges on the topic in correspondence with the obtained results.

The rest of this paper is organized as follows. Section 2 presents a thorough overview of the existing binarization techniques, with special attention to OVO and OVA strategies. Section 3 presents the state-of-the-art on the aggregation strategies for the outputs of those strategies that we use in this work. The experimental framework set-up is given in Section 4, that is, the algorithms used as base classifiers, the performance measures, the statistical tests, the data sets, the parameters for the algorithms and the description of a Web page associated to the paper (http://sci2s.ugr.es/ovo-ova), which contains complementary material to the experimental study. We develop the empirical analysis in Section 5. The discussion, including the lessons learned throughout this study and future works that remain to be addressed, is presented in Section 6. Finally, in Section 7 we make our concluding remarks.

Section snippets

Reducing multi-class problems by binarization techniques

In this section, we first describe the idea behind binarization techniques to deal with multi-class problems and review the existing decomposition strategies. Then, we explain with relative detail the most common strategies that we have used in the experimental study: OVO and OVA.

State-of-the-art on aggregation schemes for binarization techniques

In this section we describe the state-of-the-art on aggregation strategies for binarization techniques. We divide them into two subsections: the first one is oriented to the combinations for OVO decomposition where the aggregation is made from a score matrix; the second one reviews the combinations for OVA scheme, where the outputs of the classifiers are given by a score vector.

A more extensive and detailed description of these methods can be found in the web page $〈$ http://sci2s.ugr.es/ovo-ova $〉$ .

Experimental framework

In this section, we present the set-up of the experimental framework used to develop the experiments in Section 5. We first describe the algorithms that we have selected to use as base classifiers in the study in Section 4.1. Section 4.2 describes the measures employed to evaluate the performance of the algorithms analysed in this paper. Next, we present the statistical tests applied to compare the results obtained with the different aggregations and decomposition techniques in Section 4.3.

Experimental study

In this section, we present the results of the experimental study. We will answer to the following questions:

1.
Should we do binarization? How should we do it?
2.
Which is the most appropriate aggregation for each decomposition scheme?

Thereby the study is divided into two parts, each one dedicated to a question. Since the main objective of this paper is the analysis of the different combinations, we will try to answer these questions in an upside-down manner, starting from the second one. Hence we

Discussion: lessons learned and future work

This paper has provided an exhaustive empirical analysis of the main ensemble methods for binary classifiers in multi-class problems, specifically the methods based on OVO and OVA strategies. We structured the analysis in two sections, studying the different ways in which the outputs of the underlying binary classifiers can be combined and then, filling up the analysis investigating the use of binarization techniques when the multi-class problem can also be faced up by a unique classifier.

From

Concluding remarks

We made a thorough analysis of several ensemble methods applied on multi-classification problems in a general classification framework. All of them are based on two well-known strategies, OVO and OVA, whose suitability have been tested in several real-world problems data sets.

From this work we conclude that actually OVO methods and specifically the ensembles using WV, LVPC, PC and PE combinations are the ones with the best average behaviour, but the best aggregation within a problem depends on

Acknowledgements

This work has been supported by the Spanish Ministry of Education and Science under Projects TIN2010-15055 and TIN2008-06681-C06-01.

Mikel Galar received his M.Sc. degree in Computer Sciences from the Public University of Navarra, Pamplona, Spain, in 2009. Currently he holds a research position at the Department of Automatics and Computation. His research interests are data-mining, classification, multi-classification, ensemble learning, evolutionary algorithms and fuzzy systems.

References (77)

A. Anand et al.
Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates
Journal of Theoretical Biology
(2009)
O. Aran et al.
A multi-class classification strategy for fisher scores: application to signer independent sign language recognition
Pattern Recognition
(2010)
R. Barandela et al.
Strategies for learning in class imbalance problems
Pattern Recognition
(2003)
A. Ben-David
A lot of randomness is hiding in accuracy
Engineering Applications of Artificial Intelligence
(2007)
A. Fernández et al.
Solving multi-class problems with linguistic fuzzy rule based classification systems based on pairwise learning and preference relations
Fuzzy Sets and Systems
(2010)
C. Ferri et al.
An experimental comparison of performance measures for classification
Pattern Recognition Letters
(2009)
S. García et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power
Information Sciences
(2010)
J.H. Hong et al.
Fingerprint classification using one-vs-all support vector machines dynamically ordered with Naïve Bayes classifiers
Pattern Recognition
(2008)
E. Hüllermeier et al.
Learning valued preference structures for solving classification problems
Fuzzy Sets and Systems
(2008)
E. Hüllermeier et al.
Combining predictions in pairwise classification: an optimal adaptive voting strategy and its relation to weighted voting
Pattern Recognition
(2010)

J. Luengo et al.

Domains of competence of fuzzy rule based classification systems with data complexity measures: a case of study using a fuzzy hybrid genetic based machine learning method

Fuzzy Sets and Systems

(2010)

S.A. Orlovsky

Decision-making with a fuzzy preference relation

Fuzzy Sets and Systems

(1978)

O. Pujol et al.

An incremental node embedding technique for error correcting output codes

Pattern Recognition

(2008)

L. Rueda et al.

Multi-class pairwise linear dimensionality reduction using heteroscedastic schemes

Pattern Recognition

(2010)

D.W. Aha et al.

Instance-based learning algorithms

Machine Learning

(1991)

J. Alcalá-Fdez et al.

KEEL Data-mining software tool: data set repository, integration of algorithms and experimental analysis framework

Journal of Multiple-Valued Logic and Soft Computing

(2011)

J. Alcalá-Fdez et al.

KEEL: a software tool to assess evolutionary algorithms for data mining problems

Soft Computing

(2009)

E.L. Allwein et al.

Reducing multiclass to binary: a unifying approach for margin classifiers

Journal of Machine Learning Research

(2000)

E. Alpaydin

Introduction to Machine Learning

(2004)

R. Anand et al.

Efficient classification for multiclass problems using modular neural networks

IEEE Transactions on Neural Networks

(1995)

A. Asuncion, D.J. Newman, UCI machine learning repository (2007), URL:...

R.A. Baeza-Yates et al.

Modern Information Retrieval

(1999)

M. Basu et al.

Data Complexity in Pattern Recognition

(2006)

E. Bernado-Mansilla et al.

Domain of competence of XCS classifier system in complexity measurement space

IEEE Transactions on Evolutionary Computation

(2005)

N.V. Chawla, N. Japkowicz, A. Kolcz (Eds.), Special Issue on Learning from Imbalanced Datasets, vol. 6, no. 11,...

Y. Chen et al.

Support vector learning for fuzzy rule-based classification systems

IEEE Transactions on Fuzzy Systems

(2003)

P. Clark et al.

Rule induction with CN2: some recent improvements

J. Cohen

A coefficient of agreement for nominal scales

Educational and Psychological Measurement

(1960)

W.W. Cohen, Fast effective rule induction, in: ICML’95: Proceedings of the 12th International Conference on Machine...

J. Demar

Statistical comparisons of classifiers over multiple data sets

Journal of Machine Learning Research

(2006)

T.G. Dietterich et al.

Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research

(1995)

R.O. Duda et al.

Pattern Classification

(2001)

B. Fei et al.

Binary tree of SVM: a new fast multiclass training and classification algorithm

IEEE Transactions on Neural Networks

(2006)

A. Fernández, M. Calderón, E. Barrenechea, H. Bustince, F. Herrera, Enhancing fuzzy rule based systems in...

A. Fernández et al.

Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study

IEEE Transactions on Evolutionary Computation

(2010)

E. Frank et al.

Ensembles of nested dichotomies for multi-class problems

J.H. Friedman, Another approach to polychotomous classification, Technical Report, Department of Statistics, Stanford...

J. Fürnkranz

Round robin classification

Journal of Machine Learning Research

(2002)

Cited by (652)

An auto-regulated universal domain adaptation network for uncertain diagnostic scenarios of rotating machinery
2024, Expert Systems with Applications
In recent years, domain adaptation techniques have garnered significant attention in the field of intelligent fault diagnosis for mechanical equipment. Domain adaptation techniques can effectively enable the reuse of diagnostic knowledge, thereby improving the generalization performance of the trained model. However, many intelligent diagnostic models are tailored to specific diagnostic scenarios, including closed-set domain adaptation, partial domain adaptation, and open-set domain adaptation. When these models built for specific scenarios are applied to other scenarios, their diagnostic performance will deteriorate catastrophically. This study proposes an auto-regulated universal domain adaptation network (AUDAN) for uncertain diagnostic scenarios of rotating machinery, which can handle multiple diagnostic scenarios and adaptively set threshold boundaries for each fault category based on data distribution structure to distinguish known and unknown samples. Specifically, a novel integration loss function is designed to train the proposed AUDAN. The self-adaptive reweight module is introduced into the proposed method to distribute the relative weight for each loss function automatically. Experiments on two rotating machinery datasets are carried out for validations. The experimental results demonstrate that the proposed AUDAN performs effectively in uncertain diagnostic scenarios and is promising in addressing the fault classification tasks in real industries.
Ensemble methods and semi-supervised learning for information fusion: A review and future research directions
2024, Information Fusion
Advances over the past decade at the intersection of information fusion methods and Semi-Supervised Learning (SSL) are investigated in this paper that grapple with challenges related to limited labelled data. To do so, a bibliographic review of papers published since 2013 is presented, in which ensemble methods are combined with new machine learning algorithms. A total of 128 new proposals using SSL algorithms for ensemble construction are identified and classified. All the methods are categorised by approach, ensemble type, and base classifier. Experimental protocols, pre-processing, dataset usage, unlabelled ratios, and statistical tests are also assessed, underlining the major trends, and some shortcomings of particular studies. It is evident from this literature review that foundational algorithms such as self-training and co-training are influencing current developments, and that innovative ensemble techniques are continuing to emerge. Additionally, valuable guidelines are identified in the review for improving research into intrinsically semi-supervised and unsupervised pre-processing methods, especially for regression tasks.
Construction of fuzzy classification systems by fitness sharing based genetic search and boosting based ensemble
2024, Fuzzy Sets and Systems
This paper concentrates on the development of precise fuzzy rule-based classification systems for high-dimensional and multi-class problems. The approach begins with the extraction on potential fuzzy if-then rules using fitness sharing based genetic algorithms, this ensures effective searching for productive niches, thereby evolving and maintaining a diverse, cooperative population. Subsequently, for the purpose of combining the obtained fuzzy rules and eliminating their conflicts, an adaboost ensemble method is utilized, enhancing the accuracy of the fuzzy classification systems.
Experiments have been conducted on 10 UCI datasets and 3 well-known image classification problems. The features for these image tasks were derived from the activation values of the final convolutional layer in pre-trained convolutional neural networks. These datasets, which were chosen to evaluate the effectiveness of the proposed approach, exhibit significant variation in terms of dimensionality and the number of class labels. Comparative analyses are carried out with conventional fuzzy rule-based classification methods, and the results demonstrate that the classification systems can be developed for complex problems, while maintaining a high-level of prediction accuracy.
Applying machine learning for multi-individual Raman spectroscopic data to identify different stages of proliferating human hepatocytes
2024, iScience
Cell therapy using proliferating human hepatocytes (ProliHHs) is an effective treatment approach for advanced liver diseases. However, rapid and accurate identification of high-quality ProliHHs from different donors is challenging due to individual heterogeneity. Here, we developed a machine learning framework to integrate single-cell Raman spectroscopy from multiple donors and identify different stages of ProliHHs. A repository of more than 14,000 Raman spectra, consisting of primary human hepatocytes (PHHs) and different passages of ProliHHs from six donors, was generated. Using a sliding window algorithm, potential biomarkers distinguishing the different cell stages were identified through differential analysis. Leveraging machine learning models, accurate classification of cell stages was achieved in both within-donor and cross-donor prediction tasks. Furthermore, the study assessed the relationship between donor and cell numbers and its impact on prediction accuracy, facilitating improved quality control design. A similar workflow can also be extended to encompass other cell types.
A real-time SVM-based hardware accelerator for hyperspectral images classification in FPGA
2024, Microprocessors and Microsystems
Hyperspectral imaging can be conceptualized as a three-dimensional dataset of spectral information related to a particular landscape. Generally speaking, these are aerial photographs captured by Earth observation satellites. A useful analogy for a hyperspectral image is one of a cube formed with the image acquired along the X and Y axes and a third dimension of spectral bands of varying wavelengths. Given the wealth of data contained within these images, they have been employed in both civilian and military applications such as terrain recognition, urban development supervision, recognition of rare minerals, and various other objectives. The increased utilization of these images has garnered the interest of researchers striving to create solutions that may enable faster processing of the images via parallel processing. In this context, FPGA technology is an option capable of facilitating the implementation of such a system for observation satellites. This research is situated within this framework and aims to develop an FPGA-synthesized hardware accelerator to facilitate real-time hyperspectral image categorization. By taking this approach, hardware-specific solutions can be implemented for embedded applications that process hyperspectral images and can also be integrated with further image processing steps. The proposed accelerator was constructed based on an advanced algorithmic model, resulting in outcomes consistent with those generated by the software-based solution. The experimental results demonstrate that the engineered accelerator can attain a pixel classification time equal to or less than the pixel acquisition time, thus conforming to the real-time processing criteria concerning classification time. Further, the manufactured accelerator exhibits scalability that can classify distinct datasets with varying classes concurrently while maintaining a uniform logic resource utilization.
Information transfer rate in BCIs: Towards tightly integrated symbiosis
2024, Biomedical Signal Processing and Control
The information transmission rate (ITR), or effective bit rate, is a popular and widely used information measurement metric, particularly popularized for SSVEP-based Brain–Computer (BCI) interfaces. By combining speed and accuracy into a single-valued parameter, this metric aids in the evaluation and comparison of various target identification algorithms across different BCI communities. In order to calculate ITR, it is customary to assume a uniform input distribution and an oversimplified channel model that is memoryless, stationary, and symmetrical in nature with discrete alphabet sizes. To accurately depict performance and inspire an end-to-end design for futuristic BCI designs, a more thorough examination and definition of ITR is therefore required.
We model the symbiotic communication medium, hosted by the retinogeniculate visual pathway, as a discrete memoryless channel and use the modified capacity expressions to redefine the ITR. We leverage a result for directed graphs to characterize the relationship between the asymmetry of the transition statistics and the ITR gain due to the new definition, leading to potential bounds on data rate performance.
On two well-known SSVEP datasets, we compared two cutting-edge target identification methods. Results indicate that the induced DM channel asymmetry has a greater impact on the actual perceived ITR than the change in input distribution. Moreover, it is demonstrated that the ITR gain under the new definition is inversely correlated with the asymmetry in the channel transition statistics. Individual input customizations are further shown to yield perceived ITR performance improvements. Finally, an algorithm is proposed to find the capacity of binary classification and further discussions are given to extend such results to multi-class case through ensemble techniques.
We anticipate that the results of our study will contribute to the characterization of the highly dynamic BCI channel capacities, performance thresholds, and improved BCI stimulus designs for a tighter symbiosis between the human brain and computer systems while ensuring the efficient utilization of the underlying communication resources.

View all citing articles on Scopus

Alberto Fernández received his M.Sc. degree in Computer Sciences in 2005 and the Ph.D. degree in Computer Science in 2010, both from the University of Granada, Spain. He is currently a Supply Assistant Professor in the Department of Computer Science, University of Jaén, Jaén, Spain. His research interests include data mining, classification in imbalanced domains, fuzzy rule learning, evolutionary algorithms and multi-classification problems.

Edurne Barrenechea is an Assistant Lecturer at the Department of Automatics and Computation, Public University of Navarra. She received an M.Sc. in Computer Science at the Pais Vasco University in 1990. She worked in a private company (Bombas Itur) as analyst programmer from 1990 to 2001, and then she joined the Public University of Navarra as Associate Lecturer. She obtained the Ph.D. in Computer Science in 2005 on the topic interval-valued fuzzy sets applied to image processing. Her research interests are fuzzy techniques for image processing, fuzzy sets theory, interval type-2 fuzzy sets theory and applications, decision making, and industrial applications of soft computing techniques. She is member of the board of the European Society for Fuzzy Logic and Technology (EUSFLAT).

Humberto Bustince is a Full Professor at the Department of Automatics and Computation, Public University of Navarra, Spain. He holds a Ph.D. degree in Mathematics from Public University of Navarra from 1994. His research interests are fuzzy logic theory, extensions of Fuzzy sets (Type-2 fuzzy sets, Interval-valued fuzzy sets, Atanassov's intuitionistic fuzzy sets), Fuzzy measures, Aggregation functions and fuzzy techniques for Image processing. He is author of over 50 published original articles and involved in teaching Artificial Intelligence for students of Computer Sciences.

Francisco Herrera received the M.Sc. degree in Mathematics in 1988 and the Ph.D. degree in Mathematics in 1991, both from the University of Granada, Spain. He is currently a Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada. He has published more than 150 papers in international journals. He is coauthor of the book “Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases” (World Scientific, 2001). As edited activities, he has co-edited five international books and co-edited 20 special issues in international journals on different Soft Computing topics. He acts as associated editor of the journals IEEE Transactions on Fuzzy Systems, Information Sciences, Mathware and Soft Computing, Advances in Fuzzy Systems, Advances in Computational Sciences and Technology, and International Journal of Applied Metaheuristics Computing. He currently serves as area editor of the Journal Soft Computing (area of genetic algorithms and genetic fuzzy systems), and he serves as member of several journal editorial boards, among others Fuzzy Sets and Systems, Applied Intelligence, Knowledge and Information Systems, Information Fusion, Evolutionary Intelligence, International Journal of Hybrid Intelligent Systems and Memetic Computation. His current research interests include computing with words and decision making, data mining, data preparation, instance selection, fuzzy rule based systems, genetic fuzzy systems, knowledge extraction based on evolutionary algorithms, memetic algorithms and genetic algorithms.

View full text

An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes

Abstract

Research highlights

Introduction

Section snippets

Reducing multi-class problems by binarization techniques

State-of-the-art on aggregation schemes for binarization techniques

Experimental framework

Experimental study

Discussion: lessons learned and future work

Concluding remarks

Acknowledgements

Journal of Theoretical Biology

Pattern Recognition

Pattern Recognition

Engineering Applications of Artificial Intelligence

Fuzzy Sets and Systems

Pattern Recognition Letters

Information Sciences

Pattern Recognition

Fuzzy Sets and Systems

Pattern Recognition

Fuzzy Sets and Systems

Fuzzy Sets and Systems

Pattern Recognition

Pattern Recognition

Instance-based learning algorithms

Machine Learning

KEEL Data-mining software tool: data set repository, integration of algorithms and experimental analysis framework

Journal of Multiple-Valued Logic and Soft Computing

KEEL: a software tool to assess evolutionary algorithms for data mining problems

Soft Computing

Reducing multiclass to binary: a unifying approach for margin classifiers

Journal of Machine Learning Research

Introduction to Machine Learning

Efficient classification for multiclass problems using modular neural networks

IEEE Transactions on Neural Networks

Modern Information Retrieval

Data Complexity in Pattern Recognition

Domain of competence of XCS classifier system in complexity measurement space

IEEE Transactions on Evolutionary Computation

Support vector learning for fuzzy rule-based classification systems

IEEE Transactions on Fuzzy Systems

Rule induction with CN2: some recent improvements

A coefficient of agreement for nominal scales

Educational and Psychological Measurement

Statistical comparisons of classifiers over multiple data sets

Journal of Machine Learning Research

Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research

Pattern Classification

Binary tree of SVM: a new fast multiclass training and classification algorithm

IEEE Transactions on Neural Networks

Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study

IEEE Transactions on Evolutionary Computation

Ensembles of nested dichotomies for multi-class problems

Round robin classification

Journal of Machine Learning Research