About pattern recognition

News classification

Contact us

Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
Tel: 13146317170 廖经理
Fax:
Email: 398017534@qq.com

Statistical Pattern Recognition - Nonlinear Classifiers

1. Non-linear classifier roots

Nonlinear classifier concept

In many situations, we cannot ensure that the classification planes between categories are linear (linearity is the simplest case), and in many complex problems, non-linear classifiers may be more suitable for the problem. The model interface of the nonlinear classifier can be a combination of curved surfaces or hyperplanes.
2. Classification of nonlinear classifiers

There are two major categories of commonly used non-linear classifiers: according to the sum of the discriminant function and not based on the discriminant function;
The
Nonlinear classifier based on discriminant function

1) Piecewise linear classifier: The idea that a non-linear function can be fitted and approximated by a multi-segment linear function, such as a piecewise linear interval classifier, that is, a combination of multiple minimum interval classifiers, among which the so-called minimum interval classifier It is the Bayesian plan with the lowest error rate under the condition that the prior probability is equal, and the features of each dimension are independent, and the variance is equal. The idea is very simple, that is, the mean value of the two types is the center point, and the new one is close to the center point. What kind of sample is given?

2) Quadratic discriminant function: If the Bayesian plan of the plan is normal distribution, it is a quadratic function;

3) Multilayer Perceptron: This is a neural network (NN) thinking that consists of a combination of multiple perceptrons;

4) SVM: Now we have learned the optimal hyperplane, that is, the linear SVM, the first-of-its-mentioned right-and-error linear SVM.

5) Kernel function method: Naturally think of the Fisher linear discriminant used in the previous study. It is true that the kernel function method here includes Fisher's discriminant non-linear implementation. However, the creative presentation of the kernel function method here comes primarily from two SVMs. The central thinking: large intervals and kernel functions. Using these two ideas, people have made detailed changes to the traditional linear functions to form a kernel function method, or a nuclear method;

6) LR, also known as Logistic regression, is a generalized linear classifier.

Non-linear classifiers not based on a discriminant function

1) Neighbors law: if nearest neighbors, close neighbors, etc.;

2) Resolution Plan Tree

3) Random Forest

4) Boosting method

3. Multiple classification problems

With respect to a k-class classification problem, what we need to do is to determine which of the k categories a sample is. This is actually a broadening of the two-classification problem. The treatment method is also an extension of the two-class classification approach.

The commonly used multi-classification methods can be divided into two types: One-Versus-All and One-Versus-One.

One-Versus-All

1. Exercise k-1 classifiers using logistic regression or other corresponding classification methods. The training process of the classifier Dk is: classify the samples belonging to the kth class into a class, and classify the remaining k-1 class samples. For another class, use this as a two-class exercise sample to practice a classifier;
2. When you decide on the plan, enter x for a sample of the test, and substitute k-1 classifiers for each classifier. Use the class with the largest output (ie, the highest likelihood) as your own classifier.

Sample imbalance problem: When practicing each classifier in the one-to-many approach, the training sample is a class 1 pair k-1 class. The positive and negative sample plans have large differences, such as 1 positive sample and 99 negative samples. In the practice of this classifier, it is very likely that the final classifier is D(x) = -1, which means -1 is output no matter what the input is, so the error rate is as small as 0.01, which does not achieve the effect of the exercise. Dealing with this problem can be done one to one.

One-Versus-One

1. Exercise here to write pictures depicting classifiers, each classifier taking only two types of samples between them as positive and negative original exercises;
2. The voting plan will use voting rules, that is, the x-entry of the test sample will describe the classifiers, each classifier will output a discrimination result, and the class with the most votes will be his classifier.

The one-to-one approach has no sample imbalance problems, but it is slower in relation to one-to-many approaches.

PREVIOUS：Natural scene text detection in deep lea NEXT：Driving Certificate OCR Recognition API