News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
Key technologies involved in machine learning
Key technologies involved in machine learning
Bayesian
A typical example is Naive Bayes. The central idea is to calculate the type of point to be determined based on the conditional probability.
It is a relatively simple model that is still used by spam filters.
Applicable situation:
A relatively succinct explanation of the demand, and the time division of the model with less correlation between different dimensions.
High-dimensional data can be processed efficiently, although the results may not be satisfactory.
Neighbor (Nearest Neighbor)
A typical example is KNN. The idea is to ask for other points, find the data points closest to it, and decide the type of point to be determined according to their type.
It is characterized by a complete follow-up of the data, no mathematical model at all.
Applicable situation:
The time required to model a particularly succinct explanation.
For example, a referral algorithm that requires explanation of the cause to the user.
Discriminant analysis
Discriminating profiling is mainly to calculate where it is used, so I am not very familiar with it. For the time being, I will find the girlfriend of the computing department to make up the lessons. I am now selling it now.
A typical example of discriminant profiling is Linear Discriminant Analysis (LDA).
(I'm careful not to confuse this with the Latent Dirichlet allocation. Although it's called LDA, it's not one thing.)
The central idea of LDA is to project high-dimensional samples into low-dimensional ones. If you want to divide them into two categories, they will project to one-dimensional. To be divided into three categories, it is projected onto a two-dimensional plane. There are of course many different ways of casting such a projection. The specification of LDA projection is to make the samples of the same kind as close as possible, and not as much as possible. With regard to the samples to be guessed in the future, it is easy to distinguish the categories after projecting in the same way.
Application situation:
Discriminant profiling is applicable to the situation of high dimensional data demand reduction, and the self-contained dimensionality reduction function makes it easy to investigate sample distribution. Its correctness can be proved by mathematical formulas, so the same is a very reliable way to ponder.
However, its classification accuracy is often not very high, so people who are not in the computing system use it as a dimension reduction thing.
Keep an eye on it as assuming that the samples are in a normal state, so don't try the concentric circular data.
Neural network
The neural network is now very hot. Its central idea is to gradually improve the parameters using a training sample. Still, for example, guess the height, for example, if one of the input features is gender (1: male; 0: female), and the output is characterized by height (1: high; 0: short). Then when the practice sample is a tall boy, in the neural network, the path from "male" to "high" will be strengthened. In the same way, if a tall girl is coming, the road from "female" to "high" will be strengthened.
After all, the neural network is relatively strong, which is determined by our sample.
The advantage of a neural network is that it can have many layers. If the input and output are directly connected, then it is no different from LR. However, with the introduction of many intermediate layers, it can capture the connection between many input features. Convolutional neural networks have very different visual layers of visualization, which I won't go into here.
The neural network is actually very early, but its accuracy depends on a huge set of exercises. It is originally limited by the speed of the computer. The classification is always inferior to the classical algorithm of random forest and SVM.
Decision tree
The characteristic of a decision tree is that it is always segmented along features. With the progress of the layers, this division will become more and more detailed.
Although the generated tree is not simply displayed to the user, the time of data analysis, through the upper structure of the survey tree, can have an intuitive impression of the central idea of the classifier.
For example, when we guess the height of a child, the first layer of the decision tree may be the gender of the child. The boys took the tree on the left for further speculation, while the girls took the tree on the right. This shows that gender has a strong influence on height.
Applicable situation:
Because it can generate a clear tree structure based on features to select different guesses, data analysts expect to use the decision tree to better understand the time of the data on hand.
Together it is also a relatively simple classifier [3]. The intrusion here refers to the artificial change of some characteristics, so that the classifier discriminates the fault. Common in spam escaping detection. Since the decision tree is ultimately based on a single condition, the infringer often needs to change only a few features to escape the monitoring.
Limited by its simplicity, the greater use of decision trees is as a pillar of some of the more useful algorithms.
Random forest
When it comes to decision trees, you have to mention random forests. Looking at the meaning of the text, the forest is a lot of trees.
Strictly speaking, random forest is actually an integrated algorithm. It randomly selects different features and training samples, generates many decision trees, and then summarizes the results of these decision trees for the final classification.
Random forests have been used in practice analysis. It is related to decision trees and has made great progress in accuracy. Together, it has improved the characteristics of decision trees in a simple way.
Applicable situation:
The data dimension is relatively low (tens of dimensions), together with high accuracy requirements.
Since you don't need a lot of parameter adjustments, you can achieve a good effect. Basically, you don't know what method you can use to try random forests.
SVM (Support vector machine)
The central idea of SVM is to find the interface between different categories, so that the two types of samples fall as far as possible at the ends of the face, and the separation interface is as far as possible.
The earliest SVMs were flat and limited. However, using the kernel function, we can map the plane into a surface, which greatly improves the scope of application of the SVM.
The SVM after the improvement is used in many cases, showing excellent accuracy in the practice classification.
Applicable situation:
SVM has an excellent performance on many data sets.
Relatively speaking, the SVM's insistence on the nature of the interval between the samples makes it more resistant to intrusions.
Same as random forest, this is also an algorithm that can be tried first when you get the data.
Logistic regression
It is too strange to return to the name of Logisty. I will call it LR. The horizontal and vertical comments are classifiers, and there is no other way to call LR. Looking at the meaning of the text, it is actually a variant of the regression method.
The center of the regression approach is to find the most appropriate parameter for the function, so that the value of the function is closest to the value of the sample. For example, Linear regression is about finding the most suitable a, b for the function f(x)=ax+b.
LR fits not a linear function, it fits a function in probability, and the value of f(x) reflects the probability that the sample belongs to this class.
Applicable situation:
LR is the fundamental component of many classification algorithms. Its advantage is that the output value naturally falls between 0 and 1, and has a probabilistic meaning.
Since it is essentially a linear classifier, it deals with the conditions associated with poor features.
Although the role is general, but the model is clear, the probability of the back can withstand. The parameters it fits represent the impact of each feature on the outcome. It is also a good thing to understand the data.
A typical example is Naive Bayes. The central idea is to calculate the type of point to be determined based on the conditional probability.
It is a relatively simple model that is still used by spam filters.
Applicable situation:
A relatively succinct explanation of the demand, and the time division of the model with less correlation between different dimensions.
High-dimensional data can be processed efficiently, although the results may not be satisfactory.
Neighbor (Nearest Neighbor)
A typical example is KNN. The idea is to ask for other points, find the data points closest to it, and decide the type of point to be determined according to their type.
It is characterized by a complete follow-up of the data, no mathematical model at all.
Applicable situation:
The time required to model a particularly succinct explanation.
For example, a referral algorithm that requires explanation of the cause to the user.
Discriminant analysis
Discriminating profiling is mainly to calculate where it is used, so I am not very familiar with it. For the time being, I will find the girlfriend of the computing department to make up the lessons. I am now selling it now.
A typical example of discriminant profiling is Linear Discriminant Analysis (LDA).
(I'm careful not to confuse this with the Latent Dirichlet allocation. Although it's called LDA, it's not one thing.)
The central idea of LDA is to project high-dimensional samples into low-dimensional ones. If you want to divide them into two categories, they will project to one-dimensional. To be divided into three categories, it is projected onto a two-dimensional plane. There are of course many different ways of casting such a projection. The specification of LDA projection is to make the samples of the same kind as close as possible, and not as much as possible. With regard to the samples to be guessed in the future, it is easy to distinguish the categories after projecting in the same way.
Application situation:
Discriminant profiling is applicable to the situation of high dimensional data demand reduction, and the self-contained dimensionality reduction function makes it easy to investigate sample distribution. Its correctness can be proved by mathematical formulas, so the same is a very reliable way to ponder.
However, its classification accuracy is often not very high, so people who are not in the computing system use it as a dimension reduction thing.
Keep an eye on it as assuming that the samples are in a normal state, so don't try the concentric circular data.
Neural network
The neural network is now very hot. Its central idea is to gradually improve the parameters using a training sample. Still, for example, guess the height, for example, if one of the input features is gender (1: male; 0: female), and the output is characterized by height (1: high; 0: short). Then when the practice sample is a tall boy, in the neural network, the path from "male" to "high" will be strengthened. In the same way, if a tall girl is coming, the road from "female" to "high" will be strengthened.
After all, the neural network is relatively strong, which is determined by our sample.
The advantage of a neural network is that it can have many layers. If the input and output are directly connected, then it is no different from LR. However, with the introduction of many intermediate layers, it can capture the connection between many input features. Convolutional neural networks have very different visual layers of visualization, which I won't go into here.
The neural network is actually very early, but its accuracy depends on a huge set of exercises. It is originally limited by the speed of the computer. The classification is always inferior to the classical algorithm of random forest and SVM.
Decision tree
The characteristic of a decision tree is that it is always segmented along features. With the progress of the layers, this division will become more and more detailed.
Although the generated tree is not simply displayed to the user, the time of data analysis, through the upper structure of the survey tree, can have an intuitive impression of the central idea of the classifier.
For example, when we guess the height of a child, the first layer of the decision tree may be the gender of the child. The boys took the tree on the left for further speculation, while the girls took the tree on the right. This shows that gender has a strong influence on height.
Applicable situation:
Because it can generate a clear tree structure based on features to select different guesses, data analysts expect to use the decision tree to better understand the time of the data on hand.
Together it is also a relatively simple classifier [3]. The intrusion here refers to the artificial change of some characteristics, so that the classifier discriminates the fault. Common in spam escaping detection. Since the decision tree is ultimately based on a single condition, the infringer often needs to change only a few features to escape the monitoring.
Limited by its simplicity, the greater use of decision trees is as a pillar of some of the more useful algorithms.
Random forest
When it comes to decision trees, you have to mention random forests. Looking at the meaning of the text, the forest is a lot of trees.
Strictly speaking, random forest is actually an integrated algorithm. It randomly selects different features and training samples, generates many decision trees, and then summarizes the results of these decision trees for the final classification.
Random forests have been used in practice analysis. It is related to decision trees and has made great progress in accuracy. Together, it has improved the characteristics of decision trees in a simple way.
Applicable situation:
The data dimension is relatively low (tens of dimensions), together with high accuracy requirements.
Since you don't need a lot of parameter adjustments, you can achieve a good effect. Basically, you don't know what method you can use to try random forests.
SVM (Support vector machine)
The central idea of SVM is to find the interface between different categories, so that the two types of samples fall as far as possible at the ends of the face, and the separation interface is as far as possible.
The earliest SVMs were flat and limited. However, using the kernel function, we can map the plane into a surface, which greatly improves the scope of application of the SVM.
The SVM after the improvement is used in many cases, showing excellent accuracy in the practice classification.
Applicable situation:
SVM has an excellent performance on many data sets.
Relatively speaking, the SVM's insistence on the nature of the interval between the samples makes it more resistant to intrusions.
Same as random forest, this is also an algorithm that can be tried first when you get the data.
Logistic regression
It is too strange to return to the name of Logisty. I will call it LR. The horizontal and vertical comments are classifiers, and there is no other way to call LR. Looking at the meaning of the text, it is actually a variant of the regression method.
The center of the regression approach is to find the most appropriate parameter for the function, so that the value of the function is closest to the value of the sample. For example, Linear regression is about finding the most suitable a, b for the function f(x)=ax+b.
LR fits not a linear function, it fits a function in probability, and the value of f(x) reflects the probability that the sample belongs to this class.
Applicable situation:
LR is the fundamental component of many classification algorithms. Its advantage is that the output value naturally falls between 0 and 1, and has a probabilistic meaning.
Since it is essentially a linear classifier, it deals with the conditions associated with poor features.
Although the role is general, but the model is clear, the probability of the back can withstand. The parameters it fits represent the impact of each feature on the outcome. It is also a good thing to understand the data.
PREVIOUS:Text recognition
NEXT:History of deep learning