Probability theory in artificial intelligence

News classification

Contact us

Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
Tel: 13146317170 廖经理
Fax:
Email: 398017534@qq.com

Probability theory is also an essential mathematical foundation for AI research. With the rise of cohesionism, probability statistics has replaced mathematical logic and become the mainstream tool of artificial intelligence research.

Like linear algebra, probability theory represents a way of dealing with the world, and its focus is on the ubiquitous possibilities. The mathematical description of the possibility of stopping the possibility of random events is the axiomatic process of probability theory. The axiomatic construction of probability shows a kind of understanding of the essence of probability.

When the same coin is tossed 10 times, the number of times it faces upward may be either one or all, and the conversion frequency corresponds to 0% and 100% respectively. Frequency itself will obviously shake randomly, but as the number of repeated experiments increases from time to time, the frequency value presented by a particular event will show stability, gradually approaching a constant.

The method of 85 Because stable frequency is the expression of statistical regularity, it is a reasonable idea to calculate frequency through a large number of independent repeated experiments and use it to represent the possibility of an event.

In the quantitative calculation of probability, the frequency school relies on the classical probability model. In the classical probability model, the experimental results contain only a limited number of fundamental events, and the probability of each fundamental event occurring is the same. Assuming that the number of fundamental events is n and the number of fundamental events contained in the random event A to be observed is k, the formula for calculating the probability of events in the classical probability model is as follows:

From this fundamental formula, we can deduce the probability of complex random events.

The definition of probability in the previous paper is aimed at a single random event, but if we want to describe the relationship between two random events, we need to introduce the concept of conditional probability.

The conditional probability is a new probability distribution obtained by stopping adjusting the sample space based on the existing information. Assuming that there are two random events AA and BB, conditional probability refers to the probability that event AA will occur under the condition that event BB has occurred. It is expressed by the following formula:
The P (AB) P (AB) in the formula is called joint probability, which indicates the probability that two events, AA and BB, will occur together. If the combination probability equals the product of the respective probabilities of two things, that is, P (AB) = P (A) P (B) P (AB) = P (A) P (B), it is clarified that the two things do not affect each other, that is, they are independent of each other. For independent events, conditional probability is the probability of itself, that is, P (A|B) =P (A) P (A|B) =P (A).

Based on conditional probability, we can derive the complete probability formula (Law of total probability). The function of the total probability formula is to transform the probability solution of complex events into the probability sum of simple events occurring in different situations.
P (A) = Sigma i=1NP (A|Bi).P (Bi)
P (A) = Sigma i=1NP (A|Bi).P (Bi)
Sigma i=1NP (Bi) =1
Sigma i=1NP (Bi) =1
The_formula of total probability represents the idea of frequency school to deal with probability problems, that is, to make some assumptions (P(Bi)(P(Bi)) first, and then discuss the probability of random things (P(A|Bi)(P(A|Bi Bi)) under these assumptions.

In order to solve the problem of "inverse probability", we have stopped sorting out the total probability formula. The so-called "inverse probability" deals with the inference of the possibility of hypothetical seizures (P (Bi | A) (P (Bi | A) under the condition that the outcome of the event has been confirmed (P (A)) (P (A)). Its general formula is called Bayes formula.
P (Bi|A) =P (A|Bi).P (Bi) Sigma Nj=1P (A|Bj).P (Bj)
P (Bi|A) =P (A|Bi).P (Bi) Sigma j=1NP (A|Bj).P (Bj)
The Bias formula can be further generalized as the Bias theorem (Bayes' theorem):
P (H|D) =P (D|H).P (H) P (D)
P (H|D) =P (D|H).P (H) P (D)
P(H)P(H) in the formula is called prior probability, i.e. the probability of presupposed assumptions; P(D | H) P(D | H) is called likelihood function, which is the probability of observing results on the premise of assumptions; P(H | D) P(H | D) is called posterior probability. That is, the probability of assuming that the result is observed.

From the perspective of scientific deliberation, Bias's theorem provides a completely new logic. It seeks reasonable hypothesis based on observation results, or the best theoretical explanation based on observation data. Its focus is on posterior probability. The Bayesian probability of probability theory is descended from this idea.

In the eyes of the Bayesian school, probability describes the believable level of random events.

The Frequency School assumes that the hypothesis is objective and unchangeable, that is, there exists a fixed prior dispersion. Therefore, when calculating the probability of details, we should first confirm the type and parameters of probability dispersion, and then stop the probability deduction.

By contrast, the Bayesian school believes that a fixed prior dispersion does not exist and that the parameters themselves are random numbers. In other words, assuming that it depends on the outcome, it is not certain and can be corrected. The function of data is to modify assumptions from time to time, so that the observer's objective understanding of probability is closer to objective practice.

Probability theory is another theoretical foundation of artificial intelligence, which is beyond linear algebra.

PREVIOUS：Scan recognition of mobile phone OCR ban NEXT：OCR character recognition technology