Introduction

Date: 23.04.25

Writer: 9tailwolf : doryeon514@gm.gist.ac.kr

This Chapter deals with Natural Language Processing introduction. We Study about definition of NLP.

What is NLP?

Definition of NLP

Natural Language Processing is a field that study about understanding human’s language by computer. There is a two parts process of NLP, Natural Language Understanding(NLU) and Natural Language Generation(NGU). The algorithm of NLU and NLG are different but generally, they are used at the same time.

Paradigm of NLP

Rule Based NLP : The algorithm that process natural language by rule. But it is hard to make rule because there is a numerous rule in language.
Statistics Based NLP : The algorithm that process natural language by statistics. It is started by idea about language rule is related to words statistics.
Deep Learning Based NLP : The algorithm that process natural language by deep learning.

Basic Math

Basic Statistics

Defination of probability
Probability of event \(x\) is write as \(P(x)\).

And it is satisfy \(\int^{\infty}_{-\infty} P(x) dx = 1\) in continous condition and \(\Sigma_{i=0}^{n}P(i) = 1\) in discrete condition.

kolmogorov’s axioms
Whan event \(a\) and \(b\) happen at same time, we can write probability as \(P(a \land b)\), and we can write probability of \(a\) or \(b\) as \(P(a \lor b)\). And it satisfies as follow. \(P(a \lor b) = P(a) + P(b) - P(a \land b)\)

Conditional Probability
\(P(a|b)\) means if \(b\), then \(P(a|b)\) chance of \(a\).

Bayes’s Rule
Since \(P(a|b)P(b) = P(b|a)P(a)\), \(P(a|b) = \frac{P(b|a)P(a)}{P(b)}\).

Mean and Variance
In probability, the mean value can write as \(E(x)\), \(E(x) = \int_{-\infty}^{\infty} xP(x) dx\) in contunious, or \(E(x) = \Sigma_{x}xP(x)\) in discrete.

Variance can write as \(Var(x)\), and \(Var(x) = E(X - E(x))^{2} = E(X^{2})-E(x)^{2}\)