## Bayes Theorem and Naive Bayes Classifier
### Bayes Theorem
Bayes Theorem provides a way to update the probability estimate for a hypothesis as more evidence or information becomes available.
#### Formula
For events $A$ and $B$:
$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$
where:
- $P(A|B)$ is the posterior probability of $A$ given $B$.
- $P(B|A)$ is the likelihood of $B$ given $A$.
- $P(A)$ is the prior probability of $A$.
- $P(B)$ is the probability of $B$.
### Naive Bayes Classifier
Naive Bayes is a probabilistic classifier based on Bayes Theorem, assuming strong (naive) independence between features.
#### Formula
For a given class $C$ and features $x_1, x_2, \ldots, x_n$:
$P(C|x_1, x_2, \ldots, x_n) \propto P(C) \prod_{i=1}^{n} P(x_i|C)$
#### Python Implementation
```python
from sklearn.naive_bayes import GaussianNB
# Sample data
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [0, 1, 0, 1, 0]
# Naive Bayes classifier
model = GaussianNB()
model.fit(X, y)
y_pred = model.predict(X)
# Results
print("Predicted labels:", y_pred)`
```
---
## Support Vector Machines (SVM)
Support Vector Machines (SVM) are supervised learning models used for classification and regression. They aim to find the hyperplane that best separates the data into classes.
### Mathematical Formulation
For a binary classification problem, the SVM algorithm finds the optimal hyperplane:
$w \cdot x - b = 0$
where $w$ is the weight vector and $b$ is the bias. The goal is to maximize the margin, defined as:
$Margin = \frac{2}{\|w\|}$
### Python Implementation
```python
from sklearn import svm
# Sample data
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [0, 1, 0, 1, 0]
# SVM classifier
model = svm.SVC(kernel='linear')
model.fit(X, y)
y_pred = model.predict(X)
# Results
print("Predicted labels:", y_pred)`
```
---
## Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA) is a classification method that projects data onto a lower-dimensional space to maximize class separability.
### Mathematical Formulation
LDA finds the linear combinations of features that best separate two or more classes. The objective is to maximize the between-class variance and minimize the within-class variance.
#### Formula
The LDA maximizes the following objective function:
$J(w) = \frac{w^T S_B w}{w^T S_W w}$
where:
- $S_B$ is the between-class scatter matrix.
- $S_W$ is the within-class scatter matrix.
### Python Implementation
```python
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
# Sample data
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [0, 1, 0, 1, 0]
# LDA
model = LinearDiscriminantAnalysis()
model.fit(X, y)
y_pred = model.predict(X)
# Results
print("Predicted labels:", y_pred)`
```
---
## Quadratic Discriminant Analysis (QDA)
Quadratic Discriminant Analysis (QDA) is a classification method similar to LDA but allows for quadratic decision boundaries.
### Mathematical Formulation
QDA assumes that each class follows a Gaussian distribution, but unlike LDA, it does not assume that the covariance matrices of the classes are identical.
#### Formula
For a given class $C$ and feature vector $x$:
$P(C|x) \propto P(C) \exp \left( -\frac{1}{2} (x - \mu_C)^T \Sigma_C^{-1} (x - \mu_C) \right)$
where:
- $\mu_C$ is the mean vector of class $C$.
- $\Sigma_C$ is the covariance matrix of class $C$.
### Python Implementation
```python
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis
# Sample data
X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
y = [0, 1, 0, 1, 0]
# QDA
model = QuadraticDiscriminantAnalysis()
model.fit(X, y)
y_pred = model.predict(X)
# Results
print("Predicted labels:", y_pred)`
```
## Conclusion
### Summary of Advanced Regressors and Classifiers
- **Bayes Theorem and Naive Bayes Classifier:** Probabilistic models based on Bayes Theorem, assuming feature independence.
- **Support Vector Machines (SVM):** Models that find the optimal hyperplane to separate data into classes.
- **Linear Discriminant Analysis (LDA):** Projects data onto a lower-dimensional space to maximize class separability using linear decision boundaries.
- **Quadratic Discriminant Analysis (QDA):** Similar to LDA but allows for quadratic decision boundaries by considering different covariance matrices for each class.
Continue: [[07-Unsupervised Learning Algorithms]]