Definition
Scalar ,
Vector
Linearity ,
Estimator
Bias
unbiased if
Mean Squared Error
minimizing the mean squared error = maximum likelihood estimator
Moore-Penrose pseudoinverse
Probability
represent the probability of and hanppend.
represent the probability of while already known been happend.
Conditional Probability
If independent with , then
Bayes's Rule
Experience
Variance
Covariance
Normal Distribution
If , then , .
Logistic sigmoid
Softplus function
Softmax
norm
XOR
Linear model cannot perform XOR operation
General Problem
For a data-set with samples
Linear Case
Module
Likelihood function
Likelihood is the probability that a particular outcome is observed when the true value of the parameter is ,
Unlike probabilities, likelihood function do not have to integrate (or sum) to 1.
Quadratic Problem
is invertable,
summary
- maximum likelihood estimation
- least square regression
- minimum cross-entropy between the distributions
- minimum the KL divergence
prevent overfitting
- maximum a posteriori estimation
- regularized least square
cross-entropy negative log-likelihood of a Bernoulli or softmax distribution
PCA
- maximize variance of data after projection
maximize variance -- lagrangian multiplier --> eigenvector of covariance matrix
Theorem (Bochner’s theorem)
A continuous function of the form 𝑘(𝑥,𝑦)=𝑘(𝑥−𝑦) is positive definite if and only if 𝑘(𝛿) is the Fourier transform of a non-negative measure.