Recently in a Machine Learning class from professor Oriol Pujol at UPC/Barcelona he described the most common algorithms, principles and concepts to use for a wide range of machine learning related task. Here I share them with you and ask you:
- is there any comprehensive framework matching tasks with approaches or methods related to different types of machine learning related problems?
How do I learn a simple Gaussian? Probability, random variables, distributions; estimation, convergence and asymptotics, confidence interval.
How do I learn a mixture of Gaussians (MoG)? Likelihood, Expectation-Maximization (EM); generalization, model selection, cross-validation; k-means, hidden markov models (HMM)
How do I learn any density? Parametric vs. non-Parametric estimation, Sobolev and other functional spaces; l ́ 2 error; Kernel density estimation (KDE), optimal kernel, KDE theory
How do I predict a continuous variable (regression)? Linear regression, regularization, ridge regression, and LASSO; local linear regression; conditional density estimation.
How do I predict a discrete variable (classification)? Bayes classifier, naive Bayes, generative vs. discriminative; perceptron, weight decay, linear support vector machine; nearest neighbor classifier and theory
Which loss function should I use? Maximum likelihood estimation theory; l -2 estimation; Bayessian estimation; minimax and decision theory, Bayesianism vs frequentism
Which model should I use? AIC and BIC; Vapnik-Chervonenskis theory; cross-validation theory; bootstrapping; Probably Approximately Correct (PAC) theory; Hoeffding-derived bounds
How can I learn fancier (combined) models? Ensemble learning theory; boosting; bagging; stacking
How can I learn fancier (nonlinear) models? Generalized linear models, logistic regression; Kolmogorov theorem, generalized additive models; kernelization, reproducing kernel Hilbert spaces, non-linear SVM, Gaussian process regression
How can I learn fancier (compositional) models? Recursive models, decision trees, hierarchical clustering; neural networks, back propagation, deep belief networks; graphical models, mixtures of HMMs, conditional random fields, max-margin Markov networks; log-linear models; grammars
How do I reduce or relate features? Feature selection vs dimensionality reduction, wrapper methods for feature selection; causality vs correlation, partial correlation, Bayes net structure learning
How do I create new features? principal component analysis (PCA), independent component analysis (ICA), multidimensional scaling, manifold learning, supervised dimensionality reduction, metric learning
How do I reduce or relate the data? Clustering, bi-clustering, constrained clustering; association rules and market basket analysis; ranking/ordinal regression; link analysis; relational data
How do I treat time series? ARMA; Kalman filter and stat-space models, particle filter; functional data analysis; change-point detection; cross-validation for time series
How do I treat non-ideal data? covariate shift; class imbalance; missing data, irregularly sampled data, measurement errors; anomaly detection, robustness
How do I optimize the parameters? Unconstrained vs constrained/Convex optimization, derivative-free methods, first- and second-order methods, backfitting; natural gradient; bound optimization and EM
How do I optimize linear functions? computational linear algebra, matrix inversion for regression, singular value decomposition (SVD) for dimensionality reduction
How do I optimize with constraints? Convexity, Lagrange multipliers, Karush-Kuhn-Tucker conditions, interior point methods, SMO algorithm for SVM
How do I evaluate deeply-nested sums? Exact graphical model inference, variational bounds on sums, approximate graphical model inference, expectation propagation
How do I evaluate large sums and searches? Generalized N-body problems (GNP), hierarchical data structures, nearest neighbor search, fast multiple method; Monte Carlo integration, Markov Chain Monte Carlo, Monte Carlo SVD
How do I treat even larger problems? Parallel/distributed EM, parallel/distributed GNP; stochastic subgradient methods, online learning
How do I apply all this in the real world? Overview of the parts of the ML, choosing between the methods to use for each task, prior knowledge and assumptions; exploratory data analysis and information visualization; evaluation and interpretation, using confidence intervals and hypothesis test, ROC curves; where the research problems in ML are