Shanshan Pythoner Love CPP

Machine Learning Intro

2016-12-03

In general, a learning problem considers a set of n samples of data and then tries to predict properties (or called features) of unknown data.

We can separate learning problems into a few categories:

supervised learning

data comes with additional attributes that we want to predict:

classification: samples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data.

regression: if the desired output consists of one or more continuous variables, then the task is called regression.

unsupervised learning

the training data consists of a set of input vectors x without any corresponding target values. The goal in such problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization.

What are the top 10 data mining or machine learning algorithms?

One potential answer to this question comes from the Analytics 1305 documentation:

  • Kernel Density Estimation and Non-parametric Bayes Classifier

  • K-Means

  • Kernel Principal Components Analysis

  • Linear Regression

  • Neighbors (Nearest, Farthest, Range, k, Classification)

  • Non-Negative Matrix Factorization

  • Support Vector Machines

  • Dimensionality Reduction

  • Fast Singular Value Decomposition

  • Decision Tree

  • Bootstapped SVM

An awesome Tour of Machine Learning Algorithms was published online by Jason Brownlee in 2013.


Comments

Content