Machine Learning – a 10 minutes overview

What is Machine Learning

Vast number of programming problems can be solved by writing a set of well defined rules. Some problems are very complex it’s either we can’t define a clear set of rules or there are huge number of rules that it is impossible to manually codify those rules. A very good example is Speech recognition.

Machine Learning provides us technology so we can teach computers these rules provided a set of labeled data (training set) and extend them to a general set of data that were not seen in the training set.

Categories of Machine Learning

Supervised Learning

Goal of Supervised Learning is to predict the output variable from a given input variable. So for supervised learning, we need a data set with labeled output variables to train our machine learning function. If output variable is a category, it is Classification Problems. The functions we learn to solve these problems are called Classifiers. If output is a real value, it is a Regression Problem. Function we learn is called Regression Function.

Unsupervised Learning

Main goal here is to find some structure in a input data set. This involves,

  • Find clusters of similar data
  • Outlier detection

Machine Learning Workflow

Machine learning workflow mainly consists of three components.

Representation

This involves representing your problem in such a way that a computer can deal with it. So you have to extract a set of features from your input objects. Extraction most relevant set of features from your input objects is called Feature Extraction or Feature Engineering. Next part of the problem is choosing an appropriate machine learning methodology for the problem.

Evaluation

You need to have a framework to evaluate the accuracy of your ML algorithm. We usually set aside a set of test data for this.

Optimization

After evaluation step, you may find some improvements you could do. Such as choosing a more appropriate algorithm or extracting more relevant features.

 

Usually you have to iterate through above described steps before you can find an optimal solution for you problem.