Get in Touch With Different Types of Machine Learning Algorithms

There must be a proper understanding of these algorithm. Machine Learning Algorithms need to be selected correctly for right task. Below I am explaining some Machine Learning Algorithms which you should know: Random Forest: It is an algorithm used for both classification and regression purpose. It is a supervised classification algorithm, created by combining large number of trees. As more the number of trees more will be the accuracy.To know about random forest one should have the concept of decision tree. A decision tree is a model of tree like graph showing conditional statements and their results. These are used for analysis of decision.In random forest randomly number of features are selected from total number of features. Then using best split point, nodes are selected from randomly selected features. These nodes are then further split into daughter nodes using best split method. Process is repeated until a desired number of nodes are reached. These steps are repeated again and again which build an algorithm called random forest. Logistic Regression: It is a linear model which predicts output after statistical analysis of known factors. It is used in case when dependent variable is binary, and defines the relation between one dependent variable with another independent variables. This algorithm aims to define the relationship of dependent variable with other independent variables using best fitting model. It provides results on the basis of some observed characteristics. It predicts the value of categorical outcome. K-means Clustering: It is an Unsupervised Learning Algorithm used in case of unlabelled data. It is used for clustering of dataset. In this algorithm number of clusters are chosen randomly. These are chosen in such a way that they do not lie vary close to each other. Distance from cluster centres and data points are taken. The closest data point to the centre is assigned with cluster centre. Thus cluster centres and distance of data points from clusters are recalculated. The process is repeated until no data point remain unassigned. SVM : Support Vector Machine is a Supervised learning Algorithm used for both classification and regression analysis. In case when target is categorical or have classes it performs classification whereas if target is continuous like weather forecasting, stock prediction it performs regression. The dataset is divided by a hyperplane made by focusing two closest data points which are difficult to classify. An optimal hyperplane is selected such that the margin to closest data points remain maximum. Linear Regression: It is a linear approach used to predict the target value of outcome when values of input vector is given. It is used to define the continuous dependent variable. If predictor variable (X) on basis of which predictions for another variable (y) are made is only one, then it is called simple linear regression. When predictions of y are plotted as a function of X, the graph will be linear or straight line. Along with the linear regression exponential regression is also available. The choice depends upon the type of work to performed. Naive Bayes: It is a classifier used in case of large datasets by applying Bayes theorem. In this algorithm it is assumed that all predictors or features are unrelated to each other. For all classes posterior probability is obtained based on different attributes and the class with higher probability is maintained as prediction output. Mostly it is used for multi class and real time prediction. K-Nearest Neighbors (KNN): It is a non-parametric algorithm for classification and regression. Based on the output it is decided whether to use it for classification or regression. There is no training data for this algorithm, for any case it makes classification using majority vote of neighbours. During the training phase the vectors are passed along with their labels and during classification phase unlabelled vector is passed. Based on the training and characteristics of unlabelled vector it is assigned a new label. It depends upon different characteristics like Euclidean Distance, Hamming Distance and even neighbour component analysis can be performed for better fitting. AdaBoost: It is basically a meta algorithm used in combination with other algorithms to boost their performance and accuracy. It is one of the best classifier when used with decision trees. The decision trees are weak learner and the combination of these weak learner provides a smooth learning algorithm called Adaboost. The weak learners takes x inputs and detect their classes.