Machine Learning

Machine Learning

Machine Learning is the branch of science which enables computers to learn about the model patterns without being programmed. It is one of the most interesting and challenging technologies today. The ability of learning makes machine learning to act like humans. Companies are using machine learning to develop organization decisions, increase the productivity scale, medical domains, forecasting and many more. We need to build intelligent machines where hardware intelligence is necessary. It is always better if a machine can learn about the data from its input and this is a scenario where the machine learning comes into picture. There are 7 basic steps used for machine learning which are as follows;

  • Data Collection.
  • Data Preparation.
  • Model Selection.
  • Training a model.
  • Model Evaluation.
  • Parameter Tuning.

Types of Machine Learning: basically there are 4 types of machine learning and they are discussed as follows;

  • Supervised Machine Learning: it deals with the labeled data (we know the output). The purpose behind supervised machine learning is to learn the function when the input and output are readily available. Basic steps are Data Preparation, Selection of Algorithm, Model fitting, selecting particular validation model, for predictions use the fitted model.
  • Unsupervised Machine Learning: it deals with unlabeled data (we don’t know the output). Unsupervised machine learning is widely used in the domains like customer segmentation, understanding the business strategies, analysis of evolutionary biology, clustering DNA patterns etc. Some of the examples of unsupervised machine learning are Hierarchical Clustering, Principle Component Analysis and K – Means Clustering.
  • Semi – Supervised Machine Learning: it deals with both supervised and unsupervised machine learning techniques. It is a stable and simple algorithm with high efficiency. The amount of annotated data is reduced.
  • Reinforcement: it is capable of learning from its errors and it can also take appropriate actions. It can perceive as well as interpret along with its environment. It is used in different fields such as healthcare, recommendation systems, finance etc. This method is used to identify if an algorithm is producing accurate results or not.

Data in Machine Learning:

Data is the most crucial part in Machine Learning, Artificial Intelligence and Data Analytics. The data in machine learning can be in structured format, unstructured format, audio, video, images or live streaming videos (such as YouTube videos) etc. Without data the model cannot be trained and further analysis of the algorithm is not possible. Data can be either “Numerical” or “Categorical”. Some machine learning algorithms work well when the input is in categorical format and some machine learning algorithms don’t work well with categorical format. In this case the categorical data needs to be converted to numerical data. Steps to convert categorical data to numerical data include the categorical data being assigned an integer value (called as “Integer Encoding”) and then a binary value is added to the categorical data (called as “One Hot Encoding”).

Data Splitting in Machine Learning:

  • Training Data: this step allows the model to learn about the patterns. About 70% of the data is considered as “Training Data”.
  • Validation Data: in this step we evaluate the model, and then the model fits on the training dataset along with the hyper parameters.
  • Testing Data: this step is essential to test the accuracy of the model. About 30% of the data is considered as the testing data.

Applications of Machine Learning:

  • Database Mining: database mining is widely used in healthcare domains for better automation, for better user experience (UX) it is used in web development etc.
  • Unprogrammed Applications: the examples include Autonomous Driving, Handwriting Recognition, Face Recognition, Computer Vision, Natural Language Processing etc.
  • Marketing and Sales: ecommerce websites like Flipkart, Amazon recommends their customers the most popular items from their list. The most popular area where the machine learning algorithm is used is the “Chatbot”. The Chatbot makes use of Natural Language Processing and Machine Learning to interact with their customers efficiently. Nowadays the chatbots are very popular as they provide sophisticated answers to the critical questions. It reduces human efforts and time.
  • Image Recognition: this application is used to identify person, place, object etc. One of the most popular use cases in image recognition is “Automatic Friend Tagging Suggestion”. This case study is based on face recognition and person identification from a picture.
  • Email Spam and Malware Filtering: usually we receive new emails in our inbox and it is automatically filtered as important, spam or normal. Some of the spam filters used are Header Filter, Content Filter, Rule based Filters, Permission Filters etc. Machine learning algorithms such as Decision Tree, Naïve Bayes and Multi – Layer Perceptron are widely used for malware detection and email spam filtering.
  • Virtual Personal Assistant: using our voice instructions this application helps us to find the essential information. Google Assistant, Siri, Cortana, Alexa are some of the famous virtual personal assistant available today. Various commands such as call someone, play music, open a particular email; schedule an appointment etc can be used. They record our voice instructions; it is send over a cloud, decoded with the help of machine learning algorithms and then acts accordingly.

Certain features such as increase in volume and varieties of types of data available, computational processing and affordable data storage makes machine learning a popular aspect in Data Science domain. Capability of preparing a good data, preparation of algorithms (basic or advance level), and number of iterations required, scalability and ensemble modeling are some of the features required to build a good machine learning system.

Leave a Reply

Your email address will not be published. Required fields are marked *