Everyone talks about machine learning, but the fact remains that few people can answer the question What is machine learning? People tend to call everything artificial intelligence, whether it’s a phone using deep learning for face recognition or a travel app using machine learning algorithms to define the best time for buying airline tickets. In reality, these are different things. In this article, we’ll reveal the real essence of machine learning and clarify common misconceptions.
To clearly understand what machine learning really is, it’s important to know what it is not. Since the terms artificial intelligence, machine learning, deep learning, and statistical learning are often used interchangeably, we’ll cover their differences.
What is the difference between machine learning, statistical learning, artificial intelligence, and deep learning?
Statistical learning and machine learning are closely related but still different. To clarify the distinction between them, let’s first define what statistical learning is. As suggested in the book An introduction to Statistical Learning:
Statistical learning refers to a set of tools for modeling and understanding complex datasets. It is a recently developed area in statistics and blends with parallel developments in computer science and, in particular, machine learning.
Statistical learning blends with machine learning, since machine learning deals with data such as statistics. But note that their goals, processes, and results are different.
Subfield of Artificial Intelligence
Subfield of mathematics
Requires minimum human effort; is automated
Requires a lot of human effort
Can learn from large data sets
Deals with smaller data sets
Has strong predictive abilities
Gives a best estimate: you gain some insights into one thing, but it’s of little or no help with predictions
Learns from data and discovers patterns
Learns from samples, populations, and hypotheses
To clearly understand the difference between machine learning and statistical learning, consider Netflix. In 2006, the company announced the Netflix Prize, a competition for the best recommendation system. As Brian Caffo suggested, contestants could approach this task using either machine learning or statistical learning.
Machine learning: Build an automated movie recommendation system dependent on the star rating system. You build a machine learning algorithm to predict what movies users might like to watch.
Statistical learning: Build a parsimonious and interpretable model to better understand why people choose some movie. With the statistical approach, you learn something true about movie choices, like kinds of films certain demographics prefer. But that may not really help you with predictions.
Machine learning is artificial intelligence. Yet artificial intelligence is not machine learning. This is because machine learning is a subset of artificial intelligence. In addition to machine learning, artificial intelligence comprises such fields as computer vision, robotics, and expert systems.
Have a look at what Gary Sims from Android Authority says about the differences between AI and machine learning:
- Artificial intelligence is the idea of a computer being able to do abstract thinking, analyze things within context, and be creative while not being intelligent itself. It’s a machine with the ability to solve problems that are typically solved by humans with our natural intelligence. Artificial intelligence is much broader and more general than machine learning.
There are two types of artificial intelligence in terms of the approach: weak and strong. Weak artificial intelligence imitates intelligence without being self-aware. Think about asking Google for the weather. Strong artificial intelligence, in turn, completes tasks that are typically associated with human beings and forms self-consciousness. There are no true examples of strong artificial intelligence yet. It’s still just a subject of science fiction, as Goug Rose says.
- Machine learning is a large area within artificial intelligence. It refers to the process of a machine learning from experience. It deals only with algorithms that automatically extract patterns from data. The idea of machine learning is that you take a data set, feed it into an algorithm that learns from it, and as the output the algorithm makes predictions.
Deep learning is a subtype of machine learning, which is why many people confuse them. Both deep learning and machine learning offer ways to train models and classify data. The difference between them is in the very process of learning. With machine learning, you upload data (such as images), manually define features, create a model, and the machine makes predictions. With deep learning, you skip the step of manually defining features. Deep learning algorithms deal directly with data. This is a self-teaching system that’s trained by lots of data sets and a multi-layered neural network.
In recent years, deep learning has gained great attention and prevalence within various industries. In the healthcare industry, deep learning is used to automatically detect cancer cells. In the automotive industry, it’s used to automatically identify objects like traffic lights and stop signs. Another great example of deep learning algorithms is found in the mobile industry. To keep personal information secure, Apple lets users of iPhones in the X series unlock the phone with Face ID. This Face ID system is also used for payment authorization and signing in to third-party applications like banking apps. To verify it’s you, the iPhone X uses Apple’s TrueDepth to generate a depth map of your face by capturing dozens of data points such as shapes, distances between those shapes, edges, and more.
Having discussed what machine learning is not, it’s time to cover the machine learning definition. Machine learning, or ML for short, is a method that’s grounded in the idea that machines can learn from data, define patterns, and take actions with minimum human input.
The lifecycle of machine learning looks like this: ask the right question/set the problem ➡ collect and prepare data ➡ train the algorithm ➡ test it ➡ collect feedback ➡ use feedback to improve the algorithm
There are four types of machine learning algorithms: supervised, unsupervised, semi-supervised, and reinforcement.
Supervised learning is a model that predicts the outcome of new data based on past examples. You load the model with knowledge so it can predict future instances. With supervised learning, the data you’re dealing with is labeled. There are rows of data, each of which has at least one column with a known outcome that’s referred to as a label and is an example of something used to train the model. The goal of supervised learning is building a model that can predict the outcome for new instances based on previous examples.
Let’s make this idea solid with an example. Let’s say you want to create a model that predicts the price of a house. There’s a row for each data point and a column for each feature. The labels are size, number of bedrooms and baths, and price. To predict the price for a two-bedroom house with one bathroom and 1200 square feet, the algorithm uses previous examples.
With supervised learning, you can answer other questions like:
- How many customers will apply for a loan next month?
Training data: loan applications from previous months
- Are these cells cancerous?
Training data: examples of cancerous and non-cancerous cells
- Is this email spam?
Training data: previous emails known to be spam or not spam
- Is this transaction fraudulent?
Training data: previous transactions known to be fraudulent or not fraudulent
Unsupervised learning is a model that extracts useful information or features from data and finds patterns. The difference between supervised and unsupervised data is that the training data provides examples, but we have no specific labels. The goal of supervised learning is to build a model that can predict the outcome for new instances based on previous examples. With unsupervised learning, you aim at building a model to make a discovery rather than to make a prediction. Consider the following examples of some possible uses of unsupervised learning:
- Finding out what customers buy along with other items. For example, you might find that customers who buy coffee also tend to buy milk.
- Dividing the user base into groups with similar tastes, locations, or demographics.
Semi-supervised learning takes the middle ground between supervised and unsupervised learning. It implies that some data is labeled and the other portion of the data, which is the greater portion, is unlabeled. Semi-supervised learning is practical when you have big sets of data. You can start with manually labeling data and using it as a training set for your model. Once trained, the model can make predictions on the remaining unlabeled part of the data.
Reinforcement learning is quite different from the other two types of machine learning. In reinforcement learning, there’s no training data. The algorithm works on a rewards-based system. Reinforcement learning involves an autonomous agent that observes the environment and then selects an action that will lead to rewards. This helps the algorithm improve in the long run on its own. The best example of the reinforcement learning approach is creating a game.
Machine learning is of great help for businesses. They use it to solve complex issues, define patterns, get new insights, and take intelligent actions based on the data provided. Below, we outline some of the industries that can greatly benefit from machine learning.
Has this ever happened to you? After browsing for children’s clothing, you see ads for children’s items? And what about finding exactly what you’re looking for on the first page of search results? These things are powered by machine learning algorithms. For eCommerce giants like Amazon and eBay that list millions of items, it’s hard to personalize the customer experience. To predict and provide relevant recommendations and search results, marketplaces use algorithms that are based on customers’ preferences and purchase histories.
In the healthcare sector, machine learning algorithms are mainly used to provide predictive analytics. One of the greatest machine learning examples is Google. Google uses machine learning algorithms to identify breast cancer. What’s interesting is that their accuracy of detection hit 89%, compared to 73% for a pathologist. On the Google AI blog, Martin Stump and Lily Peng say “We showed that it was possible to train a model that either matched or exceeded the performance of a pathologist who had unlimited time to examine the slides.”
Social media networks greatly benefit from machine learning. Algorithms are used for news feed ranking and search relevance. Apart from this, there are other great things that machine learning can do for social media. Have you ever been notified that you’re in your friend’s photo on Facebook? Facebook also uses machine learning to make it easier for people with visual impairments to interact on the platform. Now blind people can also react to the pictures their friends post because Facebook describes every little detail of an image, including the number of likes and shares.
The core goals of machine learning for the financial industry are to gain essential insights, define profitable investment opportunities, forecast returns, and detect fraud by predicting high-risk clients. Additionally, financial services companies use machine learning for process automation. JPMorgan Chase, an international investment bank and financial services company, uses algorithms to review documents and obtain important information from them. This saves tons of time and human resources.
Since the contemporary world is data-driven, it’s important to systemize and analyze information that comes from multiple channels. Machine learning is a good choice for structuring data comprehensively to make evidence-based decisions. If you want to develop a machine learning project with Steelkiwi or have any questions on machine learning, feel free to get in touch with our team.