With algorithms driving purchases and clicks all over the internet, nationwide facial recognition systems starting to come online in foreign nations, and autonomous vehicles on the horizon, there seems to be shroud of fear surrounding anything having to do with AI. Some of these fears are very legitimate, but many of these fears may simply show a lack of understanding of the subject at hand. This article will explain some concepts and terminology of Machine Learning, Deep Learning, and Artificial Intelligence on a surface level, as well as recommending additional reading for a more in-depth look at this subject.
It all starts with Artificial Intelligence (AI)
Any computer system that mimics the way humans make decisions can be considered Artificial Intelligence, no matter how primitive its systems are. If a programmer created a massive decision tree of if/else statements to try and diagnose patients that were sick, it probably wouldn’t do a very good job, but it would still be considered artificial intelligence. In fact, what I just described was one of the earlier attempts at AI – called an “Expert System”. It turns out, expert systems are hard to maintain and don’t scale well, and quickly faded into obscurity.
Machine Learning (ML)
What matters about AI, is that Machine Learning (ML) is a field within it, and they are not mutually exclusive. Without AI, ML would not exist. ML consists of many different tools and algorithms, and is a quickly advancing field in both computer science and statistics. The seemingly sudden growth of ML is due to the internet age, and the Big Data that came along with it. The algorithms used in ML need lots of pertinent data to train themselves on, data that was not previously available.
An example of an ML algorithm is K-Nearest Neighbors, sometimes abbreviated KNN. The KNN algorithm uses measurements of distance to classify data points. Think of a scatterplot with two different colors of dots on it. Let’s say you place a dot onto it, and of the three closest dots, two nearby dots are red, and one is blue. Most likely the dot you placed would be red, right? That’s exactly how KNN algorithms work. The KNN algorithm is a good example and is pretty simple to explain, but there are some really complicated ML algorithms too, like Gradient Boosting, and Support Vector Machines.
Now, within the realm of Machine Learning, lies Deep Learning. So Deep Learning is under the umbrella of Machine Learning, which is under the umbrella of AI. Simple right? So what is Deep Learning? It approaches ML problems (and some problems previously unapproachable by legacy ML methods) with something called a neural network. The reason these tools are called ‘neural’ networks is because similar to the human brain, the connections between the parts of the parts of the network are just as important as the parts themselves, similar to connections between neurons.
A neural network is comprised of layers of (mostly) non-linear data transformations that are connected together to try to solve a problem. As the data passes through the first layer it is transformed into a representation of the original data. Then it goes through the next layer and becomes a representation of the representation of the original data. Then it goes through the third layer and it becomes a representation of the representation of the representation of the original data and so on and so on until it reaches the output layer. These increasingly abstract representations of data are called ‘deep representations’ of data, and that is where the ‘deep’ in deep learning comes from.
How about an example: “cat”, or “no cat”
Say we want the AI to identify whether a picture is of a cat, or not a cat, through image recognition. First, we give the AI many pictures that have already been labelled correctly with either “cat” or “no cat”. We don’t give the AI any other information about cats or any programming for how to recognize a cat (like does it have fur, a tail, whiskers, etc.). Instead the AI generates its own identifying characteristics and rules for recognizing a cat.
The more data given to the AI (as correctly labeled pictures), the more accurate the end result becomes, even though the standards the AI is using to make the determination may not be what a human would typically use. For example a person might see fur and a tail and, combined with personal experience, conclude “cat”. The AI might process the same image and determine “cat” because of the pupil shape, or the spatial relationship between the eyes and ears, or a complex combination of factors.
The Backpropagation Algorithm
An incredible thing about neural networks is that they self-optimize using the backpropagation algorithm. When the data finishes passing through all the layers in the network, it’s called an epoch. Whenever an epoch finishes, the algorithm uses a loss function to calculate a statistic called loss, which is basically the error, or how much the algorithm ended up getting the wrong result. Backpropagation uses the chain rule to understand how much each layer had an impact on the final output, including the loss, and automatically update the weights associated with each layer to minimize the loss for the next epoch. When training a neural network, epochs are usually ran back to back and they can be ran for many epochs to optimize the network’s accuracy. While this does mean it’s possible to run a network for an obscene amount of epochs to get near-perfect accuracy, this is actually a very bad idea. The network will, for lack of a better word, ‘memorize’ the training data and will most likely give horrible results when presented with data it has never seen before. This problem is referred to as ‘overfitting’ and is a huge roadblock for legacy ML methods as well as deep learning methods.
Shortcoming: Lack of Explainability
A shortcoming that is unique to deep learning is often called ‘lack of explainability’. While the backpropagation algorithm allows for the network to optimize itself, and tell which layers affected the output somewhat, it is impossible to tell how the networks actually arrive at their output. Remember the KNN algorithm? It’s a very simple matter to trace back the decisions and tell how the algorithm arrived at its output. But neural networks are considered to be black boxes, even by those who made them. It’s impossible to trace the network’s steps and say “Aha, this is why it gave that answer!” This problem is not from a lack of understanding of the systems or mathematics, it is just the nature of the beast, at least for now.
We may never know the secret of how deep learning identifies a cat.
For a more in-depth look at machine learning and deep learning, and assuming you have an understanding of the Python programming language, I recommend the book Deep Learning with Python by Francois Chollet. If you would like to experiment yourself with neural networks, Google Colab is a good place to start if you don’t have the hardware and don’t want to commit to installing anything on your computer. For general ML tools, the scikit-learn library for Python 3 is an excellent all-round machine learning kit for small to medium sized personal projects. For larger enterprise projects, Apache Spark is an excellent choice. Finally, while the website Kaggle.com is most famous for hosting machine learning competitions, they also have learning projects users can walk through for those wanting to step into the field, and an entire community willing to help newcomers. I hope that this article helps people understand a bit more about the mysterious and frightening world of AI, or at least pique some curiosity into the subject. While there are definitely some things to watch out for, such as data privacy, AI, ML and deep learning stand to do a lot of good in the future.
This post was written by Zach Johnson.
Contact us for more information on this post or our services.