Artificial Intelligence (AI) is making huge headlines around the world. But what is the practical reality of AI as a game-changer in various industries for solving real-world problems? Before tackling these questions, it’s useful to gain some insight into what AI is all about.
This blog post is the opening of a whitepaper with the same title, which begins by embarking on a tour of contemporary AI and how it is accomplished. Although AI and machine learning (ML) can be highly technical subjects involving lots of math and statistics, the aim here is to present these topics in a non-technical way, but at the same time provide a reasonable amount of breadth, depth, and rigor. While there will be some mathematical notation, this can be safely ignored if so desired.
This first look will present an overview of some of the critical ideas of AI and ML at a high level. We will move rapidly through key aspects of the theory and by the end of this installment walk through building a simple deep neural network to recognize handwritten digits. Don’t worry if some of the ideas presented here are not entirely clear, as we won’t have time to delve into all of the details. In subsequent installments, we will flesh out the key ideas presented here in greater detail. Upon establishing a solid foundation of the mechanics of AI and machine learning, later installments will examine the promise of AI and specifically how the various incarnations of AI can be brought to bear in industry and commercial applications to solve hard problems in the world. More detailed aspects of the content will be presented later in this series of white papers on artificial intelligence.
As we will see, the rudiments of AI and ML arise from probability and statistics. And while there are many ways to interpret how AI works, in fact, the “how” remains an open question that has become a focal area of research. Answers to this question may touch on some advanced topics, such as convex function theory, Bayesian inference, information theory, and even aspects of statistical quantum mechanics, such as spin glasses. These questions are not merely academic, for they underscore crucial elements of the successful practice of AI. Indeed, it is a vibrant field! But we needn’t complicate the discussion for now as we will get to all of that in due course.
We will approach this discussion by building up our understanding of AI from first principles. We will begin with linear regression, which may be thought of as the “Hello World” of AI. From there, we will turn to the classification problem and delve into an area called logistic regression, which is all about teaching a computer to classify an input as belonging to one class or another. If you think about it, classification is a vital part of the foundations of exhibited intelligence. An agent that can recognize and distinguish different entities in the world may be said to show some basic modicum of intelligence.
Logistic regression will serve as a foundation for understanding neural networks and deep learning, which is one of the most popular techniques today in supervised learning. Along the way, we hope to highlight the essential aspects of these techniques. For example, we will learn the two most important aspects of neural networks and deep learning that has enabled this technique to be so successful in areas such as image classification, natural language processing, speech recognition, and speech synthesis.
Read more by downloading the case study here.