/ai-bites

AI Bites is a series of short and simple explanations of concepts related with Artificial Intelligence. I feel I truly understand something when I can put it into few and simple words. I'm sharing as I'm learning... on Linkedin.

Go to: • AI BITES #1 - Explain AI to a 6-year-old • AI BITES #2 - Artificial Intelligence, why now? • AI BITES #3 - Understanding AI based on how it learns to deal with data • AI BITES #4 - AI beyond Machine Learning • AI BITES #5 - What happens to Machine Learning Models while training? • AI BITES #6 - The origin of "Artificial Intelligence" • AI BITES #7 - The professions moving AI forward

AI BITES #1 - Explain AI to a 6-year-old

AI = Code + Data

Artificial Intelligence is when computers can do complex tasks with a level of performance similar to humans. For example, identify objects in images, do hard calculations, move on a street filled with cars and people.

For that to happen, we need Code.

Code is the way we tell computers what to do. It can be as simple as "if this happens, please do that". However, AI requires more advanced code that gives machines the power to learn like a human.

This learning is done from Data.

We are able to recognize a dog after we see other similar animals. When we see a dog for the first time somebody has to tell us that it is called a "dog". The higher the number of dogs we see, the faster and more accurately we will identify a dog in a future encounter.

Just like computers do.
AI BITES #2 - Artificial Intelligence, why now?

Everyone heard some version of the story of the development of AI starting in the 50's, having two winters where funding and enthusiasm went down, and coming back to reach the hype we live in today.

But what, in fact, contributed to AI's fantastic evolution and adoption in recent years?

3 main areas of development compounded to create the "perfect storm":

1. Computers became exponentially more powerful with increased storage 2. Data exploded 3. Research and development brought significant algorithmic enhancements

These advancements have impacted how software has been developed and how we interact with computers. Programs became increasingly more useful because they were able to process and store more data, which in turn augmented our capabilities to analyze data, do calculations, and retrieve information.

Today's state-of-the-art hardware and software allow machines to go beyond the basic functions and start collecting data autonomously, apprehend the world, apply reasoning, and even have agency over a given context. * Enters Artificial Intelligence *

This is not an overnight success, it's the effect of several decades of research and development and societal changes. Here are some historical landmarks that made this possible:

- 1958: Rosenblatt develops a self-learning algorithm - 1965: Moore created the law stating the exponential growth in chip power - 1989: Birth of Neural Networks applied to image recognition - 1991: The World Wide Web was launched - 1997: IBM Deep Blue beats Kasparov at chess - 2004/5: Facebook and Youtube were founded and brought a strong movement of user-generated data - 2005: The cost of one gigabyte of disk storage drops to $0,79, from %277 ten years earlier - 2009: GPUs start to be used to train deep-learning models - 2014: Introduction of the Generative Adversarial Networks (GANs) - 2014: The number of mobile devices exceeds the number of humans - 2017: Introduction of the Transformer architecture to language processing - 2022: Launch of chatGPT

Many other incredible achievements have happened that are not listed which represent incremental gains and big leaps in the history of AI.

People and organizations using AI today are sitting on the shoulders of giants!
AI BITES #3 - Understanding AI based on how it learns to deal with data

If you dig into the Artificial Intelligence rabbit hole it gets too deep too fast. One way to understand the topic is by exploring the different foundational concepts that set the base for more advanced knowledge.

Think of focusing on the tree trunk first and the branches later.

So, AI systems - programs that can do complex intellectual tasks - work generally like this:

DATA ---> AI SYSTEM ---> OUTPUT

The data that is presented to the system can be pretty much anything - numbers, text, images, audio... (It will all be transformed into numbers anyway).

The majority of ML programs today do one task - labeling.

In a picture it can identify a dog, in transaction information it can find fraud, in emails it can find spam. This happens because the model that processes the data was built through Supervised Learning. It had access to training data (aka examples) like several images of dogs to be able to predict with a certain level of confidence that there is (or not) a dog in a picture that it has never "seen" before.

ML systems are also beneficial when we have huge amounts of data that are virtually impossible for a human to process. Sometimes we don't know very well what we are looking for or what we will find in a given dataset. In these cases, Unsupervised Learning comes in handy.

The data is subject to an analysis that will discover inherent structures or groups/clusters. These models are good for finding patterns in data where we do not have a clear mapping between inputs and outputs.

AI programs can also be taught through incentives and penalties. An AI agent that interacts with a certain environment - imagine playing Chess or Go - has a goal to achieve. Reinforcement Learning allows the rewarding of the model when a given action gets it closer to the desired outcome. There's a tutor that watches every move - the interpreter - that feeds the decision to reward or penalize to improve the chances to win in the next chance.
AI BITES #4 - AI beyond Machine Learning

Influenced by the recent developments in Generative AI, it's common to see people too focused on a specific part of the broad field of Artificial Intelligence - Machine Learning.

It's frequent to see those diagrams that drill down from AI to Generative AI:

AI > Machine Learning > Deep Learning > Generative AI

But, if you wonder what fills in the AI square beyond ML, here it is...

First, let's establish that AI is the field that focuses on the research and development of systems capable of performing tasks that typically require human intelligence which includes having decision-making and reasoning capabilities, mimicking domain expertise, problem-solving, adaptability, etc.

From this big bucket, we can exclude the sub-field of Machine Learning which is about systems that can learn from existing data to generalize patterns and make predictions. Meaning, they extract the behaviors and relationships from the training data to make increasingly more accurate predictions on new, unseen data.

So, we are left with other AI but non-ML techniques. Here are some examples:

1️⃣ Rule-based systems: decisions are made based on a set of predefined rules that are explicitly programmed, and the system follows them to make decisions or take actions. e.g. A diagnostic expert system in medicine that provides recommendations based on a set of rules and medical knowledge.

2️⃣ Search algorithms: they explore the solution space systematically to find solutions to problems. They use strategies to navigate through possibilities. e.g. In a chess-playing program, a search algorithm explores possible moves and game states to find the best move.

3️⃣ Robotics and Control Systems: they make decisions about robot movement and actions. These systems may use predefined algorithms and rules. e.g. A robot arm that follows a set of rules for precise movements in a manufacturing process.

4️⃣ Natural Language Processing (NLP) Techniques: While machine learning is often applied to NLP, rule-based approaches exist. These systems use linguistic rules and heuristics for language processing. e.g. A chatbot that understands and responds to user queries based on predefined language rules.
AI BITES #5 - What happens to Machine Learning Models while training?

This question has been in the back of my mind for some time now... A Machine Learning (ML) model is a set of steps that define the way the system deals with data and makes predictions.

If a programmer writes the code that defines the architecture and rules of the algorithm, in my mind, it should remain static until the programmer makes any change. However, we've heard of how ML programs are capable of learning and improving by ingesting more and more data. So, what's happening? Is the code changing by itself?

Well, yes and no...

ML is a broad collection of techniques like Linear Models, Decision Trees, Random Forests, Naive Bayes, K-nearest Neighbors, etc etc.

These are known as "traditional" ML algorithms that have a set of parameters like coefficients, split points, or others, depending on the approach. In this case, the programmer manually specifies the initial parameters which are defined in the code and the model uses them to make predictions.

However, in the context of neural networks (Deep Learning), the process is a bit different. Neural networks are designed to learn from data, and their parameters (weights and biases) are not manually specified by the programmer for each data point. Instead, the initial weights and biases are randomly initialized, and the learning process involves adjusting these parameters during training.

When the model is presented with the training data it makes its predictions with the initial weights and biases and calculates the loss - the difference between the predictions and the actual label of the training data.

Then, there's an event called backpropagation that uses the "loss" information to update the weights and biases to optimize the model in order to produce better results. This cycle is done several times until a satisfactory loss level is reached.

So, in the case of neural networks, the programmer doesn't manually set the weights and biases for each data point. Instead, the learning process automatically adjusts these parameters based on the provided training data and the optimization algorithm used. The goal is to find parameter values that minimize the difference between the predicted and actual outcomes on the training data, allowing the model to generalize well to new, unseen data.
AI BITES #6 - The origin of "Artificial Intelligence"

The term "Artificial Intelligence" (AI) was first coined by John McCarthy, a computer scientist who is often called the "father of AI." He introduced the term in 1956 in the proposal for the Dartmouth Conference, the first-ever conference dedicated to AI, held in 1956 at Dartmouth College.

His (and his co-authors) proposal outlined the vision for a research project that would explore the possibility of machines being able to use language, form abstractions and concepts, solve kinds of problems reserved for humans, and improve themselves.

This historic conference is widely considered the birth of AI as a field of research. McCarthy's vision was to investigate ways in which machines could be made to simulate aspects of human intelligence.
AI BITES #7 - The professions moving AI forward

The development of Artificial Intelligence (AI) is a multidisciplinary endeavor that draws on expertise from a variety of fields. Professionals from diverse backgrounds contribute their knowledge and skills to advance AI technology, make it more efficient, ethical, and accessible across various sectors.

Here are some of the key professions contributing to AI development:

1️⃣ Computer Scientists and AI Researchers

2️⃣ Data Scientists and Engineers

3️⃣ Software Engineers and Developers

4️⃣ Robotics Engineers

5️⃣ Cognitive, Neuroscientists and Psychologists

6️⃣ Ethicists and Policy Makers

7️⃣ Philosophers

8️⃣ Linguists

9️⃣ Industry Specialists and Domain Experts