Predictive Models and Machine Learning Gentle Intro (Free of math)
- Adam Longmire

- Apr 15, 2025
- 4 min read
Before Machine learning
The processes involved in machine learning have been around far longer than machine learning itself actually has in reality machine learning is based on prediction at effectively all levels. We have been doing predictive decision making for as long as man has been in existence you may think hasn't AI been a great revolution? Yes it has but it has it's roots more in the prediction of things than anything else, and that prediction is derived from data, in the past we made predictions based on a number of characteristics when to grow crops, when to plan for harvest these are just some simple examples of when prediction was used to drive the predictions made based on decisions. A very clear example would be the simple awareness of when it got colder it was well known that during the winter there was a much higher probability of rain given certain indicators, these indicators are what informed the predictions made about this "data" from that information we could make an decision to collect sufficient resources for the winter, rain brings with it, risk of flooding, more heating, more food, reduced temperatures lead to slower crop growth if any grew, these "traits" which I have just listed are what we call your "features" in AI, we model predictions of these traits.
Temperature (low, high medium)
Weather (sunny, rainy, cold, snow)
Heat (Do some fuel sources generate more heat for less)
Fuel (how much do I need to get through winter)
Trees (Are the trees too wet to be used)
Are the features we want to predict whether lets say we need more food or wood to keep a fire going, this is effectively the essence of how AI works, the only difference here is the amount of data present in "features" contained in the data went from 1 or 5 so features to 100's or even 1000's of features, do you want a read a giant chart about all the data you have about a specific task then try to make predictions on that? That is why computers are often used to sort this data and clean it removing any irrelevant features in the data.
What is machine learning / AI
Machine learning is simply the collection of data, and information then forming a set of predictions on that data typically machine learning comes in two forms "mainly" those are that of discriminating and generative. Most of the models you find in the field of recommenders used to give you the next video you are most likely to view, or the artificial intelligence systems used to determine fraud are what we call "discriminators" or discriminating models. They look at data and determine from that data what is the best choice for the model to choose, for fraud it's about determining if a financial transaction is fake or potentially unauthorized by examining the many data points about how you spend money. Examples of patterns which could be extracted from your financial activity.
How much you typically spend per transaction
The type of transaction
Location
Frequency of payments or withdrawals
How often you fall into the negative in some accounts
When you deposit money
How much you deposited
While these data points might appear pointless on the surface, they might yield some insights into how all these things are interlinked this interconnection allows a discriminating model to make better decisions based on the training data contained in this set of behavioral patterns for determining if that behavior is considered anomalous and therefore classify it as fraud.
Generative AI
Avoiding a lot of the math here is fairly simple as this isn't a technical deep dive, I might send people to sleep so I will explain this with a simple idea, that idea is in words and sentences there is relationships those relationships are fairly simple to view, words such as I need to go to the [insert here] would have various choices, but those choices would also be informed based on previous justification, at it's core generative AI is simply learning these "relationships" between words mathematically, where you might have a concept of grammar such as predicate-subject-verb based structure a machine learning model does not understand this so we represent these relationships using numbers. These numbers are exactly what form the relationship upon which generative models such as large language models are built on, an abstract representation of how words work, and what is likely to be predicatively the next word, so generative AI are still a type of discriminating AI but it technically is doing so much more than just discriminating.
A simple set of words that everyone at school would have been familiar with "The fat cat sat on the mat" This has a number associated with it how close it is to the adjective in the sentence and the subject which is the cat because the cat sat on the mat, and what happens after that is the generative model creates the first word, then generates the next and so on we all these 'words' tokens, they are created token by token, till you get a sentence, in reality this number based solution is high dimensional, not easily understood and we call this embedding space, they are just a list of numbers which tell you how close a word is to another in relative meaning.
