Machine Learning (ML): Turning Data into Predictions

Machine Learning (ML): Turning Data into Predictions
A practical introduction to how machines learn, what they need, and why it matters.
The Goal: Stop Writing Rules, Start Learning Patterns
"A computer program is said to learn from experience E with respect to some task T if its performance improves with experience."
— Tom Mitchell / Arthur Samuel
Traditional programming means writing every rule by hand. Machine learning flips this: instead of telling a computer what to do, you show it examples and let it discover the rules itself.
The computer creates its own programme from data — no explicit instructions required.
Machine Learning

•For example, one might predict if an email is spam or not based on its content and sender information.(Classification)
•Another task could be clustering books into various categories according to the words they contain, and then assigning any new book to one of these established clusters.
ML's Core Ingredients
📊 Dataset
A collection of input-output pairs — features (what you measure) and labels (what you want to predict).
🧠 Model
A mathematical structure with internal parameters that are adjusted during training to fit the data.
⚙️ Training Process
The mechanism that tunes parameters iteratively, guided by how wrong the model's predictions currently are.
Three Main Types of Machine Learning
Supervised Learning
Learn from labelled examples to predict outputs on new inputs. Used in classification and regression tasks.
Unsupervised Learning
Discover hidden structure in data when no labels exist — clustering, dimensionality reduction, anomaly detection.
Reinforcement Learning
Learn behaviour through trial, error, and reward signals over time — the approach behind game-playing AI and robotics.
Supervised Learning
•Supervised learning is the most common type of machine learning. In this approach, the model is trained on a labelled dataset. In other words, the data is accompanied by a label that the model is trying to predict. This could be anything from a category label to a real-valued number.
•The model learns a mapping between the input (features) and the output (label) during the training process. Once trained, the model can predict the output for new, unseen data.
•Common examples of supervised learning algorithms include linear regression for regression problems and logistic regression, decision trees, and support vector machines for classification problems.
Types of Supervised Learning

Unsupervised Learning
•Unsupervised learning is the process of uncovering hidden patterns and structures from unlabelled data. 
•For instance, a business might aim to group its customers into distinct categories based on their purchasing behaviour, without knowing beforehand what these categories will be. 
•This technique, known as clustering, represents one branch of unsupervised learning.
Working with Real world Data


Challenging part!!


Example: Supervised Learning 
The predicted Values are known.
Example : Cancer Data Analysis
Aim: Predict the target values of unseen data, given the features.

Visualizing the data

Select a performance measurement
The Training Loop: Loss Function Turns Guessing into Learning
What is a Loss Function?
A loss function L measures how far the model's predictions are from the true labels. The larger the error, the higher the loss.
Training is simply an optimisation problem: find the parameters that minimise L across all training examples. Every iteration, the model gets a little bit better.
Generalisation: The Hidden Test Everyone Forgets
A model that memorises training data but fails on new inputs has learnt nothing useful. This is called overfitting.
No model can see infinite inputs during training — it always learns from a finite sample. The real measure of success is whether the model performs well on data it has never seen before. Generalisation separates a useful model from one that merely gets top marks on past examples.
A Concrete Example: Linear Regression as ML in Miniature
The Simplest ML Model
Fit a straight line y = wx + b through data points by minimising the total absolute prediction error.
x → Features
The inputs you measure
y → Labels
The outputs you want to predict
w, b → Parameters
Learned automatically during training
When ML Goes Wrong
Wrong Question
No model can rescue a poorly defined problem. If you optimise for the wrong outcome, you'll achieve it perfectly — and still fail.
Bad or Missing Data
Incorrect labels, too few examples, or unrepresentative samples will produce a model that misleads rather than helps.
Wrong Yardstick
Using the wrong evaluation metric means you can't tell whether your model is actually working — even when it looks good on paper.
Takeaway: ML Isn't Magic
It's a disciplined process. Follow these five steps every time.
1
Define the Task
What exactly are you predicting? Be precise.
2
Gather the Right Data
Collect correct input-output pairs that represent the real world.
3
Choose a Model Class
Pick an architecture suited to your data and task type.
4
Train with a Loss
Define how to measure error, then minimise it systematically.
5
Verify Generalisation
Always test on data the model has never seen. That's the real exam.
Machine learning is data + models + training + generalisation. Master these four building blocks and you master ML.