Module 6: Machine Learning Basics

Introduction to building predictive models using Scikit-Learn.

6.1 Scikit-Learn Overview

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms.

6.2 Supervised Learning

Supervised learning involves training a model on labeled data. Common algorithms include Linear Regression, Decision Trees, and Support Vector Machines.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# X = features, y = target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

6.3 Unsupervised Learning

Unsupervised learning involves training a model on data without labels. Common algorithms include K-Means Clustering and PCA.

6.4 Model Evaluation

Evaluating model performance is crucial. Metrics depend on the problem type (e.g., Accuracy for classification, MSE for regression).

🎯 Practical Exercise

Train a simple classifier on the Iris dataset using Scikit-Learn and evaluate its accuracy.