Links and exercises for the course Practical Machine Learning, Green Data Science, 2o semester 2024/2025
Instructor: Manuel Campagnolo, ISA/ULisboa
The course will follow a mixed flipped classroom model, where students are supposed to work on suggested topics autonomously before classes. Work outside class will be based on a range of Machine Learning resources including the book Sebastian Raschka, Yuxi (Hayden) Liu, and Vahid Mirjalili. Machine Learning with PyTorch and Scikit-Learn. Packt Publishing, 2022. During classes, Python notebooks will be run on Google Colab.
Links for class resources:
Overview notebook This notebook provides an overview of the full course and contains pointers for other sources of relevant information and Python scripts.
Sessions: Each description below includes the summary of the topics covered in the session, as well as the description of assignments and links to videos or other materials that students should work through.
The goal of the first class is to give an introduction to ML and also to show some of the problems that can be addressed with the techniques and tools that will be discussed during the semester. The examples will be run on Colab.
The goal of the following classes is to understand how ML models can be trained in and used to solve regression and classification problems. We start by applying the machine learning approach to well-known statistical problems like linear regression to illustrate the stepwise approach followed in ML. We use synthetic data generated from a linear or quadratic regression, where one can control the underlying model and the amout of noise. Then, we consider the Iris
tabular data set with 4 explanatory variables and categorical label that can be one of three species.
Pseudo code for SGD (stochastic gradient descent) to fit a linear regression:
N observations, n features
Small positive value
Number of epochs
Typically, all zero
Update weights after each example
Predict response with current weights
Update weight (bias)
Update weight (for each feature)
Create a LinearRegression
class with a fit
method to implement the pseudo code above. Add to your class a predict
method to make new predictions using the fitted model. Test your class with the following example.
# Create synthetic data
np.random.seed(0)
X = np.random.rand(100, 1) # array with 100 rows and 1 column (1 feature)
y = 2 + 3 * X + np.random.randn(100, 1) * 0.1
# Create and train the model
model = LinearRegression(learning_rate=0.1, max_iter=1000)
model.fit(X, y)
# Make predictions
X_test = np.array([[0.5]])
y_pred = model.predict(X_test)
print(f"Prediction for X=0.5: {y_pred[0]}")