Masters in Data Science applied to agricultural and food sciences, environment, and forestry engineering.
Instructor: Manuel Campagnolo (mlc@isa.ulisboa.pt)
Teaching assistant: Dominic Welsh (djwelsh@edu.ulisboa.pt)
CS50P | Contents | PP.fi | Contents |
---|---|---|---|
Lecture 0 | Creating Code with Python; Functions; Bugs; Strings and Parameters; Formatting Strings; More on Strings; Integers or int; Readability Wins; Float Basics; More on Floats; Def; Returning Values | Part 1 | Intro; I/O; More about variables; Arithmetic operations; Conditional statements |
Lecture 1 | Conditionals, if Statements, Control FlowModulo; Creating Our Own Parity Function; Pythonic; match | Part 2 | Programming terminology; More conditionals; Combining conditions; Simple loops |
Lecture 2 | Loops; While Loops; For Loops; Improving with User Input; More About Lists; Length; Dictionaries, More on code modularity | Part 3 | Loops with conditions; Working with strings; More loops; Defining functions |
Part 4 | The Visual Studio Code editor, Python interpreter and built-in debugging tool; More functions; Lists; Definite iteration; Print statement formatting; More strings and lists | ||
Part 5 | More lists; References; Dictionary; Tuple | ||
Lecture 3 | Exceptions, Runtime Errors, try, else, Creating a Function to Get an Integer, pass | Part 6 | Reading files; Writing files; Handling errors; Local and global variables |
Lecture 4 | Libraries, Random, Statistics, Command-Line Arguments, slice, Packages, APIs, Making Your Own Libraries | Part 7 | Modules; Randomness; Times and dates; Data processing; Creating your own modules; More Python features |
Lecture 5 | Unit Tests; assert; pytest; Testing Strings; Organizing Tests into Folders | ||
Lecture 6 | File I/O; open; with; CSV; Binary Files and PIL | ||
Lecture 7 | Regular Expressions; Case Sensitivity; Cleaning Up User Input; Extracting User Input | ||
Lecture 8 | Object-Oriented Programming; Classes; raise; Decorators; Class Methods; Static Methods; Inheritance; Inheritance and Exceptions; Operator Overloading | Part 8 | Objects and methods; Classes and objects; Defining classes; Defining methods; More examples of classes |
Part 9 | Objects and references; Objects as attributes; Encapsulation; Scope of methods; Class attributes; More examples with classes | ||
Part 10 | Class hierarchies; Access modifiers; Object oriented programming techniques; Developing a larger application | ||
Lecture 9 | set; Global Variables; Constants; Type Hints; Docstrings; argparse; Unpacking; args and kwargs; map; List Comprehensions; filter; Dictionary Comprehensions; enumerate; Generators and Iterators | Part 11 | List comprehensions; More comprehensions; Recursion; More recursion examples |
Part 12 | Functions as arguments; Generators; Functional programming; Regular expressions |
code filename.py
to create a new filels
to list files in foldercp filename newfilename
to copy a file, e.g. cp ..\hello.py farewell.py
(..
represents parent folder)mv filename newfilename
to rename or move file, e.g. my farewell.py goodbye.py
or mv farewell.py ..
(move one folder up)rm filename
to delete (remove) filemkdir foldername
to create new foldercd foldername
change directory, e.g. cd ..
rmdir foldername
to delete folderclear
to clear terminal windowstr
), variables, print (a function), parameters (e.g. end=
), input, comments, formatted strings (f"..."
), .strip()
, .title
(methods)int
), operations for integers, casting (e.g. str
to int
)float
), round, format floats (e.g. f"{z:.2f}
)True
, False
, and
, or
, not
def
, return
if
, elif
, else
:
if score >= 70:
print("Grade: C to A")
elif score >= 60:
print("Grade: D")
else:
print("Grade: F")
match
:
match species:
case 'versicolor':
label=0
case 'virginica'
label=1
case _:
label=2
def main()
, define other functions, call main()
. The code must be modular.break
, break
and return
[]
: methods append
, extend
{}
, items()
, keys .key()
and values .values()
knights = {'gallahad': 'the pure', 'robin': 'the brave'}
for k, v in knights.items():
print(k, v)
if 'gallahad' in knights:
print('Go Gallahad')
Exercises from CS50 Problem set 0, 1 and 2.
Handling exceptions in Python: raising and catching exceptions.
def main():
spacecraft = input("Enter a spacecraft: ")
au=get_au(spacecraft)
m = convert(au)
print(f"{m} m")
For the fuel gauge problem (https://cs50.harvard.edu/python/2022/psets/3/fuel/), try to organize your code as follows. As suggested in hints, you should catch ValueError
and ZeroDivisionError
exceptions in your code. In the code below, the user is being asked for correct values for x,y
until they satisfy the requirements: x,y
must be inputted as a string x/y
, x
has to be less or equal to y
, and y
cannot be zero. The function get_string_of_integers_X_less_than_Y
in the code below should take care of that.
def main():
# asks user for input until the input is as expected
x,y=get_string_of_integers_X_less_than_Y()
# compute percentage from two integers
p=compute_percentage(x,y)
# print output
print_gauge(p)
Example of basic use of try-except
to catch a ValueError
:
try:
x = int(input("What's x?"))
except ValueError:
print("x is not an integer")
else:
print(f"x is {x}")
Function for requesting an integer from the user until no exceptions are caught:
def get_int():
while True:
try:
x = int(input("What's x?"))
except ValueError:
print("x is not an integer")
else:
break
return x
We may want to exit the execution of our script if some exception is caught. This can be done with sys.exit()
, which can also be used to print a message.
import sys # import module
try:
x = int(input("What's x?"))
except ValueError:
sys.exit("x is not an integer")
Example of code that catches CRTL-C
or CRTL-D
:
while True:
try:
x=int(input())
except ValueError:
print('x is not integer')
except KeyboardInterrupt: #CTRL-C
print('\n KeyboardInterrupt')
break
except EOFError: # CTRL-D
print('\n EOFError')
break
else:
print(x)
For a list of Python Built-in Exceptions, you can explore (https://www.w3schools.com/python/python_ref_exceptions.asp)
You can store your own functions in modules (which are just python scripts) and import
then into your main code. Let’s imagine you created a file named mymodule.py
in a given folder. In your main script, you can import the file if the folder belongs to list of folders the Python interpreter will look for. You can check that by running the following lines of codes in the Python interpreter:
>>>import sys
>>>sys.path
If the folder where mymodule.py
was created does not belong to that list, you can add it with sys.path.append
which allows you to import your module. To that end, you can include the followings lines to your main script:
import sys
sys.path.append(r'path-to-folder') # folder where mymodule is
import mymodule
where path-to-folder
is the path that you can easily copy in your IDE.
If your module includes a function named, say, get_integer
, you can then use the function in your main script either by calling mymodule.get_integer()
or you can instead load the function with from mymodule import get_integer
and then just call it with get_integer()
in the main script as in the following script.
import sys
sys.path.append(r'/workspaces/8834091/modules') # where file mymodule.py is
from mymodule import get_integer
def main():
x=get_integer()
print(x)
main()
Contents of mymodule.py
:
import sys
def get_integer() -> int:
while True:
try:
return(int(input('type a number: ')))
except ValueError:
print('not an integer number: try again')
except KeyboardInterrupt: #CTRL-C
print('\n If you want to exit type CTRL-D')
except EOFError: # CTRL-D
sys.exit('\n exit as requested')
Often, you import a module that is available at (https://pypi.org/project/pip/). Say you want to load the module random
which provides a series of functions for sampling, shuffling, and extracting random numbers from a variety of probability distributions. If the module is not already available, you can typically load it in your terminal with
$pip install random
and then import it on your main script with import random
. If you want to know which is the folder where the module is located, you can get that information with random.__file__
.
sys.argv
Previously, we used module sys
, in particular functions sys.exit()
and sys.path
. Another useful function is sys.argv
, that allows you to have access to what the user typed in at the command line $
as in
import sys
print(len(sys.argv)) # returns the number of words in the command line after $python
print(sys.argv[1]) # returns the 2nd word, i.e., the first word after $python myscript.py
For instance, the following script named sum.py
prints the sum of two numbers that were specified in the command line with $python sum.py 1.2 4.3
:
import sys
try:
x,y = float(sys.argv[1]), float(sys.argv[2])
print('the sum is',x+y)
except IndexError:
print('missing argument')
except ValueError:
print('The arguments are not numbers')
Application program interfaces allow you to communicate with a remote server. For instance, requests
is a package that allows your program to behave as a web browser would. Consider the following script myrequest.py
that allows you to explore the itunes database (https://performance-partners.apple.com/search-api):
import requests
import sys
try:
response = requests.get("https://itunes.apple.com/search?entity=song&limit=1&term=" + sys.argv[1])
print(response.json())
except IndexError:
sys.exit('Missing argument')
except requests.RequestException:
sys.exit('Request failed')
You can easily adapt that code to access a different database. For instance if you want to explore the GBIF database (https://data-blog.gbif.org/post/gbif-api-beginners-guide/), you can just replace the main line of code in myrequest.py
with
response=requests.get('https://api.gbif.org/v1/species/match?name='+ sys.argv[1])
and execute it with, say, $python myrequest.py Tracheophyta
in the terminal.
There are many ways of running an API in Python. The following example shows how you can access satellite imagery through the Google Earth Engine API and compute the mean land surface temperature at some location from the MODIS11 product. To be able to use the API, you need to have a Google account, and an earth engine project associated to it.
# pip install earthengine-api
import ee
# Trigger the authentication flow.
ee.Authenticate()
# Initialize the library.
ee.Initialize(project='project-name') # e.g. 'ee-my-mlc-math-isa-utl'
# Import the MODIS land surface temperature collection.
lst = ee.ImageCollection('MODIS/006/MOD11A1')
# Selection of appropriate bands and dates for LST.
lst = lst.select('LST_Day_1km', 'QC_Day').filterDate('2020-01-01', '2024-01-01')
# Define the urban location of interest as a point near Lyon, France.
u_lon = 4.8148
u_lat = 45.7758
u_poi = ee.Geometry.Point(u_lon, u_lat)
scale = 1000 # scale in meters
# Calculate and print the mean value of the LST collection at the point.
lst_urban_point = lst.mean().sample(u_poi, scale).first().get('LST_Day_1km').getInfo()
print('Average daytime LST at urban point:', round(lst_urban_point*0.02 -273.15, 2), '°C')
Solve problems from CS50P Problem_set_4. In particular, for problem Bitcoin price index organize your code so the main function is the following:
def main():
x=read_command_line_input()
price=get_bitcoin_price()
print(f"${x*price:,.4f}")
A virtual environment (https://docs.python.org/3/library/venv.html) is:
.venv
or venv
in the project directory, or under a container directory for lots of virtual environments.
- Not checked into source control systems such as Git.
- Considered as disposable – it should be simple to delete and recreate it from scratch. You don’t place any project code in the environment.
- Not considered as movable or copyable – you just recreate the same environment in the target location.In your system you have the base environment by default, and you can create one or more virtual environments. Below, we describe how to create a virtual environment and how to activate it, so you commands in terminal are interpreted within that environment. That allows you to encapsulate in each virtual environment you create a given Python version, and a set of Python packages with their given versions. Your data and script files remain on the usual working folders: they should not be moved to the folders where the virtual environment files are stored.
The following commands work in the CS50 codespace that runs Linux (check with $cat /etc/os-release
in the terminal). Some need to be slightly adapted for Windows.
Firstly, let’s check what are the available packages and their versions in the base environment, and also let’s get extra information about the package requests
(e.g. dependencies):
$ pip list
$ pip show requests
Next, let’s create a virtual environment. One can first create (with mkdir
) a folder called, say, my_venvs
so all the virtual environments are created in that folder. This makes sense since virtual enrironment folders are created independently from the working folders that contain data and scripts. The virtual environment myvenv
can then be created with:
my_venvs/ $ python3 -m venv myvenv # creates environment called myvenv with Python 3
In case one needs to delete the virtual environment, one just needs to delete the folder. This can be done with $ sudo rm -rf myvenv
in the terminal (Linux). After the virtual environment has been created, one needs to activate it. In Linux, this is done by executing activate
which lies in the bin
folder of the virtual environment:
my_venvs/ $ source myvenv/bin/activate # note that activate needs to be sourced
As a result, the prompt shows (myvenv) my_venvs/ $
which indicates that myvenv
is now activated. One can check the Python version with $python -V
. To de-activate a virtual environment, the command is $ deactivate
. With the environment activated, let’s try to install a few packages, specifying the versions. For instance, install the following packages.
(myvenv) my_venvs/ $ pip install random11==0.0.1
(myvenv) my_venvs/ $ pip install geopy==1.23.0
(myvenv) my_venvs/ $ pip install requests==2.25.0
Some of this packages depend on additional packages that are installed automatically. To list all instaled packages within the environment myvenv
one can execute (myvenv) $ pip list
as before. Compare the version of requests
in myvenv
with the version returned initially in the base environment: this one is 2.25.0 while the one in the base environment is more recent. One can also check where requests
is installed in myvenv
with the command (myvenv) $ pip show requests
.
Check the system path (where Python will look for installed packages) by executing print(sys.path)
: one can do this from the terminal with the command
(myvenv) my_venvs/ $ python -c 'import sys; print(sys.path)'
Notice that the folder in myvenv
where the virtual environment packages are installed is listed, but the path to where base packages are stored is not.
If one wishes to share a virtual environment, the way to do that is to share a file (typically, requirements.txt
) that allows a collaborator to re-create the environment. requirements.txt
stores the information about the installed packages in a file in case one intends to share the environment (e.g. in GitHub). Towards that end, one needs to create requirements.txt
with the packages names and versions, that can be used to create a clone of the environment on another machine. This is done, still within myvenv
(i.e. with myvenv
activated) with the following command:
(myvenv) my_venvs/ $ pip freeze > requirements.txt
Note that the file requirements.txt
is created in the folder that contains myvenv
and not within myvenv
itself: this makes sense, since one does not want to store scripts or data within myvenv
but just packages and the Python version. Since requirements.txt
is now available, one can create a copy of myvenv
called, say, myvenv2
. Firstly, one needs to de-activate myvenv
. Then, the commands to be executed in the terminal are:
my_venvs/ $ python3 - m venv myvenv2 # create new virtual environment with the Python 3 interpreter called myvenv2
my_venvs/ $ source myvenv2/bin/activate # activate myvenv2
(myvenv2) my_venvs/ $ pip install -r requirements.txt # install packages and versions listed in requirements.txt
Exercise: go back to myvenv
, add package (say, emoji==0.1.0
), re-build requirements.txt
, and create new environment myvenv3
and install the set of packages listed in the new requirements.txt
.
As discussed in (https://cs50.harvard.edu/python/2022/notes/6/) open
is a functionality built into Python that allows you to open a file and utilize it in your program. The open function allows you to open a file such that you can read from it or write to it. The most basic way to use open
allow us to enable file I/O with respect to a given file. In the example below, w
is the argument value that indicates that the file is open in writing mode. The instruction file.write(...)
will entirely rewrite the file, deleting the previous contents.
name='Bob'
file = open("names.txt", "w")
file.write(name)
file.close()
As an alternative, if the goal is to add new contents to the file, which is appended to the existent content, then w
should be replaced by a
(append). Each call to file.write(name)
will then add the value of name
to the end of file
.
Instead of explicitly opening and closing a file, it’s simpler to use the so-called context manager in Python, using the keyword with
, which automatically closes the file:
with open("names.txt", "w") as f:
f.write(name)
If one wishes to read from a file, then the file has to be opened in reading mode as in the following example. The method readlines
reads all lines of the file, and stores them in a list, where each element of the list is the contents of the corresponding line.
with open("names.txt", "r") as f:
L=f.readlines(name)
However, it is possible to read one line at the time:
with open('myfile.txt','r') as f:
N=0
for line in f:
N+=1
print('number of lines', N)
Aa an alternative, this can be done with method readline
. This can be included in a loop to read the whole file. Notice that when the end of the file is reached, readline
returns the empty string, and this can be easily tested with a condition.
Reading a file in Python gives the flexibility of visiting any position in the file. The initial position is 0 by default but can be instantiated with f.seek(n)
. Then, f.read(10)
for instance reads n characters from that initial position. Method f.tell()
returns the current position in the file.
A file can be of type text (human-readable) or binary. Binary files like images for instance are read with with open('myfile.txt','rb') as f
.
Exercise: Consider the file downloaded from INE (the Portuguese Institute of Statistics) about causes of fires by geographical location rural_fires.csv. The source is INE: “Rural fires (No.) by Geographic localization (NUTS - 2013) and Cause of fire; Annual” for 2023. Write a script to read the file and exclude the lines which are not formated as a table (header lines). The formatted lines should be written into a new file, say (table_rural_fires.csv
).
with open('rural_fires.csv','rb') as f:
with open('table_rural_fires.csv',"w") as fw:
for line in f:
if line[0] in ['1','2','3']: # or smth like line.startswith('1'):
fw.write(line)
Since the file contains non ASCII characters, one might want to try to decode those characters correctly. Note that Python provides methods encode
and decode
as in the example below.
str_original = 'ção'
bytes_encoded = str_original.encode(encoding='utf-8')
print(type(bytes_encoded))
str_decoded = bytes_encoded.decode()
print(type(str_decoded))
print('Encoded bytes =', bytes_encoded)
print('Decoded String =', str_decoded)
print('str_original equals str_decoded =', str_original == str_decoded)
Pandas dataframes have an intrinsic tabular structure represented by rows and columns where each row and column has a unique label (name) and position number inside the dataframe. The row labels, called dataframe index, can be integer numbers or string values, the column labels, called column names, are usually strings. Use the following script to create a dataframe with random values. Notice the terminology for rows (index
) and columns (columns
).
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(6, 4), index=list('abcdef'), columns=list('ABCD'))
print(df)
Exercices:
df
with .columns
.Series
that corresponds to column A
with ['A']
A
and C
with [['A','C']]
.Notice that .columns
returns a pd.Index
object. This is to provide extra functionality and performance compared to lists. To extract a list of names, one can use .columns.tolist()
or .columns.values
.
Consider the dataset that described 517 fires from the Montesinho natural park in Portugal. For each incident weekday, month, coordinates, and the burnt area are recorded, as well as several meteorological data such as rain, temperature, humidity, and wind (https://www.kaggle.com/datasets/vikasukani/forest-firearea-datasets). For reference, a copy of the file is available forestfires.csv. The variables are:
The goal is to download the file and use package Pandas
to explore it and solve the following tasks.
pd.read_csv
into a new object fires
, and show the first 10 rows with fires.head(10)
..dtypes
..info()
.Series
with the temperature values for all 517 fires.DataFrame
just with columns month
and day
.(...)
and can be connected with &
or |
or negated with ~
..isin()
Null
values in the dataframe with .notna()
. You can sum along columns with .sum()
.These are operators to select rows and columns from a dataframe. loc
selects rows and columns using the row and column names. iloc
uses the positions in the table. Notice that new values can be assigned to selections defined with loc
and iloc
.
fires.iloc[0:3,2:4]
loc
and is.in()
to select fires from August and September and just FWI based variables values for those fires.iloc
to select the first 20 fires and just the FWI based variables valuesThere are several possibilities to combine positional and label-based indexing:
iloc
) Using df.columns.get_loc()
which converts the name of one column into its position. Then iloc
can be used to perform the selection. For multiple columns determined by a list of column names, one can use instead df.columns.get_indexer()
. Example: Use iloc
to select the first 20 fires and just the FWI based variables values, using the names rather than the positions of those variables. Solution: FWI_positions=fires.columns.get_indexer(['FFMC','DMC','DC','ISI'])
and `
fires.iloc[0:20,FWI_positions]`loc
) Using df.index[]
to extract the index names. Then, loc
can be used to perform the selection. Solution: fires.loc[fires.index[0:20], ['FFMC', 'DMC', 'DC', 'ISI']]
.Exporting is done with operations named .to_...
as listed in (https://pandas.pydata.org/docs/user_guide/io.html)
.to_excel("filename.xlsx", sheetname="fires", index=False)
pd.read_excel("filename.xlsx", sheetname="fires", index=False)
months_df
from a dictionary: for instance create a dictionary where keys are jan
, feb
, mar
, for all 12 months, and the values are January
, February
, March
and so on.month_data = {
'Month': [
'January', 'February', 'March', 'April', 'May', 'June',
'July', 'August', 'September', 'October', 'November', 'December'
],
'mth': [
'jan', 'feb', 'mar', 'apr', 'may', 'jun',
'jul', 'aug', 'sep', 'oct', 'nov', 'dec'
]
}
months_df = pd.DataFrame(month_data)
merged_df = pd.merge(fires, months_df, left_on='month', right_on='mth', how='left')
merged_df.drop(columns='mth', inplace=True)
Create a jupyter notebook for this class. If you’re using your CS50 codespace, create a new file in the terminal with $code mynotebook.ipynb
and follow the suggestions for jupyter notebooks in your codespace session.
There are many available cheatsheets for Pandas that can help visualizing Pandas’ functionalities. Since there are many possibilities, a single page cheatsheet is either too limited or too cryptic. This 12-page cheatsheet is pretty self-contained and includes several examples.
fires
dataframe with method .groupby
to get just one row per month, and average temperature, average RH, and number of fires per month. The goal is to create a dataframe named firesbymonth
with columns avg_temp
, avg_RH
and fire_count
. See (https://pandas.pydata.org/docs/user_guide/groupby.html).reset_index()
to the previous command?firesbymonth
, such that the 12 rows are ordered by month correctly: jan
, feb
, mar
, and so on.conditions
in firesbymonth
of type string that indicates if a month is dry&hot
, dry&cold
, wet&hot
or wet&cold
. Use the mean values of avg_temp
and avg_RH
to establish the appropriate thresholds. Use method .apply
and define the function to apply with lambda
.fires
into a two-way table that shows the total area of fires per day of the week and per month, where NaN
are replaced by 0. Towards that end, explore the .pivot_table
method.Suppose that one wants write a script in python using classes to monitor plants at a nursery. Initially plants grow from seeds in trays and one wants to keep track of the trays and number of plants per tray. All plants in a given tray are from the same species. Then, at some point, some plants are transferred from trays to individual pots (one plant per pot). At the end, pots are sold. One wants to track the number of plants of each species that are in the nursery.
For this type of problem, one wants to mimic entities of the real world (plants, trays, pots, and the nursery) as objects in Python code. Object-oriented programming is an intuitive form of doing so. A class in Python is an object constructor, or a blueprint for creating objects.
The simplest example of class, with very little functionality, is a class to store constant values, which are not supposed to change. When one calls the class Constants
defined below, an instance of the class with the two properties MAX_PLANTS_PER_TRAY
and SALE_PRICE
is created.
class Constants:
MAX_PLANTS_PER_TRAY=50
SALE_PRICE=10
print(Constants.SALE_PRICE)
However, in general we intent to call the class to create one instance (one object) of the class and set the properties of that object. To indicate the values of the instance properties we use the __init__
method:
class Plant:
def __init__(self, species):
self.species = species
my_plant=Plant("Rose") # create instance where property `species` has value `Rose`
print(my_plant.species)
Alternatively, a class can be created with the @dataclass
decorator, see (https://docs.python.org/3/library/dataclasses.html). In this case, the __init__
method is set automatically.
from dataclasses import dataclass
@dataclass
class Plant:
species: str
A class can have methods, which are functions defined for objects of the class. In the example below, Tray
is a class with properties species
and number_of_plants
, and methods remove_plants
and is_empty
. The first has one argument which is the number of plants to remove from the tray; it returns a list of objects of the class Plant
which correspond to the plants that were removed from the tray. The method is_empty
doesn’t have an argument and returns True
or False
.
from dataclasses import dataclass
@dataclass
class Plant:
species: str
@dataclass
class Tray:
species: str
number_of_plants: int
def remove_plants(self, number): # self refers to the object of the class
number=min(number,self.number_of_plants) #cannot remove more than available
self.number_of_plants -= number
return [Plant(self.species) for _ in range(number)] # returns list of instances of Plant
def is_empty(self): # returns True of False
return self.number_of_plants == 0
tray=Tray('Lily', 28)
plants=tray.remove_plants(10)
if tray.is_empty():
print('The tray is empty')
else:
print('There are still', tray.number_of_plants, tray.species, 'plants in the tray')
first_plant=plants[0]
print('The plant removed is', first_plant.species)
The code for the full problem that envolves plants of several species, trays, pots and sales can be organized in the following manner: - Plant class: Simple class to represent a plant with a species. - Pot class: Holds one plant each. - Tray class: Holds plants of a single species and can remove plants. - Nursery class: Manages trays, pots, and keeps track of plant counts by species. It has methods like add_tray, transfer_to_pots, and sell_pot to handle operations for tracking and updating counts.
__init__
. Start with a simplified version of the problem where there are only trays and plants of distinct species in the nursery, which can be represented with 3 classes: Plant
, Tray
and Nursery
. Trays can be created with a given number of plants of the same species, and plants can be removed from trays. The goal in this simplified version is to create the inventory that keeps track of the number of plants of each species that are in trays.One possible solution for this simplified problem that was generated by Chat GPT when asked not to use @dataclass
is nursery_v1.py. Note that this code lacks the __str__
or __repr__
methods and therefore print(nursery.trays)
returns a list of objects with their memory address.
__repr__
method similar to the one below to class Tray
to redefine the output of print(nursery.trays)
and make it more informative.def __repr__(self):
return f"Tray(species={self.species}, count={self.count})"
Add to the previous script a class that represents pots and adapt your script accordingly. When plants are removed from trays, they are always placed in a pot (one plant per pot). The goal is that the inventory tracks the plants and the species in both trays and pots (instead of just in trays as in nursery_v1.py).
Finally, consider that pots can be sold and therefore removed from the inventory.
Verify if your script removes trays that are empty from the inventory, and update it if it is not the case.
The four main concepts of Object-Oriented Programming (OOP) are Encapsulation, Abstraction, Inheritance, and Polymorphism. These concepts work together to create modular, scalable, and maintainable code in object-oriented programming.
This is a central topic in computer science, and therefore you can find all kind of resources about it. Among them, you can find simple descriptions of those concepts, with examples, at the following links:
Building on the plant nursery example of last class, the following scripts illustrate the implementation of those concepts:
The next assignment will be the Cookie jar problem described at (https://cs50.harvard.edu/python/2022/psets/8/jar/). You will need to create a script for the problem and test it with check50 cs50/problems/2022/python/jar
.
This topic corresponds to Section 5 of the CS50 course: you can find the necessary resources on that link. In particular, see the short https://cs50.harvard.edu/python/2022/shorts/pytest/.
The idea is to create functions in Python (the names of those functions start with test_
) that are used to test existing functions or classes in the script. To execute the test functions we call pytest
in the terminal https://docs.pytest.org/ instead of python
:
$ pytest - v # -v is optional for a more verbose output
If no arguments are given, pytest
will execute all functions which name starts with test_
or end with _test
in scripts in the current directory and all its subdirectories. However, $pytest my_file.py
will only execute the tests within that file. Moreover, $pytest my_directory
will only execute the tests defined in files located in that directory. There are further options to select the tests to be executed with pytest
.
Consider you have two python modules: one with the definition of a class and the other that implement tests over that class.
# farm_carbon_footprint.py
import math
class Farm:
def __init__(self, name, area_hectares):
"""Initialize the farm with a name and area in hectares."""
self.name = name
self.area_hectares = area_hectares
self.activities = []
def add_activity(self, activity, emissions_per_unit, units):
"""Add an activity with emissions in kg CO2e per unit and units."""
self.activities.append((activity, emissions_per_unit, units))
def total_emissions(self):
"""Calculate total carbon emissions from all activities."""
return sum(emissions_per_unit * units for _, emissions_per_unit, units in self.activities)
def emissions_per_hectare(self):
"""Calculate carbon emissions per hectare."""
if self.area_hectares == 0:
raise ValueError("Farm area cannot be zero.")
return self.total_emissions() / self.area_hectares
def radius_circle_with_farm_area(self):
""" Calculate the radius (in meters) of a circle that has the same area as the farm"""
return(math.sqrt(self.area_hectares/3.1459)*100)
and
# test_farm_carbon_footprint.py
import pytest
from farm_carbon_footprint import Farm
def test_add_activity():
farm = Farm("Green Pastures", 10)
farm.add_activity("Tractor Usage", 50, 5) # 50 kg CO2e per hour, 5 hours
farm.add_activity("Fertilizer Use", 10, 20) # 10 kg CO2e per kg, 20 kg
assert len(farm.activities) == 2
def test_total_emissions():
farm = Farm("Green Pastures", 10)
farm.add_activity("Tractor Usage", 50, 5) # 50 kg CO2e per hour, 5 hours
farm.add_activity("Fertilizer Use", 10, 20) # 10 kg CO2e per kg, 20 kg
assert farm.total_emissions() == 450 # 250 + 200
def test_emissions_per_hectare():
farm = Farm("Green Pastures", 10)
farm.add_activity("Tractor Usage", 50, 5) # 50 kg CO2e per hour, 5 hours
farm.add_activity("Fertilizer Use", 10, 20) # 10 kg CO2e per kg, 20 kg
assert farm.emissions_per_hectare() == 45 # 450 total / 10 hectares
def test_emissions_per_hectare_zero_area():
farm = Farm("Tiny Farm", 0)
farm.add_activity("Tractor Usage", 50, 2) # 50 kg CO2e per hour, 2 hours
with pytest.raises(ValueError, match="Farm area cannot be zero."): # optional: matches Value Error message in emissions_per_hectare()
farm.emissions_per_hectare()
def test_radius_of_circle_with_farm_area():
farm = Farm("Circle Farm", 1)
assert farm.radius_circle_with_farm_area() == pytest.approx(56.38, abs=0.1)
farm = Farm("Circle Farm", 10)
assert farm.radius_circle_with_farm_area() == pytest.approx(178.3, abs=0.01)
Adapt the Farm
class definition and test_farm_carbon_footprint.py
in order to:
.number_of_activities()
to class Farm
that returns the number of activities. Check the correctness of that method with a new test in test_farm_carbon_footprint.py
.Farm
class so ValueError
should be raised if the property area_hectares
is negative when you try to create an instance of Farm
. Check with a new test in test_farm_carbon_footprint.py
that the behavior of the class is as expected when area_hectares
is negative.The packing/unpacking operators allows us to deal with structures of variable length. The example below illustrates packing several numbers into a list.
x=[1,2,3,4,5,6,7,8,9]
a,*b,c=x # b is the list [2,3,4,5,6,7,8]
print(a,b,c)
The same operator can be used to unpack:
list1=[1,2,3]
list2=[6,7,8]
new_list=[*list1,4,5,*list2] # values are unpacked
print(new_list)
The * and ** operator are mostly used as arguments of functions that can accept a a variable number of arguments (like print
): the operator * allows to pack all positional arguments into a tuple and the operator ** allows to pack all named arguments into a dictionary. In the example below, the variable kwargs
refers to keyword arguments (i.e named arguments) . Note that one can have a combination of regular arguments, regular named arguments, *args, and **kwargs as arguments of a function, as long as keyword arguments follow positional arguments.
def pack(*args, **kwargs):
return args,kwargs
x,y=pack(1,2,10, num_years=10,rate=0.03)
print('Positional arguments are packed into tuple',x)
print('Named arguments are packed into dictionary',y)
This can be used for instance to perform computations over a variable length sequence at in the following example.
# Compute accumulated interest on a sequence of borrowed amounts
def main(*args, **kwargs):
'''
args is a tuple of amounts borrowed
kwargs is a dictionary with keys num_years and rate
'''
S=add(args)
# Call function debt with **kwargs or kwargs
D=compute_debt(S,**kwargs) # D expects a number and two named arguments with names num_years and rate
# same as:
D=compute_debt(S,kwargs['num_years'],kwargs['rate'])
# print results
print('Borrowed:',S)
print('Debt:',round(D,3))
def add(values):
s=0
for x in values:
s+=x
return s
def compute_debt(s,num_years,rate):
for i in range(num_years):
s+=s*rate
return s
if __name__=='__main__':
main(1,2,10,5,4,num_years=10, rate=0.05)
*args
Write a function sum_all
that takes any number of positional arguments and returns their sum.
def sum_all(*args):
pass # Your code here
# Example usage:
print(sum_all(1, 2, 3)) # Output: 6
print(sum_all(10, 20, 30, 5)) # Output: 65
*args
Create a function concat_strings
that takes any number of string arguments using *args
and concatenates them into a single string.
def concat_strings(*args):
pass # Your code here
# Example usage:
print(concat_strings("Hello", " ", "world", "!")) # Output: "Hello world!"
print(concat_strings("Python", " is", " fun!")) # Output: "Python is fun!"
**kwargs
Write a function greet
that accepts a keyword argument name
(default value: "Guest"
) and an optional keyword argument greeting
(default value: "Hello"
). Return the formatted greeting message.
def greet(**kwargs):
pass # Your code here
# Example usage:
print(greet(name="Alice", greeting="Hi")) # Output: "Hi Alice"
print(greet(name="Bob")) # Output: "Hello Bob"
print(greet()) # Output: "Hello Guest"
*args
and **kwargs
Write a function describe_person
that takes positional arguments (*args
) for hobbies and keyword arguments (**kwargs
) for personal details (e.g., name, age). Return a formatted string describing the person.
def describe_person(*args, **kwargs):
pass # Your code here
# Example usage:
print(describe_person("reading", "traveling", name="Alice", age=30))
# Output: "Alice (30 years old) enjoys reading, traveling."
**kwargs
Create a function filter_kwargs
that takes any number of keyword arguments and returns a new dictionary containing only those with values greater than 10.
def filter_kwargs(**kwargs):
pass # Your code here
# Example usage:
print(filter_kwargs(a=5, b=15, c=20, d=3)) # Output: {'b': 15, 'c': 20}
Suppose one wants to create a list with all the cubes of even numbers up to N. The following scripts show how this can be done with different operators that replace the traditional loop structure: list comprehension, filter
, map
and lambda
Operator map
applies a given function to each element of a list. Likewise, filter
applies a boolean function to filter elements of a list. Both function can be executed in parallel over the elements of the list since each output is independent of the outputs for the remainder elements of the list.
def cube(x):
return x*x*x
L=[cube(x) for x in range(N) if x%2==0]
filter
to select even numbers and map
to compute cubes:
def even(x):
return x%2==0
numbers=list(range(N))
even_numbers=list(filter(even, numbers))
cubes=list(map(cube,even_numbers))
filter
and map
but defining implicitly the cube and even functions with lambda
instead of def
:
numbers=list(range(N))
even_numbers=list(filter(lambda x: x%2==0, numbers))
cubes=list(map(lambda x: x*x*x,even_numbers))
lambda
and list comprehension. In the example below, one needs to indicate that the $\lambda$ function has to be applied to the variable x
in the list comprehension, using (lambda x: x*x*x)(x)
. Otherwise, the output list would be a list of lambda functions.cubes=[(lambda x: x*x*x)(x) for x in range(N) if x%2==0]
Convert the following for loop into a list comprehension:
result = []
for x in range(10):
result.append(x**2)
# output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Rewrite this code using a list comprehension:
result = []
for x in range(20):
if x % 2 == 0:
result.append(x)
# output: [0, 4, 16, 36, 64]
Convert the following code to a dictionary comprehension:
squares = {}
for x in range(5):
squares[x] = x**2
# output: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Rewrite the nested loop as a list comprehension:
pairs = []
for x in range(3):
for y in range(2):
pairs.append((x, y))
# output: [(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)]
Transform the following code into a dictionary comprehension with a condition:
filtered_squares = {}
for x in range(10):
if x % 2 == 0:
filtered_squares[x] = x**2
# output: {0: 0, 2: 4, 4: 16, 6: 36, 8: 64}
Convert the following loop into a list comprehension that includes a conditional transformation:
result = []
for x in range(15):
if x % 3 == 0:
result.append(x**2)
else:
result.append(x)
# output: [0, 1, 2, 9, 4, 5, 36, 7, 8, 81, 10, 11, 144, 13, 14]
Transform the following loop into a dictionary comprehension, using strings as keys:
word_lengths = {}
words = ["apple", "banana", "cherry", "date"]
for word in words:
word_lengths[word] = len(word)
# output: {'apple': 5, 'banana': 6, 'cherry': 6, 'date': 4}
Rewrite this code using a single list comprehension to flatten the nested list:
nested_list = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
flattened = []
for sublist in nested_list:
for item in sublist:
flattened.append(item)
# output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Convert the following nested loop into a dictionary comprehension with a condition:
result = {}
for i in range(1,3):
for j in range(3, 6):
if j % i != 0:
result[(i, j)] = i + j
# {(2, 3): 5, (2, 5): 7}
Use a dictionary comprehension to filter and transform the following dictionary of dictionaries:
data = {
"A": {"score": 90, "passed": True},
"B": {"score": 65, "passed": False},
"C": {"score": 75, "passed": True},
"D": {"score": 50, "passed": False},
}
# Goal: Include only students who passed, and create a dictionary of their scores.
result = {}
for key, value in data.items():
if value["passed"]:
result[key] = value["score"]
# output: {'A': 90, 'C': 75}
In this class we use Python to control physical devices through GPIO (general-purpose input/output) ports on a Raspberry Pi microcomputer. We will rely on the gpiozero
Python package https://gpiozero.readthedocs.io/en/latest/recipes.html.
Topics of the class:
hostname -I
ssh
(secure shell)sudo python3 test.py
gpio zero
documentation that use the following physical devices: leds, buttons, and a line sensorLED_board. Interpret the code and verify that it is behaving as expected.
Look at the advanced recipes for LEDboard. Create a “pyramid” of lights 5-3-1-3-5, that turn on and off and pause 1 second. You can build a loop such that the pyramid runs only 4 times and the execution stops.
Adapt the code LED_board.py
so if you execute sudo python3 LED_morse.py some_word
the LEDs should turn on and off to encode the input word: a dah (-) has a duration of 2 seconds and a dit (.) has a duration of 1 second. After each letter, there should be a 3 second pause before the next letter. The example below should correspond to LEDs 1 and 2 being on for 3 seconds, then LEDs 1, 2 and 3 being on for 3 seconds, then LEDs 1 and 3 being on for 1 second while LED 2 is on for 3 seconds, and so on.
−− −−− ·−· ··· · −·−· −−− −·· ·
M O R S E C O D E
There are many hardware adapters that make it easier to connect sensors to a microcomputer. Here we look at the Raspberry Pi hat included in the Grove_Base_Kit_for_Raspberry_Pi. The Grove Base Hat for Raspberry Pi provides Digital/Analog/I2C/PWM/UART ports to the RPi allowing it to be connected a large range of modules.
The following code show how to access a temperature and humidity sensor readings programmatically. The sensor is connected to digital port D5. This code also allows access to gpio pin 17 to power a LED.
import time
from seeed_dht import DHT
from gpiozero import LED
led=LED(17)
# Grove - Temperature&Humidity Sensor connected to port D5
sensor = DHT('11', 5)
while True:
humi, temp = sensor.read()
print('temperature {}C, humidity {}%'.format(temp, humi))
if humi > 85:
led.on()
else:
led.off()
time.sleep(0.5)
import time
from grove.grove_ultrasonic_ranger import GroveUltrasonicRanger
from gpiozero import LED
led=LED(17)
# Grove - Ultrasonic Ranger connected to port D5
sensor = GroveUltrasonicRanger(5)
while True:
distance = sensor.get_distance()
print('{} cm'.format(distance))
if distance < 20:
led.on()
print('LED on')
time.sleep(0.5)
led.off()
print('LED off')
continue
time.sleep(1)