7 Object Oriented Programming#
7.1 The difference between procedural and object oriented programming#
Typically when performing self-contained data analysis or writing relatively short scripts it is preferable to keep code structure relatively simple. Our code might consist of a single file, perhaps with an imported additional module containing a series of functions. The structure of our program is to store data in variables and then to operate on that data with functions. The program starts at the top and sequentially works line by line towards the bottom, with simple tasks and repeated operations delegated to functions to increase code reuse. There is nothing wrong with this method! When programs are small, this kind of code is usually simpler to understand, quicker to write and easier to maintain. This approach to programming is often called Procedural programming.
Object oriented programming (OOP) takes a different approach. The idea is that in normal life people think of the world by dividing it up into objects, whether physical or abstract: Cars, People, Internet, Mind. Objects have attributes (a car is red) and they have actions (a car moves). These two things (attributes and actions) both belong to an object. There can be multiple objects of the same type which have the same type of attributes (a car has a colour) and actions, even if the values might be different (a car could be red or blue). Objects act on one another to change each others attributes ( A Person paints a car a different colour, thus changing the attribute colour of the car).
In OOP we can define a blueprint of an object which defines the types of attributes and actions that a particular type of object should have. We could therefore define a generic car object as something which has the following attributes: colour, velocity and the following actions which we call methods: accelerate. One can think of an attribute like a variable and a method like a function, but in both cases these are attached to an object. This blueprint is called a class. Once we have a class or blueprint we can use it to create an actual object, known as an instance. So a Car class could be used to create a red car with a velocity of 30mph. The same class could then also be used to create a blue car with velocity 20mph.
At first, this might sound a bit complicated. So why is this approach popular?
It extends the logic we used when writing functions. It limits the amount of code I have to hold in my mind at once. A good function groups code into a block that performs a single task. Once I’m confident it works, I only need to know what the function does, its inputs and outputs. As programs get bigger this is essential. Classes enable me to group larger sections of related code together, both methods and variables (This is known as encapsulation
). If well constructed this simplifies the cognitive load immensely. For example, if my class is called Camera
and it has methods start_record()
and stop_record()
and attributes framerate=30
and number_of_pics=20
it is very easy to work out how to use it without knowing the details. This hiding of details is often called abstraction
.
7.2 You’ve already been using classes#
You’ve been using classes everywhere in python, perhaps without even realising it. Whilst you didn’t define them, every time you create a new variable, collection etc in python you are using someone else’s class. Let’s create a list. This creates an instance of list from the blueprint or class list. Python allows you some shortcuts using [] but list() is what you are really doing. Notice if you ask python for the type it tells you it is a class of type list. That is it was prepared according to the list blueprint. We can then call a method append of that class list.
instance_of_list = list((1,2,3))
print(type(instance_of_list))
#Lets access a method of this instance. The method append is defined in the class list and so can
#be used by an instance to add a number to the list.
instance_of_list.append(4)
instance_of_list
<class 'list'>
[1, 2, 3, 4]
Similarly when we looked at the Pandas library we were creating instances from a class and accessing their attributes and methods. Here is an example of the DataFrame creation.
import pandas as pd
# Create an instance of the DataFrame class
instance_df = pd.DataFrame({'a':[1,2],'b':[3,4]})
# print an attribute called shape
print(instance_df.shape)
# Call the method drop to remove the column labelled b.
instance_df.drop('b',axis=1, inplace=True)
instance_df
(2, 2)
a | |
---|---|
0 | 1 |
1 | 2 |
7.3 Defining a simple class in python#
Below we will define our own simple camera class (remember this a blueprint from which objects can be created). This class will have a name Camera
. Clearly it will not work as we don’t have a camera, but it demonstrates the principles!
N.B by convention class definitions are always written with the first letter of all words capitalised
#Here we are creating the blueprint or class definition
import time
class Camera:
"""A Class for taking a picture with a camera
"""
num_cameras = 0 # This is a class variable which we will use to count how many cameras have been created.
#All class methods must have self as the first parameter. This is a reference to the instance of the class and allows us to access the class attributes and methods within the class definition.
# We can pass positional and keyword arguments to the methods just as with normal functions
def __init__(self, camera_id=0):
#Every class has an initialisation method which is called and run automatically when we generate an instance
print(f'Setting up camera {camera_id}')
self.camera_id = camera_id # We can create instance variables / attributes by assigning them to self
self._complicated_setup_code() # I can call methods within the class like this
Camera.num_cameras += 1 # Add 1 to the num_cameras class variable. We can access class variables with the class name or the instance name
def _complicated_setup_code(self): # Convention is that any method that is only used internally and not by the user starts with an _
#Imagine there was a load of code here to handle connecting to camera
pass
def take_pic(self):
# Note each method has as its first parameter "self". self refers to the object created with the class (instance)
# It means I can access camera_id even though it was defined in __init__
print(f'Taking a pic with camera {self.camera_id}')
image = 'Imagine this is an image'
return image
def get_camera_settings(self):
return self.camera_id
Having created the blueprint of the Camera class lets now use it
cam = Camera() # Define an "instance" of the class Camera. Note we don't say Camera().__init__(). This is called autmatically when an object is created.
cam2 = Camera(camera_id=2) # Create another instance with same blueprint but different properties.
cam3 = Camera(camera_id=3) # And another
# Access a property of the instance
print(cam3.camera_id)
# Call the take_pic method on cam with camera_id=0. Note we don't pass self, this is done internally by python.
img=cam.take_pic()
print(img)
# Class methods can return values just like regular functions.
settings_cam2 = cam2.get_camera_settings()
print(settings_cam2)
#We can also have properties and methods associated with the class or blueprint e.g how many cameras are there currently?
print(f'There are currently {Camera.num_cameras} active cameras')
print(f'Can also access num cameras like this: {cam2.num_cameras}')
Setting up camera 0
Setting up camera 2
Setting up camera 3
3
Taking a pic with camera 0
Imagine this is an image
2
There are currently 3 active cameras
Can also access num cameras like this: 3
Notice there might have been some complicated code involved to establish a connection with a camera but as a user we just create a Camera instance and tell it to take a picture. This is exactly how I’d interact with a physical camera and so it makes intuitive sense.
7.4 Some general software principles for writing code and particularly classes#
It is easy to read this stuff and gloss over it. As you develop your project it is worth returning to these principles and reviewing whether your code violates any of them. They are handy principles to have in your mind as you develop.
DRY - We have met “Do Not Repeat Yourself” before in the context of functions but it is worth repeating that logically grouping code should prevent you needing to change it in multiple places, making your code easier to maintain / change.
KISS stands for “Keep it simple stupid”. There is a big temptation once one starts writing classes to create huge classes with millions of properties and methods. There are a few reasons for not doing this:
i. YAGNI - “You ain’t gonna need it”. Sometimes people try and pre-empt all the possible functionality they might need. YAGNI says you should keep a simple clear structure and just implement the most basic code that serves your needs. If you find you need something later you can come back and add it in.
ii. SRP - “Single Responsibility Principle”. Another mistake is to mix what should be multiple objects together into the same class. Consider the Camera class above. The method take_pic() returns an image from the camera to the attached computer. One might be tempted to therefore add a
save_img()
method to the camera class. Then you might think wouldn’t it be nice if a gui dialogue popped open to allow the user to enter the filename. This kind of thinking results in huge classes which are really hard to understand and therefore use. Saving the images is a borderline case, but gui dialogues have nothing logically to do with cameras. What happens if you want to change the gui framework from Tkinter to PyQT? You would have to completely rewrite your class. The core responsibility of your camera class is to control the camera so restrict it to properties and methods that are just associated with that. It can then interact with other classes which handle gui dialogues or whatever is needed.
7.5 Having objects interact#
In python OOP does not mean that everything has to be written in classes. Python is flexible so we should do whatever keeps our programs simple (KISS). If we want a program where a gui opens and then the image from the camera is saved, we could write it like this. We just need our Camera class and two functions.
def save_img(filename, img):
print('Code that saves an image to the specified filename')
def save_gui_dialog(default_filepath='c:/Documents'):
print('This would open a gui and return a filename')
filename ='example_filename.png'
return filename
cam = Camera()
img = cam.take_pic()
filename = save_gui_dialog()
save_img(filename, img)
Setting up camera 0
Taking a pic with camera 0
This would open a gui and return a filename
Code that saves an image to the specified filename
7.6 Composition#
Another way to have objects interact is called “Composition”.
Suppose we have a simple camera and we want to use python to create a timelapse video. The central part is still a camera that takes a picture, so we can reuse that. We’ll also need someway to add imgs to a video. A video makes a natural object. We’ll need to do some setup, a method to add images and some code to close the video.
import time
class Video:
def __init__(self, filename):
self.filename=filename
print('Open the video for writing')
def add_frame(self, img):
print('adding img {img} to video')
def close(self):
print('close the video')
class Timelapse:
def __init__(self, video, camera):
self.video=video
self.camera=camera
def start_timelapse(self, num_imgs, delay_imgs=0.5):
for i in range(num_imgs):
img = self.camera.take_pic()
self.video.add_frame(img)
time.sleep(delay_imgs)
self.video.close()
video = Video('C:/Documents/videofilename.mp4') # Create a video instance
cam = Camera() # Create a camera instance
time_lapse = Timelapse(video, cam) # Create a timelapse instance to which we supply a video and a camera.
time_lapse.start_timelapse(5, delay_imgs=1) # Start the timelapse
Open the video for writing
Setting up camera 0
Taking a pic with camera 0
adding img {img} to video
Taking a pic with camera 0
adding img {img} to video
Taking a pic with camera 0
adding img {img} to video
Taking a pic with camera 0
adding img {img} to video
Taking a pic with camera 0
adding img {img} to video
close the video
One of the major advantages of using this kind of composition is that it enables flexibility. Suppose we want to change how we write the video. Perhaps our Video class initially used OpenCV, but now we want to use FFMPEG. We can swap out the Video class for an FFMPEG class without it affecting the rest of our code. Our FFMPEG class needs the methods add_frame() and close() (technical term the same interface
), but the TimeLapse code can be reused without any changes. This modularity beomes really important as projects get bigger. Now to use this code we would just write:
ffmpeg_video = FFMPEG('C:/Documents/videofilename.mp4')
cam=Camera()
time_lapse = Timelapse(ffmpeg_video, cam)
time_lapse.start_timelapse(5)
7.7 Inheritance#
Sometimes we have classes which do mostly the same thing but have a few small differences. For example we might have a Panasonic camera and a Canon camera. They are both cameras but there might be a few small differences. It doesn’t obey the DRY principle (Don’t repeat yourself) to write a whole new class for each camera. Instead we can use the idea of inheritance
. We might setup what is called a parent class called Camera
which contains all the things in common. We would then write two child classes called Panasonic
and Canon
. The child classes get access to the properties and methods of the parent class as well as their own. The child class can also reimplement methods in the parent class.
#Create a Parent class
class Camera:
"""A Parent class for taking a picture with a camera
"""
def __init__(self, camera_id=0): # We can pass positional and keyword arguments to the methods just as with normal functions
#Every class has an initialisation method which is called and run when we generate an instance
print(f'Setting up camera {camera_id}')
self.camera_id = camera_id
self._complicated_setup_code() # I can call methods within the class like this
def _complicated_setup_code(self):
#This code is likely to be camera specific. We add this method to the Parent so that people
#writing the child class know they must implement it.
#Notice method name begins with an underscore to indicate this is an internal method not used by end user.
pass
def take_pic(self):
# Note each method has as its first parameter "self". self refers to the object created with the class (instance)
print(f'Taking a pic with camera {self.camera_id}')
# It means I can access framerate even though it was defined in __init__
image = 'Imagine this is an image'
return image
# This notation means the Class Panasonic inherits from the class Camera
class Panasonic(Camera):
def __init__(self, camera_id=0):
# This line calls the setup method of the Camera parent class
super().__init__(camera_id)
# Since it has the same name as a method in the Parent class it overwrites it
def _complicated_setup_code(self):
print('implement the bits specific to a Panasonic camera')
def panasonic_only_method(self):
print('A function which exists only in the Panasonic class')
panasonic_camera = Panasonic()
panasonic_camera.take_pic() # This method is in the parent class but can be used by the child class because of inheritance.
print(f"Camera id is {panasonic_camera.camera_id}") # The property camera_id is stored in the Parent Camera but panasonic_camera can access it.
Setting up camera 0
implement the bits specific to a Panasonic camera
Taking a pic with camera 0
Camera id is 0
7.8 Should I use Composition or Inheritance?#
Maybe neither! Again make sure that you are genuinely making things simpler for people to read the code and simpler to use. However, done well both methods can help make code transparent. So is Composition
or Inheritance
the best way to go for your problem? A handy rule is to ask: is the relationship between the two objects / classes an is a
relationship or a has a
relationship?
A Pananasonic is a
Camera. You can use a Panasonic wherever you can use a Camera therefore this is expressed in inheritance
.
A TimeLapse object has a
Video and a Camera. One cannot use a TimeLapse in place of a Video, the video forms a part of a timelapse object. Therefore, this is best coded in terms of Composition
.
Designing classes and their relationships well is not easy. I usually find that code that uses the wrong relationships gets really messy. For example, I end up overwriting a large number of parent class methods to get it to work. Like most elements of design, its really hard to be prescriptive in each case. However, a good class design feels natural. When given to someone else who doesn’t understand the details, the structure means they should intuitively grasp the correct way to use it.
7.9 Using __dunder__
methods and a brief look under the hood of how python dataypes work#
In a basic class we have met the __init__
method. This is a reserved method that has a specific purpose to create a new instance of a class. Python has a lot of other predefined magic methods which follow this pattern __method__
and which can be implemented. Here we create a new datatype that is basically a float but that implements a special method so that when you divide by zero it returns numpy’s infinity value rather than throwing a divide by zero error.
import numpy as np
class DivideZeroFloat(float):
"""A new datatype that handles the case of dividing by zero. Returning +/- infinity as appropriate"""
def __init__(self, value):
self.value = value
def __truediv__(self, other):
"""Set up a magic method that defines what to do when we divide value by other"""
if other == 0:
if self.value >= 0:
return np.inf
else:
return -np.inf
else:
return super().__truediv__(other) # For all other numbers follow the usual division rules
def __str__(self):
return "Divide by zero float"
numerator = DivideZeroFloat(10)
print(numerator / 5) # Here it works just like a float since the value other is passed to the parent class of float.
print(numerator / 0) # Note divide by zero would produce an error for a normal float, but we've told it how to handle this special case
print(numerator) # Calls the __str__ method
2.0
inf
Divide by zero float
As discussed earlier python datatypes are actually classes that implement many of these special methods to define for example what the ‘+’ operator should do. This is why 5 + 3 = 8 but ‘5’ + ‘3’ = ‘53’ since the __add__
method in the int and string class have been defined differently. You can also use some of these methods to implement functionality:
A few cases that I have found really useful to know about:
Creating your own context managers : If a class has the methods
__enter__
&__exit__
defined, it creates a context manager. We saw this in connection with reading files, but it is also really helpful if communicating with devices. You can make sure that the connection is properly closed even if the code crashes.with Arduino('COM1') as arduino: do stuff
Creating your own generators : a generator needs a
__iter__
and__next__
method to be defined. Here’s an implementation of the Fibonnacci sequence.
class Fib:
def __init__(self):
self.a, self.b = 0, 1
def __next__(self):
return_value = self.a
self.a, self.b = self.b, self.a+self.b
return return_value
def __iter__(self):
return self.a
f = Fib()
for _ in range(10):
print(next(f))
0
1
1
2
3
5
8
13
21
34
To explore these ideas further take a look at this article