August 11, 2022

A crash course on Python object oriented programming (ver 1.0)

My intent writing this is as follows: I'll assume the reader has a basic understanding of Python.
I'll also assume that the reader has a basic understanding of OO (object oriented) programming, probably from some other language.

Every language seems to use different words for the same aspects of object oriented programming. I'll do my best to clarify how I am using language, but you will need an agile mind if you are coming from some other language. Both syntax and vocabulary will probably be quite different in at least some places.

A person may approach OO programming from several levels. The most basic level is to want to use existing libraries that are written in an OO style (as most python libraries are). The next level is to want to write your own python classes in a basic way. The final level is to use what I have chosen to label as "advanced" object oriented features.

Object oriented programming as a consumer

Suppose some genius has written a Python package we want to use. The package is called "Data" and can store a single value for us, and then retrieve the value when desired.

To create (instantiate) a Data object, you make a call like this (which also sets an initial value):

mydata = Data ( 999 )
Most OO languages use the word "new" in some syntax to create ("instantiate") a new object. Not so Python. You call the name of the class as if it was a function and that creates a new object of that type. The above call creates an object of the "Data" type, initializing it with the value 999. The variable "mydata" is a reference to the newly created object.

Most objects will have "instance methods". These are functions that are part of the object. We check the documentation for our Data object and find that it provides two methods, namely "set" and "get". They can be used as follows:

mydata = Data ( 999 )
print ( mydata.get() )
mydata.set ( 888 )
print ( mydata.get() )
The first line creates (instantiates) the object), as we have already discussed.
The second line gets the value (and prints it).
The third line sets a new value (888) in our object.
The last line prints the new value (namely 888).

Objects consist of methods (functions) and fields (values). It isn't documented, but by looking at the source code (not shown as yet) we learn that the value for our Data object is stored in a field named "value". We can bypass the get and set methods and directly access the value by using the following syntax:

mydata = Data ( 999 )
print ( mydata.value )
mydata.value = 888
print ( mydata.value )
This works just the same as using the get and set methods. Not all object oriented languages allow this sort of thing, but python does. We will talk more about the pros and cons involved with this later.

So, now you know all you need to know about Pythons syntax as a user of existing classes. Not really, but we are keeping it simple for now and skipping over some things that you don't absolutely need to know. Remember - objects consist of methods and fields. You access (invoke) methods when you put parenthesis (perhaps enclosing arguments) after the name of the object component. If you reference an object component without parenthesis, you are accessing a field (a variable that is part of the object).

Object Oriented programming as a writer of classes

As you have no doubt heard, classes are just factories for generating objects. That is one way to look at them. Another way is that classes are a way of lumping together data fields and related functions as a handy unit. In other words they are an aid to organizing your programs.

Our "Data" class provides a simple (some would say "trivial") example:

class Data () :
    def __init__ ( self, val ) :
	self.value = val

    def get ( self ) :
	return self.value
    def set ( self, val ) :
	self.value = val
There you go, that is all that there is to our Data class and you have a full fledged Python class, capable of creating myriads of Data objects.

The first thing to note is that every method has a first argument "self". Whenever python calls an object method, it is sneaky and injects (prepends) this argument "self" that you never knew about or specified when you called the object method. It gives it a value too.

First take note of the __init__ method. This gets called when we call the name of the class to create a new object. We can do pretty much whatever we like in this method. In general though, we set values in various fields that are part of the object. Here self refers to the fresh new object being created, as you might expect.

The get and set methods also get the magically inserted "self" value. Here it is very important. It is a reference to the object for which we are making the call, i.e. what you might call the "current object". Notice that it is not there in the calls, but in a sense it is. The value that will get assigned to "self" is in front of the ".". You could think of python doing the following rearrangement to make the method call:

    myvar.set ( 0 )  ===>  set ( myvar, 0 )
Of course there is more going on that just rearranging things. Python has to look at myvar, figure out what class that object belongs to, find the class and then find the "set" method, then it does the rearranging and makes the call, if you want to think of it that way.

Some comments on style, conventions, and a bit of axe grinding

Let's talk about accessor methods.

The methods "get" and "set" that I coded up as part of my Data class are accessor methods. They allow us to read or write a value inside of the class without knowing how it is stored or what it is called. This is a very good thing and a practice I recommend adopting.

As we mentioned, you can just reach inside our Data object and grab hold of the "value" field. That is not so good. It is OK if the Data class never changes, and somewhat to my surprise python allows it.

Other languages don't allow such things, or require you to explicitly make fields public to permit such questionable programming.

I am not here to teach you proper manners, or coding style, just to let you know how things work in Python. The convention in python is to name fields that are intended to be private using underscores, such as __ref_count. But this is just a convention and nothing prevents you in such a case from doing:

data.__ref_count = 0

You can only imagine how much havoc this might cause in a class maintaining a reference count for some reason if done willy nilly and unexpectedly.

Most people writing classes don't expect you to directly access fields and feel free to make any kind of changes to their names or meaning without telling you about it. They may go out of their way to add underscores to fields you should not access or mess with, but that is only a convention and not something you can count on.

Now let's talk about singleton classes.

It is not at all uncommon to have a class that will only ever be called once to generate the one and only object of that class will ever exist. Some languages have ways to enforce this and call such objects Singleton objects and get religious about them. Not so Python. If you are hell bent on enforcing one and one only in Python, you have to work at it. I use such classes all the time and never worry about enforcement.

Now let's whine and complain about "self".

This is my first, only (in this essay) and major gripe about Python OO programming. My gripe is "self". It is unfortunate that python OO code gets cluttered up with "self" references everywhere. It is just syntax, but it is ugly and most other languages provide nicer ways to express ourselves. It traces to a language called Modula-3, which was a model for the design of the object oriented features in python, and we are stuck with it. I don't like it, but there isn't anything you or I can do about it.

Advanced Python OO programming

There are no end of things that could be discussed, so I will only cover a few that I think are important and useful.

First inheritance. I make little or no use of inheritance in classes that I write. When I use libraries such as wxPython, it turns out to be very useful to write classes that inherit from the basic classes in wxPython, but I won't talk about that here. I'll let you learn about that in the context of wxPython if you go down that road.

Next class constants. These are constants that are just plain old constants, but are part of a class. By and large this just serves to include them in the class and keep them in a namespace specific to the class. You often don't care about the actual value (which may be a meaningless enum anyway). An example from wxPython might be:

self.SetBackgroundColour ( wx.RED )
Here wx.RED is the constant, here being used to set a background color of red.

There are also class methods (what we discussed above were instance methods) These are methods that have to do with the class as a whole, not individual objects. For example we could augment out Data class to have a default value which we could set by calling "set_default", allowing us to write code like this:

Data.set_default ( -1 )
myvar2 = Data ()
The code to implement this class could look like this:
class Data () :
    default_value = 0

    @classmethod
    def set_default ( cls, val) :
        default_value = val

    def __init__ ( self, val=None ) :
        if val :
            self.value = val
        else :
            self.value = Data.default_value

    def get ( self ) :
        return self.value
    def set ( self, val ) :
        self.value = val
I am actually illustrating several new things here, but this is the "advanced" section, so that is OK.

First we have the class variable "default_value". We could be bad dogs and just set it via Data.default_value, but we have an accessor method "set_default" that we ought to use instead. Note that in the __init__ function we reference it (and here we must) internally as "Data.default_value". We do not have to qualify it fully in the class method and can just refer to "default.value". We can fully qualify it here if we want to, which is peculiar.

You may as well be warned that if you have a class variable (aka field) and an instance variable with the same name, one will hide the other. In particular the instance variable hides the class variable, at least in those cases where you don't full qualify the class variable, or something like that. Just avoid having instance and class variables with the same name and you can ignore any fine points.

Second, we have the class method "set_default". We have to put the line with "@classmethod" in front of the definition to clue in python. Note here that the first argument (usually "self"), now becomes "cls" which stands for the class. This argument is rarely used.

Note also that the names "self" and "cls" are just conventions. You can replace "self" with "porcupine" all through your code if you are consistent doing it, and it will work just fine. If anyone tries to read or modify your code they might well say bad things (really bad things) about you. You might even say bad things about yourself later in life if you do such nonsense. But you are free to do so if you see fit.

Lastly note how we handle polymorphism of the constructor for a Data object. We allow x=Data("fred") as well as y=Data(). This is done by "building it ourself". In this case by giving the default of "None" and later testing for it. Other languages allow us to provide more than one __init__ function, but not python.

Finally, along with class methods, Python has something called static methods. They are essentially class methods, but without the "cls" argument magically injected. They could be useful for just collecting a bunch of related functions together in a class and enforcing a sort of namespace management. You use them like so:

class MyMath () :
    @staticmethod
    def sqrt ( x ) :
	.. insert square root algorithm here

val = MyMath.sqrt ( 4.0 )

Other OO languages don't have static methods and just use class methods for purposes like this. This works because other languages don't have the magic insertion of the cls argument like python does.


Feedback? Questions? Drop me a line!

Tom's Computer Info / tom@mmto.org