[python] Splitting methods (Compositing / inheritance / ... )

Hello,

I am looking for good techniques to split up your thousands of methods that your class can have.

  • Inheritance is interesting to put core methods in your super class.
  • Compositing seems interesting to but with this you will have to redefine each method in the class and then launch the method of the ‘leaf’.

(main resource : Learn Python the Hard Way)

I was wondering in how far following ‘pattern’ setup is a good idea :
Group all methods by category and put them in a seperate class that is not hooked.
In the init of this class you pass the main class so that your methods can use this (instead of using self)

Example:
I have a very complex object called Job
job does a lot of different this as creating processes, constructing filepaths, etc.
If we focus on a constructing filepaths we have a ‘construct_output_filepath()’ method.

with composting and inheritance you will get:
job.construct_output_filepath()

but the idea I had is to do following

job.path.construct_output_filepath()

This way you keep all methods considering path construction encapsulated in a ‘path’ variable.

To get this we would have following code structure :

class JobPath(object):
def init(self, job):
self.job = job

def construct_output_filepath(self):
return self.job.name + ‘.exr’

class Job(object):
def init(self):
self.path = JobPath(self)

In how far is this a good idea ?
Because the JobPath will know and depend on Job and Job will depend on JobPath.

This make both classes very hooked to eachother and possibly fragile ?
Are there other good techniques to use or idea’s ?
Maybe is this kind of structure ok, is there a pattern that defines this behaviour ?

Some more idea’s about this would be great !

Thanks in advance!

Sven

Grouping functions is not really composition. If I read you correctly you’re really talking about grouping functions into sub-classes for organization. This is pretty common in some languages (MaxScript, JavaScript) but in Python the usual mechanism is to put related into a package with submodules.

Classes are for operations that share persistent data. I n the classic example, a ‘car’ class might have position, speed, and fuel parameters, since the car’s internal state will be affected by lots of different functions from ‘drive()’ which uses up fuel or ‘refuel()’ which puts in more, and so on.

On the other end of the spectrum, a bunch of functions which all do string operations (‘reverse_string()’ , ‘italicise_string(),’, etc) do not need to be a class, since they operate on data and don’t store any information from one invocation to the next. These would make good candidates for inclusion in a ‘stringops’ module instead of a class. Other languages often use classes as an organizing tools for this sort of thing but in Python that’s not necessary or useful since we have modules and packages.

There is a middle ground for class and static methods which don’t affect the internal state of a class instance but are dependent on special knowledge of how the class works. The classic example is a static constructor function: say you have a class which takes a root path and a sub path as arguments, you might make a static method in that class which splits an single string according to some rule of your own and returns a new instance with the right pieces. In that case it’s not affecting internal data but it is relying on special knowledge about the class, so it’s legit to include it in the class.

The usual rules of thumb are:

  1. limit the scope of a function to one job and one job only.
  2. limit the functions in a class object to the set that shares state
  3. when composing objects into bigger objects, make never use ‘secret knowledge’ about the inner workings of one class inside anoither
  4. avoid circular dependencies!

You will never be able to avoid having some classes which know about / depend on other classes. That’s not avoidable. What you can do, though, is make sure to use only the public face of other classes, not relying on ‘secret’ knowledge about how it works under the hood. Classes should be as black boxed as they can be.

Thanks for writing out these guidelines.
Indeed with the way I wanted it to work I would make a huge circular dependency.

I 've wrote my way of thinking out underneath, to make clear what my goal is / was.
Could you actually give an example how this setup would work by using static constructor ?

Problem that I find with small module with simple functions is that you start have to drag around arguments to one function.

# 1. many arguments
def construct_complex_idname(id, duty, name):
def consutrct_complex_dutyname(duty, duty, name):
def construct_complex_filename(id, duty, name):

But of course I assume that the more logical way is to set these up as :


# 2. passing the whole object
def construct_complex_idname(job):
def consutrct_complex_dutyname(job):
def construct_complex_filename(job):

But in that case you are passing the job in each method so why not wrap it around in a class ?
Seems that you would have more power in passing around data.
But probably breaking the list of the rules

# 3. class wrap
class JobPathUtils:
    def __init__(self, job):
    def construct_complex_idname(self):
    def consutrct_complex_dutyname(self):
    def construct_complex_filename(self):

# usage
JobPathUtils(job).construct_complex_filename()

Since you have this pahutils in the class that is assigned to one job.


class Job():
    self.__init__(self):
        self.pathutils = JobPathUtils(self)

job = Job()
job.pathutils.construct_complex_filename()

I think in the end I will go for a seperate path utils module and passing the Job object (example 2)
Quick snippet


# base job path
class Job(object):
    def __init__(self):
        super(Job, self).__init__()
        self.id = None
        self.name = None
        self.duty = None


# creating a 'job_path_utils' module
def construct_complex_jobname(self, job):
    complex_name = job.id + '_' +job.name + '_' + job.duty

# way of using in script
import job_path_utils
job = Job()
job.name = 'helloworld'
job.duty = 'render'
job.id = '4581faadf54'

it’s perfectly ok - in fact, it’s a very good idea - to invent your own classes which don’t do much besides grouping relevant state data. For example, your ‘job’ class can just be a passive data container that contains all the stuff that other functions need to process the job. It’s also fine to make functions which only work on particular classes.

if you use this strategy it’s a great idea to collections.namedTuple to create classes which have nice, readable properties but which are immutable – that way there’s less chance that some code will accidentally change one item from a set of data that is supposed to live together. Your example #2 would work fine like this.

Deciding between

 
def construct_complex_dutyname(job):

and


job.construct_complex_dutyname()

has a lot to do with what kind of ‘state’ the data in job represents. If it’s a passive bundle of data, the namedtuple + functions method is simple and easy to maintain. If the data is supposed to change - say, if a job has a status which changes as it passes along the pipeline - then it’s more likely that the job class will grow methods to manage that internal state.

Another thing to think about is whether you may eventually need different kinds of job. For example, you might have a LocalJob object which does something on your own machine and a NetworkedJob object which shares some functionality but does things on remote servers. In that case you will have a much nicer time if you are passing along job objects and calling their methods since you don’t have to check every instance that comes along and make decisions about it:



class Job(object):
     def __init__(name, root, data):
          #.....
      
      def start(self):
          # make sure that every job has the method, even if it doesn't work, so you can catch incomplete implementations...
           raise NotImplemented ()

      def stop(self):
           raise NotImplemented ()

      def get_name():      
            return self.name   

class LocalJob(object):

      # uses the same constructor as base Job, but store a process ID
     def __init__(self, name, root, data):
           super(LocalJob, self).__init__(name, root, data)
           self.PID = -1
      
      def start(self):
           self.PID = os.spawn(os. P_NOWAIT, "program.exe", self.root, self.data)
           return self.PID != -1  # should return true if it didn't crash

      def stop (self):
           return os.kill(self.PID, None)

class NetworkJob (object):

      def __init__(self, name, root, data, server  = 'http://some.server.here' , pwd = 'password'):
           super(LocalJob, self).__init__(name, root, data)
           self.server = server
           self.pwd = pwd

      def start(self):
           opener = urllib2.build_opener(urllib2.HTTPHandler)
           request = urllib2.Request(self.server, data='jobs/' + self.data)
           request.add_header('Content-Type', 'text/html')
           result = opener.open(request)
           return 'OK' in result.read()  # returns true if server request is accepted

      def stop():
           opener = urllib2.build_opener(urllib2.HTTPHandler)
           request = urllib2.Request(self.server, data='/ABORT')
           request.add_header('Content-Type', 'text/html')
           result = opener.open(request)
           return 'ABORTED' in result.read() 


In that scenario all of your other code should be able create, start and stop jobs without knowing whether or not they are local or networked. This is the most important benefit of using classes: you can localise the complexity of a paritcular problem so other code does not need to know or care about the inner workings of the process.

PS the example is totally made up, i doubt it works as written :slight_smile:

i see, this was very clear explanation. thanks !

If it will just get data out of the object withouth affecting it :

construct_complex_filename(job)

if it will affect the data of the object I’ll go for this one :

job.update_status('completed')

Thank for the job example this is indeed what I am constructing at the moment.
I assume that Networkjob and LocalJob both inherit from Job.

I actually constructed this in the end in a different way.
I tried to use compositing instead of inheritance.
And then a factory would create the complex job.

Even tough I hope this is implemented correctly :smiley: I am still new with the whole design patterns ideas.


class Process():

class LocalProcess(Process):

class NetworkProcess(Process):

class Job(object):
    JOBTYPE_LOCAL = 'jobtype_local'
    JOBTYPE_NETWORK = 'jobtype_network'

    def __init__(self):
        super(Job, self).__init__()
        self.jobtype = None 
        self.local_process = None
        self.network_process = None

    def run(self):
        if self.jobtype == self.JOBTYPE_LOCAL:
            self.local_process.run()
        elif self.jobtype == self.JOBTYPE_NETWORK:
            self.network_process.run()

def make_job():
    job = Job()
    job.jobtype = job.JOBTYPE_LOCAL
    job.local_process = LocalProcess(job)
    job.network_process = LocalProcess(job)
    job.run()