Object-Oriented Programming (OOP)
Understand PyTypeObject architecture, the Method Resolution
Order (MRO), and Dunder Protocols.
Object-Oriented Programming (OOP) is a paradigm where code is organized into Classes (Blueprints) and Objects/Instances (Physical Manifestations). Instead of having disconnected variables and functions floating in the abyss, OOP binds related data (Attributes) and behaviors (Methods) together into a single, cohesive payload.
In Python, the OOP architecture is absolute. Everything is an object. A string is an object
of the str class. A number is an object of the int class. Even a
Class definition itself is an object spawned from a higher architectural blueprint called a
type (Metaclass)!
Imagine a Car Factory.
A Class is the engineering blueprint paper sitting on the manager's desk. You cannot drive a blueprint. It just explains *how* to build a car.
An Object (Instance) is the physical car that rolls off the assembly line. Because it was stamped from the blueprint, it possesses the guaranteed Attributes (`color="Red"`, `engine="V8"`) and the guaranteed Methods (`accelerate()`, `brake()`). You can stamp out 10,000 independent physical cars (Objects) from exactly 1 piece of paper (Class).
# Scenario: Inheritance and Polymorphism in an ML Pipeline
class BaseEstimator:
def __init__(self, name):
self.name = name
def predict(self, x):
raise NotImplementedError("Subclasses must implement this.")
# Random Forest INHERITS from the Base Estimator
class RandomForest(BaseEstimator):
def __init__(self, name, trees):
super().__init__(name) # Boot up the parent's memory logic
self.trees = trees
def predict(self, x): # Polymorphic Override
return sum(x) / self.trees
model = RandomForest("RF_Classifier", 100)
print(model.predict([10, 20]))
| Code Line | Explanation |
|---|---|
class RandomForest(BaseEstimator): |
Python evaluates the class block. It intercepts the inheritance parenthesis and mathematically links the new class blueprint dict to the parent class blueprint dict sequentially in memory. |
model = RandomForest(...) |
Python allocates a brand new blank block of RAM. It then searches the `RandomForest`
blueprint for the __init__ method and calls it, literally passing the
new blank memory block in as the self pointer argument! |
super().__init__(name) |
The super() proxy bypasses the current blueprint, locates the
mathematical Parent blueprint (`BaseEstimator`), and forces its `__init__` logic to
execute ON the current block of RAM (`self`). |
self.trees = trees |
The blank RAM block's internal Dictionary hash table is injected with a new key `"trees"` mapped to the integer pointer `100`. |
self dictionary)When you type model.train(data), how does the function know what data belongs to
`model`?
Under the hood, Python perfectly translates that syntax into this C-equivalent:
NeuralNetwork.train(model, data). The word self acts as a massive
hidden Dictionary passing literal memory locations into functions. Every single object in
Python (except specialized ones) has a hidden __dict__ attribute. When you type
self.layers = 50, Python executes self.__dict__["layers"] = 50.
Unlike Java, Python supports Multiple Inheritance:
class AI(Robot, Computer):.
If both `Robot` and `Computer` have a `calculate()` method, which one gets used? Python
relies on the C3 Linearization Algorithm (MRO). The MRO mathematically
scans the inheritance tree bottom-up, left-to-right. It generates a rigid 1D Array of
Classes. When searching for a method, Python scans down the MRO array. The exact second it
finds a matching function name, it executes it and ignores the rest of the tree. (You can
view this array via AI.__mro__).
The Class Variable Contamination Bug:
class Employee:
skills = [] # Class Variable (Danger!)
e1 = Employee()
e2 = Employee()
e1.skills.append("Python")
print(e2.skills) # Prints ["Python"]! e2 was infected!
Variables declared directly under `class` are glued to the Blueprint Paper, NOT the physical
Cars. If Employee 1 writes his skills on the Blueprint Paper, Employee 2 (who also looks at
the Blueprint) will see them! Fix: Always declare dynamic variables inside
def __init__(self): self.skills = [], which legally binds them to the
individual independent Car.
DataClasses (PEP 557):
If you are building an AI Training Config object that strictly holds data and no deep
functions, writing __init__ functions with 20 self.x = x lines is
excruciating. Python introduced the @dataclass decorator. It reads your typed
variables and uses C-metaclasses to literally code a flawless, hyper-optimized
__init__, __repr__, and __eq__ engine for you
automatically in milliseconds.
Mistake: Accessing "Private" variables from the outside.
In Java, private float salary; structurally prevents outside code from reading
the salary. In Python, typing self.__salary = 100 does NOT make the variable
private. Python just executes Name Mangling, silently renaming the variable
in memory to _ClassName__salary to prevent accidental inheritance overriding.
Any malicious programmer can easily bypass this by simply calling the mangled name. Python
relies on developer trust (the single underscore convention self._salary
indicating "please do not touch this"), lacking actual C-level memory occlusion.
How does Python construct the `class` blueprint itself?
When the Python compiler hits the word class, it essentially compiles all the
nested functions into a Dictionary, creates a name string, gathers the Parent classes into a
Tuple, and hands all 3 components to a supreme god-class called
type(name, bases, dict). The Metaclass is the overarching
universal factory that physically mints the Class blueprints in RAM. By intercepting and
replacing Python's default Metaclass, Data Scientists can build massive Frameworks (like
Django ORM or SQLAlchemy) that read Python class variables and automatically rewrite them
into SQL database query protocols before the class even finishes compiling!