Modules & Packages
Understand sys.path architecture, the Python Namespace
Dictionary, and Singleton Caching.
In Python, a Module is simply a text file ending in .py
containing executable Python code. A Package is a physical folder on your
hard drive containing multiple module files and a special __init__.py file.
The import statement is one of the most mechanically complex operations in
Python. It does not just paste code from one file into another (like C++
#include). It literally boots up a miniature virtual environment, executes the
remote file from top to bottom, bundles all the resulting objects into a massive Dictionary,
and hands you a pointer to that dictionary.
Imagine your main python script is a mechanic in a garage.
Typing import math means your mechanic calls a specialized tool vendor on the
phone. The vendor sends over a sealed toolbox. The toolbox is labeled "math". To use a tool
inside it, the mechanic must ask the toolbox for it (math.sqrt()).
Typing from math import sqrt means the mechanic explicitly opens the vendor's
toolbox, pulls out the exact `sqrt` drill, places it directly on his personal workbench, and
throws the toolbox away. He can now use sqrt() directly without the `math.`
prefix.
# Scenario: Importing a heavy Deep Learning library
import sys
# 1. Verify where Python is looking for files
print(sys.path)
# 2. Check the global module cache
import torch
print("torch" in sys.modules) # Returns True
# 3. Memory pointer equivalence
import torch as t
print(id(t) == id(torch)) # Returns True (Both point to the same dict)
| Code Line | Explanation |
|---|---|
import torch |
1. Python checks sys.modules to see if PyTorch is already loaded. If
not, it begins scanning the hard drive paths listed inside sys.path for
a folder named "torch" containing an `__init__.py` file.2. It compiles that file into bytecode and executes it entirely. 3. It creates a PyModuleObject, caches it, and assigns the `torch` variable to it. |
sys.path |
This is a list of Strings representing literal folder paths on your OS (like `C:\Users\Appdata\Lib\site-packages`). This is exactly why you get `ImportError: No module named X`—if the library isn't physically located in one of these string paths, Python is totally blind to its existence. |
Input: from math import pi
Transformation: Python executes `math.py` entirely in the background. It then searches the resulting `math` namespace dictionary for the key `"pi"`. It grabs the Memory Pointer to that float object and copies ONLY that pointer directly into your current script's Global Dictionary.
Output State: The variable `pi == 3.14159` is fully available in your local computer scope without any prefix.
If you have a project with 50 different python scripts, and every single script has
import numpy at the top, does Python compile and load the massive NumPy library
50 separate times into your RAM?
No. The very first time import numpy is executed, Python loads it into RAM and
saves a permanent pointer inside a hidden dictionary called
sys.modules. For the remaining 49 scripts, Python intercepts
the `import` command, realizes NumPy is already loaded into `sys.modules`, and instantly
hands the script a shortcut pointer to the existing memory structure. Modules are inherently
Singletons.
Memory layout of the Global Namespace after an import:
Current Script Dictionary:
globals() -> {"tf": [Pointer to TensorFlow Module Object]}
TensorFlow Module Object:
tf.__dict__ -> {"keras": [Pointer], "constant": [Pointer]}
A module structurally behaves like a 1-level Dictionary mapping string names to memory
pointers. When you type math.log, it is fundamentally executing the exact same
C-level lookup algorithm as math_dict["log"].
The import statement itself does not return a value you can assign (you cannot
do x = import math). It modifies the Namespace directly.
If you need to programmatically import a module using a string variable
lib = "math", you must use the underlying C-function:
module_object = __import__(lib).
Circular Imports:
File A imports File B. But File B imports File A! What happens?
When File A executes import B, Python pauses File A and starts running File B.
Inside File B, it hits import A. Python checks `sys.modules`, sees that File A
is technically "already registered" (even though it's paused halfway), so it hands File B
the halfway-finished File A object. When File B tries to load a variable from File A that
hasn't been defined yet, the entire program fatally crashes with an `AttributeError`.
Fix: Restructure your codebase or move the import statement inside a
function definition instead of the top of the file.
Relative Imports (.):
Instead of import my_package.database, you can use relative dots:
from . import database. A single dot means "look for the database module in the
exact same physical folder that this current script is sitting in". Two dots
(..) means "go up one parent folder".
Mistake: Star imports.
from sklearn.metrics import * ❌ (Disastrous)
Why is this bad?: This literally rips every single variable, function, and class out of the massive Scikit-Learn metrics library and violently dumps hundreds of pointers into your script's personal namespace dictionary. It overwrites your own custom variables without warning (Namespace Pollution) and makes it impossible for linters to know where functions came from. Never use star imports in production.
import math vs from math import sqrt.
Which is faster to execute? sqrt(25) is roughly 15% faster than
math.sqrt(25). Why? Because math.sqrt requires two dictionary
lookups: First, check the Global scope to find `math`. Second, check the `math` dictionary
to find `sqrt`. By using from math import sqrt, the `sqrt` pointer is copied
directly into your Local dictionary, requiring only one lookup!
Challenge: You downloaded a file named `keras.py` from GitHub and put it in your Downloads folder. Write the code to dynamically force Python to `import keras.py` even though Python doesn't know about your Downloads folder.
Expected Answer: Because module searching relies entirely on the `sys.path` array, you simply append your folder to the array first!
import sys
sys.path.append("C:/Users/DELL/Downloads")
import keras # Python will now successfully find it!
The role of __init__.py:
Prior to Python 3.3, if a folder didn't have a blank `__init__.py` file in it, Python flatly refused to recognize it as an importable Package. Why? It was a massive security and performance protocol. Without the marker file, typing `import math` would force Python to endlessly scan every single folder on your entire hard drive looking for a `math.py` file. The `__init__` file acts as a massive neon sign saying "I am official Python code, scan here". Python 3.3 introduced Implicit Namespace Packages, allowing imports without `__init__.py` for advanced directory splicing, but creating the file remains the definitive architecture standard.