When you run a Python script, you are running CPython — the reference implementation of the language written in C. Understanding what CPython is, and how it differs from Python the language specification, is one of the most clarifying things you can do as a developer working at any level of the stack.
- What CPython is and how it differs from Python the spec
- How source code becomes bytecode in 5 stages
- Frame objects, RESUME, and co_linetable internals
- How the GIL works and when it isn't enough
- __pycache__ invalidation and .pyc injection attacks
- CPython vs PyPy, Jython, MicroPython, GraalPy
- Free-threaded builds and the CPython JIT (3.13–3.15)
- Security: SAST, marshal ACE, TOCTOU race conditions
What's in this Python Tutorial▼
- Python the Language vs. CPython the Implementation
- How CPython Turns Source Code into Running Code
- The Global Interpreter Lock
- CPython vs. Other Python Implementations
- CPython Security Considerations
- Recent CPython Milestones (3.11–3.15)
- Why CPython Dominates Despite Faster Alternatives
- Key Takeaways
- Frequently Asked Questions
Python is both a language and an ecosystem, and separating those two things is essential once you start asking questions like "why is Python slow for CPU-heavy work?" or "what is the GIL?" or "how does memory management work?" The answers all point to CPython, not to the language specification itself. This tutorial covers the fundamentals: what CPython is, how it processes your code, what the GIL does, and how CPython compares to alternatives such as PyPy and Jython.
FoundationPython the Language vs. CPython the Implementation
Python, as a language, is a specification. It defines syntax rules, data types, semantics, and standard library behavior. That specification does not mandate how those things must be implemented internally — it only describes what results the implementation must produce. CPython is one implementation of that specification, and it is the original one. According to the official python.org download page, Python was created in the early 1990s by Guido van Rossum at Stichting Mathematisch Centrum in the Netherlands, and CPython remains the canonical benchmark for language behavior today.
Because CPython was the first implementation and remains the reference, new language features almost always appear in CPython first. The latest stable release as of April 2026 is CPython 3.14, with 3.14.1 released December 2, 2025, and 3.15 currently in late alpha development (3.15.0a8, released April 7, 2026) targeting a final release in October 2026 per PEP 790. When a PEP (Python Enhancement Proposal) is accepted and merged, CPython is the implementation where that change lives first.
CPython-Specific Behaviors You May Already Be Relying On
Several behaviors that feel like "Python" are actually CPython implementation details not guaranteed by the language specification. Knowing which ones you are leaning on prevents subtle bugs when code must run under PyPy, Jython, or GraalPy — or when CPython's own internals shift between major versions.
Reference counting and immediate destruction. CPython uses reference counting as its primary garbage collection mechanism. Every Python object carries a reference count field (ob_refcnt in Objects/object.c). When that field drops to zero, the object is deallocated immediately — in the same thread, on the same call stack. This means a with statement that wraps a file handle guarantees the file is closed the moment execution leaves the block, not some time later when a garbage collector decides to run. PyPy and Jython use tracing collectors that defer reclamation, so code that depends on timely finalization is CPython-specific behavior, not a Python language guarantee.
String interning. CPython automatically interns short strings that look like identifiers — strings composed of ASCII letters, digits, and underscores below a certain length. Two such strings with the same value will often share the same memory address, making is comparisons return True. This is a CPython optimization, not something the language specifies. Code that uses is to compare string values is technically wrong regardless, but it will silently misbehave more frequently on non-CPython runtimes.
Dictionary insertion order. CPython 3.7 made dict insertion-order-preserving as an implementation detail; the language specification formally adopted it as a guarantee in that same version. This one crossed from implementation detail to spec — but it is a useful example of how CPython ships a feature first and the spec follows.
Integer caching. CPython pre-allocates integer objects for the range −5 through 256. Any expression producing a value in that range returns the same cached object, so a = 256; b = 256; a is b returns True. For values outside that range, fresh objects are allocated. This is an optimization baked into CPython's Objects/longobject.c, not a language feature, and it is a common source of confusion in code reviews.
"CPython's memory model is not part of the Python specification."
— PEP 703 (Making the GIL Optional in CPython), Python Software Foundation
This distinction matters because PEP 703 was accepted precisely because the GIL, reference counting semantics, and threading model are all CPython-specific implementation choices — not language requirements. Removing the GIL does not break Python; it changes CPython.
If your code relies on timely file closure, deterministic object destruction, or string interning behavior — you are relying on CPython, not on Python. Those behaviors will differ under PyPy, Jython, or GraalPy.
The name "CPython" uses the "C" prefix to distinguish it from other implementations, not because Python is written entirely in C. The CPython interpreter itself is largely written in C, but a substantial portion of the standard library is written in Python.
If you want to confirm which implementation you are running, check sys.implementation.name in the Python REPL. On a standard install, it returns 'cpython'. On PyPy it returns 'pypy'.
import sys
print(sys.implementation.name) # 'cpython'
print(sys.version) # '3.14.0 (main, Oct 7 2025, ...) [GCC 14.2.0]'
print(sys.implementation.version) # sys.version_info(major=3, minor=14, micro=0, ...)
print(sys.implementation.cache_tag) # 'cpython-314' — used as the __pycache__ prefix
Compilation Pipeline
How CPython Turns Source Code into Running Code
When you execute a .py file, CPython does not translate your source code directly into machine code the way a C or Rust compiler would. Instead, it goes through a multi-stage pipeline that ends with a bytecode interpreter — a virtual machine — executing the compiled instructions one at a time.
The pipeline has five main stages. First, the tokenizer breaks the raw source text into a stream of tokens — keywords, identifiers, operators, literals. Second, the PEG parser (introduced in Python 3.9 via PEP 617) converts those tokens into an Abstract Syntax Tree, which is a tree representation of the program's structure. Third, the compiler traverses the AST, applies optimizations via a control flow graph, and emits bytecode — a sequence of instructions for the CPython virtual machine. Fourth, those instructions are stored in a code object. Finally, the evaluation loop fetches and executes each instruction.
How CPython Turns Source Code into Running Code
- Tokenization. The tokenizer reads raw source text and breaks it into a stream of tokens — keywords, identifiers, operators, and literals — discarding whitespace and comments. This is the first contact CPython has with your source file.
- PEG Parsing and AST Construction. The PEG parser (introduced in Python 3.9 via PEP 617) consumes the token stream and produces an Abstract Syntax Tree — a tree-structured representation of the program's grammar. Syntax errors are detected at this stage.
- Compilation via Control Flow Graph. The compiler traverses the AST, builds a control flow graph (CFG), applies optimizations such as constant folding and dead-code elimination, and emits a sequence of bytecode instructions stored in a code object.
- Bytecode Caching. The compiled bytecode is serialized and cached as a
.pycfile inside__pycache__. On subsequent runs, CPython checks the magic number and either the source mtime or a SHA-256 hash; if nothing changed, it loads the cached bytecode directly, skipping steps 1–3. - Evaluation Loop (Virtual Machine). The CPython VM fetches each bytecode instruction in turn and executes it. From Python 3.11, the adaptive specialization engine (PEP 659) can replace instructions with faster type-specialized variants at runtime. The optional JIT (PEP 744, enabled via
PYTHON_JIT=1) can compile hot micro-op traces to native machine code.
CPython never compiles your source directly to machine code. Every .py file goes through tokenization, AST construction, and bytecode compilation first. Machine code only enters the picture if the optional JIT is active — and even then, only for hot code paths.
CPython never compiles Python directly to machine code in the default build. What runs on your CPU is the C code of the evaluation loop — interpreting bytecode one instruction at a time. The JIT (Python 3.13+) is the first path that can skip that loop for hot traces.
Inspecting the AST Yourself
The Python standard library exposes the Abstract Syntax Tree through the ast module, giving you direct access to the tree CPython builds before compilation. This is not just an educational tool — linters like Flake8, formatters like Black, and type checkers like mypy all operate on the AST.
import ast
source = """
def greet(name: str) -> str:
return f"Hello, {name}"
"""
tree = ast.parse(source)
print(ast.dump(tree, indent=2))
The output of ast.dump() is the complete tree structure CPython uses to compile that function. You can walk it programmatically with ast.walk(tree) or subclass ast.NodeVisitor to build your own static analysis. This is the same mechanism CPython uses internally to detect syntax errors and perform constant folding before any bytecode is emitted. Linters like Flake8, formatters like Black, and type checkers like mypy all operate on the AST.
CPython's compiler performs constant folding during the CFG optimization pass. An expression like 2 ** 32 in your source is computed at compile time and stored as the integer 4294967296 in the bytecode. You can confirm this with dis: the emitted instruction will be LOAD_CONST 4294967296 rather than two LOAD_CONST instructions followed by a BINARY_OP. This optimization has a size guard — excessively large constants are not folded to avoid bloating .pyc files.
Bytecode and the dis Module
The bytecode CPython produces is not machine code. It is a compact, platform-independent instruction set designed for CPython's virtual machine. These instructions are stored in .pyc files inside the __pycache__ directory, allowing CPython to skip re-compilation on subsequent runs when the source has not changed.
Each .pyc file begins with a four-byte magic number that encodes the CPython version and the bytecode format version. When CPython loads a .pyc, it checks this magic number first; a mismatch forces full recompilation. This is why .pyc files from one Python version cannot be used by another — and why, as noted in the Python 3.14 release notes, bumping the bytecode format during a release candidate forces all cached files to be regenerated. The magic number is defined in Lib/importlib/_bootstrap_external.py and changes whenever the bytecode instruction set changes in a backward-incompatible way.
You can inspect the bytecode for any function using the dis module from the standard library. This is a valuable diagnostic tool for understanding exactly what the interpreter is doing.
import dis
def add_two(x, y):
return x + y
dis.dis(add_two)
Running the above produces output similar to the following (exact offsets and opcodes vary by Python version):
2 RESUME 0
3 LOAD_FAST 0 (x)
LOAD_FAST 1 (y)
BINARY_OP 0 (+)
RETURN_VALUE
Each line is one bytecode instruction. LOAD_FAST pushes a local variable onto the evaluation stack. BINARY_OP pops the two values, adds them, and pushes the result. RETURN_VALUE pops the top of the stack and returns it to the caller. Understanding bytecode is not required for everyday Python development, but it gives you precise visibility into what your code is actually doing at the interpreter level.
Code Objects: What dis Is Actually Showing You
Every function, class body, and module in CPython is represented at runtime as a code object — an instance of types.CodeType. Code objects are immutable and carry all the information the evaluation loop needs: the bytecode itself (co_code in Python 3.10 and earlier, co_linetable and related fields in 3.11+), the constant pool (co_consts), variable names (co_varnames, co_freevars, co_cellvars), and metadata like argument count and flags. When you call dis.dis(add_two), you are printing a human-readable disassembly of the add_two.__code__ code object.
def add_two(x, y):
return x + y
co = add_two.__code__
print(co.co_varnames) # ('x', 'y')
print(co.co_argcount) # 2
print(co.co_consts) # (None,)
print(co.co_filename) # path to the source file
print(co.co_firstlineno) # line number where the function starts
print(co.co_qualname) # 'add_two' — qualified name, new in Python 3.11
print(co.co_stacksize) # 2 — max eval stack depth needed by this code object
print(co.co_exceptiontable) # b'' — exception handler table, new in Python 3.11
# co_code is a backward-compat bytes alias in 3.12+; co_linetable holds the real data
# Access the raw compact line table directly:
print(co.co_linetable) # bytes — compact instruction metadata (line, column, depth)
Code objects are also what Python pickles when you use marshal to serialize bytecode — they are the unit of compilation and the unit of caching. Nested functions and comprehensions each get their own code object, nested inside the parent's co_consts tuple. This means a list comprehension [x for x in items] compiles to a separate code object with its own local scope, which is why the iteration variable does not leak into the surrounding scope in Python 3.
Starting with CPython 3.11, the interpreter introduced adaptive specialization (PEP 659): bytecode instructions can be replaced at runtime with faster, type-specialized variants when the interpreter observes a consistent pattern of types. This is sometimes called the "specializing adaptive interpreter" (SAI) and is part of the Faster CPython project. In Python 3.14, adaptive specialization was also enabled for free-threaded mode, contributing to the reduced single-threaded performance penalty in no-GIL builds.
Frame Objects and How Function Calls Actually Work
Every time CPython calls a Python function, it creates an execution frame — a data structure that holds the local variable array, the evaluation stack, a reference to the code object, the current instruction pointer, and bookkeeping for exception handling. Understanding frames is essential for making sense of profilers, debuggers, tracebacks, generators, and coroutines, all of which manipulate or inspect frames directly.
Before Python 3.11, every frame was a heap-allocated PyFrameObject. Heap allocation has non-trivial cost: it involves a memory allocator call, cache pressure from the new allocation, and eventually a garbage collection cycle to reclaim it. For short-lived functions called in tight loops — which is a very common pattern — this was a consistent source of overhead. CPython 3.11 introduced _PyInterpreterFrame, an inlined frame structure that lives on the C call stack (specifically, in a memory arena managed by the frame allocator inside Python/ceval.c) rather than on the heap. The Python-visible PyFrameObject still exists but is created lazily — only when something actually needs to inspect it, such as a debugger, a profiler hook, or code that calls sys._getframe(). For most ordinary function calls, the heap-allocated frame object is never created at all.
You can observe frames directly in Python:
import sys
def inner():
frame = sys._getframe() # the current frame (_PyInterpreterFrame internally)
caller = sys._getframe(1) # the caller's frame (one level up the call stack)
print(frame.f_code.co_name) # 'inner'
print(caller.f_code.co_name) # 'outer'
print(frame.f_lineno) # current line number being executed
# f_locals behaviour changed in Python 3.13:
# - 3.12 and earlier: returns a plain dict snapshot of local variables
# - 3.13+: returns a FrameLocalsProxy — a live view that stays in sync
# with the fast-locals array; mutations propagate back to the frame.
# Use frame.f_locals.items() to iterate; dict(frame.f_locals) to snapshot.
print(type(frame.f_locals)) # on 3.12, FrameLocalsProxy on 3.13+
# Walk the full call stack using f_back
f = frame
while f is not None:
print(f.f_code.co_name, f.f_lineno)
f = f.f_back
def outer():
inner()
outer()
Generators and coroutines work by suspending a frame mid-execution: when a generator hits yield, its _PyInterpreterFrame is moved to the heap and attached to the generator object, preserving the entire execution state — local variables, evaluation stack, and instruction pointer. When the generator is resumed, the frame is placed back on the C stack. This is how async/await achieves its non-blocking behaviour: an awaited coroutine suspends its frame and returns control to the event loop without blocking the thread.
If you run dis.dis() on any function in Python 3.11 or later, the very first instruction is always RESUME 0. This is not a no-op. RESUME does two things: it checks whether a tracing hook is active (e.g., a debugger or profiler set via sys.settrace() or sys.setprofile()) and branches to the tracing path only when needed, avoiding that check on every other instruction; and it handles the case where a generator or coroutine frame is being re-entered after a yield or await, restoring the suspended execution context correctly. The argument to RESUME (0, 1, 2, or 3) encodes which re-entry scenario applies. This consolidation replaced several scattered entry-point checks that existed in older CPython versions.
The co_code to co_linetable Change (Python 3.11+)
If you inspect code objects on Python 3.10 and earlier, the bytecode is available as co_code — a plain bytes object where each instruction occupies exactly two bytes (one for the opcode, one for the argument). Python 3.11 replaced this with a more compact format: co_code still exists as a compatibility shim but the canonical storage is now co_linetable, a compressed byte sequence that encodes instruction metadata (source line number, column offset, exception depth) much more efficiently than the old format. The related attributes co_lnotab (line number table, deprecated since 3.10) and the new co_positions() method give you fine-grained source location data down to individual columns — this is what powers the precise error messages with caret indicators introduced in 3.11.
def greet(name):
return f"Hello, {name}"
co = greet.__code__
# Python 3.11+: co_code is a compatibility alias; use co_linetable for raw storage
print(type(co.co_code)) # <class 'bytes'> (backward-compat shim)
print(co.co_stacksize) # max eval stack depth needed
print(co.co_flags) # bitmask: generator, coroutine, varargs, etc.
# Fine-grained source positions (3.11+): column-accurate locations per instruction
for pos in co.co_positions():
print(pos) # (line, end_line, col_offset, end_col_offset) or None
How __pycache__ Invalidation Works
When CPython loads a .py file, it checks whether a corresponding .pyc exists in __pycache__. If it does, CPython reads the 16-byte header: four bytes of magic number (encoding the CPython version and bytecode format version), four bytes of bit flags, and then either the source modification timestamp and file size (the default) or a hash of the source file. A mismatch on any field causes CPython to discard the cached bytecode and recompile from source.
The default mtime-based invalidation is fast but has a known edge case: if you modify a source file and the filesystem's mtime resolution rounds the new timestamp to the same value as the old one (common on FAT32 and some network filesystems with 2-second granularity), CPython will incorrectly reuse stale bytecode. To avoid this, Python 3.8 added hash-based .pyc files: when you compile with py_compile.compile(source, invalidation_mode=PycInvalidationMode.CHECKED_HASH), CPython stores a SHA-256 hash of the source instead of the mtime, and recomputes it on every load. The --check-hash-based-pycs always flag to the interpreter forces this for all cached files. There is also an UNCHECKED_HASH mode for deployment environments where you want to skip validation entirely and trust that the .pyc files are correct.
import py_compile, struct, sys, pathlib, importlib.util
# Compile with hash-based invalidation (avoids mtime rounding edge cases on FAT32 / NFS)
py_compile.compile(
'my_module.py',
invalidation_mode=py_compile.PycInvalidationMode.CHECKED_HASH
)
# The .pyc filename uses sys.implementation.cache_tag as its prefix
tag = sys.implementation.cache_tag # e.g. 'cpython-314'
pyc = pathlib.Path(f'__pycache__/my_module.{tag}.pyc')
# Read the 16-byte .pyc header (PEP 552 layout):
# bytes 0–3: magic number (CPython version + bytecode format)
# bytes 4–7: flags (bit 0: 1=hash-based, 0=timestamp-based;
# bit 1 if hash-based: 1=checked, 0=unchecked)
# bytes 8–15: payload (timestamp mode: mtime[4] + source_size[4]
# hash mode: 8-byte SipHash of source,
# computed by importlib.util.source_hash())
with pyc.open('rb') as f:
magic = f.read(4)
flags = struct.unpack('<I', f.read(4))[0]
payload = f.read(8)
is_hash_based = bool(flags & 1)
is_checked = bool(flags & 2) if is_hash_based else None
print(f"magic={magic.hex()}")
print(f"hash-based={is_hash_based}, checked={is_checked}")
print(f"payload={payload.hex()}")
# Verify the stored hash matches what importlib.util.source_hash() produces:
source_bytes = pathlib.Path('my_module.py').read_bytes()
expected = importlib.util.source_hash(source_bytes)
print(f"hash matches: {payload == expected}") # True
Concurrency Model
The Global Interpreter Lock
The Global Interpreter Lock — commonly called the GIL — is a mutex inside CPython that ensures only one thread executes Python bytecode at a time within a single interpreter process. It is one of the most discussed aspects of CPython, and also one of the most mischaracterized.
The GIL exists because CPython's memory management is not thread-safe without it. CPython uses reference counting to track object lifetimes: every object carries a count of how many references point to it. When that count reaches zero, the object's memory is freed. Without the GIL, two threads could modify a reference count simultaneously, corrupting it. The GIL avoids this without requiring per-object locks, which would be far more expensive.
How the GIL Is Actually Implemented
CPython releases the GIL at regular intervals so other threads get a chance to run. The switch interval defaults to 5 milliseconds and is configurable via sys.setswitchinterval(). Prior to Python 3.2, the GIL was released every 100 bytecode instructions (the "check interval"), which caused thrashing on multicore machines because a thread could lose and reacquire the GIL many times before making useful progress. The switch to a time-based interval in 3.2 — proposed by Antoine Pitrou — was a significant practical improvement for I/O-bound multi-threaded code.
import sys
# See the current switch interval (default: 0.005 seconds / 5ms)
print(sys.getswitchinterval()) # 0.005
# Change it — useful when benchmarking threading behavior
sys.setswitchinterval(0.001) # 1ms: more context switches, higher overhead
# C extensions that release the GIL explicitly use the C-level macros:
# Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS
# NumPy, hashlib, zlib, and many others do this for long-running operations.
The GIL also interacts with CPython's cyclic garbage collector. Reference counting alone cannot reclaim objects that form reference cycles — for example, two objects that each hold a reference to the other, with no external references. CPython's gc module runs a generational mark-and-sweep collector to find and break these cycles. The cyclic GC must pause all threads while it runs, which it can do safely precisely because the GIL serializes bytecode execution. In the free-threaded build, a new incremental garbage collector was designed specifically to avoid this stop-the-world requirement, which the Python 3.14 release notes describe as "incremental garbage collection."
The GIL does not prevent multithreading. It prevents two Python threads from executing bytecode simultaneously in the same process. I/O-bound threads — waiting on a network response, a file read, or a database query — release the GIL while waiting, so they can run concurrently with other threads. The GIL is primarily a bottleneck for CPU-bound Python code across multiple cores.
The GIL is a performance constraint — not a security boundary. It releases between bytecode instructions, which means read-modify-write sequences are not atomic, and shared mutable state still needs explicit locks.
For CPU-bound parallelism in CPython, the standard solution is the multiprocessing module, which spawns separate interpreter processes, each with its own GIL. Starting with Python 3.13, CPython also shipped an experimental free-threaded build that could be compiled with the GIL disabled. In Python 3.14, the Python Steering Council accepted PEP 779, promoting the free-threaded build from experimental to officially supported — the first time a no-GIL CPython build has carried that status. The full story of what PEP 703 means for Python concurrency goes deeper than this article can cover. The free-threaded build remains optional (it is not yet the default), but as of 3.14 the single-threaded performance penalty dropped to roughly 5–10%, down from the earlier 10–20%, because the specializing adaptive interpreter was re-enabled for free-threaded mode. The free-threaded executable is typically named python3.13t for 3.13 or installed as a separate variant in 3.14.
# Check whether the free-threaded build is running (Python 3.13+)
import sys
# sys._is_gil_enabled() is available in free-threaded builds (Python 3.13+).
# On a standard GIL-enabled build this attribute does not exist.
if hasattr(sys, '_is_gil_enabled'):
print("GIL currently enabled:", sys._is_gil_enabled())
# Free-threaded builds run with the GIL off by default.
# Pass -X gil=1 to re-enable it for compatibility testing:
# python3.14t -X gil=1 your_script.py
else:
print("Standard GIL build — sys._is_gil_enabled not present")
# ── Free-threaded installation ──────────────────────────────────────────
# The free-threaded binary uses a 't' suffix: python3.13t / python3.14t
# This naming is consistent across all platforms and package managers.
#
# Windows / macOS: download the "free-threaded" installer from python.org
# (separate installer option on the downloads page)
#
# Ubuntu (deadsnakes PPA):
# sudo add-apt-repository ppa:deadsnakes/ppa
# sudo apt-get install python3.14t python3.14t-dev
#
# Homebrew (macOS):
# brew install python-freethreaded # installs as python3.14t
#
# Docker (official image):
# FROM python:3.14-bookworm # standard build
# # No official free-threaded Docker image yet — build from source or
# # use the deadsnakes PPA inside a Debian/Ubuntu container.
#
# ── PYTHON_JIT is a separate, unrelated toggle ─────────────────────────
# PYTHON_JIT=1 enables the experimental JIT compiler (3.13+, disabled by default).
# It has nothing to do with the GIL. Both flags are independent:
# PYTHON_JIT=1 python3.14t -X gil=0 your_script.py # JIT on, GIL off
If your code is I/O-bound (network requests, database calls, file operations), the GIL is rarely your bottleneck. If it is CPU-bound, reach for multiprocessing, a C extension, NumPy, or consider the officially supported free-threaded builds available from Python 3.14 (PEP 779) or an alternative implementation like PyPy for long-running pure-Python loops.
Ecosystem
CPython vs. Other Python Implementations
Because Python is a specification, multiple implementations exist. Each makes different engineering trade-offs, and choosing the right one depends heavily on the workload.
- Written in
- C and Python
- Execution model
- Bytecode interpreter with adaptive specialization and an optional experimental JIT (disabled by default as of 3.14)
- Best for
- General-purpose development, maximum compatibility, access to the full C extension ecosystem (NumPy, Pandas, etc.)
- GIL
- Present by default; optional free-threaded builds available from Python 3.13+
- Written in
- RPython (a restricted subset of Python)
- Execution model
- Tracing JIT compiler that generates native machine code for hot paths — typically 3–10x faster than CPython for long-running CPU-bound loops
- Best for
- CPU-bound workloads, long-running server processes, code that doesn't rely heavily on CPython C extensions
- GIL
- Has its own GIL; different threading semantics than CPython
- Written in
- Java
- Execution model
- Compiles Python to JVM bytecode; runs on the Java Virtual Machine, giving access to Java libraries and true multi-threading without a GIL
- Best for
- Projects that need to interoperate with Java codebases; no GIL means true multi-threaded Python execution on the JVM
- GIL
- No GIL; leverages the JVM's native thread model
- Written in
- C
- Execution model
- Lean bytecode interpreter optimized for constrained environments; implements a subset of Python 3 with minimal memory footprint
- Best for
- Embedded systems, IoT devices, microcontrollers (Raspberry Pi Pico, ESP32, etc.) where CPython's memory footprint is too large
- GIL
- Single-threaded by design on most targets; cooperative multitasking via asyncio-like patterns
- Written in
- Java (runs on GraalVM)
- Execution model
- Truffle AST interpreter with GraalVM's JIT — compiles Python to native code via the Truffle/Graal pipeline; supports polyglot interop with Java, JavaScript, Ruby, and R in the same runtime
- Best for
- Polyglot environments, long-running server workloads, and use cases that can accept a larger runtime footprint in exchange for competitive JIT performance; some benchmarks show GraalPy matching or exceeding PyPy on certain workloads
- GIL
- No GIL; uses GraalVM's thread model — true multi-threaded execution, though C extension compatibility is limited
- Note
- Maintained by Oracle; tracks CPython compatibility closely but lags by a minor version or two on standard library coverage; not suitable when broad C extension support is required
For the overwhelming majority of Python developers, CPython is the right choice. Its compatibility with the C extension ecosystem — NumPy, Pandas, SciPy, Cryptography, and thousands of others — is unmatched by any alternative implementation. PyPy is the strongest alternative for pure-Python CPU-bound workloads where C extensions are not a requirement, and GraalPy is worth considering in polyglot JVM environments or when you need true multi-threading without the overhead of migrating to a free-threaded CPython build.
CPython Security Considerations
Understanding CPython's internals is not just a performance topic — several of its implementation details have direct security consequences. The four areas below are where CPython-specific behavior most commonly surfaces in security code reviews, penetration tests, and vulnerability reports.
String Interning and Authentication Bypass
CPython automatically interns short identifier-like strings, which means two such strings with equal values often share the same memory address. This makes the is operator return True when comparing them — not because their values are equal, but because they are literally the same object. When developers mistakenly use is instead of == for string comparison in authentication or authorization checks, the result depends entirely on whether CPython happens to intern both strings. The check passes for short, identifier-like tokens and silently fails for longer, non-interned ones — or vice versa depending on the runtime and Python version.
# DANGEROUS: never use 'is' to compare string values in security checks
def check_role_unsafe(user_role, required_role):
return user_role is required_role # may give wrong answer — behavior is undefined
# CORRECT: always use == for value comparison
def check_role_safe(user_role, required_role):
return user_role == required_role
# Illustration of the trap — use runtime-built strings to see real behavior.
# Two literals in the same code object share a constant entry, so 'is' always
# returns True for them. The danger surfaces when strings come from external
# sources (user input, network, database) rather than literals.
role_from_db = "admin" # imagine this came from a database query
required_role = "admin" # literal in source
print(role_from_db is required_role) # True on CPython — 'admin' happens to be interned
# Build the same string at runtime without interning:
role_dynamic = "".join(["a", "d", "m", "i", "n"]) # runtime-constructed, NOT interned
print(role_dynamic is required_role) # False — same value, different objects
print(role_dynamic == required_role) # True — == always compares by value
# A hyphenated role name further illustrates the fragility:
role_a = "-".join(["admin", "user"]) # 'admin-user' built at runtime
role_b = "admin-user" # literal
print(role_a is role_b) # False — runtime string is not interned
print(role_a == role_b) # True — == is always correct
Use == for all string value comparisons in security-sensitive code. Use is only to test object identity — for example, x is None. For comparing secrets and tokens, use hmac.compare_digest(), which is constant-time and immune to timing attacks as well as interning quirks.
The AST as a Security Analysis Surface
The same ast module you explored in section 2 is the foundation of every Python SAST (Static Application Security Testing) tool. Bandit — the standard Python security linter — works entirely by walking the AST looking for dangerous call patterns: eval() and exec() with user input, pickle.loads() on untrusted data, subprocess calls with shell=True, hardcoded passwords, use of weak cryptographic algorithms, and dozens of other issues. Semgrep's Python rules and PyLint's security checks use the same mechanism.
import ast
# Minimal AST-based dangerous-call detector (illustrates how Bandit works)
DANGEROUS_CALLS = {'eval', 'exec', 'compile', '__import__'}
class DangerousCallFinder(ast.NodeVisitor):
def visit_Call(self, node):
if isinstance(node.func, ast.Name):
if node.func.id in DANGEROUS_CALLS:
print(f"Warning: call to {node.func.id!r} at line {node.lineno}")
self.generic_visit(node)
source = """
user_input = input("Enter expression: ")
eval(user_input) # arbitrary code execution if unvalidated
"""
tree = ast.parse(source)
DangerousCallFinder().visit(tree)
# Warning: call to 'eval' at line 3
Running bandit -r your_project/ in CI catches a wide class of Python security issues before code ships. Because it operates on the AST rather than running the code, it finds vulnerabilities in code paths that tests never exercise. For a complete treatment of secure Python coding practices — including input validation, secrets management, and dependency auditing — that article goes well beyond what CPython internals alone can cover. You can also explore why Python is used in cybersecurity and how CPython's transparency makes it uniquely suited for security tooling. For deeper threat intelligence, see NoHacky and KandiBrian.com.
The .pyc Injection Attack
A less-discussed supply chain attack vector is .pyc file injection. Because CPython loads cached bytecode from __pycache__ without executing the source file, an attacker who can write to the __pycache__ directory on a shared filesystem, container image layer, or build artifact can inject malicious bytecode that runs in place of the legitimate source — while the .py file remains entirely unmodified. The default mtime-based validation only checks whether the source file timestamp matches what was recorded at compile time; it does not verify that the .pyc itself is trustworthy.
# Production hardening: prevent .pyc injection in deployed applications
# 1. Use hash-based .pyc validation — immune to mtime rounding on FAT32/NFS.
# The --invalidation-mode flag accepts: checked-hash | unchecked-hash | timestamp
# -j 0 uses all available CPU cores for parallel compilation.
python3 -m compileall --invalidation-mode checked-hash -j 0 your_package/
# 2. Make __pycache__ directories and their contents read-only after deployment.
# 555 = r-xr-xr-x (no write for anyone). Use 444 for the .pyc files themselves.
find . -type d -name __pycache__ -exec chmod 555 {} +
find . -name "*.pyc" -exec chmod 444 {} +
# 3. Disable .pyc generation entirely in audited environments (e.g. read-only containers).
# Trade-off: every import pays the compilation cost on each process start.
PYTHONDONTWRITEBYTECODE=1 python3 your_app.py
# 4. Dockerfile: compile with hash-based validation during image build,
# then lock the entire app tree read-only before the final stage.
#
# RUN python3 -m compileall --invalidation-mode checked-hash -j 0 /app \
# && find /app -type d -name __pycache__ -exec chmod 555 {} + \
# && find /app -name "*.pyc" -exec chmod 444 {} +
#
# Add to your docker run / k8s securityContext to enforce at runtime:
# readOnlyRootFilesystem: true
marshal.loads() and Arbitrary Code Execution
The marshal module serializes Python code objects — the internal format CPython uses for .pyc files. The Python documentation explicitly states that marshal is not secure against erroneous or maliciously constructed data. Calling marshal.loads() on untrusted input is an arbitrary code execution vulnerability: a crafted byte sequence can produce a code object whose bytecode, when executed via exec() or eval(), runs any code the attacker chooses. The same applies to pickle.loads() — both deserializers reconstruct live Python objects, including callable code, from raw bytes.
# NEVER do this with untrusted data
import marshal
with open('untrusted.pyc', 'rb') as f:
f.read(16) # skip the 16-byte .pyc header
code_obj = marshal.loads(f.read())
exec(code_obj) # arbitrary code execution if file is malicious
# SAFER alternatives for serializing data (not code):
# - json.dumps() / json.loads() (strings, numbers, lists, dicts only)
# - tomllib / tomli (config files)
# - protobuf / msgpack (structured binary data)
# Never use marshal or pickle for data received over the network or from users.
The GIL Does Not Prevent Race Conditions
This is one of the most dangerous misconceptions in Python security engineering. The GIL prevents two threads from executing Python bytecode at the same millisecond — but it releases between individual bytecode instructions. A read-modify-write sequence that looks atomic at the Python source level is not atomic at the bytecode level. This creates real TOCTOU (Time-of-Check Time-of-Use) vulnerabilities in rate limiters, session counters, token buckets, and authentication caches.
import threading
# ── UNSAFE: looks atomic at the source level, but is not ─────────────────
rate_limit_counter = {}
class RateLimitExceeded(Exception):
pass
def handle_request_unsafe(user_id):
count = rate_limit_counter.get(user_id, 0) # READ — GIL may release after this
if count >= 100:
raise RateLimitExceeded(user_id)
rate_limit_counter[user_id] = count + 1 # WRITE — separate bytecode instruction
# Window between READ and WRITE: another thread can increment the same key.
# Two threads can both read 99, both pass the check, and both write 100.
# ── CORRECT: explicit lock protects the entire read-check-write sequence ─
_lock = threading.Lock()
def handle_request_safe(user_id):
with _lock:
count = rate_limit_counter.get(user_id, 0)
if count >= 100:
raise RateLimitExceeded(user_id)
rate_limit_counter[user_id] = count + 1
# ── Notes on common alternatives ─────────────────────────────────────────
# collections.Counter is NOT thread-safe — individual method calls on a dict
# subclass are not atomic; you still need a Lock around read-modify-write.
#
# threading.local() gives each thread its own isolated copy of a value —
# useful for per-request context (e.g. current user), but not for shared
# state like a global rate-limit counter that must span all threads.
#
# For production rate limiting, push the counter to an atomic store:
# Redis INCR / INCRBY — atomic server-side increment
# Redis SETNX + EXPIRE — token bucket or sliding window
# These avoid the in-process lock entirely and work across multiple processes.
In CPython's free-threaded builds (Python 3.14+), the GIL is absent entirely, so any code that relied on the GIL as implicit synchronization will have race conditions on every read-modify-write sequence. Security-sensitive shared state — session stores, nonce caches, rate limiters — must use explicit locks in all Python code targeting free-threaded builds.
Release History
Recent CPython Milestones
CPython's development has accelerated significantly since the Faster CPython project began contributing to the 3.11 release cycle. Several milestones are worth knowing as context for the internals covered later in this series.
- Python 3.11 (Oct 2022): Introduced the specializing adaptive interpreter (PEP 659), which rewrites bytecode instructions in place with faster, type-specialized variants at runtime. This delivered roughly a 1.22x average speedup over 3.10 on the standard pyperformance benchmark suite. Also added significantly improved error messages with precise source location highlighting.
- Python 3.12 (Oct 2023): Added per-interpreter GILs via PEP 684, enabling isolated subinterpreters with independent locks — a foundation for future parallelism improvements. Delivered roughly 4% speedup over 3.11.
- Python 3.13 (Oct 2024): Introduced experimental free-threaded mode (PEP 703,
--disable-gil) and an experimental JIT compiler (PEP 744), both disabled by default. Also shipped a new interactive REPL with color syntax highlighting. Delivered roughly 7% speedup over 3.12. - Python 3.14 (Oct 2025): The free-threaded build moved from experimental to officially supported via PEP 779 — the single-threaded overhead dropped to roughly 5–10% after the adaptive specialization was re-enabled for free-threaded mode (Python 3.14 release notes). A new tail-calling interpreter was added, using tail calls between small C functions rather than a single large switch statement; for Clang 19+ on x86-64 and AArch64, early benchmarks showed a 3–5% geometric mean speedup. The JIT was bundled into official Windows and macOS binaries (enabled via
PYTHON_JIT=1). The newconcurrent.interpretersmodule (PEP 734) brought multiple-interpreter support from the C-API into standard Python. Overall, Python 3.14 is roughly 8% faster than 3.13. Also introduced template strings (t-strings, PEP 750) and deferred annotation evaluation (PEP 649). - Python 3.15 (in development, targeted Oct 2026): As of April 2026, Python 3.15 is in alpha (3.15.0a8). The JIT was rebuilt around an overhauled tracing frontend and now supports significantly more bytecode operations and control flow than in 3.14, including simple object creation and partial generator support. Microsoft cancelled its sponsorship of the Faster CPython project in May 2025, but the community-led team kept development moving. Per Ken Jin's March 2026 update, the 3.15 alpha JIT is roughly 5–6% faster on x86-64 Linux and 11–12% faster on AArch64 macOS compared to the tail-calling interpreter, hitting the team's targets ahead of schedule. The official 3.15 alpha documentation (a7) reports 3–4% on x86-64 Linux and 7–8% on AArch64 macOS over the standard interpreter — numbers that continue to improve as the optimizer matures. Register allocation and expanded constant-propagation were also added to the JIT optimizer. UTF-8 is set to become the default file encoding (PEP 686), and explicit lazy imports are being added via PEP 810. Free-threaded JIT support is targeted for 3.15 or 3.16.
CPython's JIT (PEP 744) uses a copy-and-patch strategy, generating machine-code templates for micro-ops at compile time using LLVM, then patching them at runtime. The JIT operates on Tier 2 micro-ops — a lower-level internal IR that the adaptive specialization engine (Tier 1) promotes hot bytecode into after repeated execution. As of CPython 3.14, the JIT's performance was a mixed picture: Ken Jin noted that the 3.13 and 3.14 JIT was often slower than the interpreter, and in May 2025 Microsoft cancelled its funding for the Faster CPython team, laying off several core developers including Mark Shannon. The project transitioned to community stewardship. For 3.15, a team of volunteer contributors rewrote the JIT tracing frontend and added register allocation and expanded constant-propagation to the optimizer — delivering the first consistently meaningful speedups (see the 3.15 bullet above). Free-threaded JIT support is targeted for 3.15 or 3.16.
CPython core developer Mark Shannon articulated the compounding logic behind the Faster CPython approach in 2021: optimizations made to the interpreter are, in effect, automatically inherited by any future compiler layer built on top of it — meaning faster interpreter work is never wasted when a JIT arrives. That philosophy explains why CPython invested heavily in the adaptive specialization engine before the JIT: a faster interpreter makes a better foundation for JIT-compiled traces.
You can check JIT status at runtime in Python 3.13 and later using the sys._jit namespace:
import sys
if hasattr(sys, '_jit'):
print("JIT available:", sys._jit.is_available()) # True if this build includes the JIT
print("JIT enabled: ", sys._jit.is_enabled()) # True if PYTHON_JIT=1 was set
print("JIT active: ", sys._jit.is_active()) # True if current frame is JIT-compiled
else:
print("No JIT in this build (pre-3.13 or stripped)")
Ecosystem Dominance
Why CPython Dominates Despite Faster Alternatives
PyPy is typically 3–10x faster than CPython for long-running pure-Python CPU-bound workloads. GraalPy can be faster still on some benchmarks. Yet CPython powers the overwhelming majority of Python deployments. The reason is not inertia — it is the C extension Application Binary Interface (ABI).
Guido van Rossum captured the core tension when discussing the Faster CPython project constraints in 2021: "Backward compatibility is the hardest problem to solve." That constraint is why CPython's performance roadmap has always proceeded incrementally through the bytecode layer — the one part of the interpreter that carries no ABI guarantee across major versions — rather than attempting wholesale rewrites that would break the extension ecosystem.
Packages like NumPy, Pandas, SciPy, Cryptography, lxml, and PyArrow are not pure Python. They expose C or C++ code to Python through CPython's C extension API, defined in Python.h. These extensions are compiled as shared libraries (.so on Linux/macOS, .pyd on Windows) that link directly against CPython's internal structures — including the object layout, the reference counting macros, and the stable ABI (introduced in PEP 384) that guarantees binary compatibility across CPython minor versions. PyPy, Jython, and GraalPy implement compatibility layers for this API, but coverage is incomplete and performance characteristics differ. A C extension that calls Py_INCREF thousands of times per millisecond gets a fundamentally different experience under a tracing GC runtime than under CPython's reference counting.
The free-threaded build's biggest ecosystem challenge is precisely this: C extensions that assumed the GIL provided implicit thread safety must be audited and rebuilt for the no-GIL ABI. The Python 3.14 release notes note that extension modules must now explicitly signal GIL-disabled compatibility via the Py_GIL_DISABLED preprocessor flag. The ongoing work at py-free-threading.github.io tracks which packages have achieved compatibility. As of early 2026, many core scientific and ML packages — NumPy, SciPy, Pillow, PyO3-based crates — have working free-threaded wheels, but the long tail of smaller packages remains a barrier to broad adoption.
PEP 779, authored by Thomas Wouters and other Steering Council members, makes clear that promoting the free-threaded build to a supported default hinges on community readiness — specifically, whether the third-party extension ecosystem can achieve broad compatibility with the no-GIL ABI. That bar has not yet been met.
This is the key context for understanding where CPython stands in 2026: the language is evolving faster than at any point in two decades, but the path to a GIL-free default runs directly through the ecosystem's willingness to update its C extensions — a process that will take years, not months.
Summary
Key Takeaways
- CPython is the reference implementation: Written in C and Python, it is the version distributed on python.org and the one you are running unless you have specifically installed an alternative.
- Python is a specification; CPython is one implementation of it: Other implementations such as PyPy, Jython, and MicroPython all implement the Python language but differ internally in ways that affect performance, threading, and ecosystem compatibility.
- CPython compiles to bytecode before executing: Your source code passes through tokenization, AST construction, and compilation before the virtual machine's evaluation loop runs it. The
dismodule lets you inspect the resulting bytecode. - The GIL allows only one thread to run Python bytecode at a time: It is not a concurrency killer for I/O-bound work but is a genuine limitation for CPU-bound multi-threaded code. As of Python 3.14, free-threaded builds are officially supported (PEP 779) — no longer experimental — with a single-threaded overhead of roughly 5–10%.
- CPython is evolving rapidly: Python 3.14 delivered the first officially supported free-threaded build, a new tail-calling interpreter, the
concurrent.interpretersstdlib module (PEP 734), and a bundled JIT in official binaries. After Microsoft ended its Faster CPython sponsorship in May 2025, community contributors rebuilt the JIT tracing frontend and delivered meaningful speedups in Python 3.15 alpha. Python 3.15, targeted for October 2026, is also adding explicit lazy imports (PEP 810) and UTF-8 as the default file encoding (PEP 686). These represent the most significant architectural changes to the interpreter in two decades. - CPython internals have direct security implications: Never use
isfor string comparison in security checks — use==orhmac.compare_digest(). Never callmarshal.loads()orpickle.loads()on untrusted data. Lock__pycache__directories read-only in production. Treat the GIL as a performance mechanism, not a thread-safety guarantee — always protect shared mutable state with explicit locks.
With a clear picture of what CPython is and how it processes code, you are ready to explore the lower layers: how objects are laid out in memory, how reference counting and the cyclic garbage collector interact, and how the evaluation loop handles frame execution. Those topics are covered in the articles that follow in this Python Internals series.
Frequently Asked Questions
What is CPython?
CPython is the reference implementation of Python, written in C and Python. It is the default version installed from python.org and the one you are almost certainly running when you type python or python3. Other implementations such as PyPy and Jython also run Python code, but CPython is the original and the benchmark for correct language behavior.
What is the difference between Python and CPython?
Python is a language specification — a set of rules describing syntax, semantics, and behavior. CPython is one implementation of that specification, written in C. Other implementations such as PyPy, Jython, and MicroPython also implement the Python language but use different internal approaches.
How do I check which Python implementation I am running?
Use sys.implementation.name in the Python REPL or a script. On a standard CPython install it returns 'cpython'. On PyPy it returns 'pypy'. On MicroPython it returns 'micropython'. You can also check sys.implementation.cache_tag to see the string used in __pycache__ filenames.
What does CPython do with my source code before running it?
CPython passes your source code through five stages: tokenization, PEG parsing into an Abstract Syntax Tree, compilation through a control flow graph into bytecode, caching the bytecode as .pyc files in __pycache__, and finally executing the bytecode in the evaluation loop of the virtual machine.
What is the GIL in CPython?
The Global Interpreter Lock (GIL) is a mutex that allows only one thread to execute Python bytecode at a time within a single CPython process. It simplifies memory management but limits CPU-bound parallelism. Python 3.14 introduced an officially supported free-threaded build (PEP 779) that can disable the GIL.
Does the GIL prevent all multi-threaded Python programs from being useful?
No. I/O-bound threads — those waiting on network requests, file reads, or database queries — release the GIL while waiting and can run concurrently. The GIL is mainly a bottleneck for CPU-bound workloads that need true multi-core parallelism.
How can I inspect the bytecode CPython generates for a function?
Use the dis module from the Python standard library. Call dis.dis(your_function) and CPython will print the bytecode instructions, including opcode names, argument values, and source line numbers.
Does CPython have a JIT compiler?
CPython 3.13 added an experimental JIT compiler (PEP 744), disabled by default. CPython 3.14 bundled it in official Windows and macOS binaries (enable with PYTHON_JIT=1). CPython 3.15 alpha delivered the first consistently meaningful JIT speedups: roughly 5–6% faster on x86-64 Linux and 11–12% faster on AArch64 macOS.
What is the difference between CPython and PyPy?
CPython uses a bytecode interpreter with optional adaptive specialization (PEP 659) and an experimental JIT. PyPy uses a tracing JIT that generates native machine code for hot loops, making it typically 3 to 10 times faster than CPython for long-running CPU-bound pure Python code. The trade-off is that PyPy has less compatibility with CPython C extensions, so most production projects use CPython.
What is a free-threaded Python build?
A free-threaded Python build is a version of CPython compiled with the Global Interpreter Lock disabled, allowing true multi-threaded execution across CPU cores. It was introduced as experimental in Python 3.13 via PEP 703 and became officially supported in Python 3.14 via PEP 779. The binary uses a t suffix: python3.14t.
What is the Faster CPython project?
The Faster CPython project was a multi-year initiative to significantly improve interpreter performance. Originally funded by Microsoft, it delivered the specializing adaptive interpreter in Python 3.11 (1.22x speedup) and continued through 3.14. Microsoft cancelled funding in May 2025 and the project transitioned to community stewardship, with the 3.15 JIT delivering the first consistently meaningful speedups.
Key references used in this tutorial: What's New in Python 3.13 — Python Software Foundation; What's New in Python 3.14 — Python Software Foundation; PEP 703 — Making the GIL Optional; PEP 779 — Supported Status for Free-threaded Python; PEP 744 — JIT Compilation; PEP 659 — Specializing Adaptive Interpreter; PEP 617 — New PEG Parser for CPython; PEP 810 — Explicit Lazy Imports; What's New in Python 3.15 (alpha) — Python Software Foundation; Python 3.15's JIT is now back on track — Ken Jin, Python Insider (March 2026).