Python Lists Learning Path
Lists are the most-used data structure in Python. They are flexible, ordered, mutable, and the first tool you reach for when you need to collect and process data. But there is more depth here than most tutorials cover.
At the CPython level, a list is a C struct called PyListObject that holds an array of pointers to PyObject instances. The list does not store your data directly -- it stores references to wherever your data lives on the heap. This is why a single list can hold integers, strings, and other lists at the same time, and it is the root cause of the aliasing bugs that catch developers off guard.
list_resize() function over-allocates using the formula new_allocated = ((size_t)newsize + (newsize >> 3) + 6) & ~(size_t)3, producing the growth sequence 0, 4, 8, 16, 24, 32, 40, 52, 64, 76... This amortizes the O(n) copy cost so that repeated append() calls run in O(1) amortized time. The list shrinks back only when its length falls below half its allocated capacity.
Tutorials marked with the cert badge include a final exam that awards a certificate of completion you can download and share.
Python's sort is equally worth understanding at a deeper level. Both list.sort() and sorted() use Timsort, a hybrid of merge sort and insertion sort written by Tim Peters in 2002 specifically for Python. Timsort finds already-sorted sub-sequences (runs) in real data and merges them, giving O(n) performance on nearly-sorted input and O(n log n) worst-case.
The Python documentation guarantees sort stability: equal-key records always retain their original relative order. This applies to both list.sort() and sorted(), and has been an explicit guarantee since Python 2.2.
This path covers lists from first principles through advanced manipulation patterns, including the copy semantics and aliasing behaviors that cause the majority of hard-to-trace bugs in Python code.
List Operation Complexity at a Glance
| Operation | Syntax | Time Complexity | Notes |
|---|---|---|---|
| Index access | list[i] | O(1) | Direct pointer lookup in ob_item array |
| Length | len(list) | O(1) | Reads ob_size field directly -- never loops |
| Append | list.append(x) | O(1) amortized | Occasional O(n) resize, amortized constant |
| Pop from end | list.pop() | O(1) | No shift required |
| Pop from middle | list.pop(i) | O(n) | Elements after i must shift left |
| Insert | list.insert(i, x) | O(n) | Elements after i must shift right |
| Delete by index | del list[i] | O(n) | Elements after i must shift left |
| Membership test | x in list | O(n) | Linear scan; use set for repeated lookups |
| Slicing | list[a:b] | O(k) | k = b - a; creates a new list |
| Concatenation | list + list2 | O(k) | k = total elements in both lists |
| Sort | list.sort() | O(n log n) | Timsort; O(n) best case on sorted input |
| Reverse | list.reverse() | O(n) | In-place; swaps pointers |
| Extend | list.extend(it) | O(k) | k = length of iterable being added |
list[i] is O(1) -- direct pointer lookup. len(list) is O(1) -- reads the stored ob_size field without looping.append() is O(1) amortized. An occasional O(n) resize happens but the cost spreads across many appends. extend(it) is O(k) where k is the length of the iterable.pop() from the end is O(1). pop(i) from a position and insert(i, x) are both O(n) because all elements after the position must shift.x in list is O(n) -- linear scan. For repeated membership tests use a set. Slicing list[a:b] is O(k) where k is the slice length.list.sort() and sorted() use Timsort: O(n log n) worst case, O(n) best case on already-sorted data. Sort is guaranteed stable.Python Lists from Zero to Mastery: The Definitive Guide
Complete coverage of list creation, indexing, slicing, methods, comprehensions, nested lists, sorting, copying, and performance. Includes CPython memory model details you will not find in most tutorials.
append vs extend in Python
The difference between append and extend, when to use each, and the performance implications of both. Includes what actually happens at the C level when each method is called.
Concatenating Lists in Python
Every way to combine lists: +, extend, unpacking, itertools.chain, and when each approach is the right choice. Covers memory implications of each method.
How to Find List Indices in Python
Finding element positions with index(), enumerate, comprehensions, and handling missing elements gracefully. Covers the ValueError raised by index() and how to avoid it.
Python List Conversion
Converting between lists and other types: tuples, sets, dicts, strings, and generators. Includes which conversions are O(1) and which require an O(n) copy.
Sorting in Python
sort() vs sorted(), key functions, reverse sorting, stable sort behavior, and how Timsort exploits real-world partial ordering to outperform generic O(n log n) sorts.
Learn What List Comprehensions are in Python
Syntax, conditions, nested loops, and common use cases for list comprehensions. Covers when a comprehension is faster than a for-loop append and when to switch back to a loop for readability. Includes interactive exercises.
Flattening Nested Lists in Python
Recursive flattening, itertools.chain, list comprehensions, and handling arbitrary nesting depth. Includes performance comparison of each approach on lists with varying structure.
Understanding Flatten in Python
Different approaches to flattening data structures and when each is the right choice. Covers the distinction between one-level and full recursive flattening.
Why Is My List Changing Unexpectedly?
Mutable default arguments, aliasing, shallow copies, and the reference model that causes unexpected mutations. One of the most common Python bugs explained from the memory level up.
Shallow Copy vs Deep Copy in Python
Understanding copy behavior with nested objects, when shallow copies break, and how deepcopy resolves it. Includes the copy module's protocol and when to avoid deepcopy for performance reasons.
How to Work Through This Learning Path
This path is structured to build on itself. Work through the fundamentals in order before moving to the advanced patterns group. Each tutorial is self-contained but written to assume the concepts from earlier tutorials in the sequence.
Start with the Definitive Guide
Read Python Lists from Zero to Mastery first. It establishes the full mental model: how PyListObject works in memory, what ob_item and allocated mean, and why the list holds references rather than values. Every other tutorial in this path assumes this foundation.
Learn the core mutation methods
Work through append vs extend and Concatenating Lists together. These two tutorials cover how Python grows and merges lists, which directly determines memory usage and performance in real code. Pay attention to which operations are O(1) amortized and which are O(n).
Master indexing, slicing, and searching
Read How to Find List Indices. Negative indexing, step slices, and the difference between index() and enumerate() are used constantly and misunderstood more often than they should be. Also covers how to handle the ValueError that index() raises when an element is absent.
Understand sorting and conversion
Read Sorting in Python and Python List Conversion. Sorting with key functions, understanding sort stability, and knowing when to convert a list to a set or tuple are all standard daily-use skills. These also establish the groundwork for the advanced patterns.
Tackle the advanced patterns
Work through the four tutorials in the Advanced Patterns group. Start with Why Is My List Changing Unexpectedly and Shallow Copy vs Deep Copy since aliasing bugs affect every level of Python code. Follow with the flattening tutorials once the reference model is solid.
Frequently Asked Questions
my_list = [1, 'hello', 3.14]. Because they are mutable, elements can be added, removed, or changed after creation.
append() adds a single object as a new element at the end of the list. If you pass it another list, that entire list becomes one nested element. extend() iterates over its argument and adds each item individually, expanding the list by the number of items in the iterable. For example: my_list.append([1,2]) results in one new element that is itself a list, while my_list.extend([1,2]) adds two separate elements.
list.sort() and sorted() use an algorithm called Timsort, created by Tim Peters in 2002. Timsort is a hybrid of merge sort and insertion sort that identifies already-sorted sub-sequences (called runs) and merges them efficiently. It guarantees stable sort behavior, meaning elements with equal keys keep their original relative order. The worst-case time complexity is O(n log n), and the best case for already-sorted data is O(n).
copy.deepcopy(), recursively copies every object so the copy is completely independent of the original. Use list.copy(), list[:], or list() for shallow copies, and copy.deepcopy() when you need full independence.
PyListObject containing three core fields: the standard object header (PyObject_VAR_HEAD, which includes the reference count, type pointer, and ob_size), a pointer to an array of PyObject pointers called ob_item, and an integer called allocated that tracks the total number of slots reserved in memory. The list stores references to objects, not the objects themselves. When the list grows beyond its allocated slots, CPython calls list_resize(), which over-allocates memory using the growth sequence 0, 4, 8, 16, 24, 32, 40, 52... to amortize the cost of future appends.
[expression for item in iterable if condition]. List comprehensions are generally faster than equivalent for-loop appends because the interpreter can optimize the iteration. Use them for simple, readable transformations. When the logic becomes complex, a regular for loop is usually more maintainable.
list.sort() and sorted(). It has been guaranteed since Python 2.2 and is a deliberate design property of Timsort, not an implementation detail.
list[i]) and len() are O(1). append() is O(1) amortized. insert() and pop() at an arbitrary position are O(n) because elements must shift. Searching with in is O(n). Sorting is O(n log n). Slicing list[a:b] is O(k) where k is the slice length. Concatenation with + is O(k) where k is the total length of both lists. Deletion from the end with pop() is O(1).
[item for sublist in nested for item in sublist]. For one level handled more efficiently, use list(itertools.chain.from_iterable(nested)). For deeply nested or irregular structures, a recursive function or a stack-based loop is required. NumPy's ndarray.flatten() is the best choice when working with numeric arrays.
b = a, both variables point to the same list object in memory. Modifying b also modifies a. The same issue occurs with nested lists inside a shallow copy. To avoid this, use b = a.copy() or b = a[:] for a new top-level list, or copy.deepcopy(a) when nested mutable objects must also be independent.