Converting a string to an integer is one of the operations you will perform most often in Python — and also one of the easiest to get subtly wrong. This Python tutorial covers every aspect of the conversion: the mechanics of int(), base conversions, validation strategies, error handling, the security-driven digit limit introduced in Python 3.11, and the cases where a safer alternative like ast.literal_eval() is worth reaching for.
What's in this Python Tutorial
- Why Strings Hold Numbers
- The int() Function in Detail
- Base Conversions: Binary, Octal, Hex
- Error Handling and Input Validation
- Edge Cases Worth Knowing
- The Python 3.11 Security Limit
- ast.literal_eval() as a Safer Alternative
- How to Convert a String to an Integer in Python
- Practical Patterns
- Key Takeaways
- Frequently Asked Questions
Python is a strongly typed language. That means you cannot treat a string and an integer as interchangeable, even when the string contains nothing but digits. The value "42" and the value 42 are fundamentally different objects: one is a sequence of characters, the other is a numeric type that supports arithmetic. Conversion bridges that gap, and understanding exactly what happens during conversion will save you from bugs that can be surprisingly hard to trace.
Why Strings Hold Numbers
Before getting into the mechanics, it is worth asking: where do numeric strings come from? The answer reveals why conversion is such a routine task.
User input is always a string. When you call input(), Python returns whatever the user typed as a str, regardless of content. Reading a CSV, a config file, an environment variable, or a command-line argument produces the same result — raw text. HTTP request parameters, JSON parsed from an API response (when the field was serialized without type information), and database drivers that return untyped text columns all hand you strings. If you want to do arithmetic on any of those values, you have to convert them first.
This is not a quirk of Python; it reflects how computers represent external input. The conversion step is deliberate: it forces you to acknowledge that parsing can fail and decide what to do when it does.
Build the correct Python statement to convert the string "99" to an integer and assign it to a variable called score:
score = int("99"). The variable name comes first on the left of the assignment operator =, followed by the int( call, the string argument "99", and the closing parenthesis. str() converts to a string (the opposite direction), and float() produces a float, not an integer.
The int() Function in Detail
The primary tool for this job is the built-in int() function. According to the official Python documentation, int() is defined with the signature:
int(x=0)
int(x, base=10)
When called with a string as its first argument, int() parses the string and returns an integer object. The default base is 10, which means the string is treated as a decimal number unless you specify otherwise.
# Basic conversion
age_str = "29"
age = int(age_str)
print(age) # 29
print(type(age)) # <class 'int'>
The return type is always a Python int. Unlike languages with fixed-width integer types (such as C's int32_t or Java's int), Python integers can be arbitrarily large. The official documentation describes int as having unlimited precision, bounded only by available memory. This means there is no overflow to worry about for large numeric strings — with one important caveat covered in the security section below.
According to the Python 3 built-in functions documentation (docs.python.org), when x is not a number and no base is provided, the argument must be a string, bytes, or bytearray object holding an integer literal in the specified radix. This is why passing a float string like "3.14" directly raises ValueError — it is not a valid integer literal in any standard base.
That note highlights something easy to miss: int() also accepts bytes and bytearray objects, not just strings. The parsing rules are the same.
What int() Accepts
When parsing a string in base 10, int() accepts the following forms:
- A plain sequence of decimal digits:
"42","1000" - A leading sign:
"-7","+3" - Leading and trailing whitespace:
" 42 "is valid - Underscore separators between digits, since Python 3.6:
"1_000_000"
int(" -42 ") # Returns -42 (whitespace is stripped)
int("+99") # Returns 99 (leading + is accepted)
int("1_000") # Returns 1000 (underscore grouping, Python 3.6+)
The underscore grouping behavior was added in Python 3.6 to mirror the way numeric literals can be written in code. int("1_000_000") returns 1000000. The Python changelog documents this addition under the 3.6 release notes for built-in functions.
Base Conversions: Binary, Octal, Hex
The second parameter of int() is base. It accepts any integer between 2 and 36 inclusive, or the special value 0. When you supply a base, you are telling Python how to interpret the digits in the string.
# Binary (base 2)
int("1010", 2) # Returns 10
# Octal (base 8)
int("17", 8) # Returns 15
# Hexadecimal (base 16)
int("1F", 16) # Returns 31
int("ff", 16) # Returns 255 (case-insensitive)
int("0xFF", 16) # Returns 255 (0x prefix accepted)
# Base 36 (digits 0-9 and letters a-z)
int("z", 36) # Returns 35
The output is always a Python integer in base 10, regardless of the input base. There is no "hex integer" or "binary integer" type in Python — there is only int, and the base you supply tells the parser how to read the input string.
Passing base=0 is a special case: it tells Python to infer the base from the string's prefix. A string starting with 0b or 0B is treated as binary, 0o or 0O as octal, 0x or 0X as hexadecimal, and a plain string of decimal digits as base 10. This mirrors how Python parses integer literals in source code.
int("0b1010", 0) # Returns 10 (binary inferred from 0b prefix)
int("0o17", 0) # Returns 15 (octal inferred from 0o prefix)
int("0xff", 0) # Returns 255 (hex inferred from 0x prefix)
int("42", 0) # Returns 42 (decimal, no prefix)
Note that passing base=0 means a plain string like "042" (without a prefix) will raise a ValueError. In Python 2, a string starting with zero was interpreted as octal, but Python 3 eliminates that ambiguity entirely.
Error Handling and Input Validation
When int() cannot parse the string you give it, it raises a ValueError. This is the correct exception type: the value is of the right type (a string), but its content is not a valid integer representation.
int("hello") # ValueError: invalid literal for int() with base 10: 'hello'
int("3.14") # ValueError: invalid literal for int() with base 10: '3.14'
int("") # ValueError: invalid literal for int() with base 10: ''
int("1 2") # ValueError: spaces within the digit sequence are not allowed
Notice the third example: an empty string raises ValueError. This catches many developers off guard when reading from sources that may produce empty fields.
Also worth noting: "3.14" fails even though the string represents a valid number. int() does not perform float parsing. If you have a string like "3.14" and you want the integer 3, you need to convert to float first:
result = int(float("3.14")) # Returns 3 (truncates toward zero)
Using try/except for Robust Conversion
The standard pattern for safe conversion is a try/except block targeting ValueError:
def safe_to_int(value, default=None):
try:
return int(value)
except (ValueError, TypeError):
return default
safe_to_int("42") # Returns 42
safe_to_int("hello") # Returns None
safe_to_int(None) # Returns None (TypeError caught)
safe_to_int("bad", 0) # Returns 0 (custom default)
The TypeError catch in the function above is intentional. If you pass a non-string type that int() cannot handle — like None, a list, or a dict — Python raises TypeError, not ValueError. A utility function meant to handle untrusted input should catch both.
Pre-validation with str.isdigit() and str.isnumeric()
You can also check whether a string is convertible before calling int(). The string methods isdigit() and isnumeric() are commonly used for this, but they have a limitation worth understanding.
"42".isdigit() # True
"-42".isdigit() # False — the minus sign is not a digit
"42.0".isdigit() # False — the dot is not a digit
"".isdigit() # False
isdigit() returns True only when every character in the string is a Unicode digit character. It does not account for a leading sign. For input that may include a sign, you need a slightly more careful check:
def is_valid_int_string(s):
s = s.strip()
if s.startswith(('+', '-')):
s = s[1:]
return bool(s) and s.isdigit()
is_valid_int_string("-42") # True
is_valid_int_string("+7") # True
is_valid_int_string("3.14") # False
is_valid_int_string("") # False
isnumeric() accepts a wider range of Unicode characters than isdigit(), including numeric characters from other writing systems and fractions like ½. These are technically numeric in Unicode but are not valid inputs for int() in base 10. Rely on isdigit() or a try/except approach rather than isnumeric() for pre-validation of integer strings.
The function below is supposed to safely convert user input to an integer. One line contains the bug that makes it fail. Click the line you think is wrong, then hit check.
except TypeError to except (ValueError, TypeError). When int() receives a string it cannot parse — like "hello" or "3.14" — it raises ValueError, not TypeError. TypeError is only raised when the argument is the wrong type entirely (such as None or a list). Catching only TypeError means every invalid string will still cause an unhandled exception.
Edge Cases Worth Knowing
Leading Zeros
In base 10, a string with leading zeros converts just fine — the zeros are discarded:
int("007") # Returns 7
int("0042") # Returns 42
However, using base=0 with a zero-padded decimal string raises ValueError, because Python 3 eliminates the Python 2 octal ambiguity. If you are using base=0, strings representing decimal numbers must have no leading zeros (unless the number is just "0").
Whitespace Handling
Leading and trailing whitespace is silently stripped before parsing. Whitespace embedded within the digit sequence causes a ValueError:
int(" 42 ") # Returns 42 (outer whitespace stripped)
int("4 2") # ValueError (internal space)
Negative Numbers
int("-100") # Returns -100
int("--100") # ValueError
int("1-00") # ValueError
The Python 3.11 Security Limit
Converting a very long string to an integer has quadratic time complexity — O(n²) — because of how Python's internal integer representation works. An attacker who can control a string being passed to int() could construct an extremely long numeric string and trigger a denial-of-service condition by consuming excessive CPU time. This vulnerability was tracked as CVE-2020-10735.
Starting with the versions released on September 7, 2022 — Python 3.11, 3.10.7, 3.9.14, 3.8.14, and 3.7.14 — Python enforces a default limit of 4300 digits for integer-string conversions in base 10. Attempting to convert a string longer than 4300 digits raises a ValueError.
# Raises ValueError in Python 3.11+ (default limit is 4300 digits)
huge_str = "9" * 5000
int(huge_str)
# ValueError: Exceeds the limit (4300 digits) for integer string conversion
import sys
sys.set_int_max_str_digits(10000) # Raise to 10,000 digits
sys.set_int_max_str_digits(0) # 0 disables the limit entirely
The limit can also be set via the environment variable PYTHONINTMAXSTRDIGITS or the command-line option -X int_max_str_digits=N. The limit applies only to base-10 conversions. Bases that are powers of 2 — binary, octal, and hexadecimal — are exempt because their conversion algorithms are linear-time.
ast.literal_eval() as a Safer Alternative
Python's ast module provides literal_eval(), which safely evaluates an expression string containing only Python literals: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.
import ast
ast.literal_eval("42") # Returns 42 (int)
ast.literal_eval("3.14") # Returns 3.14 (float)
ast.literal_eval("'hi'") # Returns 'hi' (str)
Unlike eval(), literal_eval() refuses to execute arbitrary code. Passing arbitrary code to literal_eval() raises a ValueError rather than executing it.
However, there are two reasons int() remains the right choice for converting numeric strings in the general case. First, int() is significantly faster — benchmarks consistently show int() completing in a fraction of the time required by literal_eval(), because literal_eval() invokes the full parser and AST machinery. Second, int() gives you explicit control over the base, which literal_eval() does not expose in the same way.
Where ast.literal_eval() earns its place is in situations where the string might represent multiple possible types and you want to recover the correct one without eval(), or when the source of the string is particularly untrusted and you want the strongest possible barrier against code injection.
How to Convert a String to an Integer in Python
The following steps cover the complete, production-appropriate process for converting string values to integers safely.
-
Call int() with the string
Pass the string as the first argument to
int(). For a decimal string, no second argument is needed:int('42')returns 42. For binary, octal, or hex strings, pass the base as the second argument, for exampleint('FF', 16)for hexadecimal. -
Wrap the call in a try/except block
Enclose
int()in atryblock and catchValueErrorfor invalid string content andTypeErrorfor wrong-type arguments. In yourexceptclause, either return a default value, re-prompt the user, or raise a descriptive error depending on your use case. -
Validate the string before converting if needed
For pre-validation without exceptions, use
str.isdigit()on the stripped string. Remember thatisdigit()returnsFalsefor strings with a leading sign, so strip the sign character first if signed integers are valid input. Atry/exceptapproach handles all cases correctly and is often simpler. -
Account for the Python 3.11 digit limit when working with large numbers
If your application processes numeric strings longer than 4300 digits, call
sys.set_int_max_str_digits()to raise or remove the limit before attempting conversion. This limit exists to prevent denial-of-service through the O(n²) conversion complexity and applies only to base-10 conversions.
Practical Patterns
The patterns below go beyond the basic int() call. Each addresses a category of real-world problem that basic tutorials tend to skip: user input loops, hex color parsing, environment configuration, locale-formatted numbers, binary protocol bytes, and mixed-string extraction.
Converting User Input in a Loop
The idiomatic Python pattern for prompted numeric input is a while True loop with a break on success. This avoids the need for pre-validation and handles all invalid input cleanly.
while True:
raw = input("Enter a whole number: ")
try:
number = int(raw)
break
except ValueError:
print(f"'{raw}' is not a valid integer. Try again.")
Parsing Hex Color Codes
The base-16 overload of int() makes RGB extraction from hex color strings concise and readable. Each two-character slice is an independent channel value.
def hex_to_rgb(hex_color):
hex_color = hex_color.lstrip("#")
r = int(hex_color[0:2], 16)
g = int(hex_color[2:4], 16)
b = int(hex_color[4:6], 16)
return (r, g, b)
hex_to_rgb("#306998") # Returns (48, 105, 152)
hex_to_rgb("#FFD43B") # Returns (255, 212, 59)
Reading Environment Variables
Environment variables are always strings. Converting them at the point of reading, with a typed default, is cleaner than scattering int() calls throughout application logic.
import os
port = int(os.environ.get("PORT", "8080"))
max_retries = int(os.environ.get("MAX_RETRIES", "3"))
Comma-Separated and Locale-Formatted Numbers
Data from spreadsheets, financial APIs, and international sources often delivers numbers as formatted strings: "1,234,567" in US English, or "1.234.567" in German. Neither passes directly to int().
For US-formatted strings where the comma is always a thousands separator, stripping the comma before conversion is the simplest approach:
# US-style: comma is the thousands separator
int("1,234,567".replace(",", "")) # Returns 1234567
For applications that must handle multiple locales correctly — where the separator character itself depends on the region — the standard library's locale module provides locale.atoi(). This function parses an integer string using the rules of the active locale, stripping grouping separators automatically according to LC_NUMERIC:
import locale
# Use the system's preferred locale
locale.setlocale(locale.LC_NUMERIC, "")
# Parses according to the active LC_NUMERIC locale
value = locale.atoi("1234567")
print(value) # 1234567 (int)
print(type(value)) # <class 'int'>
The Python documentation notes that setlocale() is a process-wide setting and is not thread-safe. In multi-threaded applications, set the locale once at startup rather than toggling it per request. The locale module's LC_NUMERIC category controls number parsing; LC_ALL sets all categories at once.
Converting Binary Protocol Bytes to Integers
Network protocols, binary file formats, and hardware interfaces frequently deliver integers as raw bytes rather than text strings. Python's int.from_bytes() class method handles this directly, with explicit control over byte order (endianness) and signedness — two things that int() cannot express.
# Big-endian unsigned: most significant byte first
# Used in network protocols (TCP/IP, DNS, HTTP/2)
int.from_bytes(b'\x00\x2a', byteorder='big') # Returns 42
# Little-endian: least significant byte first
# Used on x86/x64 CPUs and most file formats (PNG, ZIP, ELF)
int.from_bytes(b'\x2a\x00', byteorder='little') # Returns 42
# Signed integer — two's complement interpretation
int.from_bytes(b'\xff', byteorder='big', signed=True) # Returns -1
int.from_bytes(b'\xff', byteorder='big', signed=False) # Returns 255
# 4-byte big-endian packet field (e.g. a sequence number)
raw_packet = b'\x00\x00\x04\xd2'
seq_num = int.from_bytes(raw_packet, byteorder='big')
print(seq_num) # 1234
The inverse operation — converting an integer back to bytes for transmission — is int.to_bytes(length, byteorder). Together these two methods give you a clean, pure-Python way to serialize and deserialize fixed-width integer fields without importing any external library.
When a binary record contains multiple adjacent fields of different types (for example, a 2-byte unsigned short followed by a 4-byte signed int), use struct.unpack() from the standard library. It unpacks an entire record into a tuple in one call and is more efficient than multiple int.from_bytes() calls with manual slice offsets. For a single integer field, int.from_bytes() is cleaner.
Extracting Integers from Mixed Strings with re.fullmatch()
Sometimes the string containing the number also contains non-numeric characters — units, labels, noise from parsing — and you need to validate and extract the integer component before converting. A regular expression with re.fullmatch() is the right tool here: it checks the entire string against a pattern and either returns a match object or None, with no exception machinery needed.
import re
# Pattern: optional sign, then one or more digits, nothing else
_INT_PATTERN = re.compile(r'^[+-]?\d+$')
def strict_int(s: str) -> int | None:
"""Return int if s is a valid signed integer string, else None."""
s = s.strip()
if _INT_PATTERN.fullmatch(s):
return int(s)
return None
strict_int("42") # Returns 42
strict_int("-7") # Returns -7
strict_int("42px") # Returns None (suffix not allowed)
strict_int("3.14") # Returns None (float string rejected)
strict_int("0b101") # Returns None (prefix notation rejected)
The compiled pattern is defined at module level so it is only compiled once rather than on every call. The approach is particularly useful in data cleaning pipelines where you want to flag unconvertible values explicitly rather than silently substituting a default, and where a try/except loop would obscure whether the failure came from a bad value or a different code path.
For cases where the integer is embedded within a larger string — say, extracting a numeric ID from a log line like "user_id=8421 action=login" — switch from re.fullmatch() to re.search() with a named capture group:
import re
log_line = "user_id=8421 action=login"
m = re.search(r'user_id=(?P<uid>\d+)', log_line)
if m:
user_id = int(m.group("uid"))
print(user_id) # 8421
Named groups make the extraction self-documenting and resilient to changes in surrounding field order. The int() call on the captured group is always safe because the group pattern \d+ guarantees the match contains only digit characters.
Key Takeaways
- int() is the correct tool: For converting a numeric string to an integer in Python,
int()is the built-in function designed for exactly this purpose. It handles decimal, signed, whitespace-padded, and underscore-grouped strings out of the box. - Always catch ValueError: Any code path that calls
int()on externally supplied data should handleValueError. If the input can also beNoneor a non-string type, catchTypeErroras well. - The base parameter extends int() to other number systems: Binary, octal, and hexadecimal strings are all supported. Passing
base=0lets Python infer the base from the string's prefix. - Know the 4300-digit security limit: Python 3.11 and corresponding patch releases enforce a default limit of 4300 digits on base-10 string-to-integer conversions to prevent CVE-2020-10735. This limit is configurable via
sys.set_int_max_str_digits(). - isdigit() has a sign limitation: The
str.isdigit()method returnsFalsefor strings with a leading minus sign. Usere.fullmatch(r'^[+-]?\d+$', s)for a single-pass check that handles signs correctly and rejects noise like units or prefixes. - Use locale.atoi() for formatted international numbers: Strings like
"1,234,567"or"1.234.567"are not valid inputs forint(). The standard library'slocale.atoi()strips grouping separators according toLC_NUMERICbefore converting, making it the correct tool for data from spreadsheets, financial feeds, and international sources. - Use int.from_bytes() for binary data: Network protocols and binary file formats deliver integers as raw bytes, not text strings.
int.from_bytes(data, byteorder='big')converts those bytes directly and correctly, with explicit control over endianness and signedness thatint()cannot express. For multi-field binary records, usestruct.unpack()instead. - ast.literal_eval() is safer in adversarial contexts: When you need to prevent code injection and the input could represent multiple Python literal types,
ast.literal_eval()is the right choice. For performance-sensitive or straightforward numeric parsing, preferint().
Frequently Asked Questions
Use the built-in int() function. Pass the string as the first argument: int('42') returns 42. The function strips leading and trailing whitespace automatically and accepts an optional base parameter for non-decimal conversions. If the string cannot be parsed, int() raises a ValueError.
int() raises a ValueError when the string content is not a valid integer representation. For example, int('hello'), int('3.14'), and int('') all raise ValueError. If the argument is the wrong type entirely (such as None or a list), int() raises a TypeError instead.
Yes. Pass the base as the second argument: int('1010', 2) converts a binary string and returns 10, int('17', 8) converts an octal string and returns 15, and int('FF', 16) converts a hex string and returns 255. Passing base=0 tells Python to infer the base from the string prefix: 0b for binary, 0o for octal, and 0x for hexadecimal.
Python 3.11 and the corresponding patch releases of earlier versions introduced a default limit of 4300 digits for base-10 string-to-integer conversions. Strings longer than this raise a ValueError. The limit exists to prevent denial-of-service attacks exploiting the O(n²) complexity of large integer conversions, as documented in CVE-2020-10735. The limit can be adjusted using sys.set_int_max_str_digits().
Yes. int() strips leading and trailing whitespace before parsing, so int(' 42 ') returns 42. However, whitespace embedded within the digit sequence causes a ValueError. For example, int('4 2') raises ValueError because the space is between digits, not at the edges.
Use ast.literal_eval() when the input is untrusted and might represent multiple Python literal types, or when you need the strongest possible barrier against code injection. Unlike eval(), ast.literal_eval() refuses to execute arbitrary code. For straightforward numeric string parsing, int() is faster and preferred.
str.isdigit() returns True only when every character in the string is a Unicode digit character. A leading minus sign is not a digit character, so '-42'.isdigit() returns False. To validate a string that may include a sign, strip the sign first and then check the remainder with isdigit(), or use a try/except block around int().
Wrap the int() call in a try/except block targeting ValueError. A loop that prompts again on failure is the standard pattern: call input() to get the raw string, attempt int() conversion inside try, break on success, and print an error message and continue on ValueError. Catching TypeError as well handles cases where None or a non-string type is passed.
References
- Python Software Foundation. Built-in Functions — Python 3 documentation. docs.python.org/3/library/functions.html
- Python Software Foundation. Integer string conversion length limitation. docs.python.org/3/library/stdtypes.html
- Python Software Foundation. ast.literal_eval — ast module. docs.python.org/3/library/ast.html
- Python Software Foundation. locale — Internationalization services. docs.python.org/3/library/locale.html
- Python Software Foundation. re — Regular expression operations. docs.python.org/3/library/re.html
- MITRE Corporation. CVE-2020-10735: Prevent DoS by large int<→str conversions. cve.mitre.org
- CPython Issue Tracker. Issue #95778: int string conversion O(n^2). github.com/python/cpython/issues/95778
- Python Software Foundation. What's New in Python 3.11. docs.python.org/3/whatsnew/3.11.html