Converting Strings to Integers in Python

Final Exam & Certification

Complete this tutorial and pass the 10-question final exam to earn a downloadable certificate of completion.

skip to exam

Converting a string to an integer is one of the operations you will perform most often in Python — and also one of the easiest to get subtly wrong. This Python tutorial covers every aspect of the conversion: the mechanics of int(), base conversions, validation strategies, error handling, the security-driven digit limit introduced in Python 3.11, and the cases where a safer alternative like ast.literal_eval() is worth reaching for.

What's in this Python Tutorial

Python is a strongly typed language. That means you cannot treat a string and an integer as interchangeable, even when the string contains nothing but digits. The value "42" and the value 42 are fundamentally different objects: one is a sequence of characters, the other is a numeric type that supports arithmetic. Conversion bridges that gap, and understanding exactly what happens during conversion will save you from bugs that can be surprisingly hard to trace.

Why Strings Hold Numbers

Before getting into the mechanics, it is worth asking: where do numeric strings come from? The answer reveals why conversion is such a routine task.

User input is always a string. When you call input(), Python returns whatever the user typed as a str, regardless of content. Reading a CSV, a config file, an environment variable, or a command-line argument produces the same result — raw text. HTTP request parameters, JSON parsed from an API response (when the field was serialized without type information), and database drivers that return untyped text columns all hand you strings. If you want to do arithmetic on any of those values, you have to convert them first.

This is not a quirk of Python; it reflects how computers represent external input. The conversion step is deliberate: it forces you to acknowledge that parsing can fail and decide what to do when it does.

code builder click a token to place it

Build the correct Python statement to convert the string "99" to an integer and assign it to a variable called score:

your code will appear here...
int( str( score "99" = ) float(
Why: The correct order is score = int("99"). The variable name comes first on the left of the assignment operator =, followed by the int( call, the string argument "99", and the closing parenthesis. str() converts to a string (the opposite direction), and float() produces a float, not an integer.

The int() Function in Detail

The primary tool for this job is the built-in int() function. According to the official Python documentation, int() is defined with the signature:

python
int(x=0)
int(x, base=10)

When called with a string as its first argument, int() parses the string and returns an integer object. The default base is 10, which means the string is treated as a decimal number unless you specify otherwise.

python
# Basic conversion
age_str = "29"
age = int(age_str)
print(age)         # 29
print(type(age))  # <class 'int'>

The return type is always a Python int. Unlike languages with fixed-width integer types (such as C's int32_t or Java's int), Python integers can be arbitrarily large. The official documentation describes int as having unlimited precision, bounded only by available memory. This means there is no overflow to worry about for large numeric strings — with one important caveat covered in the security section below.

Python Documentation

According to the Python 3 built-in functions documentation (docs.python.org), when x is not a number and no base is provided, the argument must be a string, bytes, or bytearray object holding an integer literal in the specified radix. This is why passing a float string like "3.14" directly raises ValueError — it is not a valid integer literal in any standard base.

That note highlights something easy to miss: int() also accepts bytes and bytearray objects, not just strings. The parsing rules are the same.

What int() Accepts

When parsing a string in base 10, int() accepts the following forms:

  • A plain sequence of decimal digits: "42", "1000"
  • A leading sign: "-7", "+3"
  • Leading and trailing whitespace: " 42 " is valid
  • Underscore separators between digits, since Python 3.6: "1_000_000"
python
int("  -42  ")    # Returns -42  (whitespace is stripped)
int("+99")       # Returns 99   (leading + is accepted)
int("1_000")     # Returns 1000 (underscore grouping, Python 3.6+)
Note

The underscore grouping behavior was added in Python 3.6 to mirror the way numeric literals can be written in code. int("1_000_000") returns 1000000. The Python changelog documents this addition under the 3.6 release notes for built-in functions.

Base Conversions: Binary, Octal, Hex

The second parameter of int() is base. It accepts any integer between 2 and 36 inclusive, or the special value 0. When you supply a base, you are telling Python how to interpret the digits in the string.

python
# Binary (base 2)
int("1010", 2)   # Returns 10

# Octal (base 8)
int("17", 8)     # Returns 15

# Hexadecimal (base 16)
int("1F", 16)    # Returns 31
int("ff", 16)    # Returns 255 (case-insensitive)
int("0xFF", 16)  # Returns 255 (0x prefix accepted)

# Base 36 (digits 0-9 and letters a-z)
int("z", 36)     # Returns 35

The output is always a Python integer in base 10, regardless of the input base. There is no "hex integer" or "binary integer" type in Python — there is only int, and the base you supply tells the parser how to read the input string.

Pro Tip

Passing base=0 is a special case: it tells Python to infer the base from the string's prefix. A string starting with 0b or 0B is treated as binary, 0o or 0O as octal, 0x or 0X as hexadecimal, and a plain string of decimal digits as base 10. This mirrors how Python parses integer literals in source code.

python
int("0b1010", 0)  # Returns 10  (binary inferred from 0b prefix)
int("0o17", 0)    # Returns 15  (octal inferred from 0o prefix)
int("0xff", 0)    # Returns 255 (hex inferred from 0x prefix)
int("42", 0)      # Returns 42  (decimal, no prefix)

Note that passing base=0 means a plain string like "042" (without a prefix) will raise a ValueError. In Python 2, a string starting with zero was interpreted as octal, but Python 3 eliminates that ambiguity entirely.

Error Handling and Input Validation

When int() cannot parse the string you give it, it raises a ValueError. This is the correct exception type: the value is of the right type (a string), but its content is not a valid integer representation.

python
int("hello")   # ValueError: invalid literal for int() with base 10: 'hello'
int("3.14")    # ValueError: invalid literal for int() with base 10: '3.14'
int("")        # ValueError: invalid literal for int() with base 10: ''
int("1 2")     # ValueError: spaces within the digit sequence are not allowed

Notice the third example: an empty string raises ValueError. This catches many developers off guard when reading from sources that may produce empty fields.

Also worth noting: "3.14" fails even though the string represents a valid number. int() does not perform float parsing. If you have a string like "3.14" and you want the integer 3, you need to convert to float first:

python
result = int(float("3.14"))  # Returns 3 (truncates toward zero)

Using try/except for Robust Conversion

The standard pattern for safe conversion is a try/except block targeting ValueError:

python
def safe_to_int(value, default=None):
    try:
        return int(value)
    except (ValueError, TypeError):
        return default

safe_to_int("42")        # Returns 42
safe_to_int("hello")     # Returns None
safe_to_int(None)         # Returns None (TypeError caught)
safe_to_int("bad", 0)    # Returns 0 (custom default)

The TypeError catch in the function above is intentional. If you pass a non-string type that int() cannot handle — like None, a list, or a dict — Python raises TypeError, not ValueError. A utility function meant to handle untrusted input should catch both.

Pre-validation with str.isdigit() and str.isnumeric()

You can also check whether a string is convertible before calling int(). The string methods isdigit() and isnumeric() are commonly used for this, but they have a limitation worth understanding.

python
"42".isdigit()     # True
"-42".isdigit()    # False — the minus sign is not a digit
"42.0".isdigit()  # False — the dot is not a digit
"".isdigit()      # False

isdigit() returns True only when every character in the string is a Unicode digit character. It does not account for a leading sign. For input that may include a sign, you need a slightly more careful check:

python
def is_valid_int_string(s):
    s = s.strip()
    if s.startswith(('+', '-')):
        s = s[1:]
    return bool(s) and s.isdigit()

is_valid_int_string("-42")    # True
is_valid_int_string("+7")     # True
is_valid_int_string("3.14")  # False
is_valid_int_string("")       # False
Watch Out

isnumeric() accepts a wider range of Unicode characters than isdigit(), including numeric characters from other writing systems and fractions like ½. These are technically numeric in Unicode but are not valid inputs for int() in base 10. Rely on isdigit() or a try/except approach rather than isnumeric() for pre-validation of integer strings.

spot the bug click the line that contains the bug

The function below is supposed to safely convert user input to an integer. One line contains the bug that makes it fail. Click the line you think is wrong, then hit check.

1def get_age(user_input):
2 try:
3 age = int(user_input)
4 except TypeError:
5 return None
6 return age
The fix: Change except TypeError to except (ValueError, TypeError). When int() receives a string it cannot parse — like "hello" or "3.14" — it raises ValueError, not TypeError. TypeError is only raised when the argument is the wrong type entirely (such as None or a list). Catching only TypeError means every invalid string will still cause an unhandled exception.

Edge Cases Worth Knowing

Leading Zeros

In base 10, a string with leading zeros converts just fine — the zeros are discarded:

python
int("007")   # Returns 7
int("0042")  # Returns 42

However, using base=0 with a zero-padded decimal string raises ValueError, because Python 3 eliminates the Python 2 octal ambiguity. If you are using base=0, strings representing decimal numbers must have no leading zeros (unless the number is just "0").

Whitespace Handling

Leading and trailing whitespace is silently stripped before parsing. Whitespace embedded within the digit sequence causes a ValueError:

python
int("  42  ")  # Returns 42  (outer whitespace stripped)
int("4 2")     # ValueError  (internal space)

Negative Numbers

python
int("-100")   # Returns -100
int("--100")  # ValueError
int("1-00")   # ValueError

The Python 3.11 Security Limit

Converting a very long string to an integer has quadratic time complexity — O(n²) — because of how Python's internal integer representation works. An attacker who can control a string being passed to int() could construct an extremely long numeric string and trigger a denial-of-service condition by consuming excessive CPU time. This vulnerability was tracked as CVE-2020-10735.

Starting with the versions released on September 7, 2022 — Python 3.11, 3.10.7, 3.9.14, 3.8.14, and 3.7.14 — Python enforces a default limit of 4300 digits for integer-string conversions in base 10. Attempting to convert a string longer than 4300 digits raises a ValueError.

python
# Raises ValueError in Python 3.11+ (default limit is 4300 digits)
huge_str = "9" * 5000
int(huge_str)
# ValueError: Exceeds the limit (4300 digits) for integer string conversion

import sys
sys.set_int_max_str_digits(10000)  # Raise to 10,000 digits
sys.set_int_max_str_digits(0)      # 0 disables the limit entirely
Note

The limit can also be set via the environment variable PYTHONINTMAXSTRDIGITS or the command-line option -X int_max_str_digits=N. The limit applies only to base-10 conversions. Bases that are powers of 2 — binary, octal, and hexadecimal — are exempt because their conversion algorithms are linear-time.

ast.literal_eval() as a Safer Alternative

Python's ast module provides literal_eval(), which safely evaluates an expression string containing only Python literals: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.

python
import ast

ast.literal_eval("42")     # Returns 42 (int)
ast.literal_eval("3.14")   # Returns 3.14 (float)
ast.literal_eval("'hi'")   # Returns 'hi' (str)

Unlike eval(), literal_eval() refuses to execute arbitrary code. Passing arbitrary code to literal_eval() raises a ValueError rather than executing it.

However, there are two reasons int() remains the right choice for converting numeric strings in the general case. First, int() is significantly faster — benchmarks consistently show int() completing in a fraction of the time required by literal_eval(), because literal_eval() invokes the full parser and AST machinery. Second, int() gives you explicit control over the base, which literal_eval() does not expose in the same way.

Where ast.literal_eval() earns its place is in situations where the string might represent multiple possible types and you want to recover the correct one without eval(), or when the source of the string is particularly untrusted and you want the strongest possible barrier against code injection.

How to Convert a String to an Integer in Python

The following steps cover the complete, production-appropriate process for converting string values to integers safely.

  1. Call int() with the string

    Pass the string as the first argument to int(). For a decimal string, no second argument is needed: int('42') returns 42. For binary, octal, or hex strings, pass the base as the second argument, for example int('FF', 16) for hexadecimal.

  2. Wrap the call in a try/except block

    Enclose int() in a try block and catch ValueError for invalid string content and TypeError for wrong-type arguments. In your except clause, either return a default value, re-prompt the user, or raise a descriptive error depending on your use case.

  3. Validate the string before converting if needed

    For pre-validation without exceptions, use str.isdigit() on the stripped string. Remember that isdigit() returns False for strings with a leading sign, so strip the sign character first if signed integers are valid input. A try/except approach handles all cases correctly and is often simpler.

  4. Account for the Python 3.11 digit limit when working with large numbers

    If your application processes numeric strings longer than 4300 digits, call sys.set_int_max_str_digits() to raise or remove the limit before attempting conversion. This limit exists to prevent denial-of-service through the O(n²) conversion complexity and applies only to base-10 conversions.

Practical Patterns

The patterns below go beyond the basic int() call. Each addresses a category of real-world problem that basic tutorials tend to skip: user input loops, hex color parsing, environment configuration, locale-formatted numbers, binary protocol bytes, and mixed-string extraction.

Converting User Input in a Loop

The idiomatic Python pattern for prompted numeric input is a while True loop with a break on success. This avoids the need for pre-validation and handles all invalid input cleanly.

python
while True:
    raw = input("Enter a whole number: ")
    try:
        number = int(raw)
        break
    except ValueError:
        print(f"'{raw}' is not a valid integer. Try again.")

Parsing Hex Color Codes

The base-16 overload of int() makes RGB extraction from hex color strings concise and readable. Each two-character slice is an independent channel value.

python
def hex_to_rgb(hex_color):
    hex_color = hex_color.lstrip("#")
    r = int(hex_color[0:2], 16)
    g = int(hex_color[2:4], 16)
    b = int(hex_color[4:6], 16)
    return (r, g, b)

hex_to_rgb("#306998")  # Returns (48, 105, 152)
hex_to_rgb("#FFD43B")  # Returns (255, 212, 59)

Reading Environment Variables

Environment variables are always strings. Converting them at the point of reading, with a typed default, is cleaner than scattering int() calls throughout application logic.

python
import os

port = int(os.environ.get("PORT", "8080"))
max_retries = int(os.environ.get("MAX_RETRIES", "3"))

Comma-Separated and Locale-Formatted Numbers

Data from spreadsheets, financial APIs, and international sources often delivers numbers as formatted strings: "1,234,567" in US English, or "1.234.567" in German. Neither passes directly to int().

For US-formatted strings where the comma is always a thousands separator, stripping the comma before conversion is the simplest approach:

python
# US-style: comma is the thousands separator
int("1,234,567".replace(",", ""))   # Returns 1234567

For applications that must handle multiple locales correctly — where the separator character itself depends on the region — the standard library's locale module provides locale.atoi(). This function parses an integer string using the rules of the active locale, stripping grouping separators automatically according to LC_NUMERIC:

python
import locale

# Use the system's preferred locale
locale.setlocale(locale.LC_NUMERIC, "")

# Parses according to the active LC_NUMERIC locale
value = locale.atoi("1234567")
print(value)          # 1234567 (int)
print(type(value))    # <class 'int'>
Note on locale.setlocale()

The Python documentation notes that setlocale() is a process-wide setting and is not thread-safe. In multi-threaded applications, set the locale once at startup rather than toggling it per request. The locale module's LC_NUMERIC category controls number parsing; LC_ALL sets all categories at once.

Converting Binary Protocol Bytes to Integers

Network protocols, binary file formats, and hardware interfaces frequently deliver integers as raw bytes rather than text strings. Python's int.from_bytes() class method handles this directly, with explicit control over byte order (endianness) and signedness — two things that int() cannot express.

python
# Big-endian unsigned: most significant byte first
# Used in network protocols (TCP/IP, DNS, HTTP/2)
int.from_bytes(b'\x00\x2a', byteorder='big')           # Returns 42

# Little-endian: least significant byte first
# Used on x86/x64 CPUs and most file formats (PNG, ZIP, ELF)
int.from_bytes(b'\x2a\x00', byteorder='little')        # Returns 42

# Signed integer — two's complement interpretation
int.from_bytes(b'\xff', byteorder='big', signed=True)  # Returns -1
int.from_bytes(b'\xff', byteorder='big', signed=False) # Returns 255

# 4-byte big-endian packet field (e.g. a sequence number)
raw_packet = b'\x00\x00\x04\xd2'
seq_num = int.from_bytes(raw_packet, byteorder='big')
print(seq_num)  # 1234

The inverse operation — converting an integer back to bytes for transmission — is int.to_bytes(length, byteorder). Together these two methods give you a clean, pure-Python way to serialize and deserialize fixed-width integer fields without importing any external library.

When to use struct instead

When a binary record contains multiple adjacent fields of different types (for example, a 2-byte unsigned short followed by a 4-byte signed int), use struct.unpack() from the standard library. It unpacks an entire record into a tuple in one call and is more efficient than multiple int.from_bytes() calls with manual slice offsets. For a single integer field, int.from_bytes() is cleaner.

Extracting Integers from Mixed Strings with re.fullmatch()

Sometimes the string containing the number also contains non-numeric characters — units, labels, noise from parsing — and you need to validate and extract the integer component before converting. A regular expression with re.fullmatch() is the right tool here: it checks the entire string against a pattern and either returns a match object or None, with no exception machinery needed.

python
import re

# Pattern: optional sign, then one or more digits, nothing else
_INT_PATTERN = re.compile(r'^[+-]?\d+$')

def strict_int(s: str) -> int | None:
    """Return int if s is a valid signed integer string, else None."""
    s = s.strip()
    if _INT_PATTERN.fullmatch(s):
        return int(s)
    return None

strict_int("42")      # Returns 42
strict_int("-7")      # Returns -7
strict_int("42px")    # Returns None  (suffix not allowed)
strict_int("3.14")    # Returns None  (float string rejected)
strict_int("0b101")   # Returns None  (prefix notation rejected)

The compiled pattern is defined at module level so it is only compiled once rather than on every call. The approach is particularly useful in data cleaning pipelines where you want to flag unconvertible values explicitly rather than silently substituting a default, and where a try/except loop would obscure whether the failure came from a bad value or a different code path.

For cases where the integer is embedded within a larger string — say, extracting a numeric ID from a log line like "user_id=8421 action=login" — switch from re.fullmatch() to re.search() with a named capture group:

python
import re

log_line = "user_id=8421 action=login"
m = re.search(r'user_id=(?P<uid>\d+)', log_line)
if m:
    user_id = int(m.group("uid"))
    print(user_id)  # 8421

Named groups make the extraction self-documenting and resilient to changes in surrounding field order. The int() call on the captured group is always safe because the group pattern \d+ guarantees the match contains only digit characters.

Key Takeaways

  1. int() is the correct tool: For converting a numeric string to an integer in Python, int() is the built-in function designed for exactly this purpose. It handles decimal, signed, whitespace-padded, and underscore-grouped strings out of the box.
  2. Always catch ValueError: Any code path that calls int() on externally supplied data should handle ValueError. If the input can also be None or a non-string type, catch TypeError as well.
  3. The base parameter extends int() to other number systems: Binary, octal, and hexadecimal strings are all supported. Passing base=0 lets Python infer the base from the string's prefix.
  4. Know the 4300-digit security limit: Python 3.11 and corresponding patch releases enforce a default limit of 4300 digits on base-10 string-to-integer conversions to prevent CVE-2020-10735. This limit is configurable via sys.set_int_max_str_digits().
  5. isdigit() has a sign limitation: The str.isdigit() method returns False for strings with a leading minus sign. Use re.fullmatch(r'^[+-]?\d+$', s) for a single-pass check that handles signs correctly and rejects noise like units or prefixes.
  6. Use locale.atoi() for formatted international numbers: Strings like "1,234,567" or "1.234.567" are not valid inputs for int(). The standard library's locale.atoi() strips grouping separators according to LC_NUMERIC before converting, making it the correct tool for data from spreadsheets, financial feeds, and international sources.
  7. Use int.from_bytes() for binary data: Network protocols and binary file formats deliver integers as raw bytes, not text strings. int.from_bytes(data, byteorder='big') converts those bytes directly and correctly, with explicit control over endianness and signedness that int() cannot express. For multi-field binary records, use struct.unpack() instead.
  8. ast.literal_eval() is safer in adversarial contexts: When you need to prevent code injection and the input could represent multiple Python literal types, ast.literal_eval() is the right choice. For performance-sensitive or straightforward numeric parsing, prefer int().
check your understanding question 1 of 5

Frequently Asked Questions

Use the built-in int() function. Pass the string as the first argument: int('42') returns 42. The function strips leading and trailing whitespace automatically and accepts an optional base parameter for non-decimal conversions. If the string cannot be parsed, int() raises a ValueError.

int() raises a ValueError when the string content is not a valid integer representation. For example, int('hello'), int('3.14'), and int('') all raise ValueError. If the argument is the wrong type entirely (such as None or a list), int() raises a TypeError instead.

Yes. Pass the base as the second argument: int('1010', 2) converts a binary string and returns 10, int('17', 8) converts an octal string and returns 15, and int('FF', 16) converts a hex string and returns 255. Passing base=0 tells Python to infer the base from the string prefix: 0b for binary, 0o for octal, and 0x for hexadecimal.

Python 3.11 and the corresponding patch releases of earlier versions introduced a default limit of 4300 digits for base-10 string-to-integer conversions. Strings longer than this raise a ValueError. The limit exists to prevent denial-of-service attacks exploiting the O(n²) complexity of large integer conversions, as documented in CVE-2020-10735. The limit can be adjusted using sys.set_int_max_str_digits().

Yes. int() strips leading and trailing whitespace before parsing, so int(' 42 ') returns 42. However, whitespace embedded within the digit sequence causes a ValueError. For example, int('4 2') raises ValueError because the space is between digits, not at the edges.

Use ast.literal_eval() when the input is untrusted and might represent multiple Python literal types, or when you need the strongest possible barrier against code injection. Unlike eval(), ast.literal_eval() refuses to execute arbitrary code. For straightforward numeric string parsing, int() is faster and preferred.

str.isdigit() returns True only when every character in the string is a Unicode digit character. A leading minus sign is not a digit character, so '-42'.isdigit() returns False. To validate a string that may include a sign, strip the sign first and then check the remainder with isdigit(), or use a try/except block around int().

Wrap the int() call in a try/except block targeting ValueError. A loop that prompts again on failure is the standard pattern: call input() to get the raw string, attempt int() conversion inside try, break on success, and print an error message and continue on ValueError. Catching TypeError as well handles cases where None or a non-string type is passed.

References

  1. Python Software Foundation. Built-in Functions — Python 3 documentation. docs.python.org/3/library/functions.html
  2. Python Software Foundation. Integer string conversion length limitation. docs.python.org/3/library/stdtypes.html
  3. Python Software Foundation. ast.literal_eval — ast module. docs.python.org/3/library/ast.html
  4. Python Software Foundation. locale — Internationalization services. docs.python.org/3/library/locale.html
  5. Python Software Foundation. re — Regular expression operations. docs.python.org/3/library/re.html
  6. MITRE Corporation. CVE-2020-10735: Prevent DoS by large int<→str conversions. cve.mitre.org
  7. CPython Issue Tracker. Issue #95778: int string conversion O(n^2). github.com/python/cpython/issues/95778
  8. Python Software Foundation. What's New in Python 3.11. docs.python.org/3/whatsnew/3.11.html
Certificate of Completion
Final Exam
Pass mark: 80% · Score 80% or higher to receive your certificate

Enter your name as you want it to appear on your certificate, then start the exam. Your name is used only to generate your certificate and is never transmitted or stored anywhere.

Question 1 of 10