Migrating from Python 2 to Python 3 involves several syntax and API changes. This guide will walk you through using Codegen to automate this migration, handling print statements, string handling, iterators, and more.

You can find the complete example code in our examples repository.

Overview

The migration process involves five main steps:

  1. Converting print statements to function calls
  2. Updating Unicode to str
  3. Converting raw_input to input
  4. Updating exception handling syntax
  5. Modernizing iterator methods

Let’s walk through each step using Codegen.

Step 1: Convert Print Statements

First, we need to convert Python 2’s print statements to Python 3’s print function calls:

def convert_print_statements(file):
    """Convert Python 2 print statements to Python 3 function calls"""
    lines = file.content.split('\n')
    new_content = []

    for line in lines:
        stripped = line.strip()
        if stripped.startswith('print '):
            indent = line[:len(line) - len(line.lstrip())]
            args = stripped[6:].strip()
            new_content.append(f"{indent}print({args})")
        else:
            new_content.append(line)

    if new_content != lines:
        file.edit('\n'.join(new_content))

This transforms code from:

print "Hello, world!"
print x, y, z

to:

print("Hello, world!")
print(x, y, z)

In Python 3, print is a function rather than a statement, requiring parentheses around its arguments.

Step 2: Update Unicode to str

Next, we update Unicode-related code to use Python 3’s unified string type:

def update_unicode_to_str(file):
    """Convert Unicode-related code to str for Python 3"""
    # Update imports from 'unicode' to 'str'
    for imp in file.imports:
        if imp.name == 'unicode':
            imp.set_name("str")

    # Update function calls from Unicode to str
    for func_call in file.function_calls:
        if func_call.name == "unicode":
            func_call.set_name("str")

        # Check function arguments for Unicode references
        for arg in func_call.args:
            if arg.value == "unicode":
                arg.set_value("str")

    # Find and update Unicode string literals (u"...")
    for string_literal in file.find('u"'):
        if string_literal.source.startswith('u"') or string_literal.source.startswith("u'"):
            new_string = string_literal.source[1:]  # Remove the 'u' prefix
            string_literal.edit(new_string)

This converts code from:

from __future__ import unicode_literals
text = unicode("Hello")
prefix = u"prefix"

to:

text = str("Hello")
prefix = "prefix"

Python 3 unifies string types, making the unicode type and u prefix unnecessary.

Step 3: Convert raw_input to input

Python 3 renames raw_input() to input():

def convert_raw_input(file):
    """Convert raw_input() calls to input()"""
    for call in file.function_calls:
        if call.name == "raw_input":
            call.edit(f"input{call.source[len('raw_input'):]}")

This updates code from:

name = raw_input("Enter your name: ")

to:

name = input("Enter your name: ")

Python 3’s input() function always returns a string, like Python 2’s raw_input().

Step 4: Update Exception Handling

Python 3 changes the syntax for exception handling:

def update_exception_syntax(file):
    """Update Python 2 exception handling to Python 3 syntax"""
    for editable in file.find("except "):
        if editable.source.lstrip().startswith("except") and ", " in editable.source and " as " not in editable.source:
            parts = editable.source.split(",", 1)
            new_source = f"{parts[0]} as{parts[1]}"
            editable.edit(new_source)

This converts code from:

try:
    process_data()
except ValueError, e:
    print(e)

to:

try:
    process_data()
except ValueError as e:
    print(e)

Python 3 uses as instead of a comma to name the exception variable.

Step 5: Update Iterator Methods

Finally, we update iterator methods to use Python 3’s naming:

def update_iterators(file):
    """Update iterator methods from Python 2 to Python 3"""
    for cls in file.classes:
        next_method = cls.get_method("next")
        if next_method:
            # Create new __next__ method with same content
            new_method_source = next_method.source.replace("def next", "def __next__")
            cls.add_source(new_method_source)
            next_method.remove()

This transforms iterator classes from:

class MyIterator:
    def next(self):
        return self.value

to:

class MyIterator:
    def __next__(self):
        return self.value

Python 3 renames the next() method to __next__() for consistency with other special methods.

Running the Migration

You can run the complete migration using our example script:

git clone https://github.com/codegen-sh/codegen-examples.git
cd codegen-examples/python2_to_python3
python run.py

The script will:

  1. Process all Python files in your codebase
  2. Apply the transformations in the correct order
  3. Maintain your code’s functionality while updating to Python 3 syntax

Next Steps

After migration, you might want to:

  • Add type hints to your code
  • Use f-strings for string formatting
  • Update dependencies to Python 3 versions
  • Run the test suite to verify functionality

Check out these related tutorials:

Learn More