Skip to content

refactoring

Refactoring in programming is the process of restructuring existing code without changing its external behavior. It is done to improve the readability, maintainability, and performance of the code. Refactoring can involve reorganizing code, renaming, and replacing complex code with simpler alternatives.

There are a few general things to look for when identifying 'poor' code:

  • Poor readability: Poorly written code can be difficult to read and understand, making it difficult to maintain and debug.
  • Poor structure: Code that is not organized properly can be difficult to maintain and debug.
  • Unnecessary complexity: Code that is overly complex can be difficult to understand and maintain.
  • Duplicate code: Code that is repeated multiple times can be inefficient and difficult to maintain.
  • Unused code: Unused code can be inefficient and difficult to maintain.

However, this doesn't tell us much about 'how' to refactor.

That is somewhat different from small code-refactorings, where it is often obvious how code could be improved. On a more atomic level there are some (Pythonic) patterns to identify: - small loops can often be re-written using list-comprehension. - 'old' style lookups in loops should be replaced by for...in... - poor function and variable names should be replaced with obvious, easy to understand names - getter and setter methods should be @property instead - use (build in) special methods in classes instead of custom methods - use decorators when useful Modern tools help to identify (and fix) most of these (and many more). These types of refactoring will probably be automated in the near future.

Larger refactoring however requires a somewhat different approach. What are the steps involved in refactoring complex code?

1. Testing!

The first step in refactoring is checking the tests. Tests are essential when refactoring code. Tests provide a safety-net for developers, allowing them to make changes to the code without fear of breaking existing functionality. It boosts the self-confidence of the developer, resulting in more and more often experimental refactoring. Tests can be used to identify any errors or bugs that may have been introduced during the refactoring process. Having high test-coverage speeds up refactoring time, however it does require that tests are able to run quickly; if it takes hours for tests to finish, a lot of development-time is wasted and the refactoring process becomes frustrating.

2. Duplicate

Make a new function or module instead of trying to update the existing code. This helps to easily compare the old and new code, possibly run A/B tests, and find their occurrences using logging or Warnings.

3. import warnings

The Warnings module in Python is used to alert developers to potential problems in their code. This is especially useful when refactoring code, as it can help to identify if code is used somewhere. Additionally, it can be used to suppress specific warnings that are not relevant to the current refactoring task Python can run in a mode where all warning become errors: python -W error my_code.py

4. Performance tests

Refactoring should not result in worsened performance, so testing speed before and after the changes would make sense. Python Timeit library is a great tool for this and makes it easy to temporarily add performance tests. Having these performance test available will help to identify bottlenecks over time.

5. Small steps

Many incremental changes are better than one massive change. In the end, after many smaller steps, you might end up with a massive change after all. These small steps allow to continually test if any of the changes broke anything.

6. Know the libraries

Quite often, complex code is the result of trying to re-invent some fancy algorithm. Changes are that it already exists in some form as a design pattern or as a function in a build-in library. The modules itertools, functools, ABC and Numpy are packed with useful patterns and functions.

7. Iterables

Nested loops (with if-statements) can result in complex code. Therefore, sometimes it makes sense to move a loop to some function. (interestingly, this sometimes even speeds things up a bit, but not always). But sometimes a better way is to create an iterable object. To make an iterable object, create a class that has at least the two functions: __iter__ and __next__. This forces the developer to think about the object and its purpose. When implemented correctly, this results in more understandable and (re)usable objects.

8. Documentation

The best moment to update documentation is right after refactoring. By reserving some time for writing documentation, as part of the refactoring planning, you end up with more tangible results.