Week 3, Session 3: Writing Robust, Maintainable Code¶
In our previous sessions, you learned to write structured, well-organized code using object-oriented principles. Today, we're focusing on the practices that distinguish good code from great code: robustness, readability, and maintainability. We'll integrate these concepts into a continuous process, showing you how to transform "code smells" into professional, production-ready code.
The Starting Point: Code Smells¶
We'll use these two examples to guide our lesson. These patterns are common in rapid prototyping but lead to maintenance headaches:
-
Code Smell 1: Ambiguous Logic and Return Types
- What are the inputs (
x,y)? Anint? Astr? - What are the outputs? It can return an
int, or adict, leading to confusion for anyone calling this function.
- What are the inputs (
-
Code Smell 2: Unstructured Data and Fragile Access
- This code assumes the input
data(a dictionary) has the keysage(which is an integer) andstatus(which is a string). If any key is missing or the type is wrong, the code will crash at runtime.
- This code assumes the input
Tool 1: Type Hints and Static Analysis (Fixing Ambiguity)¶
The first step in professional development is clarifying intent. Type hints tell developers and machines what types of data are expected, solving the ambiguity in Code Smell 1.
Applying Type Hints
-
We start by clarifying the inputs to
do_itare numbers, but we still have an ambiguous return type (Union). -
We introduce Enums and Literals to enforce highly specific input values, solving common logic bugs before they happen.
Static Analysis: After adding type hints, we run mypy to perform static analysis. mypy is the static type checker that runs before execution and flags the ambiguity in do_it_typed (returning int sometimes and dict other times). This forces the developer to refactor the logic for clarity.
Tool 2: Linting, Formatting, and Docstrings (Fixing Readability)¶
Once the code is logically correct, we make it beautiful and easy to read.
Linting and Formatting with Ruff¶
Ruff is an extremely fast tool that combines both linting (checking for programmatic errors) and formatting (enforcing style). It ensures all code is consistently spaced and follows best practices.
Running Ruff
- We run
ruff format .on the entire project. This instantly fixes indentation, line length, and spacing issues. - We run
ruff check .to enforce things like Python's naming conventions and complexity rules.
This process is typically automated using a pre-commit hook, ensuring that bad style or format never even makes it into the version control history.
Docstrings¶
A docstring explains what the code does, why it does it, and what inputs/outputs are involved.
Standardized Docstrings
-
We will use the VS Code extension
njpwerner.autodocstringto quickly generate a template. -
We'll adopt the Google Style because it's highly readable and uses clear
Args:,Returns:, andRaises:sections.
Tool 3: Pydantic (Fixing Fragile Data Structures)¶
Pydantic is the solution to Code Smell 2 (Unstructured Data). It forces dictionary-like data to conform to an object-oriented structure, eliminating runtime errors caused by missing keys or wrong types.
Building Robust Data Models¶
We will refactor the process(data) function by defining its input structure as a Pydantic BaseModel.
Basic Model Definition:We define the expected input using type hints. Pydantic validates and coerces the data upon instantiation.
Complex Models (Inheritance & Composition):¶
We use OOP principles to define complex data structures, which are then used as type hints.
- Inheritance: We extend a base model to add specific fields.
- Composition: A model "has a" list of other models.
- Alternatively, we have other options.
- We could use the
.parse_obj()classmethod, likeProject.parse_obj(project_data). - Fill out the model directly:
- We could use the
The "Good Code" Result¶
By applying these practices, the initial "code smells" are transformed into clear, reliable, and production-ready code:
| Old Code Smell | Applied Tools | New, Robust Code |
|---|---|---|
| Code Smell 1 (Ambiguity) | Type Hints, Ruff, Docstrings | The types are clear, and the logic is broken into readable, single-purpose functions (which ruff encourages). The return type is always int for consistency. |
| Code Smell 2 (Fragile Data) | Pydantic, Type Hints | The function input is a Pydantic object, eliminating runtime key errors. Logic is clean. |
Dictionaries or Pydantic?
As a personal preference, I usually default to Pydantic objects rather than working with dictionary types, however you don't have to chose only one. Sometimes you want the speed/comfort of the standard dict() or set(), but sometimes there is a benefit to have something with more defined structure.
Class Structure
classDiagram
class AccessLevelsEnum {
ADMIN
MEMBER
}
class UserStatusEnum {
ACTIVE
INACTIVE
SUSPENDED
}
class Permissions {
level: AccessLevelsEnum
extra: str | None
}
class InputData {
age: int
status: UserStatusEnum
permissions: Permissions
}
class OutputData {
ok: bool
msg: str | None
}
Permissions --> AccessLevelsEnum
InputData --> UserStatusEnum
InputData --> Permissions
Recommended Exercises & Homework¶
This week's homework is a comprehensive exercise that asks you to take a "bad code" example and apply all the professional practices we covered today.
The Starting Code (Configuration Loader):
Imagine you are given this script that loads a complex configuration dictionary for a hypothetical application:
Your Tasks (Refactor and Validate):
- Code Organization & Readability:
- Install ruff into your choice of development environment (preferably devcontainer).
- Run
ruff formatandruff checkon the script. Address and fix any errors it reports. - Write clear, standardized docstrings (e.g., Google Style) for all functions and methods.
- Type Constraint Implementation:
- Create a custom
StrEnumfor the environment setting (e.g.,Environment.DEV,Environment.PROD). - Create a
Literaltype to constrain the allowed database type (e.g.,"postgresql","mysql").
- Create a custom
- Pydantic Validation (Complex Structures):
- Define a
BaseModelcalledDBConfigthat uses the appropriate type hints for host, port, user, type (using yourLiteral), and password. - Define a final model,
AppConfig, that uses composition to include theDBConfigmodel and includes its own field forenvironment(using your customStrEnum). - Modify
load_configto accept and return yourAppConfigPydantic model, ensuring that the function will now robustly validate any input data automatically.
- Define a
Challenge: Auto-computed Fields
Try to include a new auto calculated field to AppConfig called conn_string, which is the combination of all the fields needed to reconstruct the string f"postgresql://{settings['database']['user']}@{settings['database']['host']}".
This exercise will give you hands-on experience in making a codebase robust and production-ready.
Suggested Readings & Resources¶
- Ruff Documentation: Getting Started with Ruff - Learn how to set up and configure this powerful linter/formatter.
- Pydantic Documentation: Pydantic v2 Documentation - Explore the "Defining models" and "Validation" sections.
- Real Python: Python Docstrings Guide - Excellent resource showing different docstring styles.
- Python
typing: PEP 586 (Literal Types) and PEP 647 (Type Guards) - Deepen your understanding of specific type hints.