Python Read YAML File to Dict

Complete guide to reading YAML files into Python dictionaries

Last updated: November 2024 8 min read

Introduction

Reading YAML files into Python dictionaries is a common task in configuration management and data processing. This guide covers how to use PyYAML, the most popular Python library for YAML processing, to read YAML files and convert them to Python dictionaries.

Installation

Install PyYAML using pip:

# Install PyYAML
pip install pyyaml

# Or using conda
conda install pyyaml

# For Python 3, you may need to use pip3
pip3 install pyyaml

Basic Usage

Reading from a File

The simplest way to read a YAML file into a dictionary:

import yaml

# Read YAML file into dictionary
with open('config.yaml', 'r') as file:
    data = yaml.safe_load(file)

print(data)
# Output: {'key': 'value', 'nested': {'item': 123}}

Reading from a String

You can also parse YAML from a string:

import yaml

yaml_string = """
name: John Doe
age: 30
city: New York
"""

data = yaml.safe_load(yaml_string)
print(data)
# Output: {'name': 'John Doe', 'age': 30, 'city': 'New York'}

Complete Example

A complete example with error handling:

import yaml
import sys

try:
    with open('config.yaml', 'r') as file:
        data = yaml.safe_load(file)
        if data is None:
            data = {}  # Handle empty files
        print("YAML loaded successfully:")
        print(data)
except FileNotFoundError:
    print("Error: config.yaml not found")
    sys.exit(1)
except yaml.YAMLError as e:
    print(f"Error parsing YAML: {e}")
    sys.exit(1)

Example YAML File

Sample config.yaml file:

# config.yaml
app:
  name: My Application
  version: 1.0.0
  debug: false

database:
  host: localhost
  port: 5432
  name: myapp_db
  credentials:
    username: admin
    password: secret123

features:
  enabled:
    - authentication
    - logging
    - caching
  max_users: 1000

Reading and Accessing Data

import yaml

with open('config.yaml', 'r') as file:
    config = yaml.safe_load(file)

# Access nested values
app_name = config['app']['name']
db_host = config['database']['host']
db_password = config['database']['credentials']['password']

# Access list items
first_feature = config['features']['enabled'][0]

print(f"App: {app_name}")
print(f"Database: {db_host}")
print(f"First feature: {first_feature}")

safe_load vs load

yaml.safe_load() (Recommended)

Safe method that only loads standard YAML tags:

import yaml

# Safe loading (recommended)
with open('config.yaml', 'r') as file:
    data = yaml.safe_load(file)

# Only loads basic Python types:
# dict, list, str, int, float, bool, None

Benefits:

  • Prevents arbitrary code execution
  • More secure for untrusted YAML files
  • Recommended for configuration files

yaml.load() (Not Recommended)

Unsafe method that can execute arbitrary Python code:

import yaml

# Unsafe loading (not recommended)
with open('config.yaml', 'r') as file:
    data = yaml.load(file, Loader=yaml.FullLoader)

# Warning: Can execute arbitrary Python code
# Only use with trusted YAML files

Security Risk: yaml.load() can execute arbitrary code from YAML files. Always use yaml.safe_load() unless you fully trust the source.

Advanced Usage

Loading Multiple Documents

YAML files can contain multiple documents separated by ---:

import yaml

# YAML with multiple documents
yaml_content = """
---
name: Document 1
value: 100
---
name: Document 2
value: 200
"""

# Load all documents
documents = list(yaml.safe_load_all(yaml_content))
for doc in documents:
    print(doc)

Custom Loader with Error Handling

import yaml
from pathlib import Path

def load_yaml_to_dict(file_path):
    """Load YAML file to dictionary with comprehensive error handling."""
    path = Path(file_path)
    
    if not path.exists():
        raise FileNotFoundError(f"YAML file not found: {file_path}")
    
    try:
        with open(path, 'r', encoding='utf-8') as file:
            data = yaml.safe_load(file)
            return data if data is not None else {}
    except yaml.YAMLError as e:
        raise ValueError(f"Invalid YAML syntax: {e}")
    except Exception as e:
        raise RuntimeError(f"Error reading YAML file: {e}")

# Usage
try:
    config = load_yaml_to_dict('config.yaml')
    print(config)
except (FileNotFoundError, ValueError, RuntimeError) as e:
    print(f"Error: {e}")

Accessing Nested Values Safely

import yaml

with open('config.yaml', 'r') as file:
    config = yaml.safe_load(file)

# Safe access with default values
app_name = config.get('app', {}).get('name', 'Unknown')
db_port = config.get('database', {}).get('port', 5432)

# Using try-except
try:
    password = config['database']['credentials']['password']
except KeyError:
    password = None
    print("Password not found in config")

print(f"App: {app_name}, Port: {db_port}")

Best Practices

  • Always use yaml.safe_load() - Prevents security vulnerabilities from arbitrary code execution
  • Handle empty files - Check if the result is None and provide a default empty dict
  • Use context managers - Always use with open() for proper file handling
  • Validate data structure - Verify expected keys exist before accessing nested values
  • Specify encoding - Use encoding='utf-8' when opening files

Common Issues and Solutions

Issue: ModuleNotFoundError

PyYAML is not installed:

# Error: ModuleNotFoundError: No module named 'yaml'
# Solution: Install PyYAML
pip install pyyaml

Issue: YAMLError on parsing

Invalid YAML syntax in the file:

# Validate YAML before loading
import yaml

try:
    with open('config.yaml', 'r') as file:
        data = yaml.safe_load(file)
except yaml.YAMLError as e:
    print(f"YAML syntax error: {e}")

Issue: KeyError when accessing nested values

Use safe access methods:

# Instead of: config['database']['port']
# Use safe access:
port = config.get('database', {}).get('port', 5432)

# Or check first:
if 'database' in config and 'port' in config['database']:
    port = config['database']['port']

Related Articles