Python Read YAML File to Dict
Complete guide to reading YAML files into Python dictionaries
Introduction
Reading YAML files into Python dictionaries is a common task in configuration management and data processing. This guide covers how to use PyYAML, the most popular Python library for YAML processing, to read YAML files and convert them to Python dictionaries.
Installation
Install PyYAML using pip:
# Install PyYAML
pip install pyyaml
# Or using conda
conda install pyyaml
# For Python 3, you may need to use pip3
pip3 install pyyaml
Basic Usage
Reading from a File
The simplest way to read a YAML file into a dictionary:
import yaml
# Read YAML file into dictionary
with open('config.yaml', 'r') as file:
data = yaml.safe_load(file)
print(data)
# Output: {'key': 'value', 'nested': {'item': 123}}
Reading from a String
You can also parse YAML from a string:
import yaml
yaml_string = """
name: John Doe
age: 30
city: New York
"""
data = yaml.safe_load(yaml_string)
print(data)
# Output: {'name': 'John Doe', 'age': 30, 'city': 'New York'}
Complete Example
A complete example with error handling:
import yaml
import sys
try:
with open('config.yaml', 'r') as file:
data = yaml.safe_load(file)
if data is None:
data = {} # Handle empty files
print("YAML loaded successfully:")
print(data)
except FileNotFoundError:
print("Error: config.yaml not found")
sys.exit(1)
except yaml.YAMLError as e:
print(f"Error parsing YAML: {e}")
sys.exit(1)
Example YAML File
Sample config.yaml file:
# config.yaml
app:
name: My Application
version: 1.0.0
debug: false
database:
host: localhost
port: 5432
name: myapp_db
credentials:
username: admin
password: secret123
features:
enabled:
- authentication
- logging
- caching
max_users: 1000
Reading and Accessing Data
import yaml
with open('config.yaml', 'r') as file:
config = yaml.safe_load(file)
# Access nested values
app_name = config['app']['name']
db_host = config['database']['host']
db_password = config['database']['credentials']['password']
# Access list items
first_feature = config['features']['enabled'][0]
print(f"App: {app_name}")
print(f"Database: {db_host}")
print(f"First feature: {first_feature}")
safe_load vs load
yaml.safe_load() (Recommended)
Safe method that only loads standard YAML tags:
import yaml
# Safe loading (recommended)
with open('config.yaml', 'r') as file:
data = yaml.safe_load(file)
# Only loads basic Python types:
# dict, list, str, int, float, bool, None
Benefits:
- Prevents arbitrary code execution
- More secure for untrusted YAML files
- Recommended for configuration files
yaml.load() (Not Recommended)
Unsafe method that can execute arbitrary Python code:
import yaml
# Unsafe loading (not recommended)
with open('config.yaml', 'r') as file:
data = yaml.load(file, Loader=yaml.FullLoader)
# Warning: Can execute arbitrary Python code
# Only use with trusted YAML files
Security Risk: yaml.load() can execute arbitrary code from YAML files. Always use yaml.safe_load() unless you fully trust the source.
Advanced Usage
Loading Multiple Documents
YAML files can contain multiple documents separated by ---:
import yaml
# YAML with multiple documents
yaml_content = """
---
name: Document 1
value: 100
---
name: Document 2
value: 200
"""
# Load all documents
documents = list(yaml.safe_load_all(yaml_content))
for doc in documents:
print(doc)
Custom Loader with Error Handling
import yaml
from pathlib import Path
def load_yaml_to_dict(file_path):
"""Load YAML file to dictionary with comprehensive error handling."""
path = Path(file_path)
if not path.exists():
raise FileNotFoundError(f"YAML file not found: {file_path}")
try:
with open(path, 'r', encoding='utf-8') as file:
data = yaml.safe_load(file)
return data if data is not None else {}
except yaml.YAMLError as e:
raise ValueError(f"Invalid YAML syntax: {e}")
except Exception as e:
raise RuntimeError(f"Error reading YAML file: {e}")
# Usage
try:
config = load_yaml_to_dict('config.yaml')
print(config)
except (FileNotFoundError, ValueError, RuntimeError) as e:
print(f"Error: {e}")
Accessing Nested Values Safely
import yaml
with open('config.yaml', 'r') as file:
config = yaml.safe_load(file)
# Safe access with default values
app_name = config.get('app', {}).get('name', 'Unknown')
db_port = config.get('database', {}).get('port', 5432)
# Using try-except
try:
password = config['database']['credentials']['password']
except KeyError:
password = None
print("Password not found in config")
print(f"App: {app_name}, Port: {db_port}")
Best Practices
-
•
Always use yaml.safe_load() - Prevents security vulnerabilities from arbitrary code execution
-
•
Handle empty files - Check if the result is None and provide a default empty dict
-
•
Use context managers - Always use
with open()for proper file handling -
•
Validate data structure - Verify expected keys exist before accessing nested values
-
•
Specify encoding - Use
encoding='utf-8'when opening files
Common Issues and Solutions
Issue: ModuleNotFoundError
PyYAML is not installed:
# Error: ModuleNotFoundError: No module named 'yaml'
# Solution: Install PyYAML
pip install pyyaml
Issue: YAMLError on parsing
Invalid YAML syntax in the file:
# Validate YAML before loading
import yaml
try:
with open('config.yaml', 'r') as file:
data = yaml.safe_load(file)
except yaml.YAMLError as e:
print(f"YAML syntax error: {e}")
Issue: KeyError when accessing nested values
Use safe access methods:
# Instead of: config['database']['port']
# Use safe access:
port = config.get('database', {}).get('port', 5432)
# Or check first:
if 'database' in config and 'port' in config['database']:
port = config['database']['port']