Common Issues and Solutions
This runbook provides solutions to frequently encountered issues during development.
Table of Contents
- Database Issues
- Docker Issues
- Authentication & API Issues
- Performance Issues
- Background Job Issues
- Build & Deployment Issues
- Development Environment Issues
Database Issues
Issue: Database Connection Failed
Symptom:
sqlalchemy.exc.OperationalError: could not connect to server: Connection refused
Cause: PostgreSQL service not running or wrong connection settings
Solution:
# Check if PostgreSQL is running
docker-compose ps db
# If not running, start it
docker-compose up -d db
# Check logs
docker-compose logs db
# Verify connection settings in .env
cat .env | grep DATABASE_URL
# Test connection
docker-compose exec db psql -U postgres -c "SELECT 1"
Issue: Migration Conflicts
Symptom:
alembic.util.exc.CommandError: Target database is not up to date.
Cause: Multiple migrations created from different branches
Solution:
# Check current version
alembic current
# Check migration history
alembic history
# If migrations conflict, you may need to:
# Option 1: Merge migrations (if in development)
alembic merge heads -m "merge migrations"
alembic upgrade head
# Option 2: Reset database (DEVELOPMENT ONLY)
docker-compose down -v
docker-compose up -d db
alembic upgrade head
# Option 3: Manually resolve (production)
# Contact team lead for guidance
Issue: Slow Query Performance
Symptom:
- API endpoints taking >1 second
- Database CPU usage high
Cause: Missing indexes or N+1 queries
Solution:
# 1. Enable query logging
# In your endpoint, add:
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
# 2. Check for N+1 queries
# Look for multiple SELECT queries in logs
# 3. Use eager loading
from sqlalchemy.orm import joinedload
# BAD - N+1 query
orders = db.query(Order).all()
for order in orders:
print(order.user.email) # Each iteration triggers a query
# GOOD - Eager loading
orders = db.query(Order).options(joinedload(Order.user)).all()
for order in orders:
print(order.user.email) # No additional queries
# 4. Add indexes
# See docs/02-standards/database-patterns.md
# Add index in migration:
op.create_index('idx_orders_user_id', 'orders', ['user_id'])
Issue: Database Locked (SQLite Development)
Symptom:
sqlite3.OperationalError: database is locked
Cause: Multiple processes accessing SQLite simultaneously
Solution:
# Use PostgreSQL instead of SQLite for development
# Update docker-compose.yml and use PostgreSQL service
# If you must use SQLite:
# 1. Close all database connections
# 2. Delete the database file
# 3. Run migrations again
rm test.db
alembic upgrade head
Docker Issues
Issue: Port Already in Use
Symptom:
Error: Bind for 0.0.0.0:5432 failed: port is already allocated
Cause: Another service using the same port
Solution:
# Find process using the port
lsof -i :5432 # On macOS/Linux
netstat -ano | findstr :5432 # On Windows
# Kill the process
kill -9 <PID>
# Or change port in docker-compose.yml
ports:
- "5433:5432" # Use different host port
# Restart services
docker-compose down
docker-compose up -d
Issue: Docker Container Won’t Start
Symptom: Container exits immediately after starting
Cause: Configuration error, missing dependencies, or resource limits
Solution:
# Check container logs
docker-compose logs app
# Check container status
docker-compose ps
# Common fixes:
# 1. Environment variable missing
docker-compose config # Validate docker-compose.yml
# 2. Build issues
docker-compose build --no-cache app
docker-compose up -d app
# 3. Resource limits
# Increase Docker Desktop memory/CPU limits
# 4. Permission issues
chmod +x entrypoint.sh
# 5. Start container interactively for debugging
docker-compose run --rm app /bin/bash
Issue: Out of Disk Space
Symptom:
Error: no space left on device
Cause: Docker images and volumes filling up disk
Solution:
# Check Docker disk usage
docker system df
# Clean up unused resources
docker system prune -a # Remove all unused images, containers, networks
# Remove unused volumes
docker volume prune
# Remove specific volume
docker volume rm yourapp_postgres_data
# Check system disk space
df -h
Authentication & API Issues
Issue: 401 Unauthorized on All Requests
Symptom: All API requests return 401 even with valid token
Cause: JWT secret mismatch or token expiration
Solution:
# 1. Check JWT_SECRET_KEY in .env matches across services
cat .env | grep JWT_SECRET_KEY
# 2. Verify token hasn't expired
# Decode JWT at https://jwt.io
# Check exp (expiration) claim
# 3. Generate new token
curl -X POST http://localhost:8000/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "user@example.com", "password": "password"}'
# 4. Clear old tokens from localStorage (frontend)
localStorage.clear()
# 5. Restart services
docker-compose restart app
Issue: CORS Errors
Symptom:
Access to fetch at 'http://localhost:8000' from origin 'http://localhost:3000'
has been blocked by CORS policy
Cause: Frontend origin not in CORS_ORIGINS
Solution:
# 1. Check CORS configuration in .env
CORS_ORIGINS=["http://localhost:3000", "http://localhost:5173"]
# 2. Update app/main.py
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000", "http://localhost:5173"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# 3. Restart API server
docker-compose restart app
# 4. Clear browser cache
# Chrome: Cmd+Shift+R (Mac) or Ctrl+Shift+R (Windows)
Issue: API Endpoint Returns 500 Internal Server Error
Symptom: Request succeeds in Postman but returns 500 from frontend
Cause: Usually a Python exception in the endpoint
Solution:
# 1. Check API logs
docker-compose logs -f app
# 2. Look for stack trace
# Error will show which line caused the exception
# 3. Common causes:
# - Missing database record (use .first() not .one())
# - Type mismatch (string passed where int expected)
# - Missing environment variable
# - External API call failed
# 4. Add try/except with logging
try:
result = some_operation()
except Exception as e:
logger.error(f"Operation failed: {str(e)}")
raise HTTPException(status_code=500, detail=str(e))
Performance Issues
Issue: Slow API Response Times (>500ms)
Symptom: API endpoints taking longer than 500ms to respond
Cause: Multiple possible causes
Solution:
# 1. Check which part is slow
# Add timing logs to your endpoint:
import time
start = time.time()
# Your code here
logger.info(f"Operation took {time.time() - start:.2f}s")
# 2. Common slow operations:
# Database queries
# Solution: Add indexes, use eager loading, check for N+1
# External API calls
# Solution: Use async/await or move to background job
# File processing
# Solution: Move to background job
# 3. Use profiling
# Install:
pip install py-spy
# Profile running process:
py-spy top --pid <PID>
# 4. Check database slow query log
# See debugging.md for EXPLAIN ANALYZE
Issue: Memory Leak
Symptom: Container memory usage grows over time, eventually crashes
Cause: Objects not being garbage collected
Solution:
# 1. Monitor memory usage
docker stats yourapp_api
# 2. Check for common causes:
# - Database connections not closed
# Solution: Use context managers or dependencies
# - Large objects cached in memory
# Solution: Clear cache periodically, use Redis
# - Circular references
# Solution: Use weakref where appropriate
# 3. Profile memory usage
pip install memory_profiler
# Add to your code:
from memory_profiler import profile
@profile
def your_function():
# Your code
# Run and check output
# 4. Restart container as temporary fix
docker-compose restart app
Background Job Issues
Issue: Celery Worker Not Processing Jobs
Symptom: Jobs queued but not executing
Cause: Worker not running or Redis connection failed
Solution:
# 1. Check if worker is running
docker-compose ps worker
# 2. Check worker logs
docker-compose logs -f worker
# 3. Check Redis connection
docker-compose exec redis redis-cli ping
# Should return: PONG
# 4. Check queue depth
docker-compose exec redis redis-cli LLEN celery
# 5. Restart worker
docker-compose restart worker
# 6. Purge queue if needed (DEVELOPMENT ONLY)
celery -A app.worker purge
Issue: Background Jobs Failing
Symptom: Jobs start but fail with errors
Cause: Exception in task code
Solution:
# 1. Check worker logs for stack trace
docker-compose logs -f worker
# 2. Test task manually
python
>>> from app.tasks.your_task import your_task
>>> your_task.delay(args)
# 3. Common causes:
# - Database connection not available in worker
# - Environment variables not set for worker
# - External API unreachable
# 4. Add retry logic
from celery import Task
@celery_app.task(bind=True, max_retries=3)
def your_task(self, arg):
try:
# Your code
except Exception as exc:
raise self.retry(exc=exc, countdown=60)
Issue: Redis Connection Refused
Symptom:
redis.exceptions.ConnectionError: Error connecting to Redis
Cause: Redis service not running
Solution:
# Check Redis status
docker-compose ps redis
# Start Redis
docker-compose up -d redis
# Test connection
docker-compose exec redis redis-cli ping
# Check REDIS_URL in .env
cat .env | grep REDIS_URL
# Should be: redis://redis:6379/0 (in Docker)
# or: redis://localhost:6379/0 (local)
Build & Deployment Issues
Issue: Pre-commit Hooks Failing
Symptom: Commit blocked by pre-commit hooks
Cause: Code doesn’t meet formatting/linting standards
Solution:
# 1. See what failed
# Output shows which hook failed and why
# 2. Run hooks manually to see details
pre-commit run --all-files
# 3. Common failures:
# Black (formatting)
black app/
# isort (imports)
isort app/
# flake8 (linting)
flake8 app/
# Fix issues shown
# mypy (type checking)
mypy app/
# Add type hints
# 4. Try commit again
git add .
git commit -m "your message"
# 5. If urgent, skip hooks (NOT RECOMMENDED)
git commit --no-verify
Issue: Tests Passing Locally but Failing in CI
Symptom: pytest passes on your machine but fails in GitHub Actions
Cause: Environment differences
Solution:
# Common causes:
# 1. Different Python/Node version
# Solution: Match versions in CI config
# 2. Missing environment variables
# Solution: Add to GitHub Secrets
# 3. Time zone differences
# Solution: Use UTC in tests
# 4. File path differences (Windows vs Linux)
# Solution: Use pathlib instead of string paths
# 5. Test depends on local data
# Solution: Use fixtures, don't rely on local files
# 6. Race condition in async tests
# Solution: Add proper awaits/sleeps
# Replicate CI environment locally:
docker run -it python:3.11 /bin/bash
# Run tests in container
Development Environment Issues
Issue: Hot Reload Not Working
Symptom: Changes to code don’t trigger automatic restart
Cause: Volume mounting issue or file watcher not detecting changes
Solution:
# For FastAPI (uvicorn):
# 1. Ensure --reload flag is set
uvicorn app.main:app --reload
# 2. Check volume mounts in docker-compose.yml
volumes:
- .:/app # Should mount current directory
# 3. Restart container
docker-compose restart app
# For React (Vite):
# 1. Check vite.config.ts
server: {
watch: {
usePolling: true # For Docker
}
}
# 2. Restart frontend
npm run dev
# WSL2 specific:
# File changes on Windows might not trigger in WSL
# Solution: Move project inside WSL filesystem
Issue: Import Errors After Adding New Module
Symptom:
ModuleNotFoundError: No module named 'your_module'
Cause: Virtual environment not activated or package not installed
Solution:
# 1. Activate virtual environment
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# 2. Install package
pip install your-package
# 3. For development dependencies
pip install -r requirements-dev.txt
# 4. In Docker
docker-compose build app
docker-compose up -d app
# 5. Check Python path
python -c "import sys; print(sys.path)"
Getting Help
If these solutions don’t resolve your issue:
- Check logs - Most issues show clear errors in logs
- Search Slack -
#dev-teamchannel history - Ask in Slack -
#dev-teamwith:- What you’re trying to do
- What error you’re seeing
- What you’ve already tried
- Create GitHub issue - For bugs that need tracking
- Pair with senior dev - For complex issues
Prevention
Many issues can be prevented by:
- ✅ Running tests before committing
- ✅ Using pre-commit hooks
- ✅ Keeping dependencies updated
- ✅ Reading error messages carefully
- ✅ Following our standards (see
docs/02-standards/)
Next Steps
- For debugging techniques, see
debugging.md - For production incidents, see
incident-response.md - For deployment issues, see
../03-workflows/deployment.md