Troubleshooting
Common issues and solutions across RavenmaskOS.
Quick Diagnostics
# Check all containers
docker ps --format "table {{.Names}}\t{{.Status}}" | sort
# Check disk space
df -h ~/ravenhelm/data
# Check memory
docker stats --no-stream
# Check logs
docker logs --tail 50 <container>
Common Issues
Service Unavailable (502/504)
Symptoms: Browser shows 502 Bad Gateway or 504 Timeout
Diagnosis:
# Check Traefik
docker logs traefik | tail -20
# Check backend
docker ps | grep <service>
docker logs <service> | tail -20
Solutions:
- Restart the backend service
- Verify Traefik labels are correct
- Check service is on ravenhelm_net
Container Crash Loop
Symptoms: Container keeps restarting
Diagnosis:
docker ps -a | grep <service>
docker logs <service>
Solutions:
- Check configuration errors
- Verify environment variables
- Check dependencies are running
- Review resource limits
Database Connection Failed
Symptoms: Services can't connect to PostgreSQL
Diagnosis:
docker exec postgres pg_isready
docker logs postgres
Solutions:
- Restart PostgreSQL:
docker restart postgres - Check credentials in .env
- Verify network connectivity
SSO Login Fails
Symptoms: Redirect loop or error after Zitadel login
Diagnosis:
docker logs zitadel | grep -i error
docker logs <service> | grep -i oauth
Solutions:
- Verify Client ID/Secret
- Check redirect URI matches exactly
- Clear browser cookies
- Verify Zitadel is healthy
Disk Space Full
Symptoms: Services fail, write errors in logs
Diagnosis:
df -h
du -sh ~/ravenhelm/data/*
docker system df
Solutions:
- Clean Docker:
docker system prune -a - Clean logs:
docker logs --tail 0 <container> - Rotate large data directories
- Review retention policies
Certificate Errors
Symptoms: Browser shows certificate warning
Diagnosis:
docker logs traefik | grep -i acme
echo | openssl s_client -servername <domain> -connect <domain>:443 2>/dev/null | openssl x509 -noout -dates
Solutions:
- Verify AWS credentials
- Check DNS resolution
- Force renewal: delete acme.json and restart Traefik
Service-Specific Issues
See individual service pages for detailed troubleshooting:
Memory/Embedding Type Errors
Symptoms: Norns agent fails with TypeError on memory operations
TypeError: Cannot convert Python list to PostgreSQL type
Diagnosis:
# Check if pgvector is registered
ssh ravenhelm@100.115.101.81 "docker logs norns-agent | grep -i pgvector"
# Test embedding query
ssh ravenhelm@100.115.101.81 "docker exec -i postgres psql -U ravenhelm -d ravenmaskos -c 'SELECT COUNT(*) FROM episodic_memories;'"
Solution:
Ensure pgvector.asyncpg.register_vector() is called in main.py startup and embeddings are passed as Python lists (not strings):
# CORRECT
embedding = [0.123, 0.456, ...] # Python list
await db.execute("INSERT ... VALUES (:embedding)", {"embedding": embedding})
# INCORRECT
embedding_str = '[' + ','.join(map(str, embedding)) + ']' # Don't do this
See Norns Memory System for details.
OpenFGA Permission Denied
Symptoms: User gets 403 Forbidden on resources they should access
Diagnosis:
# Check authorization tuples
ssh ravenhelm@100.115.101.81 "docker exec -i postgres psql -U ravenhelm -d openfga -c \"
SELECT COUNT(*) FROM tuple WHERE user_object_id = 'user:USERID';
\""
# Check OpenFGA logs
ssh ravenhelm@100.115.101.81 "docker logs openfga | grep -i error"
Solutions:
- Verify authorization tuple exists for user-resource relationship
- Check user_id matches
auth_provider_idfrom Zitadel (not UUID) - Ensure OpenFGA model ID is correct:
01KE1W3RJH1E13G84N3ERN5XDN
See OpenFGA for details.