症状:服务启动失败,错误日志显示权限问题或损坏的表
sudo systemctl status mysql
sudo tail -n 50 /var/log/mysql/error.log
修复方法:
# 检查并修复权限
sudo chown -R mysql:mysql /var/lib/mysql
# 修复损坏的表
sudo mysqlcheck --all-databases --auto-repair -u root -p
# 强制恢复模式(严重损坏时)
sudo mysqld_safe --skip-grant-tables &
sudo mysql_upgrade -u root -p
症状:"Too many connections"错误
修复方法:
-- 临时增加连接数
SET GLOBAL max_connections = 500;
-- 永久修改(编辑my.cnf)
[mysqld]
max_connections = 500
症状:数据库操作失败,磁盘空间已满
修复方法:
# 查找大表
SELECT table_schema, table_name,
ROUND(data_length/1024/1024,2) as size_mb
FROM information_schema.tables
ORDER BY data_length DESC;
# 清理二进制日志
PURGE BINARY LOGS BEFORE '2023-01-01 00:00:00';
症状:服务启动失败,可能有WAL日志问题
修复方法:
# 检查日志
sudo tail -n 100 /var/log/postgresql/postgresql-13-main.log
# 尝试恢复模式
sudo -u postgres pg_resetwal -f /var/lib/postgresql/13/main/
症状:查询挂起,系统响应缓慢
修复方法:
-- 查看当前锁
SELECT blocked_locks.pid AS blocked_pid,
blocking_locks.pid AS blocking_pid
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid;
-- 终止阻塞进程
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction';
症状:启动时报错,无法读取数据文件
修复方法:
# 修复数据库
sudo mongod --repair --dbpath /var/lib/mongodb
# 修复后启动
sudo systemctl start mongod
症状:复制延迟高,成员状态异常
修复方法:
// 检查复制集状态
rs.status()
// 强制重新同步
rs.syncFrom("primary-member-hostname:27017")
症状:写入失败,"OOM command not allowed"错误
修复方法:
# 检查内存使用
redis-cli info memory
# 临时解决方案
redis-cli config set maxmemory 4gb
# 持久化配置
echo "maxmemory 4gb" >> /etc/redis/redis.conf
症状:Redis拒绝启动,AOF文件错误
修复方法:
# 修复AOF文件
redis-check-aof --fix /var/lib/redis/appendonly.aof
# 或者使用最后有效的RDB快照
mv /var/lib/redis/appendonly.aof /var/lib/redis/appendonly.aof.bak
cp /var/lib/redis/dump.rdb /var/lib/redis/dump.rdb.bak
定期备份:
# MySQL
mysqldump -u root -p --all-databases > full_backup.sql
# PostgreSQL
pg_dumpall -U postgres > full_backup.sql
# MongoDB
mongodump --out /backup/mongodb-$(date +%F)
# Redis
redis-cli save
cp /var/lib/redis/dump.rdb /backup/redis-$(date +%F).rdb
监控设置:
定期维护:
日志轮转:
# 配置logrotate
sudo nano /etc/logrotate.d/mysql
通过以上方法,可以解决Linux服务器上大多数常见的数据库故障,并预防潜在问题的发生。