插件窝 干货文章 Linux服务器上常见的数据库故障及其修复方法

Linux服务器上常见的数据库故障及其修复方法

locks 修复 blocking blocked 306    来源:    2025-04-17

Linux服务器常见数据库故障及修复方法

MySQL/MariaDB常见故障

1. 数据库无法启动

症状:服务启动失败,错误日志显示权限问题或损坏的表

sudo systemctl status mysql
sudo tail -n 50 /var/log/mysql/error.log

修复方法

# 检查并修复权限
sudo chown -R mysql:mysql /var/lib/mysql

# 修复损坏的表
sudo mysqlcheck --all-databases --auto-repair -u root -p

# 强制恢复模式(严重损坏时)
sudo mysqld_safe --skip-grant-tables &
sudo mysql_upgrade -u root -p

2. 连接数耗尽

症状:"Too many connections"错误

修复方法

-- 临时增加连接数
SET GLOBAL max_connections = 500;

-- 永久修改(编辑my.cnf)
[mysqld]
max_connections = 500

3. 磁盘空间不足

症状:数据库操作失败,磁盘空间已满

修复方法

# 查找大表
SELECT table_schema, table_name, 
ROUND(data_length/1024/1024,2) as size_mb 
FROM information_schema.tables 
ORDER BY data_length DESC;

# 清理二进制日志
PURGE BINARY LOGS BEFORE '2023-01-01 00:00:00';

PostgreSQL常见故障

1. 数据库无法启动

症状:服务启动失败,可能有WAL日志问题

修复方法

# 检查日志
sudo tail -n 100 /var/log/postgresql/postgresql-13-main.log

# 尝试恢复模式
sudo -u postgres pg_resetwal -f /var/lib/postgresql/13/main/

2. 死锁问题

症状:查询挂起,系统响应缓慢

修复方法

-- 查看当前锁
SELECT blocked_locks.pid AS blocked_pid,
       blocking_locks.pid AS blocking_pid
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks 
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid;

-- 终止阻塞进程
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction';

MongoDB常见故障

1. 数据文件损坏

症状:启动时报错,无法读取数据文件

修复方法

# 修复数据库
sudo mongod --repair --dbpath /var/lib/mongodb

# 修复后启动
sudo systemctl start mongod

2. 复制集成员不同步

症状:复制延迟高,成员状态异常

修复方法

// 检查复制集状态
rs.status()

// 强制重新同步
rs.syncFrom("primary-member-hostname:27017")

Redis常见故障

1. 内存耗尽

症状:写入失败,"OOM command not allowed"错误

修复方法

# 检查内存使用
redis-cli info memory

# 临时解决方案
redis-cli config set maxmemory 4gb

# 持久化配置
echo "maxmemory 4gb" >> /etc/redis/redis.conf

2. AOF文件损坏

症状:Redis拒绝启动,AOF文件错误

修复方法

# 修复AOF文件
redis-check-aof --fix /var/lib/redis/appendonly.aof

# 或者使用最后有效的RDB快照
mv /var/lib/redis/appendonly.aof /var/lib/redis/appendonly.aof.bak
cp /var/lib/redis/dump.rdb /var/lib/redis/dump.rdb.bak

通用预防措施

  1. 定期备份

    # MySQL
    mysqldump -u root -p --all-databases > full_backup.sql
    
    # PostgreSQL
    pg_dumpall -U postgres > full_backup.sql
    
    # MongoDB
    mongodump --out /backup/mongodb-$(date +%F)
    
    # Redis
    redis-cli save
    cp /var/lib/redis/dump.rdb /backup/redis-$(date +%F).rdb
    
  2. 监控设置

    • 监控磁盘空间、内存使用和连接数
    • 设置关键指标的警报阈值
  3. 定期维护

    • 优化表/索引
    • 清理旧数据
    • 更新统计信息
  4. 日志轮转

    # 配置logrotate
    sudo nano /etc/logrotate.d/mysql
    

通过以上方法,可以解决Linux服务器上大多数常见的数据库故障,并预防潜在问题的发生。