硬件配置:
系统调优:
# 调整文件描述符限制
echo "* soft nofile 1000000" >> /etc/security/limits.conf
echo "* hard nofile 1000000" >> /etc/security/limits.conf
# 调整内核参数
echo "vm.swappiness = 1" >> /etc/sysctl.conf
echo "net.core.somaxconn = 4096" >> /etc/sysctl.conf
echo "net.ipv4.tcp_max_syn_backlog = 4096" >> /etc/sysctl.conf
sysctl -p
server.properties关键配置:
# 日志保留策略
log.retention.hours=168
log.segment.bytes=1073741824 # 1GB段大小
log.retention.check.interval.ms=300000
# 网络与IO
num.network.threads=8
num.io.threads=16
socket.send.buffer.bytes=1024000
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=104857600
# 副本与ISR
default.replication.factor=3
min.insync.replicas=2
unclean.leader.election.enable=false
JVM调优:
# 在kafka-server-start.sh中调整
export KAFKA_HEAP_OPTS="-Xms8g -Xmx8g -XX:MetaspaceSize=96m -XX:+UseG1GC"
export KAFKA_JVM_PERFORMANCE_OPTS="-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35"
关键监控指标:
推荐工具:
集群部署建议:
分区策略:
# 创建topic时指定分区分布策略
bin/kafka-topics.sh --create --topic your_topic \
--partitions 6 --replication-factor 3 \
--config min.insync.replicas=2 \
--bootstrap-server your_brokers:9092
定期维护:
bin/kafka-leader-election.sh --bootstrap-server broker1:9092 --election-type preferred --all-topic-partitions
灾难恢复:
基准测试工具:
# 生产者测试
bin/kafka-producer-perf-test.sh --topic test --num-records 1000000 \
--record-size 1000 --throughput -1 --producer-props \
bootstrap.servers=your_brokers:9092 acks=all
# 消费者测试
bin/kafka-consumer-perf-test.sh --topic test --messages 1000000 \
--bootstrap-server your_brokers:9092
通过以上综合措施,可以显著提升Kafka集群的稳定性和可靠性,确保消息系统在高负载下的持续稳定运行。