| | | | | 节点宕机 → 检查system.log、硬件、网络 |
| | cassandra_gossip_phi_accrual_failures > 0 | | | 网络分区 → 检查phi_convict_threshold |
| | cassandra_client_request_latency_99th_percentile{scope="Read"} > 100 | | | Compaction落后、磁盘慢、热点 → nodetool tablestats |
| | cassandra_client_request_latency_99th_percentile{scope="Write"} > 50 | | | Memtable flush积压、CommitLog盘慢 |
| | cassandra_dropped_message_dropped_total > 0 | | | |
| | cassandra_compaction_pending_tasks > 100 | | | IO饱和 → 提高concurrent_compactors或限速 |
| | rate(cassandra_compaction_bytes_compacted[1h]) < 预期吞吐 | | | |
| | node_filesystem_free_bytes{mountpoint="/data"} / node_filesystem_size_bytes < 0.15 | | | |
| | cassandra_storage_total_hints_count > 100000 | | | |
| | increase(jvm_gc_pause_seconds_sum[5m]) > 5 | | | Heap过大、THP → 调G1参数、disable THP |
| | jvm_memory_heap_used / jvm_memory_heap_max > 0.8 | | | Memtable膨胀 → offheap_objects |
| | cassandra_thread_pools_pending_tasks{pool="MutationStage"} > 200 | | | 写超载 → concurrent_writes调高 |
| | rate(cassandra_thread_pools_dropped_tasks{pool="ReadStage"}[5m]) > 0 | | | |
| | cassandra_cache_key_hits_rate < 0.7 | | | |
| | cassandra_client_request_unavailable_total > 0 | | | 节点DOWN未达QUORUM → 检查RF与一致性级别 |
| | cassandra_memtable_off_heap_used > cassandra_memtable_off_heap_limit * 0.9 | | | Trie回收慢 → 升级或调memtable_allocation_type |
| | cassandra_compaction_unified_strategy_switch_failures > 0 | | | |