Runaway Process: One Core Pinned at 100%
Easy

Problem

Alerts fired: report-host is sluggish and the nightly report job is way behind. The box has a single CPU and something is sitting on it — uptime-style load is up around 1.4 on one core. Nothing is out of memory and the disk is fine; this is pure CPU.

Find the process that's pinning the core, confirm it's the cause, and kill it so the core frees up.

Initial setup

  • Host: report-host, Alpine, 1 vCPU.
  • Several app worker processes plus the usual init/cron/shell.
  • One process is stuck at ~100% CPU (it never sleeps).

Acceptance

You've solved it when:

  • You've used top to spot the process at ~99–100% %CPU with STAT R
and read its PID (it's the report/aggregate.py pass, PID 1847).
  • You've confirmed it's CPU saturation — free shows plenty of available
memory and df shows the disk is fine, so it's neither RAM nor disk.
  • You've killed that PID and shown via top that the core dropped back
to idle (load average lags, so don't expect it to fall instantly).
Live session
Code
SavedNo commands yet