Dinh Doan Van Bien · January 4, 2026

A Heartbeat for the Main Thread

If you can’t measure it, you can’t fix it.

The sidebar was slow. Not always. Not predictably. During large scans, with duplicate detection running, the sidebar would stutter. Scrolling felt wrong. Selections lagged. The app was doing things, but the UI was not keeping up. On macOS, the main thread owns the UI: block it with computation, and every frame, every scroll event, every selection highlight waits.

“Slow” is not a bug report. You cannot fix “slow.” You can fix “main thread blocked for 340ms every time a duplicate hash resolves.”

PerfLogWriter was the first thing I built to address this. Not a fix: a measurement tool.

It is a simple file-based logger. Events are written to disk with timestamps and categories. Each log file caps at a size limit (writing stops), and old files are pruned across sessions, so the logs do not grow unbounded. Structured entries, one per line, parseable by the analysis scripts in Scripts/Performance/.

The key addition was a heartbeat. Same principle as a health-check probe: if the response is late, something is blocking.

A GCD (Grand Central Dispatch) timer source on DispatchQueue.main fires every 500ms. Each time it fires, it records how long it actually waited. On an uncontested main thread, the timer fires close to on time. When the main thread is blocked, the timer fires late. The difference between the scheduled interval and the actual interval is the stall duration.¹

The log entry looks like this. During duplicate detection on a 5.2M-node scan (M3 Pro, Debug build), with six concurrent hash workers feeding results to the UI, the heartbeat captured this:

MAIN_THREAD_BLOCKED 840ms (expected 500ms)

The timer expected to fire after 500ms but actually fired after 840ms: the main thread was blocked for the extra 340ms. If that line is not in the log, the main thread was not blocked. If it is, you know exactly when and for how long. This turns a vague feeling into a specific event with a timestamp.

Click “Start background work” to simulate a main thread block. The heartbeat timer detects the delay and logs it.

The logs pointed straight at the cause.

Duplicate detection runs a TaskGroup with six workers. Each worker hashes a file, computes a digest, and reports a result. Six workers producing results as fast as they can hash. The original implementation called back into the ViewModel for each result:

// Bad: per-result mutation triggers a SwiftUI diff on every iteration
for await result in group {
    self.cleanupResult = result  // triggers re-render every time
}

With thousands of duplicate candidates, this meant thousands of forced main-thread hops. In Swift’s structured concurrency, each await resumption in a for await loop yields back to the run loop², giving SwiftUI a chance to schedule a render pass for every single result.

If the mutations had landed synchronously in one run-loop turn, SwiftUI would coalesce them into a single diff. The async iteration prevents that. The main thread never got a quiet moment to do its actual job.

The fix is batching:

// Good: batch updates every 250ms (simplified; real code uses Task.sleep loop)
@MainActor
func flushResults(_ pending: [Result]) {
    self.results.append(contentsOf: pending)
}

// In the TaskGroup context:
var buffer: [Result] = []
// A separate Task periodically flushes the buffer to @MainActor state.
// All mutations in a single flush land in one run-loop turn, so SwiftUI
// coalesces them into a single diff.
for await result in group {
    buffer.append(result)
}

Instead of updating state per result, accumulate results in a local buffer. Every 250ms, flush the buffer to @MainActor state (the main-thread-only context) in a single batch. SwiftUI runs one diff per flush, not one per result. The main thread has time to breathe between batches.

After this change, the MAIN_THREAD_BLOCKED entries disappeared from the duplicate detection phase. Before: 200–400ms of stall per heartbeat tick, three severe hangs (>2s) per session. After the 250ms batch flush: zero stall events across three consecutive runs. The sidebar scrolled smoothly during scans.

The heartbeat caught problems that my own usage patterns never triggered consistently.

Click “Run scan” to compare: per-result updates starve the main thread, while batch flushing keeps the UI responsive.

The logs also surfaced a metric I had initially misread.

sidebar body_eval selection_roundtrip=Xms records the elapsed time since the previous sidebar evaluation. Early on I saw values like 17000ms and assumed something was wrong: no computation should take 17 seconds in a sidebar render.

It does not mean that. The number is time since the last evaluation, not time the evaluation took. A value of 17,000ms means the sidebar was not re-evaluated for 17 seconds, because nothing in it changed. When the scan is running and the sidebar has not yet received new data, there is nothing to re-render. The absence of re-renders is not a bug: SwiftUI’s @Observable tracking only re-evaluates views whose state actually changed. No new data, no re-render.

Misreading that metric would have sent me on a two-day detour optimizing code that was already working correctly. The most dangerous performance metric is the one you think you understand.

One practical gotcha: the app is sandboxed, so logs land in the container path, not where you might expect.³ If you are tailing logs and they are not updating, you are probably looking in the wrong directory.

The batch flush pattern applies anywhere a high-frequency producer feeds a UI consumer. The rule: do not let background work drive UI update frequency. 250ms is imperceptible to a human and gives the main thread room to handle input. Progress counters jump in steps rather than ticking smoothly, but no one notices.

Without the heartbeat log, I would have optimized the sidebar rendering. I was already eyeing List performance, data structures, lazy loading. I had theories. They were all wrong. The rendering was fine. The problem was that the rendering was being triggered thousands of times per second.

Why build a custom heartbeat logger when Instruments exists? I did use Instruments: Time Profiler, Allocations, the hang detection template. They confirmed what the logs already showed. But the PerfLogWriter + heartbeat approach had one advantage that mattered more in practice: it ran during every debug session without attaching anything. No Instruments process competing for CPU, no post-hoc trace analysis. Just a grep through a log file. For the day-to-day development loop (run, reproduce, check the log), that immediacy was worth more than Instruments’ richer visualization. The trade-off: the heartbeat’s own 500ms timer competes for main-thread scheduling, so it is not zero-overhead either. It is lighter than Instruments, not free.

PerfLogWriter started as a quick hack: a file logger and a timer. It grew. The sidebar optimization that followed (flat list rendering, a single filterTree() call, 67% CPU reduction per keystroke) came directly from patterns the heartbeat surfaced. Over time, systematic baselines replaced ad-hoc log inspection: each release candidate got a profile run against a reference scan, and regressions showed up as numbers, not feelings.

Instruments came later for the problems the 500ms granularity could not catch: call-tree profiling for hot spots, system-level correlation for thread scheduling issues. In hindsight, I should have reached for Instruments sooner, especially for the subtler scheduling problems. The two approaches complement each other: the heartbeat for continuous, low-overhead monitoring; Instruments for surgical deep dives. I just reached for the simpler one first, and it happened to be enough for the first round of fixes.

References

Analyze hangs with Instruments: WWDC 2023
Track down hangs with Xcode and on-device detection: WWDC 2022
Instruments: Apple Developer Documentation
Event loop: Wikipedia

The heartbeat measures a lower bound on stall duration: a block that starts and ends between two 500ms fires may register as a smaller deviation than its actual length. For the coarse-grained detection this was built for (200ms+ blocks during duplicate detection), the 500ms interval was sufficient. ↩
The run loop is the main thread’s event-processing cycle: it handles UI events, timer callbacks, and scheduled work in sequence. Each time an await resumes, it re-enters the run loop as a new unit of work, giving SwiftUI a chance to schedule a render pass before the next result arrives. ↩
Logs go to ~/Library/Containers/com.renala.app/Data/Library/Application Support/Renala/perf-logs/. To enable the heartbeat, use defaults write ~/Library/Containers/com.renala.app/Data/Library/Preferences/com.renala.app.plist enablePerfLogging -bool true (the naive defaults write com.renala.app writes to the non-sandboxed domain). Restart the app to start the heartbeat. ↩

A Heartbeat for the Main Thread

References

Footnotes