Recess
Sign in
← Back to feed
You're reading as a guest. Sign in to save posts, see what's new, and tune your feed.
Sign in
TECHNOLOGY · BITE · 2 MIN · INTERMEDIATE

The Leap Second That Broke Reddit

On June 30, 2012, clocks got an extra second. Linux treated it as an infinite loop.

At 23:59:59 UTC on June 30, 2012, the International Earth Rotation Service added a leap second. Instead of rolling to 00:00:00, clocks registered 23:59:60 for one tick, then moved on. Earth's rotation had slowed slightly; the adjustment kept atomic time in sync with the planet.

A lot of Linux servers did not handle it well. A bug in the kernel's hrtimer code meant that when the system hit the duplicated second, it would loop trying to schedule timers against a timestamp that seemed to already be in the past. CPU usage pinned to 100%. The machine kept running but burned power doing nothing.

Reddit went down. LinkedIn, Mozilla's servers, Yelp, FourSquare — all unavailable for minutes to hours. Qantas airline check-in systems in Australia failed, delaying passengers. Sysadmins across the world spent the night restarting boxes or setting the clock by hand.

Google had seen the problem coming and, before the 2008 leap second, invented a workaround they called leap smearing. Instead of inserting one full second, Google's NTP servers stretched out each second across the last 24 hours of the day by a tiny fraction. Clients drifted smoothly through the leap without ever seeing a 23:59:60. Every other operator eventually copied the idea.

#leap-second#linux#timekeeping#outages
Sources
Ars TechnicaGoogle