Dev Boots Linux 292,612 Times to Find Intel, AMD Kernel Bug
Red Hat Linux developer Shared by Richard W.M. Jones An eyebrow-raising story about Linux bug hunting. Jones found a bug in Linux 6.4 that caused him to hang once in about 1,000 boots. Jones identified the bug and set out to prove he was caught red-handed. But his headline woes of booting Linux 292,612 times (and 1,000 more to check for bugs) apparently “took him only 21 hours.” Also, this bug seems to happen less on his Intel hardware than on AMD based machines.
Jones got his first smell of this elusive but reproducible Linux boot bug when testing some server software. nbdkit (a protocol for accessing block devices over the network) seemed to “hang randomly”. libguestsfs (Tools for accessing and modifying virtual machine disk images.) Although we know that the loop test phase was only 21 hours long (even though an astronomical 293,612 boot processes started ), Jones said, “it took me days to get to this point.” He says that it was useful for narrowing down the The cause is claimed to be a regression in the printk time function. Undo This code commit “fixes the problem,” Jones claims.
A clue as to the cause was that the bug always occurs at the same early stage of the boot process. When starting the latest qemu. Following this link, the easiest way to reproduce the hang issue is to run the guestfish command in a loop, running many instances in parallel and parsing the output to detect when the boot hang event occurs You can see that it is. Typically, he ran the guestfish loop 10,000 times as a viable threshold for collecting useful log data.
Perhaps of interest to hardware enthusiasts, Jones says this strange boot hang issue occurs less frequently on Intel systems than on AMD systems. In any case, I hope that the exposure and identification of this bug means that it is squashed and never returned.