The most-executed bit of code in a mainframe operating system had a hidden bug that was causing processes to get "lost", i.e. they were not an any dispatchable queue. They were effectively zombie processes which would never run again, could not be communicated with and would never execute another instruction. I helped find the bug.
We were profiling the code to see which instructions were using the most time, so we could further optimize the routine. (It was already carefully handcrafted in assembler, aligned on cache lines, etc. to be as fast as possible.)
In the bar graph of time used per instruction, there was one instruction which never got executed. The strange part - it was in a section of code with no branches. An impossibility - linear execution of a series of instructions where one of them is never found executing when the ones before and after it are found executing millions of times.
The bug turned out to be a comment in the previous line of code. The comment extended one column too far, into the "continuation" column. This made the comment continue onto the next line, which contained the instruction that never got executed. Of course, even though the instruction was in the listing, it was not in the generated binary shown to the left of each line in the assembler output. In other words it was in the source but never assembled. Nobody looked over there, we were reviewing the code, not the binary instruction stream generated from the code.
That one missing instruction caused all the lost-process problems./div>
Techdirt has not posted any stories submitted by scorpion136.
No-code-generated bug
We were profiling the code to see which instructions were using the most time, so we could further optimize the routine. (It was already carefully handcrafted in assembler, aligned on cache lines, etc. to be as fast as possible.)
In the bar graph of time used per instruction, there was one instruction which never got executed. The strange part - it was in a section of code with no branches. An impossibility - linear execution of a series of instructions where one of them is never found executing when the ones before and after it are found executing millions of times.
The bug turned out to be a comment in the previous line of code. The comment extended one column too far, into the "continuation" column. This made the comment continue onto the next line, which contained the instruction that never got executed. Of course, even though the instruction was in the listing, it was not in the generated binary shown to the left of each line in the assembler output. In other words it was in the source but never assembled. Nobody looked over there, we were reviewing the code, not the binary instruction stream generated from the code.
That one missing instruction caused all the lost-process problems./div>
Techdirt has not posted any stories submitted by scorpion136.
Submit a story now.