Scientists Forced To Change Names Of Human Genes Because Of Microsoft's Failure To Patch Excel
from the code-is-law dept
Six years ago, Techdirt wrote about a curious issue with Microsoft's Excel. A default date conversion feature was altering the names of genes, because they looked like dates. For example, the tumor suppressor gene DEC1 (Deleted in Esophageal Cancer 1) was being converted to "1-DEC". Hardly a widespread problem, you might think. Not so: research in 2016 found that nearly 20% of 3500 papers taken from leading genomic journals contained gene lists that had been corrupted by Excel's re-interpretation of names as dates. Although there don't seem to be any instances where this led to serious errors, there is a natural concern that it could distort research results. The good news is this problem has now been fixed. The rather surprising news is that it wasn't Microsoft that fixed it, even though Excel was at fault. As an article in The Verge reports:
Help has arrived, though, in the form of the scientific body in charge of standardizing the names of genes, the HUGO Gene Nomenclature Committee, or HGNC. This week, the HGNC published new guidelines for gene naming, including for "symbols that affect data handling and retrieval." From now on, they say, human genes and the proteins they expressed will be named with one eye on Excel's auto-formatting. That means the symbol MARCH1 has now become MARCHF1, while SEPT1 has become SEPTIN1, and so on. A record of old symbols and names will be stored by HGNC to avoid confusion in the future.
So far, 27 genes have been re-named in this way. Modifying gene names in itself is not unheard of. The Verge article notes that, in the past, names that made sense to experts, but which might alarm or offend lay people, are also changed from time to time:
"We always have to imagine a clinician having to explain to a parent that their child has a mutation in a particular gene,” says [Elspeth Bruford, the coordinator of HGNC]. "For example, HECA [a cancer-related human gene] used to have the gene name 'headcase homolog (Drosophila),' named after the equivalent gene in fruit fly, but we changed it to 'hdc homolog, cell cycle regulator' to avoid potential offense."
It is nice to know that we won't need to worry about serious problems flowing from Excel's habit of automatically re-naming cell entries. But it's rather troubling that Microsoft doesn't seem to have thought the problem worthy of its attention or a fix, despite it being known for at least six years. It shows once again how people are being forced to adapt to the software they use, rather than the other way around. Or, as Lawrence Lessig famously wrote: "code is law"·
Follow me @glynmoody on Twitter, Diaspora, or Mastodon.
Filed Under: autoconversion, dates, excel, gene names, genes, spreadsheets
Companies: microsoft