What3Words Sends Ridiculous Legal Threat To Security Researcher Over Open Source Alternative
from the never-use-what3words dept
A couple years we wrote about What3Words, and noted that it was a clever system that created an easy way to allow people to better share exact locations in an easily communicated manner (every bit of the globe can be described with just 3 words -- so something like best.tech.blog is a tiny plot near Hanover, Ontario). While part of this just feels like fun, a key part of the company's marketing message is that the system is useful in emergency situations where someone needs to communicate a very exact location quickly and easily.
However, as we noted in our article, as neat and clever as the idea is, it's very, very proprietary, and that could lead to serious concerns for anyone using it. In our article, we wrote about a bunch of reasons why What3Words and its closed nature could lead to problems -- including the fact that the earth is not static and things move around all the time, such that these 3 word identifiers may not actually remain accurate. But there were other problems as well.
And, apparently one of those problems is that they're censorial legal bullies. Zach Whittaker has the unfortunate story of how What3Words unleashed its legal threat monkeys on a security researcher named Aaron Toponce. Toponce had been working with some other security researchers who had been highlighting some potentially dangerous flaws in the What3Words system beyond those we had mentioned a few years back. The key problem was that some very similar 3 word combos were very close to one another, such that someone relying on them in an emergency could risk sending people to the wrong location.
The company insists that this is rare, but the research (mainly done by researcher Andrew Tierney) indicates otherwise. He seemed to find a fairly large number of similar 3 word combos near each other. You can really see this when Tierney maps out some closely related word combos:
When this happens, you get cells with these offset areas *very* closely matched.
We can see that the row above the banding has a "q" (the value on "n" on the lower left) that is approximately 14,560,000 lower than the cell below. pic.twitter.com/pYumzdxyTh
— Cybergibbons (@cybergibbons) April 27, 2021
In a follow up article, Tierney detailed a bunch of examples where this confusion could be dangerous. Some of them are really striking. Here's just one:
“I think I’m having a heart attack. I’m walking at North Mountain Park. Deep Pinks Start.” – 1053m.
(Try reading both out)
Anyway, Toponce had been tweeting about Tierney's findings, and talked about WhatFreeWords, which had been "an open-source, compatible implementation of the What3Words geocoding algorithm." It was a reverse engineered version of the proprietary What3Words system. That tool was created back in 2019, but a week after it went online, What3Words lawyers sent incredibly overbroad takedown letters about it to everyone who had anything even remotely connected to WhatFreeWords, and had it pulled offline basically everywhere.
First up: this is ridiculous. While reverse engineering is unfortunately fraught with legal risk, there are many areas in which it is perfectly legal. And it seems like WhatFreeWords implementation should be legal. But it appeared to have been a fun side project, and not worth the legal headache.
Even though WhatFreeWords was disappeared from the world in late 2019, it appears that Toponce still had some of the code. So in tweeting about Tierney's research, he offered up the tool to researchers to help investigate more problems with What3Words, similar to what Tierney had found.
And that's when What3Words' lawyers pounced. And, in pouncing, the mere chilling effects of the legal threat worked:
I've been served legal threats by @what3words. Both via email and post.
I am complying with all their demands. This is not a battle worth fighting.
Just let it be known however, they are evil.
— Aaron Toponce ⚛️ (@AaronToponce) April 30, 2021
Toponce also admits he couldn't even sleep after receiving the threat letter. This is an underappreciated aspect of the insanely litigious nature of many censorial bullies these days. Even if you're in the right, getting sued can be completely destructive. Toponce was trying to help security researchers better research an application that is promoted for being safe and security researchers should be allowed to make use of reverse engineering to do exactly that. But, What3Words and their bullying lawyers made sure that's impossible.
To be fair to their bullying lawyers, the threat letter is not as aggressive as some others, and they even make it explicit that they are not seeking that Toponce stop criticizing the company:
In this connection, and to be clear, our client does not require the deletion of your criticism of and feedback in respect of its service.
But... it still makes pretty stringent demands.
i) delete all copies of "What Free Words" and any other works derivative of W3W's software and wordlist presently in your possession or under your control;
ii) confirm, to the best of your knowledge, the identities of all parties / individuals to whom you have provided copies or derivations of the software and/or wordlist;
iii) agree that you will not in the future make further copies or derivations of and/or distribute copies or derivations of the software and/or wordlist;
iv) delete any Tweets or other online references made to the copies / derivations of our client's software and wordlist and that are connected with or emanate from the "What Free Words", and agree not to make similar representations in the future.
Of course, there are some questions about what intellectual property is actually being infringed upon here as well. When the company's lawyers got the original WhatFreeWords site taken down, they claimed copyright and trademark rights, though extraordinarily broadly. They claim their own software is covered by copyright, but WhatFreeWords isn't using their software. They also claim that all the 3 word combos are covered by copyright and... eh... it might be in the UK where W3W is based, but in the US, it would be harder to claim that three random word combos are creative enough to get a copyright. Also, in the US there would be a strong fair use defense. Unfortunately, in the UK, there is a ridiculous concept known as "database rights" that let you claim a right over a mere collection of things, even if you have no claim to the underlying rights. But, even so, it seems that there should be a fair use defense here. The UK has a fair dealing exception for research and private study, which seems like it should apply as well.
As for the trademark claims, well, no one's going to get confused about it, since it's pretty clear that WhatFreeWords was designed explicitly not to be from What3Words, and in this particular case, it's not being offered widely, just to knowledgeable security researchers. Even more insane: the original threat letter over WhatFreeWords claimed that there could be criminal penalties for violating consumer protection laws, and that's just insane.
Still, as Mike Dunford notes in his thread about this situation, W3W's decision to focus on locking up and threatening everyone perhaps explains why so few people know about or use What3Words. Imagine if they had built this as an open tool that others could build on and incorporate into other offerings. Then they could have others experiment and innovate and get more people to adopt it. By making it proprietary, and locking it down with threats and asshole lawyers, there's simply no reason to bother.
The only proper response to this is never, ever use What3Words for anything that matters. Beyond not giving in to censorial, abusive bullies, their legal reaction to a security researcher doing reverse engineering work to help find potentially dangerous problems with What3Words screams loudly to the world that What3Words has no confidence that it's products are safe. They're scared to death of security researchers being able to really test their work.
Both of these reasons means that What3Words should be remembered as little more than a failed.dumpster.fire rather than the cool.mapping.idea it could have been.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: 3 words, aaron toponce, andrew tierney, bullies, copyright, location, open source, security, threats, trademark, whatfreewords
Companies: what3words
Reader Comments
Subscribe: RSS
View by: Time | Thread
rubs between his eyes again
something something over & over pretending it has never been done before despite mountains of carved out Malibu Hills explaining how this will end poorly.
[ link to this | view in thread ]
Mathematical distance theorems...
So three words isn't enough to effectively hash the geography of the entire earth...
Can the researchers
a) give a precise definition for "effectively"?
b) determine how many (latin, since I can't do arabic or Kanji!) words are needed to cover the earth?
[ link to this | view in thread ]
Re: Mathematical distance theorems...
Why wouldn't it? The English language have about 170,000 words in common use which is more than adequate to hash the geography 28 times over if you use 3m²/10ft² squares. The amount of words used by W3W should be ~56 000 which gives 175 616 000 000 000 unique 3m² location identifiers. The earth has a surface area of ~510 100 000 000 000m². You do the math.
Currently only English and Korean cover the entire globe with W3W's system, all other languages only cover land.
What the researches pointed out was that some locations use similar words which can lead to confusion and since the system is touted as a way to easily tell rescue-services where you are it's a weakness which can have deadly consequences.
[ link to this | view in thread ]
Since their whole sales pitch in favor of the word conversion over reading your GPS coordinates as numbers is to supposedly reduce errors from misheard digits, the existence of phonetic ambiguity in the wordlists just blows the whole purpose of the system out of the water. "pink.start" vs "pinks.start" is just unforgivable.
The whole concept seems to me more like a funny toy for kids to play with than a serious tool. If you were going to use an app to help people send their location to emergency responders, why should it produce a string of words you have to speak on the phone instead of transmitting the coordinates directly?
[ link to this | view in thread ]
Here's 3 words for them.
Out. Of. Business.
[ link to this | view in thread ]
Turns out, code is free
And unconstrained. After all, what are a few gits among coders? Not much effort at all.
The researcher can rest easy. The work is available in America, where things are free unless they tick off the GQP.
[ link to this | view in thread ]
I only got 3 letters
WTF
[ link to this | view in thread ]
So... great way for identifying your location, only you aren't allowed to use their word combinations to do that without being sued into the ground.
Seems like a winning business model.
[ link to this | view in thread ]
You can tell a lot about someone by their enemies
If a company treats security researchers as enemies that says louder than words that they are not offering a secure product/service.
[ link to this | view in thread ]
Opensource Alternatives
There's plenty of other solutions to the problem of locating people. Since you are more likely to have Google Maps on your phone than w3w, just use Google's 'Plus Codes' alternative. It's open source and on your phone already with Maps.
[ link to this | view in thread ]
Re: Mathematical distance theorems...
what3words maps 57 trillion squares to 3 word addresses by shipping a 40,000 word list. Due to the sheer size of the word list, it's cluttered with problems:
A smaller word list could easily address all these issues by adding one more word. Something like 4,096 words is more than enough to reach all 57 trillion 3m x 3m squares on the globe with 4 word addresses.
One of the problems is the lack of error detection. A more tightly, curated list would greatly minimize that risk, but a "check word" could eliminate it. Simply hash the 4 words with SHA-256, read the 12 least or most significant bits to deterministically pick the 5th check word, similar to BIP39. The person on the phone would then communicate five words to emergency services, and the risk they're sent to the wrong location is non-existent.
Granted, 5 words is more to manage than 3, I admit. But it should not be hard to build a list of 4,096 common words of say 3-8 characters. The what3words word list minimum length is 4 characters, and the maximum is 18 characters. This is actually the breakdown:
The average character count is unnecessarily heavy in its current implementation, where few characters per word with one or two more words could on average be the same length to type.
[ link to this | view in thread ]
Re: I only got 3 letters
...you know those letters stand for words, right?
[ link to this | view in thread ]
Re: Turns out, code is free
What the hell are you talking about?
[ link to this | view in thread ]
Re: Re: Mathematical distance theorems...
Make no mistake: This is a mathematical encoding system, and it should be discussed technically in terms of its formal properties. Hamming distance between valid symbols comes to mind.
That math says coverage is 1/3 of the 3m squares on the planet under optimum assumptions. Check with a street beggar in Mumbai on whether that's enough for where she sleeps, then get back to me about your favorite tall building.
The code is not effective -- assuming all of us will recognize the same 56000 words, (fact not in evidence -- anecdotally, I think the literate recognize about 20K, but that's kinda ableist), -- because it's still ambiguous in the presence of noise and common human and machine errors.
I'd expect loss of singulars/plurals/tense information, not to mention homonyms and even near homonyms (pinks/sphynx/lynx/punks) to be a problem in a stressed communication environment.
All of these end up being computer-translated, so I'd want some redundancy too...not all code points should be valid locations, unless you expect them to always be copy/pasted.
This is why I think in terms of 4 words.
Google 'Plus codes' (thanks Anonymous Coward!) have some of these properties -- no 9/4 or digit 0 versus letter O or digit 1 versus lowercase l kind of issues.
[ link to this | view in thread ]
Re: Re: Mathematical distance theorems...
Couldn't you do this with four words of slightly longer length, with the fourth word being the check word?
[ link to this | view in thread ]
Re:
I can answer part of this:
The entire database can fit on a device you carry with you, even if you don't have network access.
This means that as long as you have the app on your phone, all you need is some means of receiving the three words, and you can look up the authoritative address.
So this isn't for someone who's out and has a heart attack and calls 9-1-1. This is for the 9-1-1 dispatcher to broadcast a call on the radio with the three words. That way, the nearest responder knows exactly where they're going, even if they don't have a reasonable address or description.
But doing such a scheme without a check word just shows that they didn't have a mathematician involved in the design process.
Ironically, a mathematician just did a video on the logic behind such a system, as implemented in a card game:
https://www.youtube.com/watch?v=VTDKqW_GLkw
Applying this logic to GPS coordinates should be able to create a system MUCH better that uses four emojis instead of three words, with a guarantee that the word combinations will be unique across any specific geography. And it can't be patented because there's over a century of prior art.
[ link to this | view in thread ]
Re: Re: Mathematical distance theorems...
THANKS. This is the sort of discussion I wanted.
I was wondering if we could get a comparison of properties with similar algorithms, such as the google "plus words" first mentioned by a helpful Anonymous Coward below? And how well does this stuff work with the 57th floor of the Empire State building?
It might also be worth looking into the use cases...in the dysfunctional US telecom market, it's not too hard to get out of cell tower range in mountains.
[ link to this | view in thread ]
Re: Here's 3 words for them.
Well, that doesn't map to their system, but if you were reading the following over a noisy CB radio....
https://what3words.com/outs.soft.business
[ link to this | view in thread ]
Re: Re: Turns out, code is free
He's stating that the code is in a git repo for anyone to grab. He's intentionally not being any more direct than that so that he doesn't get sued.
[ link to this | view in thread ]
Re: Re: I only got 3 letters
https://what3words.com/laugh.outs.loud
[ link to this | view in thread ]
Re: Re: Re: Mathematical distance theorems...
So you actually don't need 57 trillion squares. The surface area of the globe is (only) 510 billion m^2. Diving that into 3m^2 blocks like what3words did, you end up with 170 billion squares, a far cry from 57 trillion. This means you only need 5,540 unique words to generate all 3 word address possibilities.
So yes, you could still stick with 3 word addresses with an optional 4th check word.
[ link to this | view in thread ]
Re: Re: Re: Mathematical distance theorems...
No worse than latitude and longitude.
[ link to this | view in thread ]
Re: Re: Re: Mathematical distance theorems...
Okay. Going way too deep on the math and linguistics involved in an esoteric geopolitical topic that will be of no consequence to me in my personal life.
My specialty!
Let's start by putting down the numbers and facts:
W3W's system divides the world into, officially, 57,000,000,000,000 squares. I don't know whether this is exact, but at scales this large -- 500 billion is less than 1% -- a potential few billion one way or the other should not make any effective difference.
According to a 2016 study, the average young American adult has a reading vocabulary of 42,000 words. The speaking vocabulary is lower, in the range of 20 to 25 thousand.
The minimum number of words to reach 57 trillion, in combinations of 3, is 38,486.
The minimum number of words to reach 57 trillion, in combinations of 4, is the incredibly small figure of 2,748.
Assumptions about the suitability of words, that may vary based on individual interpretations of common sense:
Homonyms (different words, spelled the same way -- lead the horse to water, lead pipes) and homophones (different words, spelled differently, but commonly pronounced the same -- she's a witch, which is bad) are to be avoided when at all possible.
Words that are nearly homophones should be avoided as well. B, V, F, P, TH can all be very ambiguous depending on audio interference and speaker accent. TH, SH, S, Z, CH also represent a group of sounds that, while distinct in clear conditions, are similar enough to become indistinguishable at times.
But what about "dour"? That's a word, but it could be a similar typo for "door", "our", "pour", "odour", "sour", "four", "hour", "tour", or "your". If we want to be as clear and error-proof as possible, there can only be one three- or four-letter-word ending in "our". This obviously cramps the number of short words allowed, but that's fine. Typically, whenever encountering familiar expressions, moderately intelligent observers overcome significant character quantities without considerable confusion; successfully understanding embarrassingly long-winded communications.
No proper nouns. No John, no Janet, no California, no Wilson, no Samsung. No words that should be capitalized mid-sentence according to the AP style guide. If you need a reason why, I suggest you stop by a Starbucks.
It should not be possible to include plural, possessive, or tense variations of a word in the same position. If it's possible for word 1 to be "park", then it should not be possible for word 1 to be "parks", "parked", "parker", "parking", or, heaven forbid, the likes of "repark"...
Hell, let's take that one a step further in regards to plurals. No words that end in "s" or "z", no matter how many things they're talking about. Like in the problematic example, there's no reason there should ever be any combination of "pink start", "pinks start", and "pinks tart". I don't think this ban has to extend to the other similar sounds described earlier; it's a lot harder to slur together "porch champ" than "ports stamp".
I don't care that shit is a good, unequivocal-sounding word -- I don't want Karen trying to censor herself, little Billy being afraid of getting in trouble if he reads it out, or Officer Jones on the other end of the line deciding it's a prank call when someone phones in that they're stuck at humongous.dog.turd. Obviously, the definition of profanity is purely subjective, but we could cut out the vast majority of objectionable phrases by simply eliminating any words typically or commonly used to refer to bodily excretions (poop), genitals (wiener), sex and sexuality (gay), negatively referring to intelligence (stupid), ethnicity (latino), nationality (kraut), or disabilities (can you just imagine the crap that could fly if autistic or crippled were included? yikes).
That's quite a few rules, huh? Especially considering that the average English speaker doesn't share the same set of 42,000 words in their vocabulary, it doesn't seem likely that all those rules can be followed and still have 38,000 useful words left over. So in all likelihood, you'd want to include a fourth word. Following those rules with a list of 3,000 would be way less of a headache.
And as Dropbox taught us, remembering four words is still super easy. As long as I live, even if a terrible dementia ravages through my frontal lobe until I'm incapable of remembering my age, or my name, or even the Alamo... I will never be free of correct horse battery staple.
Oh, and there's one more rule I'd want to apply. It doesn't eliminate any words outright, but it limits where they can be used.
Take the following examples:
"werewolf.package.end.alone"
"where.wolfpack.agenda.loan"
"where.wolfpack. urgent.alone"
"carton.sure.leaving.complete"
"cart.unsure.leaf.incomplete"
"carton.surely.fin.complete"
"car.tincture.leaf.incomplete"
"car.tincture.l eaving.complete"
There's an easy way to avoid this potential syllable uncertainty: mandate the length of components.
Say, the first word has to be one syllable. The second word and the final word have to be two syllables. The third word has to be at least two syllables, but it can be as many more as you want -- it doesn't matter, you know it's whatever is left over.
So in the above two examples, the only valid options would be:
"where.wolfpack.urgent.alone"
"car.tincture.leaving.complete"
Anyway, that's my eighty-three cents on the subject. No, you don't get a TL;DR on a grammar post. Put on your reading glasses and get over it. Or ignore my ramblings entirely; I'm just a nerd, not a cop.
[ link to this | view in thread ]
IKEA Invested $16 million in What3words
[ link to this | view in thread ]
Mapcode is better anyway
Mapcode.com
[ link to this | view in thread ]