Copyright Enforcement Company Uses Sketchy Algorithms And Questionable Math In Hopes Of Becoming Copyright Trolls' Go-To Resource
from the DOES-NOT-COMPUTE dept
Yet another person thinks there's money to be made (albeit indirectly) in the copyright trolling business. (h/t to the Cyberlaw and Policy Blog)Stephen Moignard lives a quiet life in the Coonawarra wine district in South Australia, tending his vineyard and small wine company, the Hundred of Comaum.Moignard survived the turn-of-this-century dotcom bust. He used to have a successful company that installed high-speed internet connections in office buildings, but his fortunes crashed with many others in the early 2000s.
He also beavers away until 4am most mornings writing software for a new business venture which he’s hoping will be a global winner in the internet age.
It detects breaches of international copyright on millions of websites and produces almost instantaneous legal letters of demand.
Now, he's looking to make some money by using an algorithm to hunt down "substantially similar" text across multiple websites and serve demand letters to alleged copyright infringers. His new business is called Plfer, and its detection algorithm bears many similarities to commercial plagiarism detection software, albeit with a few tweaks that allow it to bypass web formatting and other obstacles that might throw off comparisons.
Moignard designates "victims" as "Plferees" and those using words written by others as "Plferers." At the site, you can view scans requested by site visitors, along with some very sketchy math used to determine potential damages. (Bad news for those of you who block Java by default: nearly the entire site is Java, so you'll be greeted with nothing but a banner. Incredibly annoying, but presumably there to prevent people like me from copying and pasting Moignard's words and thus becoming one of those pesky "Plferers.")
One such example of sketchy math and questionable algorithms involves perfume site Fragrantica and some short-lived Wordpress blog. Somehow, the use of Cartier-related words adds up to more than $600,000 in potential damages. [pdf link to printed report]
The report contains a lot of cool-sounding "weights" and "scores," all of which are presumably part of Plfer's proprietary algorithm.
Shallow scan: (stage one)The Plfer score is explained on the "Getting Started" page:
Found with string: "Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch" on search page: 0
amongst total results of: 16 (weighted value: 1.6)
with snippet: "Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch for aviator Alberto Santos-Dumont. This famous timepiece was known as the ..."
Recorded on Plfer search page:fragrantica.com (in full:fragrantica.com/designers/Cartier.html)
This string was number: 16 on the page.
It has an improbability weighting of: 520.
The infringement has a duration of: 708 days.
The Plfer score is:-1741.
The complexity of the string of text, the time between the earliest and later dates and the total number of copies in existence can be used to create a score (plfer score)(10).The last sentence makes no sense, but there it is. "Actionable infringement" doesn't need a score. Either it's infringement or it isn't, and much of what gets highlighted by Plfer's "Deep Scan" seems to be nothing but language that would be common to two sites covering the same subject matter. Here's a screenshot from one Plfer report on two SEO/web design companies' websites.
The lower the number (or the larger the negative number) the more serious the breach.
After a deep scan, the plfer score is updated with many more known factors. A shallow scan plfer score should not be solely relied upon to issue infringement notices.
Using both of these, Plfer arrives at this conclusion:
The plferer earned 1164 points which is greater than the score required to amount to an 'actionable infringement' .
"Substantially similar" phrases include "understanding... signals algorithmically" and "reach your audience." For the two sites noted above, the "substantially similar" wording contains phrases that would be common across all Cartier biographical information. ("Cartier gained notoriety in 1904 when Louis Cartier created the first wristwatch…")
Finding matching phrases and keywords across two marketing sites and claiming it's copyright infringement is a bit like looking over the resume of someone applying for the same position as you and claiming the similar buzzwords and job descriptions are due to your competitor reading over your shoulder.
Now, we get to the really fun stuff: potential damages. These numbers are key to Plfer's success. Plfer charges very minimal fees. "Deep Scans" and "Shallow Scans" run $1/per plus $0.85 in fees. There will presumably be small fees for demand letters and other forms, but the site is still in beta and no pricing is available. Plfer, notably, does not want a cut of recovered damages, which doesn't make it so much a copyright troll as a copyright troll facilitator. From Moignard's advertorial PDF "2015 - the end of copyright?"
Plfer differs from other online copyright service providers in that it takes no pecuniary interest in any of the copyright infringements it uncovers. It does not become a party to any of the cases it reveals but merely assists to provide evidence, pro-forma documents and "wizards" for users and their advisors.Plfer may not partake of any damages recovered, but it still needs to sell its services. And when a scan returns an amount in the low hundreds, it still looks like a bargain because the infringed party only spent a few bucks in return for this "evidence" of "actionable infringement." (The PDF quoted above also hints at Plfer entering into mutually-beneficial contracts with IP-oriented law firms, but there appears to be nothing in place at the moment.)
In the case of Fragrantica, the potential damages are huge. Here's the "math" behind the massive number.
The total value of fragrantica is $ 2,389,600 according to Alexa.com and WorthOfWeb.com. We have calculated the plferee's actual losses as follows:That's some, um, interesting math, especially when the "plifering" site ranks 14 million places lower than the "victim" and would probably never surface in a search for Cartier products -- which would seem to make it more difficult to claim damages. Sure, Fragrantica could pursue this payout and present Plfer's proprietary Alexa math to a judge, but the numbers cited here as mathematically sound are actually beyond the point of speculative.
Our daily advertising income is valued at a minimum of $3314. The proportion of our site contained in parentalstyle.wordpress.com is 5.51%, giving a proportionate advertising revenue loss of $182.60 per day.
The value of this loss over 708 days is therefore $129280.8 USD. Applying a penalty multiplier of 5 times gives a total fair and just actual damages amount of $646,404.00 USD. A standard fee for enforcing an infringement of this nature and degree is $1,998.00 USD.
The total amount payable is therefore $1,998.00 + $646,404.00 = $648,402.00 USD.
Plferer Alexa ranking: 15,105,799
Plferer value: 64
Plferee Alexa ranking: 8,185
Plferee value: 2389600
Duration (years): + 1.94
Penalty: + 646404.00
Fee: + 1998.00
Total: + 648,402.00
Going beyond the sketchy math, there's the reality of the situation. Has anyone ever made money going after "scrapers," who "republish" posts of others in their entirety and whose sites contain 100% infringing material? Of course not. Smaller infringements like these -- which are closer to plagiarism than copyright infringement -- won't be moneymakers either. Plfer might have limited success selling $1 scans to the curious and litigiously stupid, but it's not going to change the face of copyright enforcement, much less supplant Moignard's vineyard as his primary moneymaker.
So, why is Moignard doing this? Well, according to his own statements, it appears to be some sort of crusade against the internet's "devaluing" of copyright-protected content. In the FAQ, under the heading "Is copyright evil?," Moignard first points out that copyright isn't a moral right...
[C]opyright, like all intellectual property rights, is an incentive device, designed to elicit more of certain kinds of 'learning' or knowledge creation and certain kinds of knowledge processing by government, rather than being any fundamental sort of moral right...... before going on to make this a moral issue by quoting two supposed copyright opponents (at least one of which will be very familiar to Techdirt readers)...
For instance, Mike Masnick at TechDirt says:
"People copy stuff all the time, because it's a natural and normal thing to do. People make copies because it's convenient and it serves a purpose -- and quite often they know that doing so causes no harm in those situations."... and summing it up by claiming the high ground.
There are a raft of similar postings by annonymous file-sharing fans such as Enigmax [TorrentFreak], who argues that all information should be free and authors should not receive anything.
Plfer stands in total opposition to the Enigmaxs and Mike Masnick's of this world, and can prove that the technology that makes copying easy also makes prosecuting infringers just as easy.He also presents the copyright industry's attitude towards technological advancement in a far better light than it deserves, while simultaneously portraying innovation as an "attack" on rightholders. (From the "End of copyright" PDF.)
Digital 'internet' transmissions have obviously increased the risk that copyrighted works will be 'reproduced' and 'distributed' in violation of the exclusive rights granted to copyright owners. Copyright law, however, has withstood attacks from other developing media.Yeah, if by "coped" you mean "pushed for favorable legislation" and "sued endlessly." That's not coping. That's finally relenting to the inevitable because you've exhausted all your options.
Specifically, copyright has coped with the invention of broadcast media, copy machines, and the video cassette recorder, and technology is assisting copyright law to step up again today.
Plfer is positioning itself as a "volume" business, making money from quantity rather than quality.
Its developers’ are assuming that the sheer volume of infringements will enable it to generate significant income despite offering these services at a fraction of the cost of equivalent legal advice.This puts it in the same group as copyright trolls like Malibu Media and Prenda Law, even if it doesn't directly benefit from settlements and awarded damages. What it hopes to do is become the starting point for aspiring copyright trolls, using questionable algorithms and damage assessments. It even wants to further limit fair use protections -- again, by using some questionable rationalizations.
With the increasingly commercial nature of all aspects of the public internet and the "monetisation" of site traffic via ubiquitous advertising services such as Google™ AdSense™ and other variants, it is difficult to argue any part of the internet is truly "non-commercial" and so the application of the "fair use" defence would seem to remain limited.Fair use isn't limited to non-commercial enterprises. This misconception refuses to die, and self-proclaimed copyright enforcers like Plfer are doing their best -- either out of spite or ignorance -- to keep it alive. You can make money and still avail yourself of the fair use defense.
Plfer is a mess. Moignard may be ambitious, but his "solution" to small-time infringement will either become another also-ran or the tool of copyright trolls. There's nothing here that doesn't point to either of these two outcomes.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: automated threat letters, copyright, copyright troll, copyright trolling, damages, dmca, infringement detection, plagiarism, stephen moignard
Companies: plfer
Reader Comments
The First Word
“Hehe, Mike should send a take down notice to his hosting company for "Copyright Infringement". Checkmate.
Subscribe: RSS
View by: Time | Thread
Well, this is awkward...
So fair use should be severely limited apparently. Boy, that sure does make this bit rather awkward...
For instance, Mike Masnick at TechDirt says:
"People copy stuff all the time, because it's a natural and normal thing to do. People make copies because it's convenient and it serves a purpose -- and quite often they know that doing so causes no harm in those situations."
He's using someone else's quote to promote his own service, which according his own argument, would almost certainly count as commercial use, and therefor fair use wouldn't apply.
... I wonder just how much his service would qualify his use of someone else's work, and the 'harm' it caused? Perhaps a couple hundred thousand or so, depending on how long his post has been up?
[ link to this | view in chronology ]
Re: Well, this is awkward...
People claiming that no protection should exist are naive, but in terms of being damaging to society, it is a lesser evil.
[ link to this | view in chronology ]
Re: Re: Well, this is awkward...
Lets see, for most of human history, stories, music, dance etc spread by one performer copying from another performer, and sometimes modifying the work slightly to suite their audience. Indeed the only way that a creator gained a reputation was by having their work copied, as without it being copied it did not reach a large audience, or even persist.
Copyright, which was developed from censorship licensing, has an advantage only when production of copies is via a batch process, where a large number of copies is produced before a copy is sold. If two printers in the same area tried printed the same title, the first to complete the printing would sell their copies, while the second would probably be left with most of the copies that they produced.
The second effect of copyright has been that the middlemen gatekeepers, who arranged access to the production and distribution of copies, used this power to gain control of the works, and most of the profit from producing copies. They also had the power to decide what works were put in front of the public, and for how long.
So despite what politicians claim the objective of copyright is, it only controlled the production, and not the generation of new works. Because of limited production facilities, for producing build copies, there have always been more works from more creators vying for publication. That restraint on publication no longer exists, and advantage in the publication of creative works is swinging to those who can self publish, as they can gain an income from a much smaller fan base than those who go the traditional route.
Assuming that people have a limited budget for entertainment, the more people publishing creative works, the harder it will be for an individual creator to build a large enough fan base pay the middlemen, including the collecting societies.
It is worth noting, that given the opportunity, fans of a creator will give them money, given a means of doing so. If anything, the availability of free copies serves to build the fan base, and increase in come.
Whenever the copying of works is restricted, due to either the cost of copying, or because of gatekeepers and copyright, only a small number of the works created ever get a chance to find an audience. Note, that is not a problem of creation, but one of publication and circulation.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Well that is comforting
Its also a widely held fact that Alexa is a largely(entirely)useless metric.
[ link to this | view in chronology ]
Re: Well that is comforting
Yes. Using Alexa data to support what they say or do is a strong indicator that they are trying to snow you.
[ link to this | view in chronology ]
Re: Well that is comforting
[ link to this | view in chronology ]
speaking of trolls, anyone hear how last week's Prenda hearing went?
[ link to this | view in chronology ]
Re: speaking of trolls, anyone hear how last week's Prenda hearing went?
[ link to this | view in chronology ]
Re: Re: speaking of trolls, anyone hear how last week's Prenda hearing went?
[ link to this | view in chronology ]
Re: speaking of trolls, anyone hear how last week's Prenda hearing went?
http://madisonrecord.com/issues/310-defamation/268981-defendants-in-defamation-case-want-prenda-la w-duffy-held-in-contempt-further-sanctioned-hearing-slated-for-thursday-in-chicago
[ link to this | view in chronology ]
I don't imagine VC money is beating a path to his door....
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
- have main investment sell their less junk patents to troll, to keep a lean and clean portfolio there.
- have patent troll sue competitors using the junk.
- the patent troll may struggle in the long run, but in the short run, the headstart they have bought the main investment is invaluable and may mean a strong roi.
[ link to this | view in chronology ]
Hehe, Mike should send a take down notice to his hosting company for "Copyright Infringement". Checkmate.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
I think when he named this company, he missed an "i" in the name, shouldn't it be Pilfer, since that is what he is trying to do? Pilfer money from people.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Challenge accepted
Really? Okay, then I'll copy/paste something from somewhere and find out how much Shakespeare's gonna sue me for.
Ay, that's well known:
But what particular rarity? What strange,
Which manifold record not matches? See,
Magic of bounty! all these spirits thy power
Hath conjured to attend. I know the merchant.
*shrugs* Well, 'tis somewhat fitting.
[ link to this | view in chronology ]
Java or JavaScript
(of course, I have them both blocked by default.)
[ link to this | view in chronology ]
Re: Java or JavaScript
[ link to this | view in chronology ]
Re: Java or JavaScript
[ link to this | view in chronology ]
Because that "Deep Scan" report has all the almost-makes-sense attributes of an Onion-level goof.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
dissembling
[ link to this | view in chronology ]
Re: dissembling
[ link to this | view in chronology ]
Re: dissembling
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
(http://youtu.be/RB6wQxugJn4?t=40s)
[ link to this | view in chronology ]
T,FTFY.
Another extortionist attempting to claim both money and a moral high ground he has no claim to.
[ link to this | view in chronology ]