Company Claims Its Software Can Magically Identify 'Rogue Sites'
from the that's-not-how-it-works dept
A company called RogueFinder is claiming that it has automated the process for finding rogue sites:The basic idea is to draw links between seemingly unconnected “rogue” web sites, e.g. web sites selling counterfeit goods. According to the RogueFinder web site, its software takes minutes to do what it takes forensics teams months to achieve.Sounds useful for playing parlor tricks. Not so sure for a system involved in blocking protected speech. As we've discussed time and time again, one of the issues in all of this is that determining what is and what is not infringing is not an easy task. At all. It takes a human being who can actually analyze the situation and how it falls under copyright law -- including exploring specific exemptions. It's time that we got rid of the myth that there's any significant way to magically identify what's infringing and what's not.
It uses data from registries, registrars, web hosts, servers, and ISPs as well as inspecting the sites’ “invisible source code”.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: monitoring, rogue sites
Reader Comments
Subscribe: RSS
View by: Time | Thread
And I can honestly state It has identified one again
And WTF is invisible source code? is that like Imaginary Hollywood Matrix stuff.. oooo scary
[ link to this | view in chronology ]
Re:
ps my tablet typingskills need work...that and i forgot my pw again...
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re: Re: Re:
"The magic pixies who live inside the thinking box, when you double click it they are forced to once again pick up their instruments and reproduce the drivel to appease their human captors.
The magic smoke one sometimes sees leaving a computer case is actually the souls of pixies pushed to far and to hard to reproduce to many songs in a public performance.
Before you torrent that next album, won't you stop and think of the pixies?
/sallystuthers"
http://www.techdirt.com/articles/20111003/12570316188/us-supreme-court-l ets-stand-ruling-that-says-music-downloads-are-not-public-performances.shtml#c179
[ link to this | view in chronology ]
Re: Re: Re: Re:
Its all that Technical Hitch called TIKUF who is the real culprit [ http://www.youtube.com/watch?v=Yq_i-swEK14 ]
They say if you pronounce TIKUF backwards three times, that he will appear to smite your enemies.. Honest and for true!
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Already?
*facepalm*
[ link to this | view in chronology ]
What a waste of time/money/effort.
[ link to this | view in chronology ]
Response to: Anonymous Coward on Dec 13th, 2011 @ 11:25pm
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Administrative Contact:
Gioconda, Joseph Joseph.Gioconda@RogueFinder.com
RogueFinder LLC
42-40 Bell Boulevard
Suite 607
Bayside, New York 11361
United States
+1.7184233610
Which apparently is these folks.
http://www.giocondalaw.com/
N.
[ link to this | view in chronology ]
Re:
A lawyer who just so happens to make all of his money from IP.
Just sayin'.
[ link to this | view in chronology ]
Re: Re:
We have software that can identify thousands of people you can sue automatically... Imagine not having to spend all that time and effort gathering ip addresses and fake names for your extortion schemes... er legal filings, with our automated software, you just point it to a piece of content, enter the number of suckers (aka litigants) you want to try and extort money from, and our system will use it's "Magic Six Degrees of Kevin Bacon Methodology" to identify the appropriate number of individuals to include in your suit.
fine print: no warranty expressed or implied, all results made up on the spot based on random ip address associations, no guarantee of actual infringement or any proof is ever provided by this software, the results of this system are not valid for legal filings and should not be relied upon for initiating legal proceedings... (we know nobody reads the fine print... so if you use our software to identify people to sue, you are violating our licensing agreement on any suits filed, and you agree to pay us $1000 per name identified by our softwar and used in your suit)
Yes, THIS IS SOFTWAR.... get in the game or move on...
[ link to this | view in chronology ]
Re: Re: Re:
[ link to this | view in chronology ]
*headdesk*
The worst part is, the SOPA crowd would probably buy that. If I went to a congressman's house and showed him where "view page source" is in Internet Explorer 4 (or possibly Netscape Navigator), he'd think it was some kind of secret legendary hacker trick.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
And probably illegal. Unauthorized access to a computer system, anyone?
[ link to this | view in chronology ]
I doubt it
Everyone likes to shout "free market" all the time, but forget to take it into account in most discussions that actually require it.
Plus first of all, all they are doing is using bots to crawl websites they 'mark' as rogue, and then to make a database of other sites that are linked to by sites that link to that original marked site(s).
This would be the 3rd homework problem in any class teaching how to make a search engine after 1) how to make a crawler-bot, 2) how to make a DB of sites 3) (this) how to link sites together in groups
[ link to this | view in chronology ]
Re: I doubt it
All that would do is give you what is 'potentially infringing' not what is infringing.
[ link to this | view in chronology ]
Spectral Evidence
From Wikipedia, the free encyclopedia
(Citations omitted.)
[ link to this | view in chronology ]
meanwhile those _poor_ industries struck by piracy "help" add to the list... ...
[ link to this | view in chronology ]
It might be useful as a tool to gather potential instances of infringement. However, these instances will still need people to verify if they are infringing or not.
Although, based upon many companies previous behavior when given tools to locate potentially infringing material, they are likely to take this software's list and send out mass takedown notices without properly checking.
Can't wait to see the false positive rate from this software.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
DMCA law gave them cruise missiles. Over half target a business rival. One third were invalid attacks.
Now SOPA will give them nuclear weapons and you can watch part of the Internet get obliterated before your eyes.
Then what better then for lazy copyright owners to put the pending WWIII all on computer control.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Sounds like Mike Masnick's worst nightmare.
[ link to this | view in chronology ]
Re:
Sounds like a troll with penis envy talking to hear himself talk again.
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
It's more like a dash of poor understanding and a bucket of desperation sweat mixed with a cup of web crawling search engine bots to make a "magic" potion that cures the poor, poor ailment all the IP welfare leeches are suffering from known as "being forced to adapt your business model to fit reality."
No one else in the world gets to sit around like a lazy piece of trash and perpetually make money off of work they did in the past. So make sure you keep the tear stains off your resume as you go out there and look for a real job.
[ link to this | view in chronology ]
Re:
Sounds like a criminals' dream come true.
[ link to this | view in chronology ]
This is obviously some alternate reality where invisible means what you can see.
[ link to this | view in chronology ]
Re:
For example, on our 'job application' page, we have been getting several 'error' emails (everytime someone doesn't fill out the page correctly, and attempts to submit the job application in a 'bad' aka 'sql injection' type format, we are sent an email informing us of the submitting computer/user info...yes, we coded this ourselves...not a software package).
If you physically look at our page, all you see is a submit button. In the "source code" section, you can see where some things are processed and then it jumps to a different page. That 2nd page does some additional checking, and then inserts the data into the database. The user never sees anything except 'processing'.
We have had bots recently that have been skipping the first page, and going directly to the 2nd page and attempting to inject code there. However, we have already built in for that possibility, so the 2nd page errors out and shoots us an email.
If you were to attempt to explain 'invisible source code' to a non techie, then technically, to them, the 2nd page is 'invisible'. Nothing on their screen gives them the impression that they are on a different page.
[ link to this | view in chronology ]
Re: Re:
It's invisible to those who do not know it is there.
Not really in line with the actual definition of the word however.
[ link to this | view in chronology ]
rogue:
1 vagrant, tramp
2 a dishonest or worthless person : scoundrel
3 a mischievous person : scamp
4 a horse inclined to shirk or misbehave
5 an individual exhibiting a chance and usually inferior biological variation
http://www.merriam-webster.com/dictionary/rogue
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re: Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Many of the "rogue sites" do things that are common between sites. From linking images from "file hosts" to intentional misspellings of words, there are plenty of things you can use to filter down and give the old mark 1 eyeball something to look at.
Many of these sites use similar source code and layouts, those who move from domain to domain and host to host often upload the same site over and over again, with minor variations. Over time, you can build up a library of these pages and be able to spot similar sites. Duplicate content is one of the ways these sites often stand out.
You can also look at the products they offer, the hosts they use, the payment processors, and all of that stuff to look for commonalities. If you can filter down 100,000 sites offering "nike shoes" down to a list of 200-300 that are likely rogue, then review them by hand, you would probably have a pretty high success rate.
You could also use honeypots to catch their spam. Opening a wordpress site and allowing open comments is a great way to find out who is scamming what. Similar results can happen using various forum software and other types of sites that permit user comments or postings.
100% success rate? No way. Reasonable successful? I suspect that it can be done.
[ link to this | view in chronology ]
Re:
Obviously you atuomagically know what the MAFIAA, IFPI and BSA don't.
Do not think you are superior to God.
[ link to this | view in chronology ]
Re:
Some of us 'old time network admins' that just now got out from under the 'omg how do we filter spam' umbrella knows that it took YEARS before filtering out spam without filtering legit emails became manageable.
The hours we spent with configuring software solution after software solution....the months we spent reading log files, the years we spent making phone calls to the ISP, to the sending server IT dept, the finger pointing about who's fault it is that a legit email didn't make it.
(don't give me that '3rd party crap', those are the hardest to track down why an email didn't make it to it's destination...but I digress.)
Now....now....because some dying "entertainment" industry can't save their own ass and want to go crying to Gov for a handout.....now I get to find out why our purchasing agent can't find the rivets he needs to build this sidewall to the plane, because of more filtering crap.
Haven't we learned by now?
SO I wanna know......is the **AA's or the government going to reimburse businesses for IT time spent tracking down problems with legit business activities...such as purchasing steel, or shipping products because of the 92 different filtering softwares that are going to flood the market with horrible code, and a lack of understanding of business rules outside of their own world?
[ link to this | view in chronology ]
It might actually work
So, based on various factors (a site 2 days old with gigs of content for example?) registrant's associations with previous "rogues", traffic patterns (if they could get access to this or deduce it from response times), I could see how sites could be characterised into types (news, eCommerce etc) pretty rapidly.
And this doesn't need to be perfect - it just needs to make the xxAA's job slightly less of a needle in a haystack. Simply finding a site which has music for download on has already narrowed the field a bit. (It's not like the results are ever going to be used as actual proof of anything). And the backlinks (the company they keep) will will give clues too.
If they can cheaply trawl a million newly registered domains
and give a vague probability that a site might be a non legit download site, that changes the odds and the timelag in the game of whack a mole.
I suspect that this company will sell the "software" as a service, charge a fortune, but their "server" will actually be some google-literate students told to locate content in return for pocket money, focusing particularly on content pertinent to the paying customers they have signed up. In the distorted world of said content holders, this will appear to offer great value, and a follow on service of filing a takedown will be sold by the lawyers for each site located. Content holders will think this is hugely helpful and will be reminded of all the lost sales that it is preventing.
And not a single extra CD will be sold as a result.
One interesting question arises.
If said "software" finds a site (or thousands of sites) and verifies that they are indeed offering infringing content (how, by downloading ?) Is the holder of the software in violation of any laws ? Every time this software downloads, a sales is lost !
[ link to this | view in chronology ]
Re: It might actually work
[citation needed]
[ link to this | view in chronology ]
Re: Re: It might actually work
[ link to this | view in chronology ]
I don't know about this software...
[ link to this | view in chronology ]
Cool
[ link to this | view in chronology ]
The other possibility is that it's code that runs on the server and the client side never sees it during the code's execution. From what they describe it could be either or nothing at all. If that's what's happening they may be going to use the application to break into servers, something itself that's illegal but I guess this band of lawyers gets to excuse this because they're on the side of the "angels". At least in their minds.
Now data mining CAN be useful. Not will be useful as their site (a multi page advertisement in reality) as there are no guarantees. First you have to know what you're looking for. They claim they do though the sites they describe are usually those associated with harvesting credit card numbers, passwords, identity theft and that sort of thing in the sense that they set up look alike sites of of a bank and ask questions of the user no bank ever would. They may also have to do with the gray/black market for prescription drugs. They claim that by their software's analysis of the data mined the can create a collection of, frankly, unbelievable connections between owners, hosts, ISPs and other data to bring the offender to court.
The thing is this, found on a the About Us page.
"ROGUEFINDER™ Investigative Software is currently in active development by a team at RogueFinder LLC, located in New York City.
The impressive team includes experienced intellectual property attorneys, private investigators, software analysts and technical consultants. Each team member is involved in critically important elements of the software, including:.."
Whoops. The software isn't finished yet. But, hey, we're working on it.
Notably missing from the list are statistical analysts which one needs to do effective data mining as all data mining does is result in a stack of statistics which get tossed out of whack the moment something unexpected data comes along if you're relying on a collection of preset
rules.
They also claim the software is patent pending, along with the usual copyright and trade mark claims. While I won't, completely, dispute the last two the first seems unlikely as they would be relying on an aircraft carrier stuffed full of prior art to do what they claim to be able to do. (With unfinished software even!). As for copyright, there may be questions there too as some things cannot be subject to copyright. Things like facts, mathematical equations (aka algorithms) and many others that appear in software. The specific expression in that software is protected with copyright before someone tries to jump on me for that.
More than anything the site looks like an almost well written ad for vapourware stuffed with an over abunance of stock photos. If they're looking to tag those who send out spam with the Nigerian scam in them, fake bank notices about expired passwords and what have we it's gonna fail. If, for no other reason, that sites that those are run by organized crime, often Russian, who have far more resources available to them to counter this vapourware than this law firm has. And I can hear them laughing from here some 4000 miles away to the east as the crow flies. I can hear them tapping out software right now to counter what this software claims to do.
As for file sharing sites, the ones copyright purists want to target as SOPA and PIPA claim to do, virtually all of those are small operations with few, if any ads, collecting some support through donations and stuff. Not the kind of sites that are likely to be raking in money.
File lockers are both ad and subscription supported but their legitimate uses far outweigh any illegitimate uses. They do respond to takedown notices so they follow the letter and spirit of the DCMA as it is.
As I said, all they've done is warn the very people best equipped to counter them. And counter them they will while the law firm collects a hoped for ton of fees on games of whack a mole. I have yet to figure out how a New York based law firm can bring suit in Russia, Canada, France, the UK and so on when they're not members of the Bar in any of those countries. Unless, once again, the idea is to collect a liability award in the United States and whack the site owners if they're foolish enough to visit the U.S. at some point in the future under the name they used to register their site(s). Good luck there.
I'm not for a moment minimizing the threat of fake prescription drugs, the possibility of identity theft or other serious issues where organized crime would see a profit. Hell, I'll even concede that perhaps another fake Dior handbag might hurt someone, somewhere though we already know and have known for years that the majority of those come from Hong Kong.
File sharing by individuals it won't stop.
Still, if I was tasked with reviewing this software with an eye to using it I'd want to see real world data, test results, a complete and detailed description of the methodology and the complete source code. Until I got all of that not a penny would go their way.
Something about this stinks. Badly.
[ link to this | view in chronology ]
Re:
Even if they do write software to isolate suspicious transactions, at the end of the day it will still take human eyeballs to verify it all.
Of course, all it takes is to bust one 14 year old girl and one granny sharing thier own photos that are mistakenly identified as bearing an actionable copyright. Not that we haven't been down that road before. Of course it won't happen. Not in a million years! Ok, a million microseconds then.
[ link to this | view in chronology ]
What Protected Speech?
[ link to this | view in chronology ]
Re: What Protected Speech?
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re: Rogue Sites
[ link to this | view in chronology ]
Re: Re: Rogue Sites
[ link to this | view in chronology ]