stories filed under: "tools"

Content Moderation Case Studies: Using AI To Detect Problematic Edits On Wikipedia (2015)

from the ai-to-the-rescue? dept

Fri, Oct 30th 2020 3:33pm — Copia Institute

Summary: Wikipedia is well known as an online encyclopedia that anyone can edit. This has enabled a massive corpus of knowledge to be created, that has achieved high marks for accuracy, while also recognizing that at any one moment some content may not be accurate, as anyone may have entered in recent changes. Indeed, one of the key struggles that Wikipedia has dealt with over the years is with so-called “vandals” who change a page not to improve the quality of an entry, but to deliberately decrease the quality.

In late 2015, the Wikimedia Foundation, which runs Wikipedia, announced an artificial intelligence tool, called ORES (Objective Revision Evaluation Service) which they hoped might be useful to effectively pre-score edits for the various volunteer editors so they could catch vandalism quicker.

ORES brings automated edit and article quality classification to everyone via a set of open Application Programming Interfaces (APIs). The system works by training models against edit- and article-quality assessments made by Wikipedians and generating automated scores for every single edit and article.

What’s the predicted probability that a specific edit be damaging? You can now get a quick answer to this question. ORES allows you to specify a project (e.g. English Wikipedia), a model (e.g. the damage detection model), and one or more revisions. The API returns an easily consumable response in JSON format:

The system was not designed, necessarily, to be user-facing, but rather as a system that others could build tools on top of to help with the editing process. Thus it was designed to feed some of its output into other existing and future tools.

Part of the goal of the system, according to the person who created it, Aaron Halfaker, was to hopefully make it easier to teach new editors how to be productive editors on Wikipedia. There was a concern that more and more of the site was controlled by an increasingly small number of volunteers, and new entrants were scared off, sometimes by the various arcane rules. Thus, rather than seeing ORES as a tool for automating content moderation, or as a tool for “quality control” over edits, Halfaker saw it more as a tool to help experienced editors better guide new, well-meaning, but perhaps inexperienced editors in ways to improve.

The motivation for Mr. Halfaker and the Wikimedia Foundation wasn’t to smack contributors on the wrist for getting things wrong. “I think we who engineer tools for social communities, have a responsibility to the communities we are working with to empower them,” Mr. Halfaker said. After all, Wikipedia already has three AI systems working well on the site’s quality control, Huggle, STiki and ClueBot NG.

“I don’t want to build the next quality control tool. What I’d rather do is give people the signal and let them work with it,“ Mr. Halfaker said.

The artificial intelligence essentially works on two axes. It gives edits two scores: first, the likelihood that it’s a damaging edit, and, second, the odds that it was an edit made in good faith or not. If contributors make bad edits in good faith, the hope is that someone more experienced in the community will reach out to them to help them understand the mistake.

“If you have a sequence of bad scores, then you’re probably a vandal,” Mr. Halfaker said. “If you have a sequence of good scores with a couple of bad ones, you’re probably a good faith contributor.”

Decisions to be made by Wikipedia:

How useful is artificial intelligence in helping to determine the quality of edits?
How best to implement a tool like ORES?
- Should it automatically revert likely “bad” edits?
- Should it be used for quality control?
- Should it be a tool to just highlight edits for volunteers to review?
What is likely to encourage more editors to help keep Wikipedia as up to date and clean of vandalism?
What data do you train ORES on? How do you validate the accuracy of the training data?

Questions and policy implications to consider:

Are there issues when, because the AI has scored something, the tendency is to assume the AI must be “correct”? How do you make sure the AI is accurate?
Does AI help bring on new editors or does it scare away new editors?
Are there ways to prevent inherent bias from being baked into any AI moderation system, especially one trained by existing moderators?

Resolution: Halfaker, who later left Wikimedia to go to Microsoft Research, has published a few papers about ORES since it launched. In 2017, a paper by Halfaker and a few others noted that the tool was increasingly used over the previous three years.

The ORES service has been online since July 2015. Since then, usage has steadily risen as we’ve developed and deployed new models and additional integrations are made by tool developers and researchers. Currently, ORES supports 78 different models and 37 different language-specific wikis.

Generally, we see 50 to 125 requests per minute from external tools that are using ORES’ predictions (excluding the MediaWiki extension that is more difficult to track). Sometimes these external requests will burst up to 400-500 requests per second

One thing they noticed was that those using the ORES output often wanted search through the metrics and set their own thresholds rather than accepting the hard coded ones in ORES:

Originally, when we developed ORES, we defined these threshold optimizations in our deployment configuration. But eventually, it became apparent that our users wanted to be able to search through fitness metrics to choose thresholds that matched their own operational concerns. Adding new optimizations and redeploying quickly became a burden on us and a delay for our users. In response, we developed a syntax for requesting an optimization from ORES in realtime using fitness statistics from the models tests

The project also appeared to be successful in getting built into various editing tools, and possibly inspiring ideas for new editing quality tools:

Many tools for counter-vandalism in Wikipedia were already available when we developed ORES. Some of them made use of machine prediction (e.g. Huggle27, STiki, ClueBot NG), but most did not. Soon after we deployed ORES, many developers that had not previously included their own prediction models in their tools were quick to adopt ORES. For example, RealTime Recent Changes includes ORES predictions along-side their realtime interface and FastButtons, a Portuguese Wikipedia gadget, began displaying ORES predictions next to their buttons for quick reviewing and reverting damaging edits. Other tools that were not targeted at counter-vandalism also found ORES predictions— specifically that of article quality (wp10)—useful. For example, RATER,30 a gadget for supporting the assessment of article quality began to include ORES predictions to help their users assess the quality of articles and SuggestBot,31[5] a robot for suggesting articles to an editor, began including ORES predictions in their tables of recommendations.

Many new tools have been developed since ORES was released that may not have been developed at all otherwise. For example, the Wikimedia Foundation product department developed a complete redesign on MediaWiki’s Special:RecentChanges interface that implements a set of powerful filters and highlighting. They took the ORES Review Tool to it’s logical conclusion with an initiative that they referred to as Edit Review Filters. In this interface, ORES scores are prominently featured at the top of the list of available features, and they have been highlighted as one of the main benefits of the new interface to the editing community.

In a later paper, Halfaker explored, among other things, concerns about how AI systems like ORES might reinforce inherent bias.

A 2016 ProPublica investigation [4] raised serious allegations of racial biases in a ML-based tool sold to criminal courts across the US. The COMPAS system by Northpointe, Inc. produced risk scores for defendants charged with a crime, to be used to assist judges in determining if defendants should be released on bail or held in jail until their trial. This exposé began a wave of academic research, legal challenges, journalism, and organizing about a range of similar commercial software tools that have saturated the criminal justice system. Academic debates followed over what it meant for such a system to be “fair” or “biased”. As Mulligan et al. discuss, debates over these “essentially contested concepts” often focused on competing mathematically-defined criteria, like equality of false positives between groups, etc.

When we examine COMPAS, we must admit that we feel an uneasy comparison between how it operates and how ORES is used for content moderation in Wikipedia. Of course, decisions about what is kept or removed from Wikipedia are of a different kind of social consequence than decisions about who is jailed by the state. However, just as ORES gives Wikipedia’s human patrollers a score intended to influence their gatekeeping decisions, so does COMPAS give judges a similarly functioning score. Both are trained on data that assumes a knowable ground truth for the question to be answered by the classifier. Often this data is taken from prior decisions, heavily relying on found traces produced by a multitude of different individuals, who brought quite different assumptions and frameworks to bear when originally making those decisions

Filed Under: ai, content moderation, ores, tools, wikipedia
Companies: wikimedia

5 Comments

Cloudflare Makes It Easier For All Its Users To Help Stop Child Porn Distribution

(Mis)Uses of Technology

from the this-is-good dept

Tue, Dec 24th 2019 11:09am — Mike Masnick

We recently wrote about how Senators Lindsey Graham and Richard Blumenthal are preparing for FOSTA 2.0, this time focused on child porn -- which is now being renamed as "Child Sexual Abuse Material" or "CSAM." As part of that story, we highlighted that these two Senators and some of their colleagues had begun grandstanding against tech companies in response to a misleading NY Times article that seemed to blame internet companies for the rising number of reports to NCMEC of CSAM found on the internet, when that should be seen as more evidence of how much the companies are doing to try to stop CSAM.

Of course, working with NCMEC and other such organizations takes a lot of effort. Being able to scan for shared hashes of CSAM isn't something that every internet site can do. It's mostly just done by the larger companies. But last week Cloudflare (one of the companies that Senators are demanding "answers" from), did something quite fascinating: it enabled all Cloudlfare users, no matter what level of service, to start using Cloudflare CSAM scanning tools for free, even allowing them to set their own rules and preferences (something that might become very, very important if the Graham/Blumenthal bill becomes the law.

I highly recommend reading the entire article, because it's quite a clear, interesting, and easy to read article about how fuzzy hashing works (including pictures of dogs and bicycles). As the Cloudflare post notes, those who use such fuzzy hashing tools have intentionally kept at least some of the details secret -- because being too public about it would allow those who are producing and distributing CSAM to make changes that "dodge" the various tools and filters, which would obviously be a problem. However, that also results in two potential issues: (1) a lack of transparency in how these filtering systems really operate and (2) an inability for all but the largest players to make use of these tools -- which would be disastrous for smaller companies if they were required to make use of such things.

And that's where Cloudflare's move is quite interesting. In providing the tool for free to all of its users, it keeps the proprietary nature of the tool secret, but it's also letting them set the thresholds.

If the threshold is too strict — meaning that it's closer to a traditional hash and two images need to be virtually identical to trigger a match — then you're more likely to have have many false negatives (i.e., CSAM that isn't flagged). If the threshold is too loose, then it's possible to have many false positives. False positives may seem like the lesser evil, but there are legitimate concerns that increasing the possibility of false positives at scale could waste limited resources and further overwhelm the existing ecosystem. We will work to iterate the CSAM Scanning Tool to provide more granular control to the website owner while supporting the ongoing effectiveness of the ecosystem. Today, we believe we can offer a good first set of options for our customers that will allow us to more quickly flag CSAM without overwhelming the resources of the ecosystem.

Different Thresholds for Different Customers

The same desire for a granular approach was reflected in our conversations with our customers. When we asked what was appropriate for them, the answer varied radically based on the type of business, how sophisticated its existing abuse process was, and its likely exposure level and tolerance for the risk of CSAM being posted on their site.

For instance, a mature social network using Cloudflare with a sophisticated abuse team may want the threshold set quite loose, but not want the material to be automatically blocked because they have the resources to manually review whatever is flagged.

A new startup dedicated to providing a forum to new parents may want the threshold set quite loose and want any hits automatically blocked because they haven't yet built a sophisticated abuse team and the risk to their brand is so high if CSAM material is posted -- even if that will result in some false positives.

A commercial financial institution may want to set the threshold quite strict because they're less likely to have user generated content and would have a low tolerance for false positives, but then automatically block anything that's detected because if somehow their systems are compromised to host known CSAM they want to stop it immediately.

This is an incredibly thoughtful and nuanced approach, recognizing that when it comes to any sort of moderation, one size can never fit all. And, by allowing sites to set their own thresholds, it actually does add in a level of useful transparency, without exposing the inner workings that would allow bad actors to game the system.

That said, I can almost guarantee that someone (or perhaps multiple someones) will come along before too long and Cloudflare's efforts to help all of its users combat CSAM will somehow be incorrectly or misleadingly spun to claim that Cloudflare is somehow helping sites to hide or enable CSAM. No good deed goes unpunished.

However if you want to support actual solutions -- not grandstanding nonsense -- to try to deal with CSAM, approaches like Cloudflare's are ones worth paying attention to. This is especially true if Graham/Blumenthal and others get their way. Under proposals like the one they're suggesting, it will become virtually impossible for smaller companies to take the actions necessary to meet the standards to avoid legal liability. And that means that (once again) the big internet companies will end up getting bigger. They all have access to NCMEC and the necessary tools to scan and submit CSAM. Smaller companies don't. Cloudflare offering up its scan tool for everyone helps level the playing field in a really important way.

Filed Under: child porn, csam, fuzzy hashing, infrastructure, tools
Companies: cloudflare

17 Comments

Facebook Asked To Change Terms Of Service To Protect Journalists

Journalism

from the a-chance-to-fix-things dept

Tue, Aug 7th 2018 10:43am — Mike Masnick

There are plenty of things to be concerned about regarding Facebook these days, and I'm sure we'll be discussing them for years to come, but the Knight First Amendment Center is asking Facebook to make a very important change as soon as possible: creating a safe harbor for journalists who are researching public interest stories on the platform. Specifically, the concern is that basic tools used for reporting likely violate Facebook's terms of service, and could lead to Facebook being able to go after reporters for CFAA violations for violating its terms. From the letter:

Digital journalism and research are crucial to the public’s understanding of Facebook’s platform and its influence on our society. Many of the most important stories written about Facebook and other social media platforms in recent months have relied on basic tools of digital investigation. For example, research published by an analyst with the Tow Center for Digital Journalism, and reported in The Washington Post, uncovered the true reach of the Russian disinformation campaign on Facebook. An investigation by Gizmodo showed how Facebook’s “People You May Know” feature problematically exploits “shadow” profile data in order to recommend friends to users. A story published by ProPublica revealed that Facebook’s self-service ad platform had enabled advertisers of rental housing to discriminate against tenants based on race, disability, gender, and other protected characteristics. And a story published by the New York Times exposed a vast trade in fake Twitter followers, some of which impersonated real users.

Facebook’s terms of service limit this kind of journalism and research because they ban tools that are often necessary to it—specifically, the automated collection of public information and the creation of temporary research accounts. Automated collection allows journalists and researchers to generate statistical insights into patterns, trends, and information flows on Facebook’s platform. Temporary research accounts allow journalists and researchers to assess how the platform responds to different profiles and prompts.

Journalists and researchers who use tools in violation of Facebook’s terms of service risk serious consequences. Their accounts may be suspended or disabled. They risk legal liability for breach of contract. The Department of Justice and Facebook have both at times interpreted the Computer Fraud and Abuse Act to prohibit violations of a website’s terms of service. We are unaware of any case in which Facebook has brought legal action against a journalist or researcher for a violation of its terms of service. In multiple instances, however, Facebook has instructed journalists or researchers to discontinue important investigative projects, claiming that the projects violate Facebook’s terms of service. As you undoubtedly appreciate, the mere possibility of legal action has a significant chilling effect. We have spoken to a number of journalists and researchers who have modified their investigations to avoid violating Facebook’s terms of service, even though doing so made their work less valuable to the public. In some cases, the fear of liability led them to abandon projects altogether.

This is a big deal, as succinctly described above. We've talked in the past about how Facebook has used the CFAA to sue useful services and how damaging that is. But the issues here have to do with actual reporters trying to better understand aspects of Facebook, for which there is tremendous and urgent public interest, as the letter lays out. Also, over at Gizmodo, Kash Hill has a story about how Facebook threatened them over their story investigating Facebook's "People You May Know" feature, showing that this is not just a theoretical concern:

In order to help conduct this investigation, we built a tool to keep track of the people Facebook thinks you know. Called the PYMK Inspector, it captures every recommendation made to a user for however long they want to run the tool. It’s how one of us discovered Facebook had linked us with an unknown relative. In January, after hiring a third party to do a security review of the tool, we released it publicly on Github for users who wanted to study their own People You May Know recommendations. Volunteers who downloaded the tool helped us explore whether you’ll show up in someone’s People You Know after you look at their profile. (Good news for Facebook stalkers: Our experiment found you won’t be recommended as a friend just based on looking at someone’s profile.)

Facebook wasn’t happy about the tool.

The day after we released it, a Facebook spokesperson reached out asking to chat about it, and then told us that the tool violated Facebook’s terms of service, because it asked users to give it their username and password so that it could sign in on their behalf. Facebook’s TOS states that, “You will not solicit login information or access an account belonging to someone else.” They said we would need to shut down the tool (which was impossible because it’s an open source tool) and delete any data we collected (which was also impossible because the information was stored on individual users’ computers; we weren’t collecting it centrally).

The proposal in the letter is that Facebook amend its terms of service to create a "safe harbor" for journalism. While Facebook recently agreed to open up lots of data to third party academics, it's important to note that journalists and academics are not the same thing.

The safe harbor we envision would permit journalists and researchers to conduct public-interest investigations while protecting the privacy of Facebook’s users and the integrity of Facebook’s platform. Specifically, it would provide that an individual does not violate Facebook’s terms of service by collecting publicly available data by automated means, or by creating and using temporary research accounts, as part of a news-gathering or research project, so long as the project meets certain conditions.

First, the purpose of the project must be to inform the general public about matters of public concern. Projects designed to inform the public about issues like echo chambers, misinformation, and discrimination would satisfy this condition. Projects designed to facilitate commercial data aggregation and targeted advertising would not.

Second, the project must protect Facebook’s users. Those who wish to take advantage of the safe harbor must take reasonable measures to protect user privacy. They must store data obtained from the platform securely. They must not use it for any purpose other than to inform the general public about matters of public concern. They must not sell it, license it, or transfer it to, for example, a data aggregator. And they must not disclose any information that would readily identify a user without the user’s consent, unless the public interest in disclosure would clearly outweigh the user’s interest in privacy.

There are a few more conditions in the proposal, including not interfering with the proper working of Facebook. The letter includes a draft amendment as well.

While there may be some hesitation among certain people with anything that seems to try to carve out different rules for a special class of people, I appreciate that the approach here is focused on carving out a safe harbor for journalism rather than journalists. That is, as currently structured, anyone could qualify for the safe harbors if they are engaged in acts of journalism, and it does not have any silly requirement about being attached to a well known media organization or anything like that. The entire setup seems quite reasonable, so now we'll see how Facebook responds.

Filed Under: cfaa, data collection, journalism, reporting, safe harbor, tools
Companies: facebook, knight 1st amendment center

Valve Decides To Get Out Of The Curation Business When It Comes To 'Offensive' Games

Culture

from the the-good-and-the-bad dept

Fri, Jun 8th 2018 3:32am — Timothy Geigner

As we've said in the past, Valve has always had a tricky line to walk with it's Steam platform, having to straddle the needs of both the gamers that use the service and the game developers that make it worthwhile. Frankly, it's walked this line fairly well for the most part. The platform, which was always popular, has exploded as the place to release a new game title online. As we noted way back in ye olde 2016, this popularity has also presented a problem for Steam: saturation. There are now simply so many games available on the platform that blindly wading into it and expecting to find new content you didn't know you wanted is a dicey proposition at best. More content is an undeniably good thing, but it would be silly to suggest that the deluge of new games released in the past few years hasn't also had a deleterious effect on the usability of the platform.

Our solution? It won't surprise you. We advocated that Steam empower the gamers that use it to act as curators. If done properly, this would allow an ecosystem of trusted advisers among gamers that share interests to tell them which titles they should be looking at. To that end, Steam subsequently employed a curators program within the platform that attempted to build exactly this ecosystem. To date, it's been mediocre at best.

But this isn't the only publicized problem Steam has had in recent days. In addition, the platform has been in the news for its wishy-washy but ultimately heavy-handed approach to games that have either mature sexual content or are offensive to large swaths of people. Combinations of so-called sex-games and games that make such topics as school shootings central to gameplay have been banned, or not, often to much critique from every side from gamers.

The concept of empowering its community to serve as its own filters and the no-win situation when it comes to offensive games has now collided, causing Valve to announce that it's getting out of the game content moderating business entirely.

In a blog post musing on the difficulty of deciding on a case-by-case basis what should and should not be allowed on Steam, Valve’s Erik Johnson explained that the company does, in fact, have a team of humans that looks at “every controversial title submitted to us,” and employees frequently disagree like Steam users do. “The harsh reality of this space, that lies at the root of our dilemma, is that there is absolutely no way we can navigate it without making some of our players really mad,” Johnson wrote.

“We’ve decided that the right approach is to allow everything onto the Steam Store, except for things that we decide are illegal, or straight up trolling,” said Johnson. “Taking this approach allows us to focus less on trying to police what should be on Steam, and more on building those tools to give people control over what kinds of content they see.”

There are two reactions that leap immediately to mind. First and foremost, this will be a good and useful experiment by Valve. Empowering customers and communities is almost always the right approach. Acting as a gatekeeper or the warden managing the walled garden is not an approach we believe in. Moreso, an approach by a company that puts its trust in the everyday customer is typically an inherently consumer-friendly one. The ideals behind this kind of move are a good one. Censorship sucks, choice is better.

On the other hand, the other immediate reaction has to be that Valve had damned well better have its user tools in order when it rolls this out en masse, because two things will happen otherwise. Most directly, gamers who are being inundated with games and content they find horrifying, offensive, or otherwise view negatively are going to be fully up in arms. It's easy to imagine families that game together, between parents and young children, losing their shit if the Steam homepage is suddenly full of games laden with overt sexual content or school shootings.

Even more so, if you thought the floodgates had been open when it came to the sheer volume of titles on Steam previously, this is going to introduce a potential tidal wave of new games onto the marketplace. If Valve isn't supremely prepared to empower users now with far better curating tools than it already has, the platform is likely going to take a severe dip in its usability as a place to discover games.

In other words: decent idea, assuming Valve has put a ton of thought into how this will impact its platform.

Filed Under: content moderation, steam, tools, users
Companies: valve

65 Comments

DailyDirt: Helping The Blind With Technology

Studies

from the urls-we-dig-up dept

Tue, Oct 6th 2015 5:00pm — Michael Ho

We've seen some early-stage advances for ways that might help restore sight to people with low vision (or no vision), but it will take many more years before the clinical trials and safety approvals are complete. And not everyone will want to undergo an eye surgery to try to regain some vision, either. Fortunately, robots and wearable technology continue to improve, and these gadgets could become very useful for the blind (and the rest of us, too). Maybe we won't just see telecommuting iPads for remote workers -- but also robot assistants for casual and everyday uses, as well.

Can robots become better than guide dogs at helping the blind? Given that some of the most advanced robots still have trouble navigating the world by themselves, robots helping the blind might not happen for a long time -- but progress will undoubtedly be welcome by both the sighted and the blind. [url]
A wearable device could help blind users by providing tactile or audio feedback based on sensors embedded in a ring. A "smart ring" could have cameras and haptic feedback to allow a user to point it at something and have it read text or recognize objects.... But maybe a smart watch app might be a better way to start this kind of assistive tool? [url]
Tactile Navigation Tools is a company founded by a visually-impaired doctor, making a sensor-equipped vest and "smart cane" to help the blind. The vest and cane can work together to help a user identify dangerous obstacles -- and could also be useful for fire-fighting or military personnel to navigate in low-visibility environments. [url]

After you've finished checking out those links, take a look at our Daily Deals for cool gadgets and other awesome stuff.

Filed Under: baxter, blind, haptics, low vision, robots, smartcane, smartring, tools, visually impaired, wearables
Companies: tactile navigation tools

7 Comments

DailyDirt: Technology For Lawyers

Legal Issues

from the urls-we-dig-up dept

Wed, Feb 18th 2015 5:00pm — Michael Ho

Every profession faces some disruption with technological improvements. Robots have slowly been taking over dangerous and labor-intensive jobs in manufacturing for decades, but advanced algorithms are starting to creep into careers that were previously safe from automation. Sure, translation software has provided some hilarious examples of how bad they are, but the first chess programs weren't so good, either. Lawyers could outlaw their robotic replacements, but they might have to act fast. Here are just a few links on technology getting into the field of law.

Science fiction writers are thinking about how artificial intelligence systems could take over legal processes, and a world of automated contracts actually isn't too far in the future. However, do we want to govern society (or parts of it) with infallible logic trees that don't necessarily adapt to changing conditions? [url]
Watson can play a mean game of Jeopardy, but can it answer your legal questions, too? IBM's cognitive computer can scan through bazillions of legal documents far faster than any team of junior attorneys, so maybe it can come up with a few good insights, too. [url]
The legal network of the European Union can be analyzed to point out how resilient a legal system is or to help legislators determine the possible effects of proposed changes. Or help lobbyists manipulate legislators... [url]

If you'd like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post via StumbleUpon.

Filed Under: ai, artificial intelligence, automation, blockchain, cognitive computing, science fiction, smart contracts, tools, watson
Companies: ibm

17 Comments

Tim Cushing's Subjectively Awesome Stuff Of No Particular Timeframe

Innovation

from the and-now-for-something-completely-different dept

Sat, Feb 15th 2014 12:00pm — Tim Cushing

Our weekly Awesome Stuff post generally deals with interesting and/or useful new products currently being crowdfunded. This week's post will be a little different (actually, entirely different) as I was tapped to fill in for Mike and, unlike many award winners claim while being handed a statuette, I actually had nothing prepared.

So, rather than put together a small list of poorly researched crowdfunded projects, I have decided to write what I know. This post will deal with the tools I use as a Techdirt contributor, as well as with some of my favorite places on the web. Will it be as awesome as most Awesome Stuff posts? Considering we have a pretty active set of commenters, this question is far from rhetorical.

We'll start (I say "we" as if you had a choice where we start) with my most-used tools.

Feedly

When Google decided to pull the plug on its RSS reader, it prompted a panic among those who had become reliant on a service they assumed would be around forever. (There's a lesson in there that has something to do with eggs and baskets…) I'm currently subscribed to 400+ feeds (nothing compared to Nate Hoffelder of The Digital Reader, who claims more than 1,900), so a capable replacement was essential, rather than just a nicety. After trying a few others out, I settled on Feedly, which offered a streamlined, text-only feed that roughly approximated the Google Reader experience.

I find Feedly's mobile version superior to the desktop version and use it almost exclusively. Sharing is pretty seamless and its inline browser can usually handle the job. I do wish Feedly would give you the option to open links in another browser because when it can't handle a site, it tends to lock up for a bit before gracelessly crashing to the "desktop." The other small issue I have seems to be cache-related. Clicking on one article might open up one from a "page" or two back, or from somewhere else on the current page. This is often more disconcerting than annoying and is liveable.

Evernote

Another essential tool. Anything I think I might write about or simply want to read later is sent to Evernote via Feedly. Unlike Feedly, the desktop version is actually superior to the mobile version, and it has an extension that allows you to "clip" nearly anything from any webpage.

It's a very streamlined note-taking tool that I rarely use to take actual notes. Mine's filled with links to articles and posts (nearing 4,000 at the moment) and that number swells by about 300 notes per month. I often look at the growing number of notes and swear that I'm going to clean it out, but I think we all know that's not going to happen. Falling storage costs have made the idea of "cleaning something out" (email, Evernote, etc.) completely absurd. Why clean it out when you've got space to store it all?

Google Docs

This became my go-to composing tool once Google released its offline version. Even without a Wi-fi signal, I can still write and simply sync it up when a signal is acquired. A lack of a signal is becoming increasingly uncommon, but it's still handy having that option, just in case.

That's the Holy Trinity (as it were) of my writing process. From Feedly to Evernote to Google Docs before ending up on Techdirt. The main problem with the feed reader is lag. Feeds trail other sources like Twitter, where information flows almost instantly. To make that change would mean organizing lists on Twitter to trim down its firehose tendencies. And building lists on Twitter is about as enjoyable as bathing multiple cats. Failing that, I could just do what Mike Masnick does and keep the entire internet open at all times.

Moving on from the essential, here are some other essentials not entirely related to the writing process.

BaconReader

If you're going to browse Reddit using your phone, do yourself a favor and pick up BaconReader. It's a much cleaner experience, it handles multiple accounts, and it has an inline browser that actually works without periodically dumping you out of the program. (I'm looking at you, Feedly.) The premium version is only $1.99 and completely worth it.

Reddit Enhancement Suite

If you're going to surf Reddit anywhere else, you have to grab RES (Reddit Enhancement Suite). Visible vote counts on comments and posts, an inline Imgur/Youtube viewer, Never Ending Reddit (a tumblr-esque neverending scroll of posts) and a ton of other enhancements makes vanilla, unenhanced Reddit about as appealing as navigating the US government's PACER system.

And if you're going to surf Reddit, you may as well subscribe to these subreddits:

r/mildlyinteresting
r/mildlyinfuriating
r/nottheonion
r/shutupandtakemymoney
r/CrappyDesign

RECAP

Speaking of which, if you have to navigate PACER, go get RECAP. It's an extension that will fires up when PACER is accessed and lets you know if the documents you seek have already been archived (at the Internet Archive), thus allowing you to browse those without paying the US government $0.10/page for documents created with taxpayer funds. If the documents you're attempting to view aren't already stashed at the IA, RECAP will stash them as soon as you access them, which means your $0.10/page hasn't been spent in vain. You may be getting screwed (by both the per page charge and PACER's hasn't-been-updated-since-1995 interface) but at least you're helping pry some PDFs out of The Man's hands.

The Great Suspender

Another useful extension targets those of us who keep too many browser tabs open. The Great Suspender (whose name riffs on the The Platters' classic hit) suspends tabs once the user-defined time limit has passed, freeing up more RAM. This is a Chrome essential because as Chrome users know, each tab is a new "instance" and multiple tabs leak RAM faster than Snowden leaks documents. (Timely!) Sites can also be "whitelisted" to prevent them from being "put to sleep."

Recommended Reading

Not everything I read is directly related to Techdirt's many wheelhouses. Hidden among the 400+ feeds are a few sites I read simply because I enjoy them. A few of these publish very infrequently, so subscribing to a feed makes more sense than simply bookmarking them.

actionbutton.net

The sometimes-home of the wordiest developer/game critic in the business, Tim Rogers. If you like reading game reviews that approach (or frequently exceed) 10,000 words and leave no tangent unexplored, this is the place for you. Rogers' review of Bulletstorm is still a stone classic.

Ministry of Truth

Pseudonymous blogger Unity uses well-researched facts and incisive wit to deflate the hysterical for-the-children claims made by politicians and talking heads. UK-oriented.

Make It So

Detailed deconstructions of the tech user interfaces as presented in Hollywood scifi films -- what works, what doesn't and how it could be improved. 51 posts alone dedicated to the tech on display in The Fifth Element.

That Guy's A Maniac

Named after a badly-delivered line from the original Resident Evil, TGAM tackles Resident Evil stuff, console RPGS and is resolutely Nintendo-focused. Runs on in-jokes, Pokemon and deftly-deployed self-effacement. Tackles the State of Gaming by criticizing game criticism, new console releases and the UK's underwhelming game retail outlets. Some mild swearing.

Finally, here are three blogs well worth reading, all coming from the criminal defense perspective.

Simple Justice

NY defense lawyer Scott Greenfield's blog, published early and often and featuring a comment section where fools are suffered not at all.

A Public Defender

The public defense perspective via the pseudonymous Gideon of Connecticut. Published not quite as frequently but still essential. Most recently quoted by Techdirt expressing his shocked disbelief that a grand jury could crank out 276 indictments in four hours. Also helms a fairly entertaining Twitter account.

Defending People

Houston-area defense lawyer Mark Bennett's blog. Again, no one keeps up with Greenfield, but Bennett has produced some remarkable work, including several posts deconstructing Prof. Mary Anne Franks' misguided legislative attempts to criminalize "revenge porn." Perhaps best known here at Techdirt for calling out investigative reporter Teri Buhl (Gideon from A Public Defender was also involved) for her claims that her tweets couldn't be quoted without permission (and receiving a lawsuit threat in return).

Hopefully, this post pointed you to at least one piece of software/web destination you weren't already aware of. If not, then the good news is that the regularly scheduled programming will be returning shortly. Feel free to make recommendations of your own in the comments so I can scrape them for a quick post if this situation should present itself again next week. (I'm only halfway kidding.)

Filed Under: awesome stuff, tim cushing, tools

18 Comments

DailyDirt: Parenting With Technology

Computers

from the urls-we-dig-up dept

Tue, Nov 5th 2013 5:00pm — Michael Ho

Recently, the LA school district has had some problems with its adoption of iPads into its classrooms. Kids will be kids, and some of them figured out how to use their school-issued iPads for unsanctioned activities like watching movies, playing games and wasting time on Facebook. Clearly, there's a bit of a learning curve for using technology as an educational tool as educators try to figure out how to lock down the devices and keep kids focused on using them solely for school-related work. Many parents with young kids want to encourage their children to use all kinds of technology, but the advice and recommendations from various experts can be a bit conflicting or almost useless. Here are just a few links on kids and screen time.

According to recent surveys, technology is not at the top of the list of concerns that parents have about their newborn to 8yo kids. Parents who have grown up using technology are flexible in their own adoption of technology as a parenting tool -- and not all of them use iPads as toddler pacifiers/babysitters. [url]
Some prestigious pre-schools are handing iPads to 18mo kids, but not every parent agrees that barely-potty-trained kids should be interacting with a touchscreen instead of other humans. Also, you don't want to touch that iPad after a kid who's not potty trained has used it. [url]
Plenty of folks are discussing the effects of screen time on kids. Apparently, some are making a distinction between "active" and "passive" screen time -- but it's still not clear if educational apps are actually educational. [url]

If you'd like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post via StumbleUpon.

Filed Under: babysitter, computers, education, ipads, kids, mental development, parenting, pre-school, school, screen time, tools, trend

6 Comments

DailyDirt: Useful Cooking Gadgets?

Too Much Free Time

from the urls-we-dig-up dept

Tue, Sep 17th 2013 5:00pm — Michael Ho

There is a constant stream of kitchen gadgetry aiming to save everyone time, effort and money. Just watch any TV channel when the paid programming kicks in, and you'll eventually see an infomercial for a revolutionary new cooking tool that you can't imagine ever living without. Sure, most of these inventions are probably made of flimsy plastic that will break after a few weeks of casual use, but you'll already have broken even on the time you saved, right? Here are just a few links to some neat kitchen tools that you might want to ~~try~~ throw in a drawer and forget about.

A submission for the James Dyson award proposes a kitchen device that uses a laser for automatically cutting up common fruits/vegetables and meat. Warning: Do not look at laser with remaining eye. [url]
There are a couple cheap robots that will help stir the pots on your stove while you're cooking, but they aren't very good at doing their job. What do you expect for a robot that costs under $10? [url]
If a robot can't help with the simple job of stirring liquids in the kitchen, maybe a passive design would be more reliable -- a specially shaped pot can create a small whirlpool when water is heated in it. If you're too lazy to stir a pot of boiling water (and whatever food you're cooking), you're probably not cooking that much in the first place.... [url]

If you'd like to read more awesome and interesting stuff, check out this unrelated (but not entirely random!) Techdirt post via StumbleUpon.

Filed Under: cooking, gadgets, inventions, james dyson award, kitchen, laser cutting, robot stirring, tools

4 Comments

RIAA Still Can't Figure Out How To Use Google's DMCA Tools, Blames Google

Failures

from the but-of-course dept

Wed, Feb 20th 2013 7:42am — Mike Masnick

This will hardly comes as a surprise, but the RIAA and other "anti-piracy groups" are still complaining that Google "isn't doing enough" to prop up their old and obsolete business models. The latest complaint? That Google's system only accepts a mere 10,000 DMCA takedowns per day and somehow that's just not enough. It turns out that this isn't actually true, but we'll get to that in a moment. Much of the article focuses on Dutch extremist anti-piracy group BREIN saying that the limit needs to go away. But there is this bizarre statement from the RIAA as well:

“Google has the resources to allow take downs that would more meaningfully address the piracy problem it recognizes, given that it likely indexes hundreds of millions of links per day. Yet this limitation remains despite requests to remove it,” RIAA noted.

In addition to unthrottling the URL limits, RIAA also says it wants to lift the cap on the number of queries they can execute per day to find infringing content.

“Google places artificial limits on the number of queries that can be made by a copyright owner to identify infringements.”

This seems wrong on a variety of levels. As we noted last year when the RIAA raised some of these complaints, part of the problem appears to be that the RIAA doesn't understand how Google's tools work. There are some technical limitations in terms of how many URLs a "trusted partner" using automated means can submit at once, but no actual limit on the number of URLs that can be submitted total. There's a practical reason for the setup: in case an automated system goes haywire, Google wants to be able to catch it. But that's it. It does not limit the searches or the ability to submit DMCAs. We asked Google for specifics, and they confirmed:

While there is no limit on the number of DMCA notices that a copyright owner or reporting organization may send us, we put safety limits on the number of automated submissions that partners can make at one time using our tools in order to protect our systems from technical problems. We increase these limits for partners who have demonstrated a consistent track record of submission quality and volume.

On top of that, there's the issue that takedown notices go through a review process before the takedowns happen, to hopefully weed out abuse. For the RIAA to compare handling of takedown messages to the automated process of searching is really bizarre. It's basically them saying they want to be able to automatically takedown any content with no review whatsoever. That's a massive problem for a variety of obvious reasons. Indexing the web for search is an automated process. Taking sites down requires at least some level of review, even if only cursory. Apparently, the RIAA not only misunderstands the tools available, but also the DMCA process itself.

Filed Under: copyright, dmca, limits, takedowns, tools
Companies: brein, google, riaa

105 Comments

Older Stories >>

Follow Techdirt

Essential Reading

The Techdirt Greenhouse

Read the latest posts:

read all »

Techdirt Deals

Report this ad | Hide Techdirt ads

Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Older Stuff

Thursday
13:33	Former Employees Say Mossad Members Dropped By NSO Officers To Run Off-The-Books Phone Hacks (2)
12:01	No, Creating An NFT Of The Video Of A Horrific Shooting Will Not Get It Removed From The Internet (18)
10:49	San Francisco Cops Are Running Rape Victims' DNA Through Criminal Databases Because What Even The Fuck (18)
10:44	Daily Deal: The Complete 2022 Java Coder Bundle (0)
09:31	As Expected, Trump's Social Network Is Rapidly Banning Users It Doesn't Like, Without Telling Them Why (44)
06:30	Comcast Continues To Bleed Olympics Viewers After Years Of Bumbling (19)
Wednesday
20:42	Apple Finally Defeats Dumb Diverse Emoji Lawsuit One Year Later (6)
15:39	Clearview Pitch Deck Says It's Aiming For A 100 Billion Image Database, Restarting Sales To The Private Sector (10)
13:41	Peloton Outage Prevents Customers From Using $2,500 Exercise Bikes (16)
12:09	The GOP Knows That The Dem's Antitrust Efforts Have A Content Moderation Trojan Horse; Why Don't The Dems? (16)
10:51	Hertz Ordered To Tell Court How Many Thousands Of Renters It Falsely Accuses Of Theft Every Year (24)
09:21	Even As Trump Relies On Section 230 For Truth Social, He's Claiming In Lawsuits That It's Unconstitutional (34)
06:16	Medical, Home Alarm Industries Warn Of Major Outages As AT&T Shuts Down 3G Network (25)
Tuesday
20:37	Video Game History Foundation: Nintendo Actions 'Actively Destructive To Video Game History' (29)
15:35	Massachusetts Court Says No Expectation Of Privacy In Social Media Posts Unwittingly Shared With An Undercover Cop (17)
13:30	Techdirt Podcast Episode 312: Regulating The Internet (2)
12:03	US Copyright Office Gets It Right (Again): AI-Generated Works Do Not Get A Copyright Monopoly (60)
10:42	LA Sheriff Threatens To 'Subject' City Council To 'Defamation Law' If They Won't Stop Calling His Deputies 'Gang Members' (20)
10:37	Daily Deal: codeSpark Academy Sibling Bundle (0)
09:25	Trump's Truth Social Bakes Section 230 Directly Into Its Terms, So Apparently Trump Now Likes Section 230 (128)
06:22	15 Years Late, The FCC Cracks Down On Broadband Apartment Monopolies (31)
Sunday
12:05	Funniest/Most Insightful Comments Of The Week At Techdirt (11)
Saturday
12:00	This Week In Techdirt History: February 13th - 19th (1)
Friday
19:39	Letter From High-Ranking FBI Lawyer Tells Prosecutors How To Avoid Court Scrutiny Of Firearms Analysis Junk Science (25)
15:52	Nintendo Is Beginning To Look Like The Disney Of The Video Game Industry (44)
13:49	Seattle Public Radio Station Manages To Partially Brick Area Mazdas Using Nothing More Than Some Image Files (44)
12:13	Thankfully, Jay Inslee's Unconstitutional Bill To Criminalize Political Speech Dies In The Washington Senate (8)
10:52	How Our Convoluted Copyright Regime Explains Why Spotify Chose Joe Rogan Over Neil Young (136)
10:47	Daily Deal: The Complete Blocs Website Builder Bundle (0)
09:33	Arizona Prosecutor Who Brought Bogus Gang Charges Against Protesters Files Ridiculous Defamation Suit Against Her Boss (12)

Content Moderation Case Studies: Using AI To Detect Problematic Edits On Wikipedia (2015)

from the ai-to-the-rescue? dept

Cloudflare Makes It Easier For All Its Users To Help Stop Child Porn Distribution

from the this-is-good dept

Facebook Asked To Change Terms Of Service To Protect Journalists

from the a-chance-to-fix-things dept

Valve Decides To Get Out Of The Curation Business When It Comes To 'Offensive' Games

from the the-good-and-the-bad dept

DailyDirt: Helping The Blind With Technology

from the urls-we-dig-up dept

DailyDirt: Technology For Lawyers

from the urls-we-dig-up dept

Tim Cushing's Subjectively Awesome Stuff Of No Particular Timeframe

from the and-now-for-something-completely-different dept

DailyDirt: Parenting With Technology

from the urls-we-dig-up dept

DailyDirt: Useful Cooking Gadgets?

from the urls-we-dig-up dept

RIAA Still Can't Figure Out How To Use Google's DMCA Tools, Blames Google

from the but-of-course dept

The Techdirt Greenhouse

Thursday

Wednesday

Tuesday

Sunday

Saturday

Friday

More

Tools & Services

Company

Contact

More

from the ai-to-the-rescue? dept

from the this-is-good dept

from the a-chance-to-fix-things dept

from the the-good-and-the-bad dept

from the urls-we-dig-up dept

from the urls-we-dig-up dept

from the and-now-for-something-completely-different dept

from the urls-we-dig-up dept

from the urls-we-dig-up dept

from the but-of-course dept

Techdirt Daily Newsletter

The Techdirt Greenhouse

Tools & Services

Company

Contact

More