How Newspapers Can Make Their Data More Useful: Uncovering The Semantic Newspaper

from the stuff-to-think-about dept

Earlier this week, when we wrote about yet another weak strategy that newspaper industry-types were discussing as a plan to "fight back" against the internet, a few people complained in the comments that we only seem to focus on the negative side of what newspapers do, and never highlight the positives or come up with any suggestions on our own. Part of this may be because it just seems like so few newspapers seem to be doing much right. However, it's also not entirely true. In the past, we've discussed ways that newspapers can better customize and also why newspapers should recognize that their role has shifted from being just an information deliverer, to being an enabling party that helps its own readers spread the news -- something sites like Digg have shown many people want to do. Techmeme has pointed us to another interesting idea, this time suggested by Adrian Holovaty, who has worked for many years on the digital side of various newspapers. Rather than coming up with vague statements about blogs, tags or whatever the latest buzzword is, Holovaty points out how newspapers need to fundamentally shift how they think about the data they create. That is, they need to recognize that it's data they produce. Rather than focus on each "story" as a blackbox, they should be willing to break it up into chunks of useful metadata. That is, each story is likely to have certain consistent attributes, and making sure the newspaper database understands those attributes allows the newspaper to become a data source, rather than just a collection of news articles. This doesn't mean to get rid of the story itself, but at least make sure the database recognizes the different data attributes.

This is a very powerful idea, that may bring to mind Tim Berners-Lee's idea of the semantic web, where there's a lot more metadata for computers to understand. Of course, the big stumbling block for the semantic web over the years is often that it involves setting up too rigid a structure, eliminating much of what made the web so useful in the first place. It forces people to make choices and to assign specific labels or categories when they might just want to put the full content out there. In fact, Holovaty acknowledges some of this, when he complains that too many in the newspaper industry just see the content management system as the fastest means possible of delivering their story. They just want to be able to dump the story in and have it published. However, as Holovaty has also seen, some are beginning to see the light -- and with the consistency of certain types of news stories, there's really very little need for the "flexibility" that often holds back attempts at the semantic web. Just last month, for example, we pointed out that Thomson Financial is trying to automate the process of writing certain stories, such as on earnings releases. That takes the same concept from a different angle, easing the labor side, but at the same time inherently recognizing the metadata involved.

While some journalists may protest this attempt to "chunkify" their stories, there's nothing in this process that needs to take anything away from their traditional journalism. The story is still filed and is still important. What the additional data (or the classification/categorization of that data) does is open up a goldmine of additional information and services a newspaper can provide. Rather than just focusing on the qualitative angle, the data is exposed and can be used in a variety of ways -- many of which may not be obvious at first, but will come to light later. Holovaty uses an example of being able to break up a ton of useful weather forecast data, and easily combine it with a system for keeping track of little league games (where weather info is important). That's just a small example, but making news data, rather than stories, useful has plenty of other benefits that could revitalize the news business. As an example of how such things could be useful, I was going to point to the ChicagoCrime website that maps where crimes have occurred in Chicago -- and in looking it up, only now realized that it was actually created by Holovaty as well (no wonder). So the good news is that there are some really good ideas out there for improving the value of traditional news organizations. It's just a matter of getting more in the industry to embrace them.
Hide this

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.

Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.

While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team


Reader Comments

Subscribe: RSS

View by: Time | Thread


  1. identicon
    dorpus, 7 Sep 2006 @ 3:29am

    Freak Accident Categories?

    How often do newsrooms get shot up by psychos?

    How often do tow trucks get towed?

    How often do novice skydivers go into a panic, entangle their instructor's parachute, and kill both of them?

    How often do fire trucks get stolen?

    How often do lifeguards drown?

    How often do serial killer nurses kill doctors?

    How often do Arab terrorists accidentally kill other terrorists while building bombs or practicing firearms?

    link to this | view in thread ]

  2. identicon
    "Mike", 7 Sep 2006 @ 4:38am

    Freak

    I hate you dorpus.

    link to this | view in thread ]

  3. identicon
    Clifford VanMeter, 7 Sep 2006 @ 8:18am

    News

    I personally like the idea of the Semantic approach, but there is an underlying obstacle to dealing with all that meta-data. Money.

    How does the newspaper monitize the meta-data, how do the journalists who are collecting this information getting paid. Its easy enough the pay 2¢ per word, but journalists are already wanting to be paid again for web publication of articles they created for print publications. How about photographers? Editors?

    Who gets paid when two or more journalists' "chunk" get mashed up into the result for a single custom query? These are the real world problems that must be dealt with BEFORE we can even consider moving in that direction.

    Maybe the real question should be this -- If every journalist can create and distribute their own content directly to a mass audience, what real purpose do newspapers serve anymore? Aren't they really little more than content aggregators? A kind of manual RSS feed?

    link to this | view in thread ]

  4. identicon
    Anonymous Coward, 7 Sep 2006 @ 9:30am

    semantic spam

    The semantic web is a dream that will only work within very trusted sites (maybe a newspaper site would qualify). We've already had a taste of the semantic web when people use to put META tags in their html headers to list the subject matter of their page. False informantion was stuffed in their to show up on certain search engines.
    The semantic web seems a wonderful thing - for spam-style search engine hijackers.

    link to this | view in thread ]

  5. identicon
    Phil Wolff, 7 Sep 2006 @ 9:34am

    Re: News

    Providing structured data doesn't address the revenue issue, at least not directly.

    I'll agree that it would be a value added service, and that you would readily synthesize new products to sell. The act of publishing structured data won't solve the problem of a business (news) and an occupation (journalism) with falling barriers to entry and a long line of new entrants.

    link to this | view in thread ]

  6. identicon
    Anonymous Coward, 7 Sep 2006 @ 11:03am

    Re: Re: News

    If the structured data were used correctly, I think it could generate revenue in a similar way to how ESPN Insider works, as well as possibly attracting online advertisers.

    For example, it's local election night and a reporter, in addition to writing the story about Suzie Q winning the election and hugging her nephew and crying at city hall upon hearing the results, actually inputs those results into a Web GUI back in the office that goes into a database. That database can then be maintained over every election cycle and made available online to readers for a premium price (or who subscribe to the print version). Same could be done for campaign contributions, local crime data, school testing scores, etc.

    The problem: convincing crotchety editors and crotchety reporters, who would rather tell stories about the old Linotype machines, that this could actually be a good thing that serves their readers well.

    link to this | view in thread ]

  7. identicon
    Anonymous Coward, 7 Sep 2006 @ 12:09pm

    To Mike

    This is about your reply on the other newspaper post in which i was complaining. You stated that the purpose of a newspaper is to draw people to the classifieds.

    Do you seriously think that's what newspapers are for? I certainly hope not. People pay the newspaper for space to advertise things they're selling and for ther various purposes. Classifieds are advertisments from indivual people. The keyword being advertisments. Is the purpose of televised news for people to watch the commercials? The purpose is the news, the advertisements are the financial means to allow for distribution of the news.

    I am glad to finally see suggestions from techdirt though.

    link to this | view in thread ]

  8. identicon
    Joe Wroblewski, 7 Sep 2006 @ 6:37pm

    Only the innovators will survive

    I find myself reading the newspaper less and less all the time. I have often wondered if publishers don't understand the basic shift that is taking place or they don't know what to do about it.

    I think Adrian's idea is exactly the type of innovation the newspaper industry needs. It will surely provide a lot of value to readers and it will do it by leveraging new and evolving technologies.

    There just isnt' a need for so many newspapers anymore, so only the innovators will survive.

    link to this | view in thread ]

  9. identicon
    It's me again, 9 Oct 2006 @ 9:05am

    Um, so who's going to pay for REAL news?

    I for one, would like to get my news from a REAL reporter, not EveryBlogger in his skivvies typing his opinions into a laptop. And if you want REAL reporters, someone's going to have to pay them.

    Just recently, I read this and thought it was great: "If newspapers were invented tomorrow, people would think they were the latest, greatest, newest thing! Portable news! Take it with you where you want! Read it when you want!"

    It's a loose paraphrase, and if someone knows who wrote that, I'd love to know.

    link to this | view in thread ]

  10. identicon
    Anonymous Coward, 30 Nov 2008 @ 12:34pm

    "That is, they need to recognize that it's data they produce. Rather than focus on each "story" as a blackbox, they should be willing to break it up into chunks of useful metadata..."

    So who would benefit the most from this ... oh yes the bloggers; they would then not have to recycle 3rd of 4th hand interpretations of newspaper stories, they could direct to the news source and build their contrived and predictable opinions directly on the "facts".

    If it ever happens you will soon find facts getting in the way of a good blog.

    link to this | view in thread ]


Follow Techdirt
Essential Reading
Techdirt Deals
Report this ad  |  Hide Techdirt ads
Techdirt Insider Discord

The latest chatter on the Techdirt Insider Discord channel...

Loading...
Recent Stories

This site, like most other sites on the web, uses cookies. For more information, see our privacy policy. Got it
Close

Email This

This feature is only available to registered users. Register or sign in to use it.