The Warehousing And Delivery Of Digital Goods? Nearly Free, Pretty Easy, Mostly Trivial
from the relax,-be-happy dept
One of the most important moments in the rise of a radical idea is when the fightback begins, because it signals an acceptance by the establishment that the challenger is a real threat. That moment has certainly arrived for open access, most obviously through moves like the Research Works Act, which would have cut off open access to research funded by the US government. That attack soon stalled, but the sniping at open access and its underlying model of free distribution has continued.
Here, for example, is an interesting post by Kent Anderson, who is CEO/Publisher of the Journal of Bone & Joint Surgery, with the title "Not Free, Not Easy, Not Trivial -- The Warehousing and Delivery of Digital Goods." The starting point is as follows:
There is a persistent conceit stemming from the IT arrogance we continue to see around us, but it's one that most IT professionals are finding real problems with -- the notion that storing and distributing digital goods is a trivial, simple matter, adds nothing to their cost, and can be effectively done by amateurs.
As a result, he thinks, there is "a consistent theme among dew-eyed idealists about publishing -- that digital goods are infinitely reproducible at no marginal cost, and therefore can be priced at the rock-bottom price of 'free'."
Well, they're certainly "infinitely reproducible", but nobody seriously claims that can be done at zero marginal cost. It is, however, extremely small. Indeed, in another post, Anderson himself provides a rough estimate for one part of the cost -- the online delivery of a 1Mbyte file: $0.001. It's true that delivering millions of copies would represent a more significant sum, but that ignores things like BitTorrent, which effectively shares the cost of distributing digital goods among many downloaders. Using such P2P delivery systems, the cost to the publisher really is vanishingly small.
But Anderson thinks there are other issues:
Even beyond just their power requirements, digital goods have particular traits that make them difficult to store effectively, challenging to distribute well, and much more effective when handled by paid professionals.
Why might that be?
First, digital goods are not intangible. They occupy physical space, be that on a hard drive, on flash memory, or during transmission. A full Kindle weighs an attogram more when fully loaded with digital goods, and there are hundreds of thousands of Kindles in the field.
According to the source referenced, "the difference between an empty e-reader and a full one is just one attogram" -- a million-trillionth of a gram. Even with "hundreds of thousands of Kindles in the field," that extra fraction of a gram spread around the world is hardly going to be a major problem. But leaving aside the issue of weight, it's certainly true that this data takes up space on storage media:
The proliferation of digital goods -- photos, music, Web pages, blog posts, social media shares, tweets, ratings, movies and videos, and so much more -- puts incredible and growing pressure on metadata management techniques and layers. This means building more and larger warehouses, which adds to both ongoing costs for current users and migration costs as older warehouses are outstripped by new demands. Megabytes become gigabytes become terrabytes become zettabytes and beyond. Where will they all fit?
One answer is "in your pocket:" according to Amazon, a 1 terabyte portable hard disc currently costs around $100. Yes, a zettabyte might be a little more pricey, but judging by this recent large-scale, real-life project, we're still in the sub-petabyte era, so storing all this data isn't really going to require a warehouse -- a few rack systems should suffice.
But independently of where you are going to put it, another question is: Where is all that important metadata going to come from? As Anderson rightly says:
Creating, updating, and tracking the metadata is a chore for owners of digital goods. Poor metadata -- like a photo name off your digital camera of DX0023 -- can make the photo hard to find or use. Better metadata -- usually applied by humans, like "Rose in bloom, August 2006" for that elusive photo -- makes more sense.
That's mostly true, most of the time. But in another paragraph, quoting from a description of the Library of Congress's effort to archive all Twitter messages since 2006, Anderson also shows us why metadata is not always an issue:
Each tweet is a JSON file, containing an immense amount of metadata in addition to the contents of the tweet itself: date and time, number of followers, account creation date, geodata, and so on.
That is, the data comes with "an immense amount of metadata" automatically, because of the way Twitter (wisely) designed its system. And even for datasets that require metadata to be applied by hand, crowdsourcing is proving an efficient and low-cost way of providing it.
Other issues raised by Anderson are that digital goods need to be backed up, and secure, but that's hardly rocket science: open source solutions that cost nothing to acquire (but not to run, obviously) have been around for years. His main concern, however, seems to be about the physical infrastructure required:
Digital warehouses are more expensive to build. Site planning is a major undertaking. A physical warehouse is something a small business owner can buy and construct with relative ease. They aren’t expensive (a concrete pad, a sheet metal structure, some crude HVAC, and a security system is usually all it takes). A digital warehouse is expensive to construct -- servers, site planning, redundant power requirements, high-grade HVAC, earthquake-proofing, and so forth. This means that digital goods have to work off a much higher fixed warehouse cost.
It seems unlikely that it is cheaper to build a typical physical warehouse than to install a typical LAMP stack on rented commodity servers in a few different geographical locations (or in the cloud) to provide resiliency and backups. This exposes the central problem with Anderson's argument about the amount of data that must be handled, and the necessity for huge and expensive infrastructure to handle it: he seems to be lumping together very different kinds of digital data.
In the realm of digital goods, we’re reaching a point at which we’re facing trade-offs. Already, some data sets are propagating at a rate that exceeds Moore’s Law, which may still accurately predict our ability to expand capacity. And these are purposeful data sets. As data becomes an effect of just living -- traffic monitoring software, GPS outputs, tweets, reviews, star ratings, emails, blog posts, song recommendations, text messages -- we as a collective will easily outstrip Moore’s Law with our data. If there’s no place to put it, and nobody to manage it, does it exist?
Yes, genomic data is spewing out of DNA sequencers at an incredible rate; yes, the Large Hadron Collider produces almost unimaginable quantities of data. But these are exceptions: nobody is talking about letting the general public access this stuff in the same way that they can download media files, say. As I've pointed out in a previous post, we are fast approaching the point where we could store every Spotify track on a single hard disc, and the same will soon be true for every film, book -- and academic article.
For the latter, despite Anderson's title, it really is the case that storing and sharing them is nearly free, pretty easy and mostly trivial, which is why open access makes sense and is constantly gaining ground. The sooner traditional publishers stop fearing and fighting this trend, the sooner they can embrace and enjoy the possibilities this new abundance opens up for them.
Follow me @glynmoody on Twitter or identi.ca, and on Google+
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: data, kent anderson, open access, publishing
Companies: amazon
Reader Comments
Subscribe: RSS
View by: Time | Thread
archive.org
[ link to this | view in chronology ]
Re: archive.org
From a great fan, thank you. I have a few terabytes available if needed.
[ link to this | view in chronology ]
Re: archive.org
[ link to this | view in chronology ]
An attogram more?
But it gives me an idea: I'm going to request my ISP bill my data usage by weight. Even at a billion dollars a gram it will be practically free!
[ link to this | view in chronology ]
Re: An attogram more?
"First, digital goods are not intangible. They occupy physical space, be that on a hard drive, on flash memory, or during transmission."
Fine, I'll take my buddies external hard drive with 3,000 CDs and 800 movies on it. We'll load physical copies of those movies and CDs into boxes for Mr. Anderson to carry (all at the same time - assuming he'd even be able to lift them all at once). Then we'll go for a five mile walk. We'll see who gets there first.
[ link to this | view in chronology ]
Result: Guy writes that Kindles are going to weigh more because of all the digital goods inside them.
Maybe an editor at the Society for Scholarly Publishing misheard "open-source software" as "orthopedic surgery", and things just went downhill from there? I can't think of any other reason for sending in an amateur instead of a trained professional, anyway...
[ link to this | view in chronology ]
Tangibility and costs
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
"You didn't argue against things the author didn't say!"
[ link to this | view in chronology ]
Econ 101 FAIL
Congrats on not understanding the difference between Fixed Cost and Marginal Cost and why Fixed Cost (which is what you brought up) is irrelevant here.
[ link to this | view in chronology ]
Re: Econ 101 FAIL
This particular commenter has basically been professionally "not understanding" the difference between fixed and marginal costs in our comments for over five years now. It's impressive. It's been explained to him probably over 100 times and he still feigns ignorance. At this point, it's pretty clear he's not here for honest debate.
[ link to this | view in chronology ]
Re: Re: Econ 101 FAIL
I guess we can surmise that he is on of the second category.
[ link to this | view in chronology ]
Re: Re: Econ 101 FAIL
Actually, what is impressive is that someone with your level of education seems unwilling to accept that the rules are somewhat different in a business where most of the costs are fixed costs, or pre-production costs. Paying attention ONLY to marginal costs will put you in the poor house.
It's impressive that you seem unwilling to acknowledge how the whole process goes, and that you would rather narrowly focus on one area. It's entirely misleading, and you know it. Why keep it up?
[ link to this | view in chronology ]
Re: Re: Re: Econ 101 FAIL
As for Mike, well... again, since the article isn't about what you want it to be about there's no need for him to acknowledge any of that. The topic is something else. Why you keep trying to bring up the rest of the process is entirely beyond me. It's irrelevant. Why keep it up? Can you not acknowledge that the warehousing and delivery of digital goods is cheap and trivial (to an extant)? Answer that. Focus only on that. Leave the rest for a related article. Otherwise, realize type of behavior you're exhibiting is why people mock you and don't take you seriously.
[ link to this | view in chronology ]
Re: Re: Re: Econ 101 FAIL
[ link to this | view in chronology ]
Re: Re: Econ 101 FAIL
How can you not beleive in something until you understand it?
He is the canary in our gold mine.
[ link to this | view in chronology ]
Re:
it was about warehousing and distribution. which are marginal costs so far as the product the consumer buys is concerned (even if setting them up is a fixed cost). MAKING the thing is a fixed cost and, as mentioned every other bloody time this comes up, is IRRELEVANT to the price per item to the consumer. it is the amount your over all PROFITS per item have to overcome for the product line/business to have been worth the effort. it's a different layer of calculation. there's per item profit (sale price per item - marginal cost per item) and then there's your product's profit ( (per item profit * items sold) - fixed cost of product ) then there's your business's profit (though i'm not sure if the last two are usually separated) which is the total of the profits and losses of all products less whatever other expenses your business incurs in it's running (overhead i guess? taxes? whatever.)
they're all numbers. they're all money going in and out. they're all Different Things.
Techdirt is, generally speaking, talking about the First one. morons like You keep trying to claim that somehow magically translates into the second and by implication the third. the only one that MATTERS is the third. the second is relevant to that, but if one thing makes a loss on the second level to allow another thing to make a greater Gain on that level the third level increases.
the constant attempt to apply a fixed percentage of the second level's costs as an excuse for the excessively high prices per infinitely (or near) reproduce-able item at the first level leads to over-pricing dropping the number of sales per item dropping the income per product but not the cost, thus dropping the profit and dropping the profit of the over all business, which is the one you should CARE about.
can you follow the logic?
it's not that bloody hard.
[ link to this | view in chronology ]
Re:
A space faring ship used to cost billions of dollars to design, now some people are doing it for pennies sort of speak. See Copenhagen Orbitals, nobody laugh at them anymore.
The genome cost billions to do it the first time around, now it costs thousands of dollars.
So I am curious what goods do you refer too?
The only way to compete with centralized production is by being decentralized, no more being dependent of one company/entity/person to deliver the goods.
In fact property rights are an impediment for economic growth, those days where a few could demand others to do something because they hold all the cards are over, in this cycle people are going back to the basics, they will start producing their own products that means companies will get desperate and try to block that from happening.
Which means more granted monopolies attempts until the people get fed up and fuck them up.
[ link to this | view in chronology ]
Re: Re:
It's monopolies like copyright and patents that are the problem
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
Re:
[ link to this | view in chronology ]
.... and on, which makes it even less of a problem.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
(and now you can download awesome as well)
[ link to this | view in chronology ]
Re:
but yeah. point.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Theoretically, it marks the exact halfway point: first they ignore you, then they laugh at you, then they fight you, then you win.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Re:
https://en.wikipedia.org/wiki/Mass%E2%80%93energy_equivalence
The more energy something has the more mass it has the more weight it has.
That put in simple terms, there are others ways to explain it that are more complicated than that of course and I never finished high school so don't take my word for it.
https://en.wikipedia.org/wiki/Lorentz_ether_theory#Electromagnetic_mass
[ link to this | view in chronology ]
Re: Re:
The thing about flash and hard disks is they actually start in a higher energy state before you put data on them as you change 0xFF (all 1's, highest energy) to whatever data you have.
So I would counter the argument that kindles get heavier and would say they actually get lighter by an atogram. But that is coming from an electronic engineer (me) who has slightly more grasp of physics and how flash works than the guy that original said this argument (Professor Kubiatowicz) who is a computer scientist.
[ link to this | view in chronology ]
Re: Re: Re:
From what I know, that is not true. For flash memory, writing means injecting electrons into a floating gate; this makes the output read as 0 due to the way it is wired (the normal state reads as 1). So what reads as zero is the state with more stored energy.
As to hard disks, they start at 0x00, not 0xFF (just look at the contents of a new hard disk).
[ link to this | view in chronology ]
He willfully ignores his own bias and arrogance while saying it is the other guy who is arrogant.
If someone framed this exact same argument in the same terms, but in areas where he isn't an "expert" he would call them an idiot. Sometimes it is useful to step outside yourself and look at the situation without all of your own bias in the way.
[ link to this | view in chronology ]
Approaches zero
More than anything the guy who wrote the original article seems completely oblivious to scale. Sure a "data warehouse" (whatever that is) might be more expensive than a physical one that takes up the same amount of real world space. It also holds countless orders of magnitude more. And when you compare based on capacity, the only thing that matters, digital is vastly cheaper and more efficient. He'd have to be willfully blind not to see that.
[ link to this | view in chronology ]
Re: Approaches zero
if you have/know an IT department, i invite you to ask them what their overhead is monthly. some data centers do in fact sell digital goods, most do not, and both have overhead. lots and lots and lots of overhead.
[ link to this | view in chronology ]
Re: Re: Approaches zero
[ link to this | view in chronology ]
Let the old guard scream.
What they'll find, is that having spent their resources fighting against it, they have little left to join it when they no longer have a choice. By fighting the future, instead of embracing, guiding, and becoming leaders of it, they effectively are destroying any profitable future they may have.
Let the idiots die. They always do.
[ link to this | view in chronology ]
Re: Let the old guard scream.
You are exactly, precisely, correct. My only problem is with the damage they do during their death throes. It can be substantial.
[ link to this | view in chronology ]
Interesting...
But he is right to say that data storage and bandwidth is not free. For a popular site, the costs can be relatively enormous.
Here's the thing. The ones who bear the cost of the storage and bandwidth are usually not the ones who actually produce the material to be stored or transmitted. They are the ones who have found business models that are content-agnostic, scalable, and sustainable.
Yet, the "warehousing and delivery" aspect is almost always brought up by the idiots claiming "free content isn't free." See e.g. Lowery's letter to Emily White: "It turns out the supposedly 'free' stuff really isn’t free" (linking to this story at Scholarly Kitchen). This directly contradicts the notion (again from Lowery) that sites like iTunes "simply hosting the songs on their servers. They do absolutely nothing else."
Well, yeah, they do something else. They warehouse, archive, and deliver digital goods. Something that Lowery insists costs money and isn't free. Yet, they are absolutely livid when the people who actually bear those costs take a cut of the content sales. It is really disgusting, frankly.
[ link to this | view in chronology ]
Re: Interesting...
[ link to this | view in chronology ]
My response to him ....
With in the next 7-15 years we will have nanotechnology. This will allow us to store all current human knowledge (circa 2012) in storage device the size of a deck of cards. Your image of warehouse sized storage facilities is reminiscent of early 1950′s speculation of computing. By their standards, using vacuum tubes and relays, a home PC would have been the size of a house and used the power output of a coal fired power plant.
[ link to this | view in chronology ]
Anyway, "warehousing and delivery" feels like a bullshit argument from someone who wants to disguise his profit margin. I mean... look at Spotify: they use more storage and bandwidth than anyone and their service is still ridiculously cheap. That said, I'm guessing there's other costs for a STORE, that wouldn't affect a streaming service nearly as much.
I'm thinking the costs of a typical online store for purely digital goods pretty much break down like this:
Bandwidth: 1%
Storage: 1%
Tech staff: 3%
Customer support: 95%
I could be wrong of course, but bandwidth is cheap, and so is hardware. If the system is cleverly built, you don't need a huge number of techies to maintain and develop it. The one big expense I can think of is the same as always, customer support: my order didn't arrive, you sent the wrong CD, your site is rejecting my VISA card... That has to be the real cost, right? That's where you need a warehouse (full of support staff, not hard drives) costing you a pile of money every month.
If someone argued that the staff is a big expense in online distribution of digital content, then I'm thinking they may be right. If someone talks about expensive bandwidth and storage, I just feel like jumping up on a table, waving my arms and screaming like a monkey.
Anyway, that's just what I'm thinking. If anyone has any actual experience with it, by all means chip in.
[ link to this | view in chronology ]
Re:
"I didn't get the email with the download link..."
"The download link is linking to the wrong product..."
"I ordered the wrong season of Star Trek by accident..."
"Internet Explorer says your site is dangerous!"
...and so on and so forth. Physical or digital - shit happens, and at the end of the day, customers want a human being to sort it out for them.
[ link to this | view in chronology ]
Human Costs (to: maclypse, #33)
Reputable medical research is mostly paid for by grants from reputable funding agencies, such as the National Institutes of Health. The additional work of preparing things for publication is small compared to the cost of actually doing the research, and the cost of actual distribution even smaller. Parenthetically journal articles tend to be so specialized that the potential audience is less than a thousand, or certainly, less than ten thousand. The cost of distributing the information is also small compared to the cost of reading it. At any rate, sooner or later, the more reputable funding agencies will want to publish the research they sponsor, in order to disassociate themselves from research funded by less reputable organizations. The sponsors who hang onto publication in proprietary journals will inevitably be those sponsors, such as the tobacco industry, who have the most to lose from a true statement of who is paying for the research.
[ link to this | view in chronology ]
[ link to this | view in chronology ]
Mixed Vertical Markets
Here's an example; Hollywood doesn't need to create huge datacenters for each movie they make to distribute it electronically, just as they don't need to directly procure and manage a fleet of trucks to deliver DVDs to customers.
Existing datacenters are there with more than enough capacity and he should only consider costs related to the production of content and server hosting/co-location costs. He doesn't need to worry about what it cost to build the facility, cool it, or anything else that's not in his own segment.
...Unless he's just trying to make a point that "It costs money to build datacenters with lots of disk drives that send and receive lots of data". That's kind of a no-brainer. But for people who produce content and need to leverage that infrastructure for distribution; it's painfully cheap.
[ link to this | view in chronology ]
Regardless of what they publish each Journal pays Editors, copywriters, customer support/sales, peer reviewers (these may be low or no cost depending on availability and quality, but generally there will be a quality dependent cost) and then there is the per issue cost for assembly and publication of a print edition & monthly costs for the IT and internet presence. Many of these costs have a fixed component+marginal component so that a small distribution has a much higher per copy cost with each edition.
Individual researchers are free to publish their work independently, this has not changed in centuries. However it is usually established researchers with a solid reputation or well established research centers with a solid reputation publishing these works. Publications from unknowns will automatically be judged as poor or fraud simply because they did not get into a 'proper' Journal.
The Journals need to pay their bills so they price their submissions & subscriptions according to their estimated costs & income. Better quality with low distribution will cost more, larger distributions will generally lower the per piece cost & mediocre quality will lower it still more. There is a bottom where the price levels out due to periodicals priced below the point where they are taken seriously being ignored. (That seeming paradox is why small items sell better at $19.99 than they do at $9.99 ... the lower price means it isn't worth $20 :P ) For mass market periodicals the sweet spot minimum seems to be $4-$6 based on checking newsstands. Limited subscription periodicals do not have the economy of scale that allows those prices though.
[ link to this | view in chronology ]
Warehousing
[ link to this | view in chronology ]