As Expected, Aereo Pleads Its Case For Survival

Awesomeness: Millions Of Public Domain Images Being Put Online

from the go-use-them dept

Fri, Aug 29th 2014 5:57pm — Mike Masnick

Here's some nice news. Kalev Leetaru has been liberating a ton of public domain images from books and putting them all on Flickr. He's been going through Internet Archive scans of old, public domain books, isolating the images, and turning them into individual images. Because, while the books and images are all public domain, very few of the images have been separated from the books and released in a digital format.

To achieve his goal, Mr Leetaru wrote his own software to work around the way the books had originally been digitised.

The Internet Archive had used an optical character recognition (OCR) program to analyse each of its 600 million scanned pages in order to convert the image of each word into searchable text.

As part of the process, the software recognised which parts of a page were pictures in order to discard them.

Mr Leetaru's code used this information to go back to the original scans, extract the regions the OCR program had ignored, and then save each one as a separate file in the Jpeg picture format.

Already over 2.6 million images have been posted to Flickr in this manner -- all completely in the public domain. From a historical perspective, the images are fascinating -- and the fact that anyone can do anything with them, free of charge, is important culturally as well. Just scrolling through the images is amazing. Here are a few interesting ones that I spotted:

There seem to be lots of images of musical scores, sewing machines, individual portraits, building and machinery. Each Flickr page associated with the image gives information about the book, including the text before and after the image, which is pretty cool. The one (only slightly) annoying thing is that on the Flickr pages, rather than saying these are public domain images, it says that there are "no known copyright restrictions." While that's accurate, and a potentially reasonable hedge against some miraculous finding that says these images are covered by copyright, it's really too bad that it's so problematic to come out and say "this is in the public domain, do whatever the hell you want with it."

Filed Under: book scans, copyright, flickr, internet archive, kalev, leetaru, old books, public domain

15 Comments

If you liked this post, you may also be interested in...

Reader Comments

Subscribe: RSS

View by: Time | Thread

That One Guy (profile), 29 Aug 2014 @ 5:35pm

Come one, come all, and place your bets!
While awesome for archival purposes if nothing else, I give it a week at most before some bot starts tagging and demanding pictures be removed and claiming that at least some of them are still under copyright, followed shortly thereafter(assuming Flickr doesn't just pull them immediately), by the ones running the bot doubling down and insisting that yes, they do indeed own the rights to the images, and will be filing a lawsuit if they aren't taken down immediately.

Because when there's absolutely no penalty for copyfraud, well, why not try to claim everything you can, on the off chance that at least some of the claims will stick and/or the target will pay up?
[ link to this | view in chronology ]
Toestubber (profile), 29 Aug 2014 @ 7:28pm

Anyone know a good way...
...to download the originals en masse? This archive seems too important to entrust to Flickr.
[ link to this | view in chronology ]
- jupiterkansas (profile), 29 Aug 2014 @ 10:20pm
  
  Re: Anyone know a good way...
  I don't know why they can't be added back to archive.org too.
  [ link to this | view in chronology ]
s7, 29 Aug 2014 @ 9:36pm

Haha, I Stumbled upon some of these last night while browsing flickr looking for some PD images to use for a project. Re-did my search today, and yep, it was them.
[ link to this | view in chronology ]
Anonymous Coward, 29 Aug 2014 @ 11:38pm

Oh, boy. Whatever's not going to like this, not one bit.
[ link to this | view in chronology ]
- Anonymous Coward, 31 Aug 2014 @ 9:06am
  
  Re:
  IP extremists will try to argue that this is going to kill art and make all artists starve or something.
  [ link to this | view in chronology ]
Anonymous Coward, 30 Aug 2014 @ 7:02am

Thanks for the Info and story.
[ link to this | view in chronology ]
orbitalinsertion (profile), 30 Aug 2014 @ 7:53am

In jpeg format?
Once more, just to help me mentally process this:
Retrieving and publishing public domain images in jpeg format?
[ link to this | view in chronology ]
1st Dread Pirate Roberts (profile), 30 Aug 2014 @ 12:27pm

Way Cool!
This is way cool! This guy needs to patent this technique.
[ link to this | view in chronology ]
bob, 30 Aug 2014 @ 2:54pm

Lather, rinse, repeat
It's interesting that Leetaru has taken on images. He is a major force behind GDELT, the Global Database of Events, Language, and Tone which uses automated techniques to mine news sources for event summaries (among other things).

Unlike GDELT, here all the source material is demonstrably public domain, so publishing the image extracts (in whatever form) should not cause any hiccoughs.
[ link to this | view in chronology ]
Antsan (profile), 31 Aug 2014 @ 3:49am

Unfortunately there seems to be something strange going on on Flickr. I cannot just right click on the images and save them like I am used to.
Would be nice if the pictures were uploaded somewhere where they are more easily accessible.
[ link to this | view in chronology ]
- NikFromNYC, 2 Sep 2014 @ 3:54pm
  
  Re:
  There's a little hard to hit three dot icon leading to various sizes that includes original that I can download just fine on an iPhone browser. I just have to zoom in to not miss the dots button since the next image hot area is the whole right edge of the image right down to that button, irritatingly.
  [ link to this | view in chronology ]
- Victoria Love, 9 Sep 2014 @ 5:06pm
  
  Re: saving images
  I was able to isolate and save by playing around with the "all sizes" option on flickr. Once the image was displayed without the caption information I was able to use the "save image" option. This was on my iPad. I was able to save a single image to "my photos" on iPad.
  [ link to this | view in chronology ]
Anonymous Coward, 31 Aug 2014 @ 9:12am

Torrent?
Would be cool if someone could create a .torrent file of all the images.
[ link to this | view in chronology ]
- Ninja (profile), 1 Sep 2014 @ 7:54am
  
  Re: Torrent?
  Free, distributed backup plan. Hell yeah!
  [ link to this | view in chronology ]