(I write software that helps people transcribe old handwriting.)
Now is this a pretty image? I don't know. We need to do some things with this, and this is how RMagick has helped me solve some of my problems. One of the things we need to do is to turn it into a thumbnail.
image.thumbnail and pass it a scaling factor, and that will very quickly go through and transform that image. Write it to a file and we're done.
So that's kind of basic. RMagick does a lot of other basic kinds of things like this, like rotate or negate--sometimes I get files that are negatives--things like that.
What about the cool stuff? Here's an example of something that I have to deal with.
http://tinyurl.com/GridifyGist . And what you'll see is:
Enter deskew. So here's an example of ruby code, very simple:
require 'RMagick' # Which is very idiosyncratically capital-R capital-M
s = Magick::ImageList.new('skew.jpg') # We get a new ImageList from skew.jpg
d = s.deskew # And poof! Zap!
d.write('deskew.jpg') # We write out deskew and get this:
Now this looks kind of weird. The image looks off. But the image's contents are not off. And if we throw the grid on it:
I think that's pretty slick.
RMagick can do some other slick things, and this is where you get into the programming aspects of RMagick and why you'd use RMagick instead of just calling ImageMagick from the command line.
Here what we're doing is extracting files from a PDF. Lots of scanners produce images in PDF format, that is just using PDF as a container for a bunch of images.
This goes through, creates an ImageList from the file, and then goes through for each image in the image list--actually more specifically for each page in that PDF--and it writes it out correctly. (RMagick may not be the right thing to user for some files because it's going to get page by page images -- if you have PDFs that are composed of a bunch of images [on each page] then you're going to want to use some other tools.)
So I came up with this idea: Let's look for vertical dark stripes. Let's look for the darkest strip that is vertical in a deskewed version of this image and see if we can identify that. So this is something we can do. What I've done here is I've said-- this is the inside of a loop where I've said for each x let's pull all the pixels out, and then come up with a total brightness for that image [stripe]. Then later on, I'm going to find the minimum brightness for those vertical stripes.
If I do this on some of the files and indicate that stripe by a red line--which I hope you can see--it did pretty well on this!
crop it, so left page/right page
The only caveats that I'd give you about RMagick is that I find it necessary to call GC.start a lot -- at least I did in Ruby 1 6 [ed: 1.8.6] -- because RMagic--I don't know, man--because it swaps out the garbage collector or something and you run out of memory really fast.
RMagick: I love it!
This is a great example of how programming can save a huge amount of labour. When I digitized that regimental history I used Irfanview for batch processing the scans and hoped that the spine would be in roughly the same place in every scan. Luckily it was, but you've found a much better way of doing it.
I never thought anything would drive me to use Ruby... very interesting work!
Post a Comment