Wednesday, July 25, 2012

RMagick Lightning Talk at Austin On Rails

This is a transcript of my five-minute lightning talk at the July meeting of Austin On Rails, the local Ruby on Rails user group. It highlights a few things I've learned from my work on MyopicVicar and Autosplit.
Hi everyone, I'm Ben Brumfield and I've been a professional Ruby on Rails developer for three whole months, [applause] and I am here to talk about RMagick!  RMagick is a wrapper for ImageMagick, which is great for processing images.  These are the kinds of images that I process:

(I write software that helps people transcribe old handwriting.)

Now is this a pretty image?  I don't know.  We need to do some things with this, and this is how RMagick has helped me solve some of my problems.  One of the things we need to do is to turn it into a thumbnail.
This is an example of RMagic used as a very simple wrapper for ImageMagick: All we're doing is creating an ImageList out of that file, we say image.thumbnail and pass it a scaling factor, and that will very quickly go through and transform that image. Write it to a file and we're done.

So that's kind of basic.  RMagick does a lot of other basic kinds of things like this, like rotate or negate--sometimes I get files that are negatives--things like that.

What about the cool stuff?  Here's an example of something that I have to deal with.
This is a parish registry from 1810, which would be fine, but this is a scan of a bad microfilm of a parish registry from 1810 that is tilted.  Now my users need to draw rectangles around individual lines, and the tilt is going to really throw them off.
How badly is this tilted? If you're interested in seeing some RMagick code, here's a script that I wrote to draw a grid over that image or any other image: http://tinyurl.com/GridifyGist . And what you'll see is:
Man, it kind of sucks!  You draw a rectangle over that and you're going to get all kinds of weird stuff.

Enter deskew.  So here's an example of ruby code, very simple:

require 'RMagick' # Which is very idiosyncratically capital-R capital-M
s = Magick::ImageList.new('skew.jpg') # We get a new ImageList from skew.jpg
d = s.deskew # And poof!  Zap!
d.write('deskew.jpg') # We write out deskew and get this:


Now this looks kind of weird. The image looks off.  But the image's contents are not off.  And if we throw the grid on it:
You can see that what RMagick has done is that it's gone through and it's looked for lines inside the content of the image, and it's rotated the image in correspondence to where it thinks the lines go and where it thinks the orientation is.

I think that's pretty slick.

RMagick can do some other slick things, and this is where you get into the programming aspects of RMagick and why you'd use RMagick instead of just calling ImageMagick from the command line.

Here what we're doing is extracting files from a PDF.  Lots of scanners produce images in PDF format, that is just using PDF as a container for a bunch of images.

This goes through, creates an ImageList from the file, and then goes through for each image in the image list--actually more specifically for each page in that PDF--and it writes it out correctly.  (RMagick may not be the right thing to user for some files because it's going to get page by page images -- if you have PDFs that are composed of a bunch of images [on each page] then you're going to want to use some other tools.)
Okay.  So here's a file that I'm dealing with, and when I want to present it to my users I actually want to present them with a single page.  Is this file a single page? No -- this file is two pages, all scanned on a flatbed scanner.  (For those that are curious, it's a sixteenth-century Spanish legal document; I don't know what it says.)  But how do I find the spine?  How do I know where to split it?  It's easy enough to go through and do cropping, but you know, what do we do?

So I came up with this idea: Let's look for vertical dark stripes.  Let's look for the darkest strip that is vertical in a deskewed version of this image and see if we can identify that.  So this is something we can do.  What I've done here is I've said-- this is the inside of a loop where I've said for each x let's pull all the pixels out, and then come up with a total brightness for that image [stripe].  Then later on, I'm going to find the minimum brightness for those vertical stripes.

If I do this on some of the files and indicate that stripe by a red line--which I hope you can see--it did pretty well on this!
 It does well on this awful scan of a microfilm, although the red line is hard to see at this resolution.
 And wow, it does great on this piece!
This is just an example of using RMagick to solve my problems.  After I've gotten the line I want to crop it, so left page/right page

The only caveats that I'd give you about RMagick is that I find it necessary to call GC.start a lot -- at least I did in Ruby 1 6 [ed: 1.8.6] -- because RMagic--I don't know, man--because it swaps out the garbage collector or something and you run out of memory really fast.

RMagick: I love it!

3 comments:

Anonymous said...

This is a great example of how programming can save a huge amount of labour. When I digitized that regimental history I used Irfanview for batch processing the scans and hoped that the spine would be in roughly the same place in every scan. Luckily it was, but you've found a much better way of doing it.

Crwth said...

I never thought anything would drive me to use Ruby... very interesting work!

Anonymous said...
This comment has been removed by a blog administrator.