You are here: start » blog » 2008 » 06 » Tools: unpaper

Tools: unpaper

Recently I wanted to print a few scanned pages. However, due to the low quality of the source material (see below) reducing the pages to black&white didn't exactly improve them.

Enter unpaper. While unpaper is a rather unknown tool it's also extremely useful because it allows diverse image modifications to improve scanned pages, e.g.:

  • black-/gray-/noise-/blurfiltering
  • deskewing
  • border-aligning
  • mask-centering
  • etc.

If you aren't sold yet the documentation offers other great examples of applied image processing techniques.

Below I've included the minimal1) script I used to process the scanned pages. While the settings produced sufficient results for my needs they are probably still far from perfect – so don't forget to toy with your settings to get optimal results ,)

autounpaper

#!/bin/bash
# Usage: autounpaper *.jpg
 
inputfiles=$*
tmpin=`mktemp -t autounpaper-in-XXXXXXXXXX`.pgm
tmpout=`mktemp -t autounpaper-out-XXXXXXXXXX`.pgm
 
opts="-q --overwrite"
opts+=" --no-deskew --no-mask-scan"
opts+=" --grayfilter-size 1,1 --grayfilter-step 1,1 --grayfilter-threshold 0.4"
 
echo -n "Processing:"
 
for input in $inputfiles; do
	echo -n " $input"
	ending=${input#*.}
	output=${input%.*}-unpaper.$ending
 
	convert $input $tmpin && echo -n . && \
	unpaper $opts $tmpin $tmpout && echo -n . && \
	convert $tmpout $output
done
 
rm $tmpin $tmpout
 
echo

1) meaning: quick&dirty

Discussion

Andreas GohrAndreas Gohr, 2008/07/01 09:26

Interesting tool. I just recently had to hand process a number of scans in Gimp. Nothing I'd like to do for a living ;-)

PSVPSV, 2009/11/17 18:36

- #!/bin/sh

+ #!/bin/bash

demoddemod, 2009/11/17 18:42

thanks, fixed it

Andreas GohrAndreas Gohr, 2009/11/18 09:13

This would be really helpful as an online tool. You setup a small website for it where you can upload an image, maybe set some checkboxes and get the corrected image back. Maybe even with an API?

Enter your comment
GJIPK
 
blog/2008/06/unpaper.txt · Last modified: 2009/11/17 18:42 by demod