IMSLP talk:Scanning music scores

Free public domain sheet music from IMSLP / Petrucci Music Library
Jump to: navigation, search
  • I've made a great discovery (haha) regarding why some people can make scans that are so tiny, and yet high quality. The secret lies in a monochrome image compression algorithm called CCITT (Group 4; Group 3 is much inferior), which is used commonly for faxes. Not only does it have very high compression ratio, it also is, surprisingly, lossless (wow). The catch is that it can only compress monochrome well, and fails miserably (or so I heard) when trying to compress color images. Currently the only way I've succeeded in creating a CCITT Group 4 compressed PDF is via Imagemagick:
convert -compress Group4 input.bmp output.pdf
Of course, you can then combine the PDF files with pdftk. --Feldmahler 12:21, 29 September 2006 (EDT)
If you have a whole directory of files to convert, and you're using a Linux machine:
> find *.png -exec png2pdf.sh {} \;
> pdftk *.pdf output outfile.pdf
png2pdf.sh:
#!/bin/bash
echo "Converting $1 to pdf"
convert "$1" -compress Group4 -monochrome "$(basename "$1" .png).pdf"
  • It is also interesting to note that the average compression ratio of CCITT Group 4 is 4-5%, meaning that a 1000KB monochrome image will compress to around 40-50KB, all the while being lossless. CCITT performs best when there are many repeating pixels of the same color, hence the reason why it compresses monochrome well and not color. --Feldmahler 14:12, 7 October 2006 (EDT)
With most scanners and scanner software, once you choose the output format as TIFF you should be able to set compression to CCITT-4, and this usually gives even higher compression and is as you point out best with 1-bit images. For those using or wish to use Acrobat Professional, Adobe has a proprietary compression called JBIG2 which is a further lossy compression (and must be changed in the preferences) and even better when compiling these CCITT-4 compressed TIFFs into your final PDF. Daphnis 10:20, 30 August 2007 (EDT)
  • The highest compression ratios nowadays are achieved with JPEG2000, a format that is used in djvu and the latest pdf versions. However this format is relatively hard to use, and achieving high ratios is not easy. Please share your experiences!
  • Indeed JPEG2000 is a good compression format, but it is a generic one. It is lossy, and it is also not specially made for compressing monochrome images... and so will perform much better than CCITT Group 4 on color/grayscale images, but not as good on monochrome images. --Feldmahler 14:12, 7 October 2006 (EDT)


How to make PDF files

(This contains info that could be useful to some on making pdf files. Edit, add delete modify as you see fit!).

How to make PDF files of scanned public domain sheet music

DETERMINE IF YOUR EDITION OF SHEET MUSIC IS PUBLIC DOMAIN OR NOT!!!

Procedure No.1: Using Freeware/Shareware programs


1) Obtain sheet music that is in the public domain, such as a volume of sheet music published before 1923 or a reprint of a public domain source. DETERMINE IF YOUR EDITION OF SHEET MUSIC IS PUBLIC DOMAIN OR NOT.

2) Using the Black and White setting, scan the sheet music into TIFF CITT-4 graphics format. A program called Infothek 2000 scan (shareware from http://www.informatik.com) does this automatically.

3) Number each TIFF file sequentially, with the first file named 001.tif, the second named 002.tif, then 003.tif, etc. The Infothek program also does this automatically as you scan.

4) When finished scanning, place all of the TIFF files into a directory. Next, place in the same directory a freeware program called C42pdf (freeware from http://c42pdf.ffii.org.

5) Finally, if using Windows, exit to the DOS command prompt and go to the directory with the TIFF files and the C42 program. Type the following command at the command prompt:

C42 *.tif

All of the sequentially numbered CITT-4 TIFF files in the directory will be combined into a single pdf file (called 001.pdf) containing all of the pages in sequential order.

6) Go back to Windows, load up the 001.pdf file in Adobe Acrobat Reader (available free from Adobe ) and check to make sure that all of the pages are there, clear and not cut off, and in the proper order. If any page is missing, scan it in, and add the TIFF file in the directory with the TIFF files, making sure that the TIFF file name is numbered in the order where it should appear in the file (for example, page 39 should be numbered 039.tif). If that means you must rename every TIFF file in the directory by 1 (for example change 040.tif to 041.tif), use a batch file rename program to do so, such as the freeware Rename program at http://www.1-4a.com/rename/ .

7) Once you are satisfied with your pdf file, give it a new file name and e-mail as an attachment to the Sheet Music Archive. This is a great technique for making pdf files, since it is simple, quick, and the files are small in file size, typically 50-60k per page.

Procedure No.2: Typeset your own

You may use freeware programs such as MusiXTeX typeset public domain music. You can also use a freeware program called GhostScript (version 6.0 and above) to turn Postscript files (.ps) or sheet music into Adobe PDF files. See this page from the Werner Icking Sheet Music Archive for information on these and other programs.

Procedure No.3: Using the Adobe Acrobat Suite

Buy the Adobe Acrobat Suite from Adobe.com or from your software dealer, and use the Adobe Acrobat Exchange file in the suite to directly scan public domain sheet music into pdf files. Use a resolution of at least 300 DPI (Dots Per Inch; 400-600 DPI give better results, but the file size will be larger). After making the pdf, you may use the Adobe Acrobat Distiller (which comes with the Acrobat Suite) to compress the size of the PDF file further.

Procedure No.4: Using Photoshop with the Adobe Acrobat Suite

Sometimes you may want to add text to your scanned sheet music PDF file. On the title page, for example, you might want to add your own title or other info.

1) To do so, scan the first page into Adobe Photoshop. Then, use the Erase tool to erase text or other things you do not want to see on the title page. Next, use the Text tool to write in your own text, and drag the text onto the title page.

2) After you are done with your editing, save the file in Photoshop EPS format.

3) Then, use Adobe Acrobat Distiller to change the EPS file into a pdf file.

4) Now, open that title page PDF file in Adobe Acrobat Exchange. Use the Exchange program to insert, after that first title page that you have edited, a PDF file containing the other pages of the sheet music. Save the file. You now have the complete sheet music PDF file with the edited title page.

Personal tools
Variants
Actions
Navigation
Browse scores
Browse recordings
Participate
Other
For iPhone & iPad

Purchase

Toolbox
Associated with