Internet Archive Book Ripper Beta

The Internet Archive Book Ripper is a userscript that enables you to download, free from DRM, the pages of a book borrowed from the Internet Archive. The pages are downloaded as a ZIP containing .jp2 images from which a complete PDF can be generated.

The Internet Archive Book Ripper is in a beta state, meaning that it is likely to change and that errors are expected. If an error occurs please report the error.

Only Chromium based browsers are supported so far, excluding Ungoogled Chromium.

Installation

Violentmonkey is required, install it then click here to install the Internet Archive Book Ripper. Other userscript managers are not supported.

Guide

The Internet Archive Book Ripper adds a new button to the book reader's interface once a book has been borrowed. Once the pages of the book have been indexed the button will become enabled. Clicking on the button will reveal a menu from which one of five sizes can be selected to download.

...

Because not all pages in a book are necessarily the same size, these sizes represent the average size of all pages within the book and are not to be taken as a guarantee of individual page size. The largest size represents the native resolution of the images stored on the Internet Archive servers and because the downscaling of images occurs on the Internet Archive servers, not in your browser, the download time is primarily affected by the size selected.

Once a size has been selected the download progress menu will appear.

...

The dark blue bars show the progress of single files and the accompanying text shows the name of the file currently being downloaded, the number of kilobytes downloaded, and the total size of the file in kilobytes. The light blue bar shows the overall progress and the accompanying text shows how many pages have been downloaded as well as the current size of the created ZIP file.

The console will display any important messages, such as error messages prompted by a download failure.

The Abort button will terminate the download. Because pages are streamed straight to disk an aborted download will leave behind a ZIP containing all the pages that were successfully downloaded prior to termination.

As soon as the download is complete the progress menu will disappear and you will be free to download the book again in a different size if you wish.

Closing or refreshing the page while a download is in progress will terminate the download as though you had clicked the Abort button. Attempting to close of refresh the page while a download is in progress will trigger a prompt, ensuring that accidental termination if a download is not possible.

Mutiple books may be downloaded simultaneously.

Limitations

The final ZIP can be no larger than 4GB.

Contact

To report an issue email bookripper-issues@protonmail.com and give as much detail as you can. Emails to this address that are not issue reports will be ignored.

All other inquiries can be sent to bookripper@protonmail.com.

Creating a PDF Linux

pdftk, tesseract, unpaper, img2pdf, and ImageMagick's convert