jump to navigation

Command Line Capture of Web Pages to PDF, PNG, SVG, etc January 5, 2010

Posted by Robert Harder in : Utility , trackback

If you’re on a Mac, you probably know that in any application that prints, you can “print” to a PDF file — handy to be sure — but from the command line, it’s not so easy. Enter CutyCapt, a cross-platform tool that lets you capture web pages in a variety of formats including SVG, PDF, PS, PNG, JPEG, TIFF, GIF, and BMP using WebKit as the rendering engine.

Update. I fixed the page size so that CutyCapt creates US Letter sized pages instead of A4 pages. I had to change line 262 of CutyCapt.cpp to printer.setPaperSize(QSizeF(8.5,11),QPrinter::Inch);.

A windows executable is provided on the website, so I had to download the source and compile it. I don’t remember exactly what I had to do to compile it except that I needed to install QT. I think I typed qmake and then xcodebuild to compile it.

Download

Here is what I compiled. I have not packaged it as a PKG file. You’ll get a generic-looking application (.app) and a symbolic link pointing to the executable within the application. I recommend copying both files to /usr/local/bin, if that is in your PATH, or somewhere else that is accessible.

Download CutyCapt.zip

Usage

Running CutyCapt without arguments gives us the following options:

-----------------------------------------------------------------------------
Usage: CutyCapt --url=http://www.example.org/ --out=localfile.png
-----------------------------------------------------------------------------
 --help                         Print this help page and exit
 --url=<url>                    The URL to capture (http:...|file:...|...)
 --out=<path>                   The target file (.png|pdf|ps|svg|jpeg|...)
 --out-format=<f>               Like extension in --out, overrides heuristic
 --min-width=<int>              Minimal width for the image (default: 800)
 --max-wait=<ms>                Don't wait more than (default: 90000, inf: 0)
 --delay=<ms>                   After successful load, wait (default: 0)
 --user-styles=<url>            Location of user style sheet, if any
 --header=<name>:<value>        request header; repeatable; some can't be set
 --method=<get|post|put>        Specifies the request method (default: get)
 --body-string=<string>         Unencoded request body (default: none)
 --body-base64=<base64>         Base64-encoded request body (default: none)
 --app-name=<name>              appName used in User-Agent; default is none
 --app-version=<version>        appVers used in User-Agent; default is none
 --user-agent=<string>          Override the User-Agent header Qt would set
 --javascript=<on|off>          JavaScript execution (default: on)
 --java=<on|off>                Java execution (default: unknown)
 --plugins=<on|off>             Plugin execution (default: unknown)
 --private-browsing=<on|off>    Private browsing (default: unknown)
 --auto-load-images=<on|off>    Automatic image loading (default: on)
 --js-can-open-windows=<on|off> Script can open windows? (default: unknown)
 --js-can-access-clipboard=<on|off> Script clipboard privs (default: unknown)
-----------------------------------------------------------------------------
 <f> is svg,ps,pdf,itext,html,rtree,png,jpeg,mng,tiff,gif,bmp,ppm,xbm,xpm
-----------------------------------------------------------------------------
http://cutycapt.sf.net - (c) 2003-2008 Bjoern Hoehrmann - bjoern@hoehrmann.de

Based on the example provided, try a simple test:

$ CutyCapt --url=http://blog.iharder.net --out=blog.png

I get a nice long capture of my blog as a PNG file.

The program does not seem to work if you use a tilde (~) in the path to represent your home folder.

Enjoy!

Comments»

1. davidmitNo Gravatar - July 11, 2010

Can you talk about the compilation process? The cutycapt website suggests running qmake but I get an error stating that there is no makefile…

2. Robert HarderNo Gravatar - July 11, 2010

Wow, I’ve completely forgotten how I compiled it, but I’ll see what I can dredge up when I can. Rob

3. Ezekiel TemplinNo Gravatar - July 16, 2010

This is the tutorial I used to install it:
http://daveelkins.com/2009/04/10/setting-up-headless-xserver-and-cutycapt-on-ubuntu/

4. Robert HarderNo Gravatar - August 2, 2010

In recovering from a hard drive crash, I noticed that I also had to install Qt for CutyCapt to work. From the page http://qt.nokia.com/downloads/qt-for-open-source-cpp-development-on-mac-os-x I downloaded the file “Cocoa: Mac binary package for Mac OS X 10.5 – 10.6 (32-bit and 64-bit)
http://get.qt.nokia.com/qt/source/qt-mac-cocoa-opensource-4.6.3.dmg (172 MB, includes build and interface tools).” -Rob

5. Cameron EagansNo Gravatar - January 5, 2012

Here’s the steps that I used:

Install QT from Homebrew:

brew install qt

In the CutyCapt source directory:

qmake
make
./CutyCapt.app/Contents/MacOS/CutyCapt –url=http://cweagans.net –out=cweagans.net.png

6. Cameron EagansNo Gravatar - January 5, 2012

I guess you can also just use homebrew to install cutycapt:

brew install qt cuty_capt

7. Gabriel Le BretonNo Gravatar - August 26, 2012

3 last comments are spam xD

Anyway, just to say your compiled mac os x cutycapt version doesn’t work on Mac os X Mountain Lion 10.8.1 (did not try on earlier versions)

I will try to compile it tonight when qt will be installed, thanks for pointing out your compilation instructions. If I get it to work, I’ll post a link below.

8. Robert HarderNo Gravatar - August 26, 2012

@Gabriel Thanks for the spam alert — wordpress missed them.

Doesn’t work on Mountain Lion? Nuts. I real only use CutyCapt once each year to archive a blog, so I would not have noticed this for several months yet. Hope it still compiles…

BTW you might need to update the source to hard code US Letter size paper, if that’s what you use.

9. Robert HarderNo Gravatar - August 26, 2012

@Gabriel I’ve got Mountain Lion 10.8.0, and CutyCapt still works for me, though I get a funny warning: “WARNING: Phonon needs QCoreApplication::applicationName to be set to export audio output names through the DBUS interface”