Icon of a thin page Icon of a thick page

Encrypted Password-Protected PDF Files

David A. Harding

I have a favourite program, much(1), that I use to generate randomized tests in PDF format. I've used it for tests I've distributed, like the LUG/IP LPI102 practice test, and for personal tests I've written for myself.

With a little work, much can be configured to generate multiple-choice tests of increasing difficulty. Using this feature with password protected encrypted PDFs lets me create a series of tests. To access test 2, you need the correct answers to test 1; to access test 3, you need the correct answers to test 2; and so forth. But now I realise passwords based on multiple-choice answers are weak. In response, I could put more questions on each test. But how many questions are enough?

Brute-forcing password-protected PDFs is easy. A simple shell script:

dave@europa:~$ cat bin/crack-pdf 
#!/bin/ash

while read pw
do
        pdfinfo -upw $pw $1 2> /dev/null && echo "Password: $pw" && exit 0
done

To test, I created an encrypted password-protected PDF with a password equal to the last word in the local dictionary.

dave@europa:~$ tail -n 1 /usr/share/dict/words 
études

dave@europa:~$ pdftk report-cover.pdf output foo.pdf encrypt_128bit owner_pw foo user_pw études

Then I counted the words in the dictionary and timed how long my script took to check them all.

dave@europa:~$ wc -l /usr/share/dict/words 
96274 /usr/share/dict/words

dave@europa:~$ time crack-pdf foo.pdf < /usr/share/dict/words 
Creator:        TeX
Producer:       pdfTeX-1.10b
CreationDate:   Thu Sep  1 19:34:00 2005
Tagged:         no
Pages:          2
Encrypted:      yes (print:no copy:no change:no addNotes:no)
Page size:      595.276 x 841.89 pts (A4)
File size:      18252 bytes
Optimized:      no
PDF version:    1.4
Password: études

real    8m24.843s
user    6m29.093s
sys     1m26.598s
 

My script was very inefficent, but I have a baseline. Roughly 200 passwords were checked every second. The fine science (art?) of statistics eludes me, but I reckon 15 or more questions each with 4 or more answers will prevent a brute-force attack from succeeding in a reasonable amount of time.