The beginning of the book is comprised largely of Arial, Helvetica, and the occasional Times New Roman. As you might expect, these are by far the most common fonts used in documents. By page 46 and 47, things have progressed to a lot of Arial Bold and Times Italic. In the 200s, commonly used script fonts, as well as much more obscure faces are beginning to appear. As we reach the end, the book has devolved significantly: non-Roman fonts, highly specialized typefaces, and even pictogram fonts abound.
For each of the 5,483 unique words in the book, we ran a search (using the Yahoo! Search API) that was filtered just to PDF files. We downloaded the top 10 to 15 hits for each word, producing 64,076 PDF files (some were no longer available, others were duplicates). Inside these PDFs were 347,565 subsetted fonts. From these fonts, 55,382 unique glyph shapes were used to fill the 342,889 individual letters found in Shelley's Frankenstein text.
This project started because of a fascination with the way that PDF files contain incomplete versions of fonts. The shape data is high enough quality to reproduce the original document, however only the necessary characters (and little of the font's “metrics” that are used for proper typographic layout) are included in the PDF. This prevents others from extracting the fonts to be used for practical purposes, but creates an opportunity for a curious Victor Frankenstein who wants to use these incomplete pieces to create something entirely different.