All your research in a picture

For some Friday fun, here’s a word cloud generated from my Mendeley library. It’s based on a collection of about 400 PDF files, the papers I read for my research. Neutral ecology is a bit overrepresented as I used to maintain a Mendeley group on the topic, but otherwise the keywords do reflect what I do.

Word cloud of research keywords


The “making of”

If you’d like to make yours, the first thing is of course to extract all the text from the PDFs automatically. I used pdftotext, a command line tool that comes with Xpdf and Poppler. On OS X with MacPorts, the installation is as simple as sudo port install poppler. On Ubuntu it’s probably the poppler-utils package.

On Unix-like systems, the following will create a file containing the text of all the PDFs from the current folder:

for i in *.pdf; do pdftotext "$i" - >> allpapers.txt ; done

Now we can use this data to make the cloud. There are several tools that can do this, the most popular being Wordle. I wanted finer control over the end result than what Wordle provides, so I used Mathematica instead, based on this code.

Comments !