To run the VM you have to download and install VirtualBox.
If you're on OS X, Windows, or Solaris you can download VirtualBox from it's homepage. Run the installer and follow the steps.
If you're on Linux, chances are VirtualBox can be installed through your package manager. If that doesn't you can download deb's and rpm's from the homepage.
The Virtual Machine is an Open Virtualization Archive. We have tested it on OS X (Mavericks), Windows 7, and Ubuntu.
The VM image (1.6GB) can be downloaded from the SURFsara Beehub server.
Open VirtualBox and import the VM by selecting "File → Import Appliance". Follow the on-screen instructions.
The VM comes with sensible default settings:
The settings should be working on most systems, but if you run into trouble this is what you might want to tweak.
You can start the VM by clicking the Start arrow in the VirtualBox interface. When Ubuntu has booted you can start by getting a feel for Hadoop and the Common Crawl data!