Setting up the psiTurk Docker container
September 17, 2017
September 17, 2017
This is the third post in a series of blog posts about how to run an experiment with Amazon Mechanical Turk, using psiTurk.
This post will cover how to run the psiTurk example experiment inside of a Docker container.
Setting up the psiTurk Docker container
Now that we have Docker installed, let’s work on setting up the psiTurk Docker container. First, create a directory for your experiment on your Linode server:
Next, change to the newly created directory and create a file called
Dockerfile
in that directory:
A Dockerfile
is a set of instructions for building Docker images. An
image is the basis for a container. You can think of a running Docker
container as an instance of a Docker image. (See
this Stack Overflow question
for more discussion.) The set of instructions in the Dockerfile
is
what allows this process to be automated and reproducible.
Now, we want to create a Docker image that runs psiTurk and serves our
experiment. I’ve already created a publicly available Docker image with
psiTurk that can be used as the basis for other Docker images, so the
Dockerfile
that we have to create is pretty simple, consisting of
only two lines. In the end, your Dockerfile
should look like this:
You can add the above content to your Dockerfile
by running the
following command:
The FROM
instruction tells Docker to build the image on top of the
publicly available image called adamliter/psiturk:latest
, which is an
image that contains the latest version of psiTurk.
The VOLUME
instruction tells Docker to expect the user to map a folder
from the host computer to the folder /psiturk
inside the Docker
container. This will allow you to edit the experiment files on the host
computer (i.e., the Linode server) and have these changes
automatically reflected inside the container itself! 🎈
Now that you have your Dockerfile set up, go ahead and run the following command:
The docker build
command is used to build images from Dockerfile
s.
The -t
flag allows you to specify the name of the image. If you ever
intend on making this publicly available on
DockerHub, you should replace
<YOUR_USERNAME>
with the username for your DockerHub account.
If you don’t have a DockerHub account and/or don’t intend to ever make
it publicly available, it’s still a good idea to replcae
<YOUR_USERNAME>
with something. You could use the same username as
the account on your Linode server that you’re currently signed in to.
Finally, the .
at the end of the command just tells docker
to look
for a Dockerfile
in the current directory.
After this runs, you’ll now have a Docker image containing psiTurk and a prespecified place for where you ought to mount your experiment files. Let’s move on to generating and setting up the experiment files.
Setting up the experiment files
For the purposes of this blog post, we’re just going to use the default
experiment from psiTurk, which you can generate with the psiTurk command
psiturk-setup-example
. However, you currently don’t have Python
installed on your Linode server, much less psiTurk! But Python and
psiTurk are installed inside the Docker image that you just built! 👀
Let’s see how we can take advantage of that fact. First, make sure that
your ~/psiturk-example
directory only contains the Dockerfile
:
The ls
command should just return Dockerfile
. Next, run the
following command from inside of the ~/psiturk-example
directory. You
can run the command cd ~/psiturk-example
first, if you want to make
sure you’re in that directory:
Let’s break down the docker run
command, starting from the back. The
part at the very end of the command (<YOUR_USERNAME>/psiturk-example
)
is the name of image that is the basis for the container that you’re
instantiating and running (remember, a container is a running instance
of an image). So, whatever you named your image in the command above,
you’ll want to make sure that it is the same name here.
The next part is the -v
flag. This is short for --volume
, and it is
what allows you to map a directory on the host folder to a directory
inside the container. In this case, $(pwd)
is being mapped to the
folder /psiturk
inside of the container. $(pwd)
is a command that
evaluates to your current working directory, which should be something
like /home/<YOUR_USERNAME>/psiturk-example
.
The -p
flag (short for --publish
) “publishes” a port from the
container to a port on the host computer. Like with the -v
flag, the
part on the left side of the :
is for the host computer, and the stuff
on the right side of the :
is for the container. Mapping port 22362
on the host computer to port 22362
of the container just means that
the traffic on port 22362
of the host computer will be passed to port
22362
of the Docker container.1
The next part from the back of the command is --name psiturk-example
.
This one is pretty straightforward. The --name
flag allows you to
specify a name for the container. In principle, it could be whatever you
want it to be. It doesn’t have to be related to the name of the image
(which is <YOUR_USERNAME>/psiturk-example
), but it is nonetheless
often convenient to give the container a name that is related to the
image, which is why I’ve named the container psiturk-example
.
Working backwards still, the next part of the command is --rm
. The
--rm
flag simply means that the container will be automatically
removed when it is exited. This helps reduce clutter so you don’t
accidentally end up with a bunch of running Docker containers that
you’ve long since forgotten about.
Lastly, the -i
and -t
flags to the docker run
command are related,
so I will talk about them together. The -t
flag allocates a
psuedo-TTY, which basically just means that you are given a shell where
you can execute commands inside of the Docker container. The -i
flag
keeps the STDIN
pipe open if you detach from the running Docker
container, meaning you can reattach to the container and continue
executing commands in that same shell.
After running this command, you should now be looking at a shell prompt that looks something like this:
root@66e24f5bfce1:/psiturk#
This means you’re signed in as root
to the machine 66e24f5bfce1
,
and you’re currently in the directory /psiturk
(the alphanumeric
string that is the name of your container will be different for
you). Congratulations, you’re inside the Docker container!
🔧 🎉
Remember that this container includes both Python and psiTurk, so we can run psiTurk commands. In particular, run the following command from inside of the Docker container:
This will create a directory called psiturk-example
, which contains
the experiment files for the example experiment that ships with psiTurk.
You should also see a message indicating that a file called
~/.psiturkconfig
was created. I’ve configured the Docker container
that your Docker container is built on top of to expect the
.psiturkconfig
file to be in the same directory as the experiment
files. So, go ahead and move it there by running the following command
from inside of the Docker container:
After doing that, go ahead and disconnect from the Docker container by
either typing exit
and hitting ENTER or by hitting
CTRL+d on the keyboard.
You should now be back at a shell prompt for your Linode server. If you
run the command docker ps
to show all containers, you won’t see any
containers since the container you were just in was configured to remove
itself (--rm
) when you disconeccted from it. So did we lose the files
that we just created inside of the container?!
Nope, those files live on the host computer, your Linode server. This is
because we mapped the folder ~/psiturk-example
on your Linode server
to the folder /psiturk
inside of the container. All changes inside
the container appear on the host machine, and vice-versa.
If you run the command ls ~/psiturk-example
, you should see:
Dockerfile psiturk-example
And if you run the command ls ~/psiturk-example/psiturk-example
, you
should see:
config.txt custom.py herokuapp.py herokuapp.pyc Procfile requirements.txt runtime.txt static templates
Yes, you do have a folder called psiturk-example
inside of a folder
called psiturk-example
, but so it goes … ¯\_(ツ)_/¯
The folder ~/psiturk-example/psiturk-example
is what contains the
experiment files for the example experiment that psiTurk automatically
generated.
Let’s go ahead and make one small edit to the config.txt
file in that
folder. In the config.txt
file, there’s a line that says
host = localhost
. I’ve never had any luck getting psiTurk to work with
this default setting. Instead, we’re going to change this line to say
host = 0.0.0.0
.
Since the files were all created by the root
user inside the Docker
container, your user currently does not have sufficient permission to
edit the files. Go ahead and change the ownership of the files by
running the following command on your Linode server (replacing both
instances of your_username
with the username for your account on your
Linode server):
Now you can edit the config.txt
file to change the line to say
0.0.0.0
for the host
. Do so by running the following command on your
Linode server:
We are now finally ready to try out the example experiment! To do so, restart the docker container. Note that the command this time is going to be slightly different!
The command is almost the same, except that this time, you’re mapping
$(pwd)/psiturk-example
on the host computer to /psiturk
inside of
the container (instead of just $(pwd)
). The /psiturk
directory
inside of the container should contain the experiment files; you
don’t want to have the experiment files nested one directory down. So
we want to map the nested psiturk-example
directory on the host
machine to the /psiturk
directory inside of the container.
You should now be looking at a shell prompt for the Docker container,
and if you run the ls
command you should see the experiment files:
config.txt custom.py herokuapp.py herokuapp.pyc Procfile requirements.txt runtime.txt static templates
Go ahead and start the psiTurk shell by running the command psiturk
inside of your Docker container. You should now see something like the
following:
Again, this tutorial series assumes familiarity with how psiTurk works, so if you’re not familiar with the psiTurk shell, I suggest you spend some time reading the psiTurk documentation.
Once the psiTurk shell is running, type the command server on
from
inside of the psiTurk shell:
If all goes well, you should see a message that says:
Now serving on http://0.0.0.0:22362
This means that the psiTurk example experiment is accessible on port
22362
of the Docker container … which is published to port 22362
of your Linode server … which is publicly accessible … which means
that you can access the example experiment by navigating to the
following URL in a browser: http://<LINODE_IP_ADDRESS>:22362
(replacing <LINODE_IP_ADDRESS>
with the static IP address of your
Linode server)! 🙌 🍾
As exciting is this is, you might want to wait on popping the champagne. (Sorry, I got ahead of myself.)
We still need to set up a MySQL Docker container and hook it up to the psiTurk experiment, which will be covered in the next post.
For now, go ahead and turn the psiTurk server off by running the following command:
Then hit CTRL+d twice in order to exit the psiTurk shell and then the Docker container.
That’s it for this post! In the final post of this series, I’ll cover how to set up a Docker container for MySQL and hook everything together. As always, please feel free to comment with any questions! And, if you do sign up for a Linode account, please consider signing up using my referral link. Thanks! 🤓
Notes
-
It’s worth noting that you could have used a different host port here. However, in the first blog post in this series, we only opened two ports to outside traffic on the Linode server:
22
and22362
.22
is for SSH. Thus, the only other publicly accessible port on your Linode server is22362
, so that’s what we must use. Of course, feel free to configure your publicly accessible ports differently, if you’re comfortable doing so and have good reason for it. ↩