How to save output after running new analyses in your docker container
Recently I gave a brief presentation introducing docker to fellow scientists about how to create a reproducible environment the analyses completed and to facilitate sharing between lab members. Aming the questions that came up were concerns over data privacy and security (which I have addressed them here), and issues with persisting outputs and new analyses added to the original script that came with the image. So for instance, I’ve shared an image (pull it here) to Docker hub and my colleague pulls it and runs the container to reproducible my results. But they decide to add some new analyses to the script (see code comments), and now they would like to save the newly added code and its output.
Snippet of R script that you can find by pulling the docker image I created:
Option 1: Create a new Docker image
The most common approach involves committing changes made within a container to a new image. This can be done using the docker commit command, which creates a new image that includes the changes. For example:
docker commit <container_id> <new_image_name>:<tag>
docker commit nghuixin/infl_marker_analysis:1.0.0 soohyun/infl_marker_analysis:1.0.0
Upon running the new docker container with docker run soohyun/infl_marker_analysis:1.0.0, the new code will be visible in the R script, while preserving the same libraries and versions, and will produce the expected outputs.
Option 2: Save the modified script and/or analysis results outside of Docker container
Save the modified script to your local machine
You can do so by running the following commands:
docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5cb1dedcc204 nghuixin/infl_marker_analysis:1.0.0 "/init" About an hour ago Up About an hour 0.0.0.0:8787->8787/tcp eager_poincare
docker cp 5cb1dedcc204:home/rstudio/analysis/infl_marker_huixin.R ./container_r.R5cb1dedcc204 is the container id which can be obtained by running docker container ls, and ./container_r.R is the new R script with the added lines of code. It is now saved in the root directory of the project on the hosts machine.
🚨 However, this is not recommended, because once the file is saved outside of the Docker container, then there is no guarantee that the results of the analyses will be replicable given that the versions of R and associated libraries might not be the same the local machine.
Save the analyses results to your local machine (text output)
If or some reason, you wish to only save the output like summary(mod3) above, then you can either save your results output by using sink()
#### ---- New analyses that were NOT already part of the container -------
sink('new_analyses_output.txt')
# Fit the mixed-effects model using lme4
mod3 <- lme(lgvegf ~ time * (agem * dxgroup + gender) , random = ~ time | subnum, data = complete_data)
# Print model summary
summary(mod3)
print('analyses completed')
sink()Next, you can copy the txt file output from the container to your local machine.
docker cp <container_id>:/path/to/container/file /path/to/local/destination
docker cp 5cb1dedcc204 :/home/rstudio/new_analyses_output.txt /new_analyses_output.txtSave the analyses results to your local machine (image output)
You can also do the same for the plots you created. For instance, if there isn’t already a figures directory in this Docker container you can create it manually just as you would on your local machine R studio, or run dir.create('figures') :
Then run the code for creating a new plot: