Introduction

So you’ve got this great application, and it’s a perfect solution for your stakeholder. You’ve Dockerized your application code and setup orchestration for the individual services. Your Proof of Concept worked brilliantly—now it’s just time to ship it.

But, wait! Hold on! We can’t just publish this image to Docker Hub. Why? Well, your situation may be different, and the reasons may change; regardless, you’re locked out. The deployment environment is sealed from the internet. There’s no way you can build the image by delivering the code and Dockerfile to the deployment endpoint–I guess you’re stuck.

Then, like a ghost of Christmas future, a whisper of wind seems to say: “Don’t forget—docker save! docker save…..”’

What is Docker? And Why?

Docker is a wonderfully designed code-isolation engine. As a part of this technology offering, Docker defined and carried forward a new standard for Containers. This Container instance is run by the Docker Engine and helps isolate the application and its dependencies away from the Computing Resource. This means your developers and data scientists no longer have to wrestle with dependency management; no longer have to create and manage development environments; and, no longer have to worry about managing the tiny differences between the final deployment environment and the development environment. Containers handle this by only using the machine’s Kernel (managed by the Docker Engine) and abstracting the OS to the container layer (where the code itself actually runs and lives). This way conflicts, dependencies, and the complicated little details are a worry of the past for your software teams.

For the sake of brevity, we’ll avoid a deep dive into Docker. So, to keep things focused, we’ll assume a few things:

  1. You are already at least lightly familiar with Docker (If not, get started with Docker’s Tutorial).
  2. Your application has been Dockerized, whether with your own Dockerfile or from pulling images from some Docker Image Registry (most commonly Docker Hub).
  3. The Docker Engine and Docker Compose are installed on Production and will be used for this deployment.

Solution Introduction

Regardless of the reasons, sometimes the production environment is someplace where you can’t build the application. Even if you’ve got Docker installed there, enterprises are focusing more heavily on security by narrowing what can reach the production environment; this typically means that internet access is not available to the box.

The build and deploy workflow, can be executed manually, however; this solution workflow is best employed under DevOps supported Continuous Integration and Continual Delivery (hereout referred to as CI/CD) pipeline. This gives you the confidence that your deployments are consistent in form and movement, effectively eliminating the human variables that can confound engineers wishing to debug a bad release.

Step 1 – Image Build

With your application code completed and a well formed Dockerfile included in the repository, you’ll first need to build your images. At build-time, you’ll need to perform this step in an environment that can reach the resources it needs. This is the time to let Docker access the internet and gather resources for the short time it needs to perform the build. If security is still of concern, you can narrow down traffic to your build server to those endpoints and IP addresses you know are required for your build process; however, you’ll have some fun figuring out the exact endpoints, and proxies where your resources live. A few examples of resources your build may require to complete its task from the Build Server:

  • Image Repositories – Docker Hub, Vagrant and a few others.
  • Code Repositories – PyPi, CRAN, RubyGems, etc.
  • Package Repositories – NPM, apt-get, node, etc.
  • Proxy Solutions – The above resources can always be mirrored if security is your biggest concern

Now that you’ve got access to your resources figured out, we can kickoff a Docker Build fairly simply. From the root directory of the project, you can simply invoke:

docker build .

This command simply invokes the docker build command and provides the directory where the Dockerfile lives (in the root directory of the project, as we mentioned). Further options and features can be explored in the build command reference page.

Step 2 – Docker Save

Once your build has completed, you’ll need to then save the image (or images) to a tarball file which will be delivered to production.

While simple, using an automation engine as a part of your Continuous Integration services will assist in making your deployments consistent. We’ve used Ansible for projects requiring CI/CD and task automation. But regardless of what you use, have your automation service run the docker command:

docker save name-your-file.tar name-of-your-image

Be sure when you’re automating your service to either pass your image name variable from a definite source (such as your service name in your docker-compose file). This way you can be sure you’re delivering the right image, as well as not running into errors. If you’re not sure of the image name, you can check the image registry created after the image is built, by running:

docker image list

Then refer to the final NAME column for your specific image name.

Step 3 – Ship to Production Resource

Now that you’ve got your tarball generated with your image binaries, we need to deliver this to our final destination. There’s a multitude of options available: rsync, ftp, sftp, cp, scp—just pick whatever works best for your automation or CI/CD pipeline. Regardless, you’ll want to achieve the same ends as running the following command would:

scp /path/to/tarball.file deploy_user@deployment.host:/some_host_directory/

Step 4 – Docker Load

Once the tarball is delivered to the deployment host, we’re going to load it to our host’s Docker image registry, before we run the images as containers. Navigate your session (likely through SSH) to the deployment host, and navigate to the directory you just placed the tarball file in. To load the image from the file, we can run the command:

docker load -i your_tarball_file

After the command is run, you can check that the image made it by running:

docker image list

This should list the image under the same name as you built it. Once it’s there, you’re ready to run your application!

Step 5 – Make it Run

Now, we’ve deployed a Docker image; it’s time to make it run! Whether you’re orchestrating several container images from your deployment, or just running a single container service, you can have your automation engine start up your newly deployed application! Simply use the command below:

docker run –idt image-name

Voila! You’ve built your container and delivered the application to your deployment server! What was once a major problem is now handled easily, with a few major benefits to boot. With a bit of automation, these deployments can be made extremely simple.

Make it Better with CI/CD

To aid our data science and software development teams even further, and speed up our time to release new applications, we can develop a CI/CD workflow that automates the tedious and repetitive deployment tasks. Manual deployments take time, are prone to errors, and are even more tedious to rollback when done manually.

This is made even easier under a CI/CD solution—at eSage Group we’ve used Ansible for this scenario. For each step we defined above, our Ansible playbook has a task definition associated with that step. If you’re using Ansible, most of the Docker commands can be handled using Ansible’s docker_image module provided by the community. While this can be helpful for those seeking to let Ansible handle Docker, if you’d like finer control over the commands executed, you’ll want to use the cmd or shell Ansible modules (we prefer shell, so that Ansible has access to stored Environment Variables on the build server).

If Ansible is new to you, or you’d just like to learn more. They have wonderful and extensive documentation. There are a load of excellent third party tutorials and learning resources to boot.

Given the vast array of CI/CD providers and the automation systems that work with them, whatever you choose, you’ll see enormous benefits from investing in this type of infrastructure. With this level of automation, developers can release their code more quickly, operations can get that code to production more accurately, and product owners and business stakeholders can be assured that mistakes will be minimized. Combined with Docker, and your time-to-develop new models or products will see a major improvement.

Conclusion

Docker is a great tool for managing dependencies and application layers with narrow and specific purpose. This is a huge value for software developers and data scientists who need to be able to share their work developed on their local environment, and push that to a production environment without having to change their entire dependency tree.

And, with a tool like Docker Save, we can isolate our production and build environments to achieve a robust level of security with almost no loss in productivity. Add in an automation and CI/CD pipeline, and your development and data science woes will be a thing of the past.

Have Questions? Reach out!

Our team has developed several solutions for deployment automations to serve our customers’ and their teams at large.

If you want to discuss how eSage Group can help you optimize your Data Science and Machine Learning Development efforts, contact us for a free 30 minute consultation.