Cosmophile's Blog: Home

Docker notes

Table of contents

Docker Compose configuration file caveats

Production issues on Docker Compose often come from compose.yml (or docker-compose.yml) file structure, due to surprising opinionated structure of YAML files.

  1. No tabs, whitespaces only. Indent uses two whitespaces.

  2. YAML might try auto type conversions. true, on, yes can become boolean true, same for false, off, no being boolean false. This includes case variants as well. Values should be quotes to prevent type conversion: "on" or "off".

  3. Values left empty mean null, not blank. Blank values need to be explicitly written as "".

  4. Leading zeros coerce to octals. 024 is octal, 20 in decimal. For extra safety on edge cases, quotes need to be used: "024"

  5. In multiline strings, | and > are different. | (literal block) preserves line breaks whereas > (folded block) does not.

command: >
  echo "Starting dev"
  npm run dev

# becomes: echo "Starting dev" npm run dev
  1. Environment section can be written in map style and list style. Map style should be preferred provided that values are correctly quoted whenever needed. It is easier to read, gives better diffs and causes less bugs.
# List style, bad:
environment:
  - NODE_ENV=production
  - DEBUG="false"
# Worst:
environment:
  - "NODE_ENV=production"
  - "DEBUG=false"
# Map style, good:
environment:
  NODE_ENV: production
  DEBUG: "false"
  1. $ needs to be escaped by doubling because $ means variable expansion.
command: "echo $$HOME"
  1. Finalize early: docker compose config command expands variables and anchors and displays final resolved config. It catches YAML errors as well.

Docker Compose file variable expansion

First of all, variable expansion results can be debugged by running docker compose config command.

Variable expansion happens outside the container and in the following order:

  1. Command line --env-file parameter,
  2. Shell environment variables,
  3. .env file in project directory,
  4. Default expression :-.

An important note here is that variables in .env do not override if they also exist in shell.

Variable expansion works only for values (not on keys). The following expansion formats are accepted:

  1. ${VAR}: Plain interpolation.
  2. ${VAR:-default}: Strict, value default is used if VAR is unset or empty.
  3. ${VAR-default}: Value default is used if VAR is unset.
  4. ${VAR:?error message}: Strict, error is raised with error message if VAR is unset or empty.
  5. ${VAR?error message}: Error is raised with error message if VAR is unset. "Must exist, but can be empty."

Nested expansions are not supported and commands are not substituted, therefore ${VAR_${ENV}} and ${$(pwd)} do not work like shell environment.

It is better to have variable expansions quoted to avoid implicit type coercions:

environment:
  DEBUG: "${DEBUG:-false}"

For API keys, connection strings and such, raising error in case a variable is undefined is better to prevent upstream errors. For example, if a database connection is a must, then this should be noted as ${DB_URL:?DB_URL is required}.

Another edge case is that a configuration might work on development environment, but might fail during CI stage. env -i docker compose config command tests this by blocking local shell variables to flow into compose file.

ENTRYPOINT and CMD

entrypoint (ENTRYPOINT in dockerfiles) and cmd (command in compose files) needs to be correctly understood and may cause unexpected errors and signal handling otherwise. What Docker does is simply:

Final Command = ENTRYPOINT + CMD

Best practice is to use ENRYPOINT as the main executable, runtime, or worker and CMD as default arguments. For example:

ENTRYPOINT ["node"]
CMD ["app.js"]

# becomes: node app.js

A use can easily override CMD with docker run:

docker run image_name test.js

# becomes: node test.js

But overriding ENTRYPOINT requires a keyword argument:

docker run --entrypoint npm image_name

A container's existing ENTRYPOINT can be inspected by docker inspect image_name command.

Exec form vs shell form

An important point is not to mix exec and shell form and to always prefer exec form, as this is the best practice. In addition, never mix the two.

# exec form, good:
ENTRYPOINT ["node", "--tls-min-v1.3", "--check"]
CMD ["app.js"]

# shell form, bad:
ENTRYPOINT node --tls-min-v1.3 --check
CMD app.js

In shell form, all input is wrapped by /bin/sh -c and this breaks signal forwarding, meaning that signals like SIGTERM which used when stopping the container will not reach to inner executable.

Another issue arises here: Exec form does not expand variables and passes literal text.

CMD ["node", "app.js", "$INPUT_FILE"]

# becomes: node app.js $INPUT_FILE

A solution is wrapping it in shell script:

CMD ["sh", "-c", "node", "app.js", "$INPUT_FILE"]

# This expands $INPUT_FILE correctly.

Now this works:

docker run -e INPUT_FILE=file.json image_name

However, since we encapsulated the main runner with shell, shell became PID 1, i.e. initial process and node becomes child process. In this form shell does not forward signals, which is not ideal for production environments, because this may cause zombie processes and break graceful shutdown. The best practice here is creating an entrypoint script entrypoint.sh:

#!/bin/sh
set -e

exec node app.js "$INPUT_FILE"
ENTRYPOINT ["./entrypoint.sh"]

Here exec command replaces shell and makes node PID 1.