Youtube channel !

Be sure to visit my youtube channel

Friday, April 17, 2020

Docker Basics + Security

Here are answers to common questions from the Docker for web developers course.
 

difference between image and build
using FROM:image_name - docker compose will run a container based on that image
using build: docker compose will first build an image based on the Dockerfile found in the path specified after the build: option, or inside the context: option, and then run a container based on the resulting image. Inside the build: we can specify image: option which will name and tag the built image. Example:
build: ./
image: webapp:tag
This results in an image named webapp, tagged tag

why we do: apt-get clean or npm cache clean?
The cache of apt, makes it not aware of new apt installs inside the docker image! If we install packages with apt install we immediately(&&) have to do apt clean afterwards or use: && rm -rf /var/lib/apt/lists/* Reason: Next time when we add a new package to be installed in the container docker will use the apt cached layer and won't be able to detect the changes and install the package version. We can use docker history command to see the different layers of the docker container creation.

optimizing image size: Docker images are structured as a series of additive layers, and cleanup needs to happen in the same RUN command that installed the packages. Otherwise, the deleted files will be gone in the latest layer, but not from the previous layer.

why we copy package.json from our host directory to the container?
We first COPY the dependency lists (package.json, composer.json, requirements.txt, etc.) to the container in order for Docker to cache the results of the npm install that follows. This way when changing other parts of the container configuration and re-building it, docker will not rebuild all the dependencies, but use the cached ones. At the same time, if we change a line inside the dependencies list file, all the dependencies will be re-installed, because they now form another different cached layer inside of docker.

Then why we copy just package.json and not all source files of the project, saving them in one docker layer? Because if we make a change to just one of our source files - this would bust the docker cache and even though the required packages had not changed they'll need to be re-installed (npm/composer install).
For this reason we:
1) copy the dependency list
2) install dependencies so they will be cached
3) copy our source files


combining commands
We can combine multiple lines from RUN and COPY commands into one line this will create only one layer which will be cached for later lookup. Also instead of using ADD, we can use: COPY to transfer files from image to image

multiple builds
For having development, build and test stage we can use target build in the compose file like:
target:dev
then we can build a specific target with: docker build app:prod --target prod
This will build just the section prod from the docker-compose file, and will tag it with app:prod
The same can be done for a development environment:
app:dev --target dev

mounts
- a named volume will be created entirely inside the container and is suitable for storing persistent information inside of the container such as database data.
- a bind mount (pointing outside of the container) is used for information, residing on our local machine. When it is good to use bind mounts? - they allow us not to copy our source code to the container, but to use the local code files, such as local development files.

version: "3.8"
services:
  web:
    image: nginx:alpine
    volumes:
      - type: volume # named volume
        source: dbdata # link to created volume inside of container
        target: /data # inside of container
      - type: bind # bind mount
        source: ./static # local directory
        target: /app/static # inside of container
 volumes:
  dbdata: #create volume inside of container
 
Note: anonymous volumes
They are the same as the named volumes, but don't have a specified name.
During the build phase the named volumes are created inside the container.
In the run phase bind mounts will overwrite the freshly created container contents: the name bind will copy the local directory name bind over the container named/anonymous volume overwriting its contents. In such cases anonymous volumes can be used to preserve certain container sub-directories from being overwritten at runtime from host directories:
volumes:
      - '.:/app' # bind mount - copy the local host dir into container at runtime
      - '/app/node_modules' # anonymous volume - preserve container built /node_modules at runtime


node_modules
Why we would like /node_modules to be rebuilt inside the container, and not copied directly from our host? Because the container libraries might be based on a different image distribution than our host. For example, if we run a project on Windows OS and creating a container for the project, based on a Linux distribution image the contents of /node_modules might be not the same for Linux and Windows OS. The solution in those cases is to place /node_modules inside of .git_ignore file. This way the libraries inside /node_modules will be rebuilt from scratch inside of the container, and they will get their own proper versions, based on the Linux image, that are different from the host installation.

environment variables
In the docker-compose file outside of the build phase, we can use pre-made published images and transfer variables to the image using the environment: section. The second benefit of this technique is that there is no need to rebuild the container, but just change the variables and restart the container in order for it to get the changes. Inside the build phase, the container uses ARGs to receive external variables.

Example 1
docker-compose:
version: '3'
services:
  service1:
    build: # note: we are in build phase
      context: ./
    args:
      USER_VAR: 'USER1' # setup the USER_VAR variable
# note: if there is alredy USER_VAR inside the alpine image (used in the Dockerfile)
# it will overrite the USER_VAR and show instead

Dockerfile:
FROM alpine
# note accessing the USER_VAR after the FROM line !
ARG USER_VAR # access the docker-compose set USER_VAR
RUN echo "variable is: $USER_VAR" # echo on screen

Example 2
.env:
ENV_USER_VAR = USER1
docker-compose:
version: '3'
services:
  service1:
    build: # note: we are in build phase
      context: ./
    args:
      USER_VAR: ${ENV_USER_VAR} # setup the USER_VAR variable from .env file

Dockerfile:
FROM alpine
ARG USER_VAR # access the docker-compose set USER_VAR
RUN echo "variable is: $USER_VAR" # echo on screen
 
Example secrets:

Optionally we can create named secrets from .txt files:
docker secret create mysql_root_password ./db_root_password.txt
docker secret create db_password ./db_password.txt 
docker secret ls
  
version: '3.1'

services:
   db:
     image: mysql:8
     volumes:
       - db_data:/var/lib/mysql # using persistant volume inside the container
     environment:
       MYSQL_ROOT_PASSWORD_FILE: /run/secrets/mysql_root_password
       MYSQL_DATABASE: wordpress
       MYSQL_USER: wordpress
# read the password from memory and set the container environment variable 
       MYSQL_PASSWORD_FILE: /run/secrets/db_password 
     secrets:
       - mysql_root_password # enable access to the in-memory secrets 
       - db_password # enable access to the in-memory secrets

secrets:
   db_password:
#  Docker mounts the db_password.txt file under /run/secrets/db_password  
     file: db_password.txt #read the password from db_password.txt file in-memory filesystem
# note: if a container stops running, the secrets shared to it are
unmounted from the in-memory filesystem and flushed from the node’s memory. 
 
   mysql_root_password:
     file: mysql_root_password.txt

volumes:
    db_data: # creating persistant volume inside the container


non-root environment
Keep in mind that the Docker daemon starts with full root privileges in order to create networking, work with namespaces, open ports etc...
Then for each service/container created it uses the created service UID and exports it outside of the container. This way worker/service UIDs inside of the container are mapped to non-root UIDs inside of the host.
The special UID 0 in the container can perform privileged operations in the container. This means that if a container gets compromised and an attacker gains a root account inside of the container this is equal to the host root account. So it is good to use a non-root account for the following reasons:
- a non-root cannot read or write to system files, create users, read memory secrets, etc.
- memory secrets could be only read by the user who created them.

web servers
Some software (Nginx, Apache) already has one master node running at maximum privileges(root) for administrative purposes, and worker nodes for running user applications (web sites) with non-root privileges.
The same way applications developed in nodejs, angular, express, as processes in Linux, run with the privileges of the calling user.

Apache web server is having 1 master process which is owned by root,
then spawns child-processes(workers) for serving web pages, which are configured to run as user 'www-data':
ps -aef --forest|grep apache2
root  /usr/sbin/apache2 -k start
www-data  /usr/sbin/apache2 -k start
Keep in mind that when running Apache with non-root user (www-data) the default port 80 will not be allowed to be opened by the Apache because port 80 as all ports below 1024 are blocked to be assigned by non-root users by default inside of Unix environments. So you'll need to choose to open up a port that is greater than 1024.

dockerhub images
One must note that the predefined official images from dockerhub use root permissions for their installation process. In a container context, valid usage of running commands with root privileges is when we would like to perform system administration activities such as:
- run npm for updating the npm version: RUN npm i npm@latest -g
- install software inside the image with apt and other package managers
- copy files from outside to the inside of the container
- create and set up a 'non-root' user
- set correct permissions for application project directories such as /var/www/html/app etc. using chown and chmod
- setup/change webserver configuration
Note: following the above-described process, when the official image installation completes (unless specified otherwise such as using the USER command inside of docker-compose), the created container/service ends up having root permissions.

In such cases, in order to create a non-root environment, we can divide the docker-compose configuration file into 2 phases:
1) build-time dependencies:
to prepare the project's main installation directory, set up local 'non-root' user, set proper project directory permissions with chown in order our 'non-root' to be able to access it. -> ALL done with root permissions
2) run-time dependencies:
When the system environment is ready we can perform project-specific packages installations and customizations. We switch to a 'non-root' user (example: USER node) and install project packages using the current 'non-root' running user. Example: 
USER node RUN npm install


web development case
If we would like to develop locally on our host and then using our data inside the container via a bind mount:
1) we can first create a non-privileged user inside our container.
2) Then we need to match our local user UID to be the same as the container user UID. Reason: the freshly created container user might receive by the OS another UID which will not match our local user ID and prevent us to work correctly with files.
Solution:
1) We can specify and pass the UID from .env file to the service/container in the docker-compose file
2) Then pass the UID via ARGs from the compose file to the Dockerfile in order to achieve the same UID inside and outside the container.
Details: To specify the user that we want a service to run as, in the docker-compose.yml we can directly set user: uid:gid or: we can set variables in .env file: UID=1000 GID=1000 and then use the variables inside docker-compose use user like: "${UID}:${GID}"

more on security: If Apache runs as under www-data group, then the group www-data should be able to read+traverse user directories such as var/www/html/user_dir and read their files.
So for the directories, we set the following permissions: owner: rwx, group:rx (a group can traverse directories, and a developer can also create and update files), and for the files: - owner:rw, group r (developer reads and writes, apache interprets PHP (reads the file)). All other users are with denied permissions:
0) set initial ownership of /var/www/html to the current user/developer
sudo chown -R $USER:www-data /var/www/html
 
1) user www-data(apache) can only read files(+r) and directories(+rx)
sudo find /var/www/html -type d -exec chmod g+rx {} +
sudo find /var/www/html -type f -exec chmod g+r {} +

2) user/developer is able to read and create directories, as well as read/update/, write files.
We prevent the user from executing files(such as .php or other directly on the host (not on web). When the .php files are being requested on the web - Apache will handle the.
sudo chown -R USER /var/www/html/
sudo find /var/www/html -type d -exec chmod u+rwx {} +
sudo find /var/www/html -type f -exec chmod u+rw {} +

3) revoke access for other users
 sudo chmod -R o-rwx /var/www/html/

4) set default permissions for newly created files& directories

chmod g+s .
set the group ID (setgid) on the current directory - all newly created files and subdirectories will inherit the current group ID, rather than the group ID of the user creator.


use a specific version of the image instead of :latest
It is better to install an image specific version, so the newly created container will stay immutable and not induce problematic changes when the image changes its versions for example from ver.5 to ver.7. If we use :latest, we cannot be sure that our code will run correctly on every vendor version. So by setting a specific known version of the source image, we assure that our configuration/application/service will work on the chosen version.

networks
If your containers reside on the same network (by default) docker-compose will automatically create a network for the containers inside the compose project and they will be able to access all the listening ports of other containers via their service name as DNS hostname. The default created network driver is overlay/bridge. If containers span multiple hosts, we need an overlay network to connect them together.

'depends_on' is to be able to have somewhat control over the order of the creation of containers.

RUN apt-get update vs RUN [ "apt-get", "update" ]
1st will use shell /bin/sh to run the command, 2nd will not (for images without bash shell)

Multi-stage builds

PRODUCTION: using local-dev project files and building dependencies inside the container
dockerfile
# 1st stage
FROM composer AS builder
COPY composer.* /app # copy local app dependencies into the container /app directory

RUN composer install --no-dev # build project dependencies in container's /vendor folder in order container to build its own dependencies excluding dev-dependencies

# 2nd stage
FROM php:7.4.5-apache as base # start a new build stage with the php-apache image as its base
RUN docker-php-ext-install mysqli
# Note: COPY copies just the built artifact from previous stage to a new stage.
COPY --from=base ./ /var/www/html/ # copy our local project files inside the container using the base stage COPY --from=builder /app/vendor /var/www/html/vendor/ # from the composer stage copy the pre-build vendor folder to the container

docker-compose.yaml
version: '3.7'
services:
  app:
   build: .
     target: base # we just run the build phase only when the target is base
                  # i.e. don't need to rebuild the 1st stage of the build (composer install)
   ports:
     - 80:80
   volumes:
     - ./:/var/www/html # getting bind mount inside of the container to local development directory
-
/var/www/html/vendor # preserving container's built dependencies from being overwritten by bind mount


DEVELOPMENT: using both local-dev project files and dependencies (we need to manually install dependencies using composer install)
dockerfile
FROM php:7.4.5-apache as base # start a new build stage with the php-apache image as its base
RUN docker-php-ext-install mysqli
# Note: COPY copies just the built artifact from previous stage to a new stage.
COPY ./ /var/www/html/ # copy our local project files inside the container using the base stage
docker-compose.yaml
version: '3.7'
services:
  app:
   build: .
   ports:
     - 80:80
   volumes:
     - ./:/var/www/html # getting bind mount inside of the container to local development directory


Separating build and runtime dependencies using stages:

1st stage - build:
FROM node AS build
WORKDIR /usr/src/app # created / bind-mount volume inside the compose file
COPY package.json .
RUN npm install # install the app package dependencies
COPY . ./src # copy generated code into the container
2nd stage - serve the generated .js & html files
FROM nginx:alpine 
COPY nginx.conf /etc/nginx/nginx.conf
COPY --from build /usr/src/app/build /usr/share/nginx/html

Production vs Development environment

FROM php:7.4-fpm-alpine as base FROM base as development
# build development environment

FROM base as production COPY data /var/www/html # copy into the container the generated source files

docker-compose.yaml

php-dev: build: .
  target: development ports: - "9000:9000"


php-prod: build: .
  target: production ports: - "9000:9000"
 

volumes:

  - ./:/var/www/html


docker build . -t app-dev --target=development
docker build . -t app-prod --target=production


FAQ:

How to inspect containers:
Here is how to inspect the open ports inside of both MySQL and Apache containers.
1) we need to get get the running container process id:
docker container ls (to get the container_id)
then:
docker inspect -f '{{.State.Pid}}' <container_id>
2) having the container process_id run netstat inside the container namespace:
sudo nsenter -t <container_process_id> -n netstat
which will show us which ports are open for connections from outside world to the container.
If needed you can also start a temporary shell in the container: docker exec -it <container_id> /bin/bash and try to analyze what is happening: i.e missing file/directory permissions with ls -la, check the container logs etc..., like when you are running the Apache server locally. For example you can easily check on which port Apache server is running with: sudo netstat -anp | grep apache2 , sudo lsof -i -P | grep apache2 , or cat /etc/apache2/ports.conf Then having the right port update your docker container configuration: delete and rebuild the container.

Enable / disable PHP extensions:
It is possible with: RUN docker-php-ext-install name_of_extension
Note: some extensions require additional system libraries to be also installed. For exmaple for the zip library you need to run on the same line before php-ext-install...: apt-get install libzip-dev zlib1g-dev;
 
 
How to import database from local drive into a mariadb/mysql database:
If the container is already present, execute the following commands: docker exec -i mysql8 mysql -udevuser -pdevpass mysql < db_backup.sql
or docker exec -i mysql8 sh -c 'exec mysql -udevuser  -pdevpass' <  db_backup.sql
Of course, you can just mount a local database (bind mount) to be used within the container with: docker run  -v /var/lib/mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=root mysql8


How to create tables inside of a mysql container?
You can create a sample table with the help of php:
$sql = "CREATE TABLE MyTable (
id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,
firstname VARCHAR(30) NOT NULL,
email VARCHAR(50),
reg_date TIMESTAMP
)";
if ($conn->query($sql)===TRUE) {   echo "Table MyTable created successfully"; }
 
 
How to copy a local folder from /var/www/html/folder_name to folder html inside a docker container?
COPY /var/www/html/folder_name  /app  
 

How to create a persistent volume and link the container to use it?
dockerfile:
FROM php:7.4-apache
COPY --chown=www-data:www-data  . /var/www/html # we copy the current directory contents into /var/www/html directory inside of the container

docker-compose.yaml
version: '3.8'
services:
  php:
    build: ./  # use the above dockerfile to create the image
    ports:
      - 8080:80
    volumes:
       - type: volume
         source: phpdata
         target: /var/www/html
volumes:
  phpdata:  
 

How can you dockerize a website and then run the containers on another server?
First create a backup of the current container images and their content:
1) commit the changes made so far in the container: docker commit container_id backup_image_name
2) save the image to your local(node) machine: docker save backup_image_name > backup_image.tar
On the second server restore the image via:
1) docker load < backup_image.tar
2) start a container/s using the backup_image
Please not that if you have bind mounts (or volumes that reside outside of the container), you need to backup them manually!


How to install phpmyadmin?
docker-compose.yml file:
 phpmyadmin:
     image: phpmyadmin/phpmyadmin:latest
     env_file: .env
     environment:
       PMA_HOST: db
       MYSQL_ROOT_PASSWORD: $MYSQL_ROOT_PASSWORD
     ports:
       - 3333:80
.env file:
MYSQL_ROOT_PASSWORD = the password from the MYSQL installation


How to use local domains to access the container like domain.local?
You can start a NEW container from an image specifying -h (hostname option): docker run -h domain.local

How to forward localhost:8000 to some_domain.com
You can create an Nginx reverse-proxy container, which will expose your service container when browsing the Nginx container at port 80. Let's suppose you have a "web" service defined inside a docker-compose.yaml file.
1) Nginx configuration
default.conf
server {
  listen 80;
  listen [::]:80; # listen for connections on port 80
  server_name web-app.localhost;
  location / {
    proxy_pass http://web:80; #web is the name of the service(container) you would like to expose, 80 is the port, the service is listening on
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
  }
}
2) Dockerfile image configuration:
FROM nginx
COPY default.conf /etc/nginx/conf.d/
3) create an entry in hosts file pointing to
127.0.0.1 web-app.localhost
You can now browse: http://web-app.localhost

Congratulations and enjoy the Docker course !

No comments:

Subscribe To My Channel for updates

Integrating AI code helpers into Visual Studio Code

In this guide, we’ll walk through setting up a local AI-powered coding assistant within Visual Studio Code (VS Code). By leveraging tools s...