Puppet, Git, Docker in DevOps – a simple yet powerful workflow

In this article I’ll briefly describe how I’m managing my code (configs, scripts, etc.) between my workstation and my virtual private server playground. I will try to point out where I’m using simple solutions instead of enterprise-appropriate ones.

To automate the workflow, I am using:

  • Docker – to run services in sandboxed networks, without their dependencies
  • Git – for proper version control of my code
  • Cronie – for simple (cronie is a light weight cron implementation) scheduling (with enterprise alternatives)
  • Puppet – for file orchestration and integrity monitoring
  • First of all, I need a code repository with the ability to control versions and to review commits. Git seems the most appropriate as it is easy to configure and is available by default in my Linux distribution (Gentoo). It is also available in more common enterprise Linux choices, like RHEL, SLES or Debian.
    It is highly recommended to generate a key pair to use key-based authentication with the Git server, or to be precise – with the ssh daemon running there. Use ssh-keygen to generate the key pair (comes with the openssh package). From there, copy the public key (the one ending in .pub) and place it in Git user’s ~/.ssh/authorized_keys.

    Since there’s plenty of guides available about setting up a Git repository, I will not describe this in detail here. What I did briefly was a –bare init for configs.git and scripts.git on the server, while on the client side, I’ve added two remotes over ssh with a custom port:

    git remote add ssh://git@rzski.com:9999/path/to/configs.git
    git remote add ssh://git@rzski.com:9999/path/to/scripts.git

    This allowed me to push all my files after aggregating their copies in one directory, and then pull it on my workstation. Now I can edit files locally and push them to the central repository on the server:

    % echo "# End of file" >> configs/ntpd/ntp.conf
    % git add configs/ntpd/ntp.conf
    % git commit -m "configs: for the purpose of the article"
    [master 3a26138] configs: for the purpose of the article
    1 file changed, 1 insertion(+)
    % git push configs master
    Enumerating objects: 16, done.
    Counting objects: 100% (16/16), done.
    Delta compression using up to 8 threads
    Compressing objects: 100% (9/9), done.
    Writing objects: 100% (11/11), 1.29 KiB | 1.29 MiB/s, done.
    Total 11 (delta 3), reused 0 (delta 0)
    To ssh://rzski.com:9999/path/to/configs.git
    c5858d0..3a26138 master -> configs


    Time to set up Puppet to grab the files from the Git repository and push them to chosen environments. For simplicity, I’m using a single environment (production) here. Puppet needs a server (master) and an agent to provide the file orchestration and integrity monitoring functionality. It also needs a connector to grab files from Git and use them as source for modules. There are many ways to integrate Git and Puppet, such as:

  • Puppet Enterprise (PE) + PE Code Manager (obsoleting or r10k)
  • Puppet Enterprise (PE) + PE Bolt running scripts
  • git pull command scheduled to run in the Puppet external mount every minute (or so) by cronie
  • I went with the last approach, but it is probably least appropriate for enterprise or production environments. For me this was suitable, since I’ve decided to install the puppet-agent (no dependencies) from portage (Gentoo’s package manager), but have the master and the pdk run as Docker containers:

    docker pull puppet/puppetserver-standalone
    docker pull terzom/pdk

    Running the Puppet master from a Docker container is very convenient. The images are tiny. I have created a /30 network for just the master and the agent to operate in:

    docker network create --internal --subnet=192.168.123.0/30 --gateway 192.168.123.1 puppet-nw
    docker run --name puppetmstr --hostname puppetmstr --network puppet-nw -d -v /work/puppetlabs:/etc/puppetlabs puppet/puppetserver-standalone

    Gladly, the pdk can be run in “disposable” mode (like the ansible container, if you decide to use it) and bind storage to the same config path as the master:

    docker run --rm -it -v /work/puppetlabs:/etc/puppetlabs terzom/pdk

    Then run pdk to generate templates for a new module. A module is the recommended organizational unit of which files Puppet should control. In most cases people seem to start with NTP as a good, simple example. I’m skipping the interview since I’m not planning to open-source my configs 😉

    pdk new module ntp --skip-interview

    This generates the schema for the ntp module. Meanwhile, the server needs to be configured to accept connections from the agent (plenty of guides online, I’ll jump over this part) and become authorized to grab files from a clone of the central repo.

    To set up the repo:

    cd /work/puppetlabs/files;
    git init;
    git clone ssh://git@rzski.com:9999/path/to/configs.git
    chown -R puppet:puppet configs/

    GOTCHA: all files and folders in the Puppet config as well as cloned from the Git repository need to match the UID and GID used by the Puppet master in the container. Matching the username and group name is not enough, since these are mapped in /etc/passwd and /etc/group respectively and can have varying octals. Matching them between the host and the container is apparently the recommended approach.

    The Puppet master needs to know the location of the central repository pulled from Git. In the below example, I am configuring this directory as a Puppet mount point:

    # cat /work/puppetlabs/puppet/fileserver.conf
    [files]
    path /etc/puppetlabs/files
    allow 192.168.123.0/30


    [configs]
    path /etc/puppetlabs/files/configs
    allow 192.168.123.0/30

    Note the paths refer to /etc rather than /work, since that’s how the master running from within the container will see them.
    To configure authorizing the server to access this mount (although it should be enabled by default, I’d rather cut it down to just the participants of my puppet-nw network), the following regexp stanzas are needed (use docker attach puppetmstr to hook into the running Puppet master and see how it handles these paths as agents try to connect to it):

    # cat /work/puppetlabs/puppet/auth.conf
    (...)
    # Authorization for serving files from the Git repo
    path ~ ^/file_(metadata|content)s?/configs/
    method find
    allow 192.168.123.0/30
    path ~ ^/file_(metadata|content)s?/files/configs/
    method find
    allow 192.168.123.0/30
    (...)

    Now each time I want to link a file in the central repo with a file in production, be it a script, config, or source code, I need to generate a module for it. Inside each such module, I’m defining a config class to manage the actual config file. The rest of the module code has been pre-populated by pdk.

    # cat /work/puppetlabs/code/modules/ntp/manifests/config.pp

    # ntp::config
    #
    # A description of what this class does
    #
    # @summary A short summary of the purpose of this class
    #
    # @example
    # include ntp::config
    class ntp::config {
    file { "ntp.conf" :
    path => "/etc/ntp.conf",
    owner => "root",
    group => "root",
    mode => "0644",
    source => "puppet:///configs/ntpd/ntp.conf",
    }
    }

    In the above example, the source is defined as Puppet’s fileserver relative location, mount name equals “configs”, and the path is set to ntpd/ntp.conf to match the config file location in the central config repository pulled from Git.

    The Puppet master also needs to know on which nodes (servers) it should manage these config files. In this case, the management happens in the production environment, on the VPS hosting the Puppet master container only – rzski.com (hostname is picked from the host’s /etc/hosts):

    # cat /work/puppetlabs/code/environments/production/manifests/site.pp
    node "rzski.com" {
    include ntp::config
    }

    The above two steps (class definition with the correct path and the site.pp class reference per node) would have to be repeated for every module (set of config files or single files or scripts).

    To verify the configs are shared, run the local Puppet binary testing the agent mode:

    # puppet agent -t
    Info: Using configured environment 'production'
    Info: Retrieving pluginfacts
    Info: Retrieving plugin
    Info: Retrieving locales
    Info: Caching catalog for rzski.com
    Info: Applying configuration version '1543335030'
    Notice: Applied catalog in 0.05 seconds

    And to observe the change commited by appending a comment to the end of the ntp.conf file:

    # tail /var/log/puppetlabs/puppet/puppet.log
    puppet-agent[18560]: Applied catalog in 0.05 seconds
    puppet-agent[18728]: (/Stage[main]/Ntp::Config/File[ntp.conf]/content) content changed '{md5}96db7670882085ea77ce9b5fa14dc46f' to '{md5}06f8cea8589b23e43dcea88cce5ac8ea

    Finally, to “sync” between the Git repo and the Puppet repo, cron can call git pull every minute (as described above, optionally that can land in a small shell script). Either switch user to ‘puppet’ and clone (su – puppet -c “git clone…”, but this requires giving the puppet user a valid shell, which is not ideal), or just pull && chown.

    To go further from here, I can also execute a command (using Bolt for agentless approach, or a shell script) to, for example, push my Solidity code into my Docker container running Geth Ethereum node. Let’s say I’ve pushed the file from the workstation and accepted the commit. Once the new version gets cloned into the Puppet repo, run docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH. Then, to register the ABI and, if required, compile, I can chain another task calling docker exec container_name command and collect the outputs. But that’s material for another article 😉