Steve's Solaris/Linux experiences: 2017

One of the key benefit that Containerization has brought is that ability to rapidly implement and scale up implementation that were deemed complex. As example, implementing a Mongodb MongoD Replica Set is something that was taking hours, but when using Docker, that implementation time is reduced to less than 05 minutes. Coupled with a Configuration Management Tool like Puppet, the complexity of such implementation is reduced to modifying a simple set of variables on a Yaml file and can be replicated with no efforts.

I am describing below one approach that leverages on both Docker and Puppet to complete a certain level of automation in implementing MongoDB Replica Set.

As usual, in order to make the technical implementation details more easy to digest,I'm making use of a set of well-known Puppet's best practice, with a great emphasis on the following :

Roles and Profiles Pattern
Hiera for Configuration Data

The post is divided in the following sections:

A view words on the Architecture
Puppet Module Installation and others prerequisites
Profile Module creation
Hiera Configuration
Roles Configuration
MongoDB Replica Set Additional Configuration

I. A view words on the Architecture:

A picture is worth a thousand words, that has proved so true when it comes to Technology Architecture. So let's first see what the Implementation I am implementing looks like:

Dockerized MongoDB Replica Set

We have two Physical Nodes and One Virtual Systems (doesn't need to absolutely be a Virtual System, the emphasis here is just on the facts that this node is a small one that is just set as MongoDB arbiter).

The two physical Nodes serves as Docker Host , each Docker Host hosts many distinct MongoDB Docker Instances, and each instance is published on a particular TCP Port (27017, 27018,27019...) . The datadir (/data/db) of the MongoDB are Docker volumes (mapped to Docker Host Directory which I usually create as disctinct LVM Logical Volumes).
To make Container easy to identify, Each container has a name that was created by concatenating the physical node hostname and that container port ("${hostname}-p${public_port}" e.g: node1-p27017 for physical node that has as hostname node1).
The complete Docker Installation and Initial Configuration is handled by Puppet and covered in Section II-V of these Post. The last section (Section VI) covers the Initialization of the Replica Set using Mongo Shell.

Arbiters are mongod instances that are part of a replica set but do not hold data. Arbiters participate in elections in order to break ties. If a replica set has an even number of members, add an arbiter.

II. Puppet Module Installation and others prerequisites:

To complete this implementation, we only need to install Garethr/Docker Module. I am also adding Puppet/LVM as I prefer to map Docker volumes to LVM Logical Volumes, but this isn't mandatory. As usual, the module can either be installed on Command line or through PuppetFile.

[root@pe-master ~] # puppet module install garethr-docker
[root@pe-master ~] # puppet module install puppetlabs-lvm

or simply by adding the following to Puppetfile,

mod 'mgarethr-docker'

mod 'puppetlabs-lvm'

III. Profile Module Creation:

The profile is really where most of the code is being written. In facts, the profile code shared below is truly acting like a glue for all these technologies (Docker, MongoDB and optionally LVM) and it provides the great level of automation needed .

Note that this profile can easily be adapted to be used for a simple Docker Installation and that I am using host as Network Mode for Docker.

class profiles::mongodb_docker
(
  $mongo_version            = "3.4",          # mongod Docker Version, default to 5.7
  $is_using_lvm             = false,  # Whether or not LVM Module should be included
)
{

  ### Add the needed Modules and Docker Images
  include 'docker'
  docker::image { 'mongo':
    image_tag => $mongo_version,
  } 

  # Include LVM if needed 
  # Note that all the needed parameters must be defined in Hiera
  if $is_using_lvm { include ::lvm }
  
  # Create a hash from Hiera Data with the Docker Run
  $myDockerContainer = hiera('docker::run', {})
  
  # Looping through container
  $myDockerContainer.each |$container_name, $container_param| {

    ### Get the must have parameters - variables , for now ports only
    if !has_key($container_param, 'extra_parameters') {
      fail('Need to have extra_parameters --publish xxxx:27001 set for this to work properly')
    }
    $extra_params_publish = grep($container_param[extra_parameters], 'publish')
    if empty($extra_params_publish) {
      fail('Need to have extra_parameters --publish xxxx:27001 set for this to work properly')
    }

    ### Create the Container Needed Directory for Volumes
    if has_key($container_param, 'volumes') {
      $container_param[volumes].each |$volume_hiera| {
        $mydir = split("${volume_hiera}", ':')[0]
        file { "${mydir}":
          ensure   => directory,
        }
      }
    }
    
    ### Create the Container Hostname
    # Form is <hostname>-p<port>; e.g: dladc2-infdoc01-p27017
    $port_mapping         = split("${extra_params_publish}", Regexp['[\s+]'])[1]
    $public_port          = split("${port_mapping}", ':')[0]
    $docker_hostname      = "${hostname}-p${public_port}"
    
    ## Docker Container Run Configuration
    # Then run the Containers
    docker::run{$container_name:
      * => {
           hostname   =>  $docker_hostname,
           net        => 'host',
           } + $container_param # Merging with container parameters
    }
  }
}

IV. Hiera Configuration:

As noted above, I'm exclusively using Hiera to store Nodes' configuration data. The configuration below is used to configure 03 MongoD instances running on 03 differents ports with the following parameters all specified:

logpath: log directory where mongo would be generating the logs
port: port at which mongo would be listening for requests
dbpath (volumes): dbpath where mongo would be storing all the data (mapped to a volume that is on the physical nodes)
replSet: name of the replicaset
wiredTigerCacheSizeGB : the maximum size of the internal cache that WiredTiger will use for all data

---
docker::run: 
  'mongod_db27017':
    image            : 'mongo'
    command          : 'mongod --replSet rs1 --port 27017 --shardsvr --logpath /var/log/mongodb/mongod.log --wiredTigerCacheSizeGB 100'
    volumes          :
      - "/var/docker_db/db_p27017:/data/db"
    extra_parameters :
      - '--publish 27017:27017'
      - '--restart=always'
    memory_limit     : '128g'
  'mongod_db27018':
    image            : 'mongo'
    command          : 'mongod --replSet rs2 --port 27018 --shardsvr  --logpath /var/log/mongodb/mongod.log --wiredTigerCacheSizeGB 100'
    volumes          :
      - "/var/docker_db/db_p27018:/data/db"
    extra_parameters :
      - '--publish 27018:27018'
      - '--restart=always'
    memory_limit     : '128g'
  'mongod_db27019':
    image            : 'mongo'
    command          : 'mongod --replSet rs3 --port 27019 --shardsvr  --logpath /var/log/mongodb/mongod.log --wiredTigerCacheSizeGB 100'
    volumes          :
      - "/var/docker_db/db_p27019:/data/db"
    extra_parameters :
      - '--publish 27019:27019'
      - '--restart=always'
    memory_limit     : '128g'

V. Roles Configuration:

Using the profile described in the Section above, with the right Hiera Data, we can move forward with the roles modules configuration,

class roles::containers_ha_02  {
 
    # Install Docker and MongoDB Instances
    include profiles::mongodb_docker

}

VI. MongoDB Replica Set Additional Configuration:

Once the roles/profiles (and the correct Hiera Data associated as shown above) are applied, we'll end up having 3 containers per Docker host from the mongo image, all inside their own docker container network.

Assuming that our Docker Host are named node1,node2 and node3, accordingly with the Puppet Code in the profiles, our Container and Replica Set Names will be (to make things easier, I recommend to add these names to DNS):

On node1: node1-p27017 (rs1), node1-p27018 (rs2), node1-p27019 (rs3)
On node2: node2-p27017 (rs1), node2-p27018 (rs2), node2-p27019 (rs3)
On node3: node3-p27017 (rs1), node3-p27018 (rs2), node3-p27019 (rs3)

As described in the first section,We can access any of these containers MongoD instances using the mongo shell interface from our network, if we need to (you will have to install MongoDB on your own machine to do this).

What is remaining is just to initiate the Cluster Replica for each set and add the other instances. Below is the example for rs1 (same should be completed with rs2 and rs3).

#### configuration of rs1

rs.initiate( {
   _id : "rs1",
   members: [ { _id : 0, host : "node1-p27017:27017" } ]
})
rs.add("node2-p27017:27017")
rs.addArb("node3-p27017:27017")
rs.status()

#### Setting the primary

cfg = rs.conf();
cfg.members[0].priority = 10;
cfg.members[1].priority = 5;
rs.reconfig(cfg);
rs.conf();

References:

https://docs.mongodb.com/manual/tutorial/add-replica-set-arbiter/
http://www.sohamkamani.com/blog/2016/06/30/docker-mongo-replica-set/
https://docs.mongodb.com/manual/tutorial/add-replica-set-arbiter/

Steve's Solaris/Linux experiences

2017/09/14

MongoDB MongoD Replica Set Installation and Configuration with Puppet and Docker