Powered By Blogger

2016/05/08

puppet, git, gitlab & r10k: Split that Monolithic git workflow

Configuration Management and Version Control Software! Definitely, a great way to start a new year in this DevOps' era  is to write a post about these two hot topics. Indeed, Moving forward on my DevOPS path makes these two unavoidable.

I'm using Puppet for more than a year now, I have written a dozen of useful modules and as I'm introducing the rest of the team to Configuration Management/Puppet, I'm also starting  to understand the great benefits of using Version Control Software(VCS) to manage Puppet Code (or any other Source Codes), especially when that code is being edited by many people (it's even more obvious when these other people are SysAdmin).

The aim of this post is mainly to share my experience about moving from that monolithic environment where I used to edit my Written Puppet Modules directly on my Puppet Master and use git (a bunch of git clone/git remote/git push/git fetch...) for maintaining Puppet Directory Environment (Production/Dev/Test Puppet Environments) to a much cleaner Workflow where I (and any others contributors) can easily contribute. On that cleaner workflow, I'm intending to make use of:

  • A smart tool which provides a general purpose toolset for deploying Puppet environments and modules in a seamless way (R10k).
  • An Internal Online Repository Hosting System (Gitlab CE/Git) to Publish /Store Puppet Codes in a Central Repo.
  • A much modern integrated toolset for developing Puppet modules and manifests (Gepetto) from workstation (not from my Puppet Master).
Note that similar workflows have already been discussed on many great websites (some are listed in the references sections), but I'm intending to share (with as much details as possible)  my own Experience moving from a monolithic workflow to one that has the set of features that are listed above, My hope is that this will help others people designing their workflow.


To make it more readable, I've divided this post in 03 sections:

I. A Basic and Monolithic Git workflow: Description of this Monolithic Environment

This section is about describing in details the Basic Monolithic Git workflow and how that was implemented.  I will also highlight some pros and cons of that workflow.

II. Moving to "one repo per Puppet Module" and a Centralized Git

In This section, I'm describing in detail the split of the huge Puppet Repositories in many smaller repo that are hosted on a Centralized Enterprise Git Server (GitLab-CE)

III. Plastering the walls with r10k:

This last section is about r10k Implementation for the synchronization of the Puppet Environments, their associate Modules and Data


   I A Basic and Monolithic Git workflow : Description of the Monolithic Environment

   *** One ring to rule them all


This section is about describing in details the Basic Monolithic Git workflow and how that was implemented.

The Setup I'm using for the post is a classic one for a Middle-sized Infrastructure, a PE 2015.2 with Puppet master, PE console, and PuppetDB all installed on one node.
For Puppet Directory Environment (production/Dev/Test...), I'm using the recommended PE 2015.2 environment-based workflow for testing new code in the node classifier. this workflow is well described on Puppet Site, but basically, it's about the creation/configuration of two type of Nodes Groups, a first set of node groups is used exclusively for assigning environments to nodes, and the second set of node groups is used for applying classes to nodes. 
Below, a Picture which summarizes that workflow. On the left side, we have a set of node groups which are used for assigning nodes to either Production, Dev or Test environments and on the right side, the structure that is put in place for Classifying Nodes (using roles).

Source: http://docs.puppetlabs.com/pe/2015.2/console_classes_groups_environment_override.html# 




For the implementation of such design, I've the default production environment and new test and Dev Environments. The test and dev environments were simply created by initializing /etc/puppetlabs/code/environments/production as git repo and cloning that repo to both dev test directories . The Synchronization between test/dev and production repo is (obviously) manual, with the development of new features/modules being done exclusively in the test environment. Once satisfied with the features/modules on tests environment, the test repository is merged in dev environment and later on, to Production environment. Below is a picture that try to summarize this basic git workflow.





        I.a: Creation of Git Repo and Directory Environments

Let's see how this is implemented on my Puppet Master (Name: pe-master). Note that I'm using the default (already configured) production directory to initialize the First git repo.


### Creation of production repo
[root@pe-master production]# cd /etc/puppetlabs/code/environments/production
[root@pe-master production]# git init 
Initialized empty Git repository in /etc/puppetlabs/code/environments/production/.git/
[root@pe-master production (master)]# git add --all :/
[root@pe-master production (master)]# git commit -m "Production Initial Repository with few official modules installed"
[......]

#### Cloning of production repo to test and dev Repos
[root@pe-master production]# cd ..
[root@pe-master environments]# git clone production test
Cloning into 'test'...
done.
[root@pe-master environments]# git clone production dev
Cloning into 'dev'...
done.

### Adding Remote repositories to test/dev Repositories
[root@pe-master environments]# cd test/
[root@pe-master  test (master)]# pwd
/etc/puppetlabs/code/environments/test
[root@pe-master test (master)]# git remote add dev /etc/puppetlabs/code/environments/dev
[root@pe-master test (master)]$ git fetch dev 
From /etc/puppetlabs/code/environments/dev
 * [new branch]      master     -> dev/master
[root@pe-master test (master)]# git remote -v
dev     /etc/puppetlabs/code/environments/dev (fetch)
dev     /etc/puppetlabs/code/environments/dev (push)
origin  /etc/puppetlabs/code/environments/production (fetch)
origin  /etc/puppetlabs/code/environments/production (push)
[root@pe-master test (master)]# git branch -r
  dev/master
  origin/HEAD -> origin/master
  origin/master
[root@pe-master test (master)]# cd ../dev
[root@pe-master dev (master)]# pwd
/etc/puppetlabs/code/environments/dev
[root@pe-master test]# git status
# On branch master
nothing to commit, working directory clean
[root@pe-master dev (master)]# git remote add test /etc/puppetlabs/code/environments/test
[root@pe-master dev (master)]$ git fetch test 
From /etc/puppetlabs/code/environments/test
 * [new branch]      master     -> test/master
[root@pe-master dev (master)]# git remote -v
origin  /etc/puppetlabs/code/environments/production (fetch)
origin  /etc/puppetlabs/code/environments/production (push)
test    /etc/puppetlabs/code/environments/test (fetch)
test    /etc/puppetlabs/code/environments/test (push)
[root@pe-master dev (master)]# git branch -r
  origin/HEAD -> origin/master
  origin/master
  test/master

### Checking the Production Repositories and adding test/dev as remote
[root@pe-master dev (master)]# cd ../production
[root@pe-master production (master)]$ git remote add test /etc/puppetlabs/code/environments/test    
[root@pe-master production (master)]$ git remote add dev /etc/puppetlabs/code/environments/dev 
[root@pe-master production (master)]$ git fetch test 
From /etc/puppetlabs/code/environments/test
 * [new branch]      master     -> test/master
[root@pe-master production (master)]$ git fetch dev
From /etc/puppetlabs/code/environments/dev
 * [new branch]      master     -> dev/master
[root@pe-master production (master)]# git remote -v
dev     /etc/puppetlabs/code/environments/dev (fetch)
dev     /etc/puppetlabs/code/environments/dev (push)
test    /etc/puppetlabs/code/environments/test (fetch)
test    /etc/puppetlabs/code/environments/test (push)



        I.b: Creation of Environments Nodes Groups:

As test and dev Environment Directories are already created, the node classification under these Environments Node Groups (test/dev) is quite easy on Puppet Enterprise Console.

On the enterprise console, under the node ==> classification section, we create a new group named test (dev) and set the Environment to "test".



Once that is done, we go on that group and "Edit node group metadata", then checked "





        I.c: Create and use a new Module:

With this implementation, Modules can be written in the test environment and added in a seamless way to the dev environment and later on to production environment by just merging the git repositories. 
Let's illustrate this by creating a quick (and dirty) newfeature module ( which simply notify on the environment facts).


### Get to the test modules directory and create newfeature Branch
[root@pe-master test (master)]# cd modules/
[root@pe-master modules (master)]# git status 
# On branch master
nothing to commit, working directory clean
[root@pe-master modules (master)]# git checkout -b newfeature
Switched to a new branch 'newfeature'
[root@pe-master modules (newfeature)]# git status
# On branch newfeature
nothing to commit, working directory clean
[root@pe-master modules (newfeature)]# pwd
/etc/puppetlabs/code/environments/test/modules
[root@pe-master modules (newfeature)]# git branch 
  master
* newfeature


### Edit Puppet Manifest
[root@pe-master modules (newfeature)]# mkdir -p newfeature/{manifests,tests,files}
[root@pe-master modules (newfeature)]# cd newfeature/manifests/
[root@pe-master manifests (newfeature)]# ls
[root@pe-master manifests (newfeature)]# vim init.pp
[root@pe-master manifests (newfeature)]# cat init.pp 
class newfeature {
        notify { "This is the environment I am assigned to : ${::environment}": }
}
[root@pe-master manifests (newfeature)]# puppet parser validate init.pp 
[root@pe-master manifests (newfeature)]# puppet module list --environment test | grep newfeature

├── newfeature (???)

### Commit this change
[root@pe-master manifests (newfeature)]# cd /etc/puppetlabs/code/environments/test/modules
[root@pe-master modules (newfeature)]# git status 
# On branch newfeature
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       newfeature/
nothing added to commit but untracked files present (use "git add" to track)
[root@pe-master modules (newfeature)]# git add newfeature
[root@pe-master modules (newfeature)]# git commit -m "Added newfeature module"
[newfeature f037c7b] Added newfeature module
 Committer: root <root@pe-master.stiv.local>
 1 file changed, 3 insertions(+)
 create mode 100644 modules/newfeature/manifests/init.pp

Classify a test node (Add my small newfeature to the right profile) and run puppet agent on that test node.


[root@pe-test ~]# puppet agent -t
Warning: Local environment: "production" doesn't match server specified node environment "test", switching agent to "test".
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for pe-test.stiv.local
Info: Applying configuration version '1452445799'
Notice: This is the environment I am assigned to : test
Notice: /Stage[main]/newfeature/Notify[This is the environment I am assigned to : test]/message: defined 'message' as 'This is the environment I am assigned to : test'
Notice: Applied catalog in 0.73 seconds

The test seems to be well working, so we can merge it back to our test master branch and delete the newfeature branch.


[root@pe-master modules (newfeature)]# git branch 
  master
* newfeature
[root@pe-master modules (newfeature)]# git checkout master
Switched to branch 'master'
[root@pe-master modules (master)]# git log master..newfeature 
commit e8e6c5a5481b2e211a71f912a7185990bcdacf10
Author: root <root@pe-master.stiv.local>
Date:   Sun Jan 10 18:21:28 2016 +0100
    Added newfeature module
[root@pe-master modules (master)]# git merge newfeature 
Updating 755df81..e8e6c5a
Fast-forward
 modules/newfeature/manifests/init.pp | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 modules/newfeature/manifests/init.pp
[root@pe-master modules (master)]# git branch -d newfeature 
Deleted branch newfeature (was e8e6c5a).

Next step is the Merging of this "test master branch" to the "dev master branch".


[root@pe-master modules (master)]# cd ../../dev/
[root@pe-master dev (master)]# ls
environment.conf  hieradata  manifests  modules
[root@pe-master dev (master)]# git branch -r
  origin/HEAD -> origin/master
  origin/master
  test/master
[root@pe-master dev (master)]# git log master..test/master 
[root@pe-master dev (master)]# git fetch test 
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From /etc/puppetlabs/code/environments/test
   755df81..e8e6c5a  master     -> test/master
[root@pe-master dev (master)]# git log master..test/master 
commit e8e6c5a5481b2e211a71f912a7185990bcdacf10
Author: root <root@pe-master.stiv.local>
Date:   Sun Jan 10 18:21:28 2016 +0100
    Added newfeature module
[root@pe-master dev (master)]# git branch 
* master
[root@pe-master dev (master)]# git merge test/master 
Updating 755df81..e8e6c5a
Fast-forward
 modules/newfeature/manifests/init.pp | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 modules/newfeature/manifests/init.pp
[root@pe-master dev (master)]# git log master..test/master      
 
We can now test that feature on Development System, and later on merge the dev and production repository to have the change live on the production systems.


[root@pe-dev ~]# puppet agent -t
Warning: Local environment: "production" doesn't match server specified node environment "dev", switching agent to "dev".
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for pe-dev.stiv.local
Info: Applying configuration version '1452446957'
Notice: This is the environment I am assigned to : dev
Notice: /Stage[main]/newfeature/Notify[This is the environment I am assigned to : dev]/message: defined 'message' as 'This is the environment I am assigned to : dev'
Notice: Applied catalog in 0.65 seconds

Just to summarize this section, once both production,dev and test Directory Environments repo are initially synchronized, the workflow (with git commands associated) is implemented as followed for any module modification/addition:


### From the test repository, Create New Branch for new feature/module
git checkout -b newmodule

### Develop Module in the normal way 
### (with any number of commit in that newmodule branch) 
### and once satisfied with the new module/feature, still in the test repository, 
### merge our newmodule with our master branch            
git checkout master
git merge newmodule
                              
### Get to the dev repo, fetch the test repo in order to refresh 
### our remote branch listing
git fetch test   

### Check what the remote branch has that isn't in the current master branch
git log master..test/master

### Merge the remote test branch with the production master branch
git merge test/master

### once satisfied with that Module on the Development Systems,
### we can move forward to our most critical systems (production), and merge dev
git fetch test
git fetch dev
git merge dev/master

### Get to the dev branch, fetch the origin repo in order to refresh its branch listing
git fetch origin

### Get to the test branch, fetch the origin
### and dev repo in order to refresh their branch listing
git fetch origin
git fetch dev



        I.d: Some advantages and disadvantages of that Workflow:


All that seems good, and we can easily see what might be the advantages of such design; No changes pushed to production Systems without extensive tests, first in the test environment, then in the dev environment, Well written history of the evolution of our main environment... But they are also some (or many...) disadvantages using git repo and Puppet in this way. In fact, one of the main purpose of using Version Control Software (and git) is collaboration. But this configuration doesn't really help achieve that noble goal. To put that in simple ways,

1. How do we enable the tons of external awesome developers in the world to help enhance the module we're writing (in my case, how can I push some of my Puppet Module Code to Public github/gitlab account...)? We obviously can't push our whole Test repo to github and expects people to collaborate on that, we need a way to start a new (single) project/module, and push only that module to Public CVS.

2. How are we pulling Public Written Module which aren't on Forge (e.g: github,gitlab...) to our environment? If we do that, we'll be pulling another repository inside our repo (nested git repo) and that's a bit tough to deal with.

3. Eight (08) manual steps to have a Module modification on Production System! That seems a lot, and it's probably prone to mistake!  


   II.Moving to one repo per Puppet Module and a Centralized Git:

   *** The Fellowship of the Ring



Having now these huge Test, dev and production repos which contain all our downloaded and written modules ,  I'm aiming  to describe in this section how I'm moving from that to smaller repo which are more manageable and easily shareable. As described above, I have 03 similar environments; test,dev and production. Test is where I'm making most of the modification and dev and production are replicas of that repository (Mainly used for node classification.). For the split,I'm mainly working on that Test environment.

But before moving forward with the Test environment's split, there's a set of requirements that must be met:
  • The first requirement is to have another System that will be used as Centralized git System. It is possible to use a Public Central Repo for this, but In this case, there are some modules that should be kept Private (the one that are specific to the running environment) and some that will be shared publicly, so I found that having an Internal central Git System will be nice. As you will see below, I'm making use of GitLab for that Internal Git System.
  • The second is to make sure that we have no other branch than the master one, so we've to merge all our modification to the master branch of our Test repo.
  • The last requirement is to have a Backup of all our environments  :-) . For that a simple tar will do the trick, something like : 
  cd /etc/puppetlabs/code && tar zcvf /tmp/environments_before_split.tar.gz environments

        II.a: Split the monolithic repo:

Once the little requirements described above are met, we can apply the recommended strategy for such split which is to make a full clone of the main repo (test) to another local directory named with each module's name. Then, for each of this module, use  git filter-branch --subdirectory-filter command to select commits that apply only to that module and we're done.
If that seems a bit tough to understand, then let's see how it is done practically. First we need have a look on the number of modules we've to migrate by using the tree command as seen below.

[root@pe-master test (master)]# tree -d -L 2 modules
modules
├── apache
│   ├── files
│   ├── lib
│   ├── manifests
│   ├── spec
│   ├── templates
│   └── tests
├── banner
│   ├── manifests
│   ├── templates
│   └── tests
├── chrony
│   ├── manifests
│   ├── spec
│   ├── templates
│   └── tests
├── newfeature
│   ├── files
│   ├── manifests
│   └── tests
├── ntp
│   ├── lib
│   ├── manifests
│   ├── spec
│   ├── templates
│   └── tests
├── profiles
│   ├── files
│   ├── manifests
│   ├── templates
│   └── tests
├── roles
│   ├── files
│   ├── manifests
│   ├── templates
│   └── tests
├── stdlib
│   ├── examples
│   ├── lib
│   ├── manifests
│   └── spec

So,we have a bunch of Puppet Modules Installed from Puppet Forge (puppetlabs-apache, puppetlabs-ntp, puppetlabs-stdlib and ringingliberty-chrony ), we also have the roles and profiles module and two other Internal written Modules (banner and newfeature).
Let us first try to split a single module (newfeature) by following the described strategy which is (after that we will automate that process):
  1. Create a new (temporary) working directory and get in that directory
  2. Clone the test directory environment to a repo named after the module we are willing to split (
  3. and remove origin after the cloning) 

  4. Keep only the history relevant to that module
  5. Check that our module only contains the relevant subdirectory

###1.### Create the new working directory and get in that directory
[root@pe-master tmp]# cd /tmp/
[root@pe-master tmp]# mkdir modules
[root@pe-master tmp]# cd modules/

###2.### Clone the test directory environment, better remove origin after the cloning
[root@pe-master modules]# git clone /etc/puppetlabs/code/environments/test newfeature
Cloning into 'newfeature'...
done.
[root@pe-master modules]# cd newfeature/
[root@pe-master newfeature (master)]# git remote -v
origin  /etc/puppetlabs/code/environments/test (fetch)
origin  /etc/puppetlabs/code/environments/test (push)
[root@pe-master newfeature (master)]# git remote remove origin

###3.### Keep only the history relevant to that module
[root@pe-master newfeature (master)]# git filter-branch --subdirectory-filter modules/newfeature
Rewrite e8e6c5a5481b2e211a71f912a7185990bcdacf10 (1/1)
Ref 'refs/heads/master' was rewritten

###4.### Check that our module only contains the relevant subdirectory
[root@pe-master newfeature (master)]# cat manifests/init.pp 
class newfeature {
        notify { "This is the environment I am assigned to : ${::environment}": }
}
[root@pe-master newfeature (master)]# git status 
# On branch master
nothing to commit, working directory clean
[root@pe-master newfeature (master)]# git log --oneline
81ca0ca Added newfeature module



Having many modules to split using that same sequence, I think it's better to write a script for that activity.  Here is a Python script that is written for that purpose and below output of that script in this environment (My strong suggestion for running such a script is to switch to an unprivileged system user who has only read right to the Directory Environment folder).


[stiv@pe-master ~]$ ./split_monolithic_puppet_git_repository.py  /tmp/modules /etc/puppetlabs/code/environments/test
### Start Split for apache
running git clone /etc/puppetlabs/code/environments/test apache ...
Cloning into 'apache'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/apache ...
Rewrite 755df8167361f1d5fff96cdba1b5b67f63c7e50f (1/1)
Ref 'refs/heads/master' was rewritten

### End Split for apache 


### Start Split for banner
running git clone /etc/puppetlabs/code/environments/test banner ...
Cloning into 'banner'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/banner ...
Rewrite 8c103db90111a1d9230fdefe5917e6d74c463c39 (2/2)
Ref 'refs/heads/master' was rewritten

### End Split for banner


### Start Split for chrony
running git clone /etc/puppetlabs/code/environments/test chrony ...
Cloning into 'chrony'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/chrony...
Rewrite 8c103db90111a1d9230fdefe5917e6d74c463c39 (2/2)
Ref 'refs/heads/master' was rewritten

### End Split for banner


### Start Split for newfeature
running git clone /etc/puppetlabs/code/environments/test newfeature ...
Cloning into 'newfeature'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/newfeature ...
Rewrite e8e6c5a5481b2e211a71f912a7185990bcdacf10 (1/1)
Ref 'refs/heads/master' was rewritten

### End Split for newfeature 


### Start Split for ntp
running git clone /etc/puppetlabs/code/environments/test ntp ...
Cloning into 'ntp'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/ntp ...
Rewrite 755df8167361f1d5fff96cdba1b5b67f63c7e50f (1/1)
Ref 'refs/heads/master' was rewritten

### End Split for ntp 


### Start Split for profiles
running git clone /etc/puppetlabs/code/environments/test profiles ...
Cloning into 'profiles'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/profiles ...
Rewrite 0572ead6fa6a23f4eca967482eae247e438f9472 (14/14)
Ref 'refs/heads/master' was rewritten

### End Split for profiles 


### Start Split for roles
running git clone /etc/puppetlabs/code/environments/test roles ...
Cloning into 'roles'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/roles ...
Rewrite ea7b531ad9caabf7fb93bc792edb2fd7dbbee3f4 (11/11)
Ref 'refs/heads/master' was rewritten

### End Split for roles 

### Start Split for stdlib
running git clone /etc/puppetlabs/code/environments/test stdlib ...
Cloning into 'stdlib'...
done.

running git remote remove origin ...

running git filter-branch --subdirectory-filter modules/stdlib ...
Rewrite 755df8167361f1d5fff96cdba1b5b67f63c7e50f (1/1)
Ref 'refs/heads/master' was rewritten

### End Split for stdlib 


        II.b: Create new remote repositories (our Central git repo):

The next step is to push these small Modules repositories to our Centralized git Repo. To complete that, we obviously need to have that Centralized git repositories Installed and configured.
If you are using a known (public) centralized git repository (such as Bitbucket, Gitlab or GitHub) then you can simply start creating the repo that will host the module  and then move forward II.c section (Pushing to the Central git Repo).

For this post, I've installed Gitlab-CE using the vhsn/gitlab Puppet Forge Module. In its simplest form, the following is enough to have it installed on the target node,


class { 'gitlab':
 external_url => 'http://git.stiv.local',
}

I'm strongly advising to update GItLab CE to the latest edition by running  yum update right after its installation.
On that Centralized git repository, I'm creating one project for each module. As example, I created a repo for the newfeature module developed above on my Gitlab.



Having many repos to create, it's obviously better to  automate such operation. It is also a great idea (not mandatory) at this stage to create a Gitlab group that will be used exclusively by Puppet Administrators and Users. For this post, I created a "puppet" group. Related to the automation of the new repo creation, I slightly modified the script described here as following (basically the script is using the right Gitlab API to create the repo. The script (and API) needs as input parameter my Gitlab User token (Available on Gitlab under profile ==> account) and it also needs the namespace id for the GitLab group we are using (puppet). This namespace group's id can be easily retrieved with:
  

$ curl --header "PRIVATE-TOKEN: MY_TOKEN_ID" "http://git.stiv.local/api/v3/namespaces/"

The script below (located under /home/stiv/git-init-gitlab.sh), 


#!/bin/bash
# Location: /home/stiv/git-init-gitlab.sh
set -x

# Set RepoName
repo_name=$1 

# Private Gitlab Token
token=XXXXXXXXXXXXXXXX

# Namespace of the group this project belongs to
group_id=10

# Test that repo name is set
test -z $repo_name && echo "Repo name required." 1>&2 && exit 1

# Create the Project in the group
curl -H "Content-Type:application/json" http://git.stiv.local/api/v3/projects?private_token=$token -d "{ \"name\": \"$repo_name\",\"namespace_id\":$group_id}"

With that little script, I am able to easily create all my centralized git repo. The only small challenges before running the script over My new modules' repo is to figure out what to do with Modules that were downloaded from Puppet Forge. In my opinion, the simplest to do is to exclude these Forge Modules and create Central repo only for our written Modules, but this is only applicable if you haven't modified a Forge downloaded module. As seen below, we can leverage on "puppet module changes" command to verify if a Forge Module was modified after it has been downloaded.


[stiv@pe-master modules]$ cd /etc/puppetlabs/code/environments/test/modules
[stiv@pe-master modules]$ for module in $(ls)
> do
> echo "##Module Name: $module"
> puppet module changes $module
> echo " "
> done
##Module Name: apache
Notice: No modified files

##Module Name: banner
Error: No file containing checksums found.
Error: Try 'puppet help module changes' for usage

##Module Name: chrony
Warning: 1 files modified
manifests/config.pp

##Module Name: newfeature
Error: Could not find a valid module at "newfeature"
Error: Try 'puppet help module changes' for usage

##Module Name: ntp
Notice: No modified files

##Module Name: profiles
Error: Could not find a valid module at "profiles"
Error: Try 'puppet help module changes' for usage

##Module Name: roles
Error: Could not find a valid module at "roles"
Error: Try 'puppet help module changes' for usage

##Module Name: stdlib
Notice: No modified files

Based on that output, I can safely assume that over the Module downloaded from Forge, Apache, NTP and stdlib modules haven't been modified, while chrony has been modified and I need to decide what to do with that modified version (in this case, I'll be keeping this modified version in my Central repository). banner,newfeature, profiles and roles are my own Modules and will obviously be cloned to that Central Repository.

Moving forward with the cloning to Gitlab, I simply have now to create my new Gitlab project (empty Gitlab project).



[stiv@pe-master modules]$ for module in `ls | egrep -v "apache|ntp|stdlib"`
> do
> /home/stiv/git-init-gitlab.sh $module
> done


        II.c: Push the new set of modules to their Central Repos and create production branch:

With our central repos created, the next step is to simply push our new set of Modules to these repos (I assume here that remote authentication to GitLab/Github is already configured). Let us see how this works for a single Module (in this case, newfeature), then later on, we'll run a loop for all the Modules.



[stiv@pe-master modules]$ cd newfeature/
[stiv@pe-master newfeature]$ git remote
[stiv@pe-master newfeature]$ git remote add origin git@git.stiv.local:puppet/newfeature.git
[stiv@pe-master newfeature]$ git remote -v
origin  git@git.stiv.local:puppet/newfeature.git (fetch)
origin  git@git.stiv.local:puppet/newfeature.git (push)
[stiv@pe-master newfeature]$ git push -u origin master
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (4/4), 348 bytes | 0 bytes/s, done.
Total 4 (delta 0), reused 0 (delta 0)
To git@git.stiv.local:puppet/newfeature.git
 * [new branch]      master -> master
Branch master set up to track remote branch production from origin.

Again, doing that manually for all the modules is kind of boring (Culture, Automation...). A simple loop here will do the trick for all the modules.



[stiv@pe-master modules]$ for module in `ls | egrep -v "apache|ntp|stdlib"`
> do
> cd $module
> git remote add origin git@git.stiv.local:puppet/$module.git
> git push -u origin master
> cd ..
> done

Once that is completed, we can move forward with the last step which is about r10k...


   III.Plastering the walls with r10k:


   *** The return of the king


Though we've successfully split our huge repos, we're still using that main monolithic repos (test,dev and production) on Puppet Master, this is the step where we're going to break these repo by using r10k. 
So, what is r10k? It is a very simple tool that does just two things (more details on Puppet site):
  1. Handles Puppet modules Installation using an authoritative, standalone list (Puppetfile) that you create for it.
  2. Creates and manages directory environments based on the branches in your Git Control Repository.
Puppet Enterprise 2015.2 (and later version) already includes r10k, so there's no need to download and install the tool, we only need to configure it.
One of the great feature r10k provides is the on-fly creation of Directory Environment using git branches of our Control Repo. Meaning that creation of a new environment is done just by adding a new branch to a repository.  Anyway, let us see below what is that control repository and how to implement it.

        III.a: Create R10k Control Repo:

The Control repository in its simplest form is a Git repository which stores a Puppetfile and hiera data (hieradata/). In facts, to use R10k, we must split a Puppet repo into two parts, a control repo that contains some configuration, including an r10k config file that points to second repo, which will contain our actual Puppet code.For the control repo creation, we can either decide to go for an empty directory that we will populate or  use this template repo made available by Puppet. For educational purpose, I've settled for creating my own control repo. Below are the steps I went through for that
  • Create a project named "control-repo" and add it to our "puppet" group
  • Copy the following folders/files from one of the existing Directory Environment (production in this case, but can choose any other..) to a temporary folder:
    [stiv@pe-master tmp]$ cd /tmp/
    [stiv@pe-master tmp]$ mkdir control-repo
    [stiv@pe-master tmp]$ cd control-repo/
    [stiv@pe-master control-repo]$ cp -pr /etc/puppetlabs/code/environments/production/{environment.conf,hieradata,manifests} .
    [stiv@pe-master control-repo]$ ls -lrth
    total 4.0K
    -rw-r--r-- 1 stiv stiv 879 Jul 21  2015 environment.conf
    drwxr-xr-x 2 stiv stiv   6 Jul 21  2015 hieradata
    drwxr-xr-x 2 stiv stiv  20 Aug 31  2015 manifests
    
    
  • Create Puppetfile in that same folder, this is the main configuration file which lists the modules we'd like for our environment, and also lists where to get these modules (Forge, git -Gitlab, Github, Bitbucket...-). Below, you can see the one I created for this case. It is to be noted that as I'm using GitLab SSH to authenticate, I had to create another GitLab User in the Gitlab puppet Group (Set as id pe-puppet). I also created a public/private SSH keypair (ssh-keygen) for the user that invokes R10k (typically root, but may be site-specific,) and authorize that public key to connect to my Gitlab groups (and to all the modules' Repositories). r10k simply invokes git in order to clone the repositories defined in that Puppetfile. 
  • [stiv@pe-master control-repo]$ cat Puppetfile 
    forge "http://forge.puppet.com"
    
    # Modules from the Puppet Forge
    mod "puppetlabs-apache",                     '1.5.0'
    mod "puppetlabs-ntp",                        '3.3.0'
    mod "puppetlabs-stdlib",                     '4.11.0'
    
    # Modules from Git
    mod 'roles',
      :git    =>  'git@git.stiv.local:puppet/roles.git'
    mod 'profiles',
      :git    =>  'git@git.stiv.local:puppet/profiles.git'
    mod 'newfeature',
      :git    =>  'git@git.stiv.local:puppet/newfeature.git'
    mod 'banner',
      :git    =>  'git@git.stiv.local:puppet/banner.git'
    mod 'chrony',
      :git    =>  'git@git.stiv.local:puppet/chrony.git'
    
  • Turn that temporary-control directory as a repo and sync it with the control-repo project (Note the git repo renaming to production in order to match with our Puppet Environment Name)
    [stiv@pe-master control-repo]$ git init 
    Initialized empty Git repository in /tmp/control-repo/.git/
    [stiv@pe-master control-repo]$ git remote add origin git@git.stiv.local:puppet/control-repo.git
    [stiv@pe-master control-repo]$ git add :/
    [stiv@pe-master control-repo]$ git commit -a -m "Initial Control Repo creation"
    [master (root-commit) 8d3b8bb] Initial Control Repo creation
     3 files changed, 131 insertions(+)
     create mode 100644 Puppetfile
     create mode 100644 environment.conf
     create mode 100644 manifests/site.pp
    [stiv@pe-master control-repo]$ git branch -m master production
    [stiv@pe-master control-repo]$ git push -u origin production 
    

  • On GitLab, create control-repo's branch for dev and test environments

        III.b: Test and Deploy r10k:


We've everything in place now, so it's time to configure our Puppet master to use r10k for Module deployment. This is the most critical step, as it erases the existing Directory Environment. The strong advise here is to first master this on another environments before getting to the Production.

On PE 2015.2, r10k is configured either during the Installation (answer file) or under the PE Console (after deployment). In this case, we'll be doing that on the PE console, the steps on the console are as followed:
  1. In the PE console, navigate to the Classification page.
  2. Click the PE Master group.
  3. In the PE Master group page, click the Classes tab.
  4. Locate or add the pe_r10k class.
  5. In the pe_r10k class, set r10k’s parameters . At minimum, set the remote (which is the control-repo git address) and git_settings parameters (see screenshots below).
  6. Click Add parameter, and then the Commit change button.



We may now run r10k on the console and enjoy the output,

[root@pe-master ~ ]# r10k deploy environment -pv
INFO     -> Deploying environment /etc/puppetlabs/code/environments/dev
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/apache
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/apt
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/stdlib
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/roles
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/profiles
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/newfeature
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/banner
INFO     -> Deploying module /etc/puppetlabs/code/environments/dev/modules/chrony
INFO     -> Deploying environment /etc/puppetlabs/code/environments/production
INFO     -> Deploying module /etc/puppetlabs/code/environments/production/modules/apache
INFO     -> Deploying module /etc/puppetlabs/code/environments/production/modules/apt
INFO     -> Deploying module /etc/puppetlabs/code/environments/production/modules/stdlib
INFO     -> Deploying module /etc/puppetlabs/code/environments/production/modules/roles
[.................]


III.c: Benefits of the split and way forward:

So, our monolithic Git workflow has been divided in smaller units, with r10k in place, we can easily collaborate and turn some Modules on Public Version Control System. 
But that is just few of the benefits, one of the huge improvement we got with r10k is that we can now easily refer to a Single Text file to know the exact state of each Environments. Also, as we have now one control-repo branch (thus one Puppetfile) for each environment, we can set (while keeping a trace of that) a different version of the same Module for each environment.
As an example, let's say that I want to have a specific "banner" Module version for dev Environment, I can simply get to the dev control-repo branch, modify the Puppetfile to have the following for Banner. The following setting on the dev Puppetfile tells to r10k that on this dev environment, the banner module's branch I'm willing to use is "dev".


mod 'banner',
  :git    =>  'git@git.stiv.local:puppet/banner.git',
  :ref => 'dev'

There's a set of four options available to  specify the module “version” (get more details for that on Official Puppet Documentation):
  • ref: Determines the Git reference to check out. Can be any of a tag, commit, or branch.
  • tag: Clones the repo at a certain tag value.
  • commit: Clones the repo at a certain commit.
  • branch: Clones a specific branch of the repo.
Our next step will most probably be the optimization of Roles/Profiles Modules and also discussions around IDEs (Gepetto), but for now, let's enjoy r10k (or Code Manager :-))...

Reference: