May 16 2021

MySQL / PostgreSQL on iSCSI Fail to Start at Boot

Published by under Linux,Mysql,Postgresql

You are hosting Mysql or PostgreSQL data directory on iSCSI disks but the service fails to start at server’s boot. The service does not find the directory. However, you can start the service manually if you log on the server once SSH is available.

Mysql / PostgreSQL on iSCSI do not start

These are logs for Mariadb but they would be similar for Mysql:

mariadbd[795]: 0 [Note] /usr/sbin/mysqld (mysqld 10.5.9-MariaDB-1:10.5.9+maria~buster-log) starting as process 795 ...
mariadbd[795]: 0 [Warning] Can't create test file /opt/db/data/database_server.lower-test
mariadbd[795]: #007/usr/sbin/mysqld: Cannot change dir to '/var/lib/mysql/data/' (Errcode: 2 "No such file or directory")
mariadbd[795]: 0 [ERROR] Aborting
systemd[1]: mariadb.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: mariadb.service: Failed with result 'exit-code'.
systemd[1]: Failed to start MariaDB 10.5.9 database server.


You may also get logs similar to these for PostgreSQL. I stored PostgreSQL on a XFS partition, hosted on LVM for a more flexible disk space management, such as adding space while the filesystem is mounted. This is probably the main reason why iSCSI disks are popular.

systemd: mounting /var/lib/pgsql
starting PostgreSQL database server
sd 2:0:0:0: [sdb] attached SCSI disk
xfs (dm-4): Mounting V4 Filesystem
postgresql-check-db-dir: "/var/lib/pgsql/data" is missing or empty
postgresql.service: control process exited, code=exited status=1
Failed to start PostgreSQL database server.


The database starting manually after the boot indicates there’s most likely a problem with the services boot order. Databases should start after iscsi disks are made available. You can solve this issue adding “After=remote-fs.target” in the service systemd file such as /usr/lib/systemd/system/postgresql-9.5.service for PosgreSQL for instance. This is a way to manage service precedence and dependencies.

Note you may lose these changes next time the package is upgraded. Systemd lets you create an extra file where you can define your own settings. It will never modify it since this is your own file.
Just create /etc/systemd/system/mariadb.service.d/override.conf as follow for Mariadb, it is the same process for Mysql or PostgreSQL:

[Service]
Environment="UMASK_DIR=0750"

[Unit]
After=remote-fs.target


In this file, I also changed the default data directory permissions so users in Mysql group can walk into it.
Run systemctl daemon-reload so it takes the new settings into account and reboot the server. The database service should now start.

 

No responses yet

May 07 2021

Nginx Behind Reverse Proxy 301 https to http Redirect When URL has no Trailing Slash

Published by under Nginx

An Nginx web server hosted on Kubernetes was sending back to me permanent 301 http redirects although I was sending https requests. The web server was behind a reverse proxy that was also running Nginx but it would be the same story with haproxy or another. Worst, the reverse proxy was redirecting http requests to https (this is normal behaviour) but without a trailing slash, creating a loop!

Nginx http redirect behind proxy


Whenever I tried to connect to https://mydomain/app, I got a redirect to http://mydomain/app/. And the other way around… A simple curl confirmed the redirect:

$ curl -v https://mydomain/app
[...]
 > 
 * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
 < HTTP/2 301 
 < date: Thu, 06 May 2021 08:56:48 GMT
 < content-type: text/html
 < content-length: 170
 < location: http://mydomain/app/
 [...]
 <html>
 <head><title>301 Moved Permanently</title></head>
 <body>
 <center><h1>301 Moved Permanently</h1></center>
 <hr><center>nginx/1.19.10</center>
 </body>
 </html>


Nginx adds a trailing slash because that’s its default behaviour as describe on the Nginx documentation:

“In response to a request with URI equal to this string, but without the trailing slash, a permanent redirect with the code 301 will be returned to the requested URI with the slash appended.” If this is not desired, an exact match of the URI and location could be defined with 2 blocks:

location /app/ {
    [...]
}

location = /app {
    [...]
}


Right… but this cannot be done on the root directory and you don’t want to do this for every single directory. Don’t expect developers to give you a ring every time they create a new directory!


Accept URIs with and without a Trailing Slash

Nginx provides another way to accept URLs with or without a trailing slash, other than duplicating location blocks.
try_files lets you do that by checking the existence of files in the specified order and uses the first found file to process the request.
It is even possible to check a directory’s existence by appending a slash to $uri which gives the following configuration lines:

location / {
    try_files $uri $uri/index.html $uri/ =404;
}


Sounds perfect but there’s a catch. Once I’ve reloaded Nginx, the web browser could fetch no more CSS and Javascript links.
Checking the code returned to the browser, CSS links have been changed from /css/my_file.css to /my_file.css.

The problem is that try_files will source the file $uri/index.html but leaves the URI as /app so any relative URI will be relative to / (or the parent directory) and not /app/.

OK, that works but I need to change all the links to absolute path. From a developer point of view, I would understand this is not something they want.


Relative Redirects

We want something cleaner but we can’t prevent Nginx’s default behaviour from appending a trailing slash.
Let’s focus on the https to http redirect. This is where the main problem lies and most browsers deny that kind of redirection… This is why the problem might be visible on curl but most browsers.

Nginx redirects to the full (or absolute) URL and a trailing slash. The definite solution is to do a relative redirect:

absolute_redirect off;


Reload nginx and run another test with curl:

$ curl -v https://mydomain/app
[...]
 > 
 * Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
 < HTTP/2 301 
 < date: Thu, 06 May 2021 08:56:48 GMT
 < content-type: text/html
 < content-length: 170
 < location: /app/
 [...]
 <html>
 <head><title>301 Moved Permanently</title></head>
 <body>
 <center><h1>301 Moved Permanently</h1></center>
 <hr><center>nginx/1.19.10</center>
 </body>
 </html>


Redirection location was http://mydomain/app and is now /app/.
Doing that, the client will just reconnect with the protocol he used in the first request.

 

One response so far

Apr 28 2021

Show All Available Icons in Diagram as Code

Published by under Python

Diagram as Code by Mingrammer lets you draw beautiful diagrams while offering so many icons. It is just frustrating to go within all subdirectories to check and see all available icons. I decided to write a piece of code that imports classes from Diagram as Code package’s submodules.
I then gather each provider’s submodules in a cluster – that shows up as a blue area – and classes in green. A class matches a specific icon.

You will find on this image all icons available for your diagrams as code. Clic to expand image in bigger resolution, and right clic to save on your desktop.

Diagram as code all available icons


Here is the source code that generates above image that you can adapt to your needs to get all the latest icons.
You might want to show icons for a specific provider or generate separate images for each.

I had to link icons with one another so they display horizontally rather than a single unending vertical column. This is what the recursive function display_icons is for. For some reason, recursivity works removing the last element, but the first.

#!/usr/bin/env python3
 
import importlib
import pkgutil
import sys, inspect
 
import diagrams
from diagrams import Cluster, Diagram, Edge
 
def get_modules(module):
   path_list = []
   spec_list = []
   for importer, modname, ispkg in pkgutil.walk_packages(module.__path__):
      import_path = f"{module.__name__}.{modname}"
      if ispkg:
         spec = pkgutil._get_spec(importer, modname)
         importlib._bootstrap._load(spec)
         spec_list.append(spec)
      else:
         path_list.append(import_path)
   return path_list
 
def get_classes(module):
   class_list = []
   for name, obj in inspect.getmembers(module,inspect.isclass):
      if not name.startswith('_'):
         class_list.append([name,obj]) 
   return class_list
 
def add_module_to_provider_list(providers, module):
   # eg diagrams.azure.database
   # add "database" to array linked to key "azure"
   (diagram,provider,pclass) = module.split('.')
   if provider not in providers:
      providers[provider] = [pclass]
   else:
      providers[provider].append(pclass)
 
def get_provider_list(providers):
   return providers.keys()
 
def get_provider_classes(providers, provider):
   return providers[provider]
 
def display_icons(class_list):
   if len(class_list) < 1: return
   if len(class_list) == 1: return class_list[0][1](class_list[0][0])
   length = len(class_list)
   return display_icons(class_list[:-1]) - class_list[length-1][1](class_list[length-1][0])
 
if __name__ == "__main__":
   providers = {}
   modules = get_modules(diagrams)
 
   for module in modules:
      # Module exception
      # /usr/local/lib/python3.8/site-packages/diagrams/oci/database.py
      # NameError: name 'AutonomousDatabase' is not defined
      if module not in ['diagrams.oci.database']:
         add_module_to_provider_list(providers, module)
 
   with Diagram("all_Icons", show="False", outformat="png"):
      for provider in get_provider_list(providers):
         with Cluster(provider):
            for pclassname in get_provider_classes(providers, provider):
               with Cluster(pclassname):
                  classes = get_classes(importlib.import_module('diagrams.'+provider+'.'+pclassname))
                  display_classes(classes)
 

No responses yet

Apr 13 2021

Make Ansible 6X Faster with these 3 Tips

Published by under Ansible

Ansible is generally slow because it connects to the remote host for every task it runs. Let’s do some checks on a tiny role that gets the latest kubectl version (Kubernetes client) from a URL and installs the binary.
Let’s see 3 easy ways how to speed up Ansible and get better execution times and overall performance.

- name: get kubectl last version
   uri:
     url: "{{kubectl_url}}/stable.txt"
     return_content: yes
     status_code: 200
   register: kubectl_latest_version
 

 - name: Download kubectl binary
   get_url:
     url: "{{kubectl_url}}/v{{ kubectl_version |
       default(kubectl_latest_version.content | 
       regex_replace('^v', '')) }}/bin/{{kubectl_os}}/{{kubectl_arch}}/kubectl"
     dest: "/usr/local/bin"
     mode: '755'
     owner: "{{ kubectl_owner }}"
     group: "{{ kubectl_group }}"


Execution time with default settings: 32,2s

Ansible Speed up
Speed up Ansible


Ansible Fact Cache

Gathering facts is the first task Ansible runs when connecting to a host and the least we can say, it is slow. Very slow. Performance is also an issue when writing a playbook that you are testing over and over.

Luckily, it can be tweaked to save some time by adding these lines in ansible.cfg:

# implement fact caching and a smaller subset of facts gathered 
# for improved performance

gathering = smart
gather_subset = !hardware,!facter,!ohai

fact_caching_connection = /tmp/ansible_fact_cache
fact_caching = jsonfile

# expire the fact cache after 2 hours
fact_caching_timeout = 7200


According to the ansible.cfg example available online, smart gathering “gathers by default, but doesn’t regather if already gathered”.

Hardware facts are the longest facts to retrieve but you may need them especially if you build roles based on network interfaces. You may get “ansible_eth0 is undefined” for example.
Facter and ohai are related to Puppet and Chef clients.

And the most efficient is fact caching of course that stores information in a JSON file. But it could also fit in memory, or even in a shared Redis database.

Facts can also be disabled within a playbook if you don’t need them for that specific playbook. That’s a potential significant speed up that you can’t use too often though, most playbooks need facts.

- name: my_playbook
    hosts: *
    gather_facts: no
    tasks:


Execution time: 19,2s


SSH Speedup with Pipelining

Enabling pipelining reduces the number of SSH operations required to execute a module on the remote server. This improves performance but ‘requiretty’ must first be disabled in /etc/sudoers on all managed hosts, reason why pipelining is disabled by default. Add in ansible.cfg:

[ssh_connection]
pipelining = True


If requiretty is set, sudo will only run when the user is logged in to a real tty.  When this flag is set, sudo can only be run from a login session and not via other means such as cron or cgi-bin scripts. However, this flag is off by default according to the man page on Debian and Ubuntu at least.
It is safe to use pipelining is this case. Check your Linux distro sudo man page.

Execution time: 11,6s

Delegate_to localhost

Most improvements are dependant on the way you write Ansible tasks. In this role, you could connect to the URL from any host – localhost? – and spare an SSH connection.
This is the purpose of delegate_to that is usually set to localhost. The first task becomes:

- name: get kubectl last version
   delegate_to: localhost
   become: false
   uri:
     url: "{{kubectl_url}}/stable.txt"
     return_content: yes
     status_code: 200
   register: kubectl_latest_version


This is a non neglectable optimisation that you can use anytime the task can be run anywhere.

You’d better set become: false or you might get this error message if Ansible tries to sudo as root on your local host.

fatal: [backup]: FAILED! => {"changed": false, "module_stderr": "sudo: a password is required\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}


Execution time: 4,7s


Execution time is the mean time of 10 runs in a row, and tests were conducted through a VPN link that was far from good.

Of course, results are not linear, all playbooks will not run 6 times faster every time but you get the idea. Fact cache saves a few seconds at the start of the playbook while delegation to localhost is only applicable to a small bunch of cases.
There are other possible improvements to speed up Ansible such as async tasks to launch a task and move on immediately to the next one, but your best bet is to run Ansible on a server as close as possible to the target hosts, because people often experience slow SSH connections.

 

No responses yet

Apr 01 2021

Build Ansible Inventory from IBM Cloud Resources

Published by under Ansible,Python

IBM Cloud relies on Softlayer API – also called IBM Classic Infrastructure – to automate things. The API is available in different languages such as Python, Go, Java or PHP, and can be used to build an Ansible inventory.

manfredrichter / Pixabay

Here I will generate automatically a host inventory collected from my IBM Cloud account, directly usable by Ansible. I will also generate datacenters groups. You may want to fit servers in all kinds of categories such as Databases, Mysql, Linux, Web, etc… No worries, we can do that too in a really simple way.
This can be achieved adding tags to hosts on IBM Cloud portal and a host can indeed have multiple tags and belongs to Ansible groups matching these tags.

Once these groups are declared in the inventory file, you can apply Ansible playbooks like Install Apache package on servers that belong to the “Web” hostgroup for instance.


You first need to add your API key into a shell file you will execute before running the script. It could be in the script, but this allows each user to have his own key in a separate file. The API Key can be generated or found on IBM Cloud portal.

export SL_USERNAME=username
export SL_API_KEY=a1b2c3[...]3456


Run . ./IBM_SL_env.sh first to set username and API key into environment variables that will be used by the following Python script.

#!/usr/bin/env python3  
import os
import SoftLayer
 
HOST_VARS_DIR  = "host_vars"
INVENTORY_FILE = "inventory"
 
class Inventory:
   def __init__(self):
      # Variables
      self.categories = {}
      self.datacenter = {}
      self.servers = {}
 
      # Create Softlayer connection
      self.client = SoftLayer.create_client_from_env()
 
      # Init Methods
      self.get_server_list()
      self.write_inventory()
 

   def short_host_name(self, host):
      return host['fullyQualifiedDomainName'][:host['fullyQualifiedDomainName'].find('.mydomain.')]
 

   def add_host_to_cat(self, host, cat):
      if cat == "ibm-kubernetes-service": cat = "kube"
      if cat not in self.categories:
         self.categories[cat] = [host]
      else:
         self.categories[cat].append(host)
 

   def add_server_to_list(self, host):
      try:
         host["primaryBackendIpAddress"]
      except KeyError:
         pass
      else:
         host["ShortHostname"] = self.short_host_name(host)
             
         # Build server Categories list
         if host["tagReferences"] != []:
            for tagRef in host["tagReferences"]:
               self.add_host_to_cat(host["ShortHostname"], tagRef["tag"]["name"].strip())
             
         # Build datacenter lists
         if host["datacenter"]["name"] not in self.datacenter:
            self.datacenter[host["datacenter"]["name"]] = [host["ShortHostname"]]
         else:
            self.datacenter[host["datacenter"]["name"]].append(host["ShortHostname"])
             
         # Build server attribute list
         serverAttributes = {}
         serverAttributes['IP'] = host["primaryBackendIpAddress"]
         self.servers[host["ShortHostname"]] = serverAttributes
     

   def get_server_list(self):
      object_mask = "mask[id,fullyQualifiedDomainName," \
                    "primaryBackendIpAddress," \
                    "tagReferences[tag[name]]," \
                    "datacenter]"
 
      # Get virtual server list
      mgr = SoftLayer.VSManager(self.client)
      for vs in mgr.list_instances(mask=object_mask):
         self.add_server_to_list(vs)
 
      # Get bare metal servers
      hsl = SoftLayer.HardwareManager(self.client)
      for hs in hsl.list_hardware(mask=object_mask):
         self.add_server_to_list(hs)
 

   def write_inventory(self):
      # host_vars structure
      if not os.path.exists(HOST_VARS_DIR):
         os.makedirs(HOST_VARS_DIR)
 
      inventoryFile = open(INVENTORY_FILE,"w")
 
      for cat in self.categories.keys():
         if cat != "kube":
            inventoryFile.write("[" + cat + "]\n")
            for host in self.categories[cat]:
               # write host vars
               inventoryFile.write(host + "\n")
               file = open(HOST_VARS_DIR+"/"+host,"w")
               file.write("ansible_host: " + self.servers[host]['IP'] + "\n")
               file.close()        
            inventoryFile.write("\n")
      for dc in self.datacenter.keys():
         inventoryFile.write("[" + dc + "]\n")
         for host in self.datacenter[dc]:
            if not host.startswith("kube"):
               inventoryFile.write(host + "\n")
         inventoryFile.write("\n")
      inventoryFile.close()

if __name__ == "__main__":
   exit(Inventory())


A few notes:
Managed Kubernetes nodes are visible in IBM classic infrastructure like any other VM (VSI in IBM terminology) but you cannot connect on to them. They’re added to the “Kube” category before being ignored.

A distinct file is created in hosts_var for every server with its IP address. You could add more variables into it of course such as iSCSI settings, etc…

The script collects virtual machines as well as bare metal machines. You can comment out one of these sections if you don’t need it, saving one call to IBM.

Now you can build an Ansible inventory with IBM cloud hosts on the fly in less than 10 seconds.

 

No responses yet

« Prev - Next »