Recent Posts

HowTo Mount LVM Partitions

Find out which LVM parititions you have by running

lvdisplay
and mount the one you need with
mount /dev/vg0/vol1 /mnt
See also: LVM - Cheat Sheet

Is there hope when your Couchbase cluster is stuck in compacting buckets?

Well to be anticlimactic: no.

Scope

This seems to be at least a Couchbase 3.x problem. So far I haven't experienced it with Couchbase 4. Of both versions I only know about the so called community edition.

As for the frequency: Couchbase 3 getting stuck on bucket compacting is propabilistic. In the setups I've run so far it happens every half a year. But this might be load-dependant. Actually never having had the issue on some "smaller" clusters, I actually think it is.

The Symptoms

If you do not monitor explicitly for the compacting status, you will probably noticy by some nodes disks running full. Compacting not working anymore means, the Couchbase disk fragmentation growing and finally filling you disks.

If you look in the GUI you will see a constant "Compacting..." indicator in the top right of the admin GUI. In normal operation it never takes more than some minutes to finish (again depending on your usage).

Things that do not work...

What does help...

The root cause

What actually happened is a data structure corruption from which Couchbase 3 does not recover. This is also the reason why flushing buckets helps.

There are several bug reports in Couchbase 2, 3 and 4 about compacting stuck for different reasons. In general Couchbase is not a very stable product in this regard...

How to search Confluence for macro usage

When you want to find all pages in Confluence that embed a certain macro you cannot simply use the search field as it seamily only searches the resulting content. A normal search query does not check the markup for the macro code.

To search for a certain macro do a request like this

https://<base url>/dosearchsite.action?cql=macro+%3D+"<macro name>"
So to search for the "sql-query" macro for example do
https://<base url>/dosearchsite.action?cql=macro+%3D+"sql-query"

Solving d3.scale is undefined

When porting older code and examples of d3.js visualizations you might encounter the following exception:

TypeError: d3.scale is undefined
The causing code might be something like:
var xscale = d3.scale.linear().range([0, chartWidth]);
The problem is an API change from d3.scale.linear() to d3.scaleLinear() with d3.js 4.0. So to fix it rewrite the code with
var xscale = d3.scaleLinear().range([0, chartWidth]);

Automatically Download Oracle JDK

When downloading Oracle JDK via scripts you might run into the login page.

While in the past it was sufficient to do something like

wget --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
you now also need to ignore https:// as you will get redirected. So when using wget you might want to run
wget -c --no-cookies --no-check-certificate --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
Note the added "-c" and "--no-check-certificate" options.

RabbitMQ Does Not Start: init terminating in do_boot

If you have a RabbitMQ cluster and a crashed node fails to start again with

{"init terminating in do_boot",{undef,[{rabbit_prelaunch,start,[]},{init,start_it,1},{init,start_em,1}]}}
in /var/log/rabbitmq/startup_log and something like
Error description:
   {could_not_start,rabbitmq_management,
       {{shutdown,
            {failed_to_start_child,rabbit_mgmt_sup,
                {'EXIT',
                    {{shutdown,
                         [{{already_started,<9180.461.0>},
                           {child,undefined,rabbit_mgmt_db,
                               {rabbit_mgmt_db,start_link,[]},
                               permanent,4294967295,worker,
                               [rabbit_mgmt_db]}}]},
                     {gen_server2,call,
                         [<0.427.0>,{init,<0.425.0>},infinity]}}}}},
        {rabbit_mgmt_app,start,[normal,[]]}}}

Log files (may contain more information): /var/log/rabbitmq/[email protected] /var/log/rabbitmq/[email protected]
in /var/log/rabbitmq/[email protected] then you might want to try to drop the node from the cluster by running
rabbitmqctl forget_cluster_node [email protected]
one a working cluster node and rejoin the node by running
rabbitmqctl join_cluster [email protected]
on the disconnected node (given rabbit-02 is a working cluster member).

Note: Doing this might make you lose messages!

Match Structured Facts in MCollective

If you are using Facter 2+, which is what you do when you run at least Puppet4, then you have structured facts (meaning nested values) like those:

processors => {
  count => 2,
  isa => "unknown",
  models => [
    "QEMU Virtual CPU version 2.1.2",
    "QEMU Virtual CPU version 2.1.2"
  ],
  physicalcount => 2
}
Now you cannot match those using
mco find -F <fact name>=<fact value>
If you try you just get an empty result. The only way to match structured facts is using -S
mco find -S 'fact("<fact name>").value=<value>'
For example:
mco find -S 'fact("networking.network").value=192.168.5.0'
mco find -S 'fact("os.distro.codename").value=jessie'
See also Mcollective Cheat Sheet

Nagios Check for Systemd Failed Units

Just a short bash script to check for faulty systemd units to avoid 107 lines of Python...

#!/bin/bash

if [ -f /bin/systemctl ]; then failed=$(/bin/systemctl --failed --no-legend) failed=${failed/ */} # Strip everything after first space failed=${failed/.service/} # Strip .service suffix

if [ "$failed" != "" ]; then echo "Failed units: $failed" exit 1 else echo "No failed units." exit 0 fi else echo "No systemd. Nothing was checked!" exit 0 fi

How to fix debsecan for Wheezy

Someone at Debian increase security for all Debian servers by breaking debsecan a while ago for everything before Jessie by moving the vulnerability definitions from

http://secure-testing.debian.net/debian-secure-testing/project/debsecan/release/1/

to

https://security-tracker.debian.org/tracker/debsecan/release/1

Of course there was no way to issue a security fix for Wheezy debsecan...

Workaround 1: Hotfix

So if you still want to scan your Wheezy systems you can hotfix debsecan before running it like this:
sed -i "s/http:\/\/secure-testing.debian.net\/debian-secure-testing/https:\/\/security-tracker.debian.org\/tracker/;s/project\/debsecan\/release\/1\//debsecan\/release\/1\//" /usr/bin/debsecan 

Workaround 2: Pass Config

You can also pass an in-place config file:
debsecan --config <(echo SOURCE="https://security-tracker.debian.org/tracker/debsecan/release/1/")

More Changes after the Dyn DDoS Attack

Looking at NS records again of all scanned top 500 Alexa domains after the recent Dyn DDoS attack now 13 of the previously 14 Dyn customers which previously relied solely on Dyn now switched away entirely or added additional non-Dyn DNS servers.

Who Switched Entirely

about.com, addthis.com, exoclick.com, github.com/.io, quora.com, speedtest.net, zendesk.com

Who Switched to "Multi"-DNS

etsy.com, paypal.com, shutterstock.com, therverge.com, weebly.com

Details

Here is a summary of the NS record changes. To automatically compare the server names all numbers in the DNS server names having been stripped in the following table:

SiteBefore (15.10.)After (24.10.)
about.comns.p.dynect.net.dns.p.nsone.net.
addthis.comns.p.dynect.net.matt.ns.cloudflare.com.
wanda.ns.cloudflare.com.
etsy.comns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
ns.p.dynect.net.
exoclick.comns.p.dynect.net.dns.p.nsone.net.
ns.p.dynect.net.
github.comns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
github.ions.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
paypal.comns.p.dynect.net.ns.p.dynect.net.
pdns.ultradns.com.
pdns.ultradns.net.
quora.comns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
shutterstock.comns.p.dynect.net.a.verisigndns.com.
ns.p.dynect.net.
speedtest.netns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
theverge.comns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
ns.p.dynect.net.
weebly.comns.p.dynect.net.ns-.awsdns-.co.uk.
ns-.awsdns-.com.
ns-.awsdns-.net.
ns-.awsdns-.org.
ns.p.dynect.net.
zendesk.comns.p.dynect.net.pdns.ultradns.biz.
pdns.ultradns.co.uk.
pdns.ultradns.com.
pdns.ultradns.info.
pdns.ultradns.net.
pdns.ultradns.org.
The note-worthy non-changer is Twitter which is still exclusively at Dyn, eryone else seems to have mitigated. Some to using two providers, most of them switching to AWS DNS some to UltraDNS exclusively.

Changes after the Dyn DNS Outage

Looking at NS records today some of yesterdays affected companies decided to change things after the DoS on Dyn. As the NS record is not really a customer facing feature this is more an indication of Dyn's customers expectations. One could argue switching away from Dyn could mean fear of more downtimes to come.

Here is a summary of changed NS records so far:
SiteBefore (15.10.)After (22.10.)
about.com dynect.net nsone.com
etsy.com dynect.net dynect.net
awsdns
github.com dynect.net awsdns
paypal.com dynect.net dynect.net
ultradns.org
paypal.com
xhamster.com dynect.net anycastns*.org
zendesk.com dynect.net ultradns.*
speedtest.net dynect.net awsdns
I did only check some NS records for changes, just several of the top sites. There are two note-worthy non-changers being Twitter and Github though, everyone else seems to have mitigated. Some to using two providers, several switching to AWS DNS or UltraDNS exclusively.

Whom you can DDoS via DynDNS

After todays various affected major websites you might ask who is actually also affected together with the well known sites as Amazon, Twitter or Github.

Using the results of a monthly scan I automatically run on the top 500 Alexa sites it is easy to find out. The only thing you need to know is that the Dyn DNS server domain is dynect.net (detailed results).

Top Affected Sites

6pm.com about.com adcash.com addthis.com amazon.ca amazon.cn amazon.co.jp amazon.co.uk amazon.com amazon.de amazon.es amazon.fr amazon.in amazon.it answers.com bitly.com businessinsider.com chase.com chip.de disqus.com ebay.co.uk ebay.com ebay.com.au ebay.de ebay.in ebay.it etsy.com evernote.com exoclick.com github.com github.io goodreads.com hostgator.com huffingtonpost.com imdb.com indeed.com indiatimes.com jmpdirect01.com moz.com nytimes.com outbrain.com overstock.com pandora.com paypal.com photobucket.com pornhub.com quora.com redtube.com scribd.com shutterstock.com soundcloud.com speedtest.net stumbleupon.com t.co theguardian.com theverge.com tripadvisor.com trovi.com tube8.com tumblr.com twimg.com twitch.tv twitter.com uploaded.net webmd.com weebly.com wikia.com wix.com xhamster.com youporn.com zendesk.com zillow.com

Probably Different Impact

Note that not all of these sites were equally affected as some of them like Amazon are using multiple DNS providers. The Amazon main domains NS records do point to Dyn and UltraDNS. The same way probably none of the major adult sites was down as the also relied on at least two providers.

So while Amazon users probably got to the website after one DNS timeout and switching over to UltraDNS, Twitter and Github users were not so lucky and had to hope for Dyn to respond. It will be interesting to see if Twitter and Github will add a second DNS provider in the next time as a result of this.

The Need For "Multi-DNS"

Reading different reports on this incident it seems to me the headlines are focussing on those sites using just Dyns DNS and not on those having a "Multi-DNS".

Detailed results on who is using which DNS domain can be found in the monthly DNS usage index.

Apply-changes-to-limits.conf-immediately

See also ulimit - Cheat Sheet

Sometimes you need to increase the open file limit for an application server or the maximum shared memory for your ever-growing master database. In such a case you edit your /etc/security/limits.conf and then wonder how to get the changed limits to be visible to check wether you have set them correctly. You do not want to find out that they were wrong after your master DB doesn't come up after some incident in the middle of the night...

Instant Applying Limits to Running Processes

Actually you might want to apply the changes directly to a running process additionally to changing /etc/security/limits.conf. In recent edge Linux distributions (e.g. Debian Jessie) there is a tool "prlimit" to get/set limits.

Usage for changing limits for a PID is

prlimit --pid <pid> --<limit>=<soft>:<hard>
for example
prlimit --pid 12345 --nofile=1024:2048
If you are unlucky and do not have prlimit yet check out this instruction to compile your own version because despite missing user tool the prlimit() system call is in the kernel for quite a while (since 2.6.36).

Alternative #1: Re-Login with "sudo -i"

If you do not have prlimit yet and want a changed limit configuration to become visible you might want to try "sudo -i". The reason: you need to re-login as limits from /etc/security/* are only applied on login!

But wait: what about users without login? In such a case you login as root (which might not share their limits) and sudo into the user: so no real login as the user. In this case you must ensure to use the "-i" option of sudo:
sudo -i -u <user>
to simulate an initial login with sudo. This will apply the new limits.

Alternative #2: Make it work for sudo without "-i"

Wether you need "-i" depends on the PAM configuration of your Linux distribution. If you need it then PAM probably loads "pam_limit.so" only in /etc/pam.d/login which means at login time but no on sudo. This was introduced in Ubuntu Precise for example. By adding this line

session    required   pam_limits.so
in /etc/pam.d/sudo limits will also be applied when running sudo without "-i". Still using "-i" might be easier.

Finally: Always Check Effective Limits

The best way is to change the limits and check them by running
prlimit               # for current shell
prlimit --pid <pid>   # for a running process
because it shows both soft and hard limits together. Alternatively call
ulimit -a                # for current shell
cat /proc/<pid>/limits   # for a running process
with the affected user.

Sharing Screen With Multiple Users

How to detect screen sessions of other users:

screen -ls <user name>/

How to open screen to other users:

  1. Ctrl-A :multiuser on
  2. Ctrl-A :acladd <user to grant access>

Attach to other users screen session:

With session name
screen -x <user name>/<session name>
With PID and tty
screen -x <user name>/<pid>.<ptty>.<host>

Linux HTML Rendering Widgets

In 2010 I compiled a summary of HTML rendering widgets useful for embedding in Linux applications. Given recent changes and switching Liferea from Webkit to Webkit2 I felt it is time to post an updated version.

The following table give a summary of the different HTML renderers some long gone, some fully maintained:
Name Toolkit Platform Derived From Driving Force Active
KHTMLQT%KDEKDEYes
wxHtmlwxWidgetsGTK, WindowsKHTMLwxWidgetsYes
GtkHtmlGTK+ 1.0GNOME 1KHTMLGNOME 1No, long gone
GtkHtml2GTK+ 2.0GNOME 2GtkHtmlGNOME 2No, v2.11: Aug 2007
GtkHtml3GTK+ 2.0GNOME 2GtkHtmlXimian, EvolutionNo, v3.14: May 2008
GtkHtml4GTK+ 3.0GNOME 3GtkHtmlXimian, EvolutionNo, v4.6.6: Jul 2013
GtkMozEmbedGTK+ 2.0Gecko%MozillaNo
WebKitGtkGTK+ 2.0
GTK+ 3.0
WebkitKHTML/WebkitApple SafariNo
WebKitGtk2GTK+ 3.0WebkitWebkitApple SafariYes
Note: My summary somewhat complements this Wikipedia list. Still it focusses more on Linux renderers and does correctly distinguish between the rather mad history of GtkHtml*.

Given the list above one could conclude the only acceptable renderers are KTHML, wxHtml and WebkitGtk simply based on project activity. Still other renderers like GtkHtml2 and GtkHtml3 have gone a long way and provide a limited but stable functionality.

But the important question is: What features are supported by the different renderers?
Name Widget
Embed
Full
HTML
CSS JS Java/Flash Editor MathML
KHTMLyy1,2,3yynn
wxHtmlynnonennnn
GtkHtmlyynonennyn
GtkHtml2yy1,2 inlinennnn
GtkHtml3yynonennyn
GtkHtml4yynonennyn
GtkMozEmbedny1,2,3yyny
WebKitGtkny1,2,3yynn
WebKitGtk2ny1,2,3yynin work
The feature matrix along with the platform listing explains why a lot of those old renderer libraries are still around. Given you want to render simple markup in an email client you might still choose wxHtml or GtkHtml4, with the latter one providing you with a HTML editor for rich mail editing. Of course when you want to allow your users to have fully fledged inline browsing you need to use either KHTML or Webkit. If you are developing for GTK you need to use Webkit, if on KDE you probably will use KHTML.

If you find mistakes or have something to add please post a comment!

Hiera EYAML GPG Troubleshooting

When using Hiera + Eyaml + GPG as Puppet configuration backend one can run into a multitude of really bad error message. The problem here is mostly the obscene layering of libraries e.g. Eyaml on top of Eyaml-GPG on top of either GPGME or Ruby GPG on top on GnuPG. Most errors originate from/are reported by GnuPG and are badly unspecified.

This post gives some hints on some of the errors

[hiera-eyaml-core] General error

This is one of the worst errors you can get. One common cause is an expired GPG key. Check for it using
LANG=C gpg -k | grep expired
and remove the expired key with
gpg --delete-key <name
As the error label indicates this can have other causes. In such a case check out the GPGME Debugging section below.

[hiera-eyaml-core] no such file to load -- hiera/backend/eyaml/encryptors/gpg

If you got this you probably forgot to install the Ruby GEM. Fix it by running
gem install hiera-eyaml-gpg

[hiera-eyaml-core] GPG command (gpg --homedir /home/lars/.gnupg --quiet --no-secmem-warning --no-permission-warning --no-tty --yes --decrypt) failed with: gpg: Sorry, no terminal at all requested - can't get input

This error indicates a problem getting your secret key password. As Eyaml triggers GPG in background no password prompt can be issued. So the only way to get one is the PHP agent. In this case it might be dead.Check if one is running:
pgrep -fl gpg-agent

[gpg] !!! Fatal: Failed to decrypt ciphertext (check settings and that you are a recipient) [hiera-eyaml-core] !!! Decryption failed

If you get this error message you might want to check if you have a matching private key listed in your GPG recipient using
gpg -K

GPGME Debugging

No matter what error message you get if you cannot solve consider enabling debug traces by setting
export GPGME_DEBUG=9
Then run "eyaml" and check the output for sections of "_gpgme_io_read" that indicate the GnuPG responses like this one:
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_run_io_cb: call: item=0x2363d70, handler (0x21abc30, 7)
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: enter: fd=0x7, buffer=0x238b6c0, count=1024
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 5b474e5550473a5d 20494e565f524543 [GNUPG:] INV_REC
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 5020302035444136 3939343530393537 P 0 5DA699450957
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 3346354543394341 4138413232433134 3F5EC9CAA8A22C14
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 3846433938453339 374335430a5b474e 8FC98E397C5C.[GN
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 5550473a5d204641 494c55524520656e UPG:] FAILURE en
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: check: 6372797074203533 0a               crypt 53.
GPGME 2016-06-16 12:33:55 <0x45b7>    _gpgme_io_read: leave: result=89
If you overlook the bad wrapping you see the following info here:
INV_RECP 0 5DA699450957.... FAILURE encrypt 53
Google for those messages and you often get a GnuPG related result hinting on the cause. Above trace is about an invalid key with fingerprint 5DA699450957.... which you can find with listing your GPG keys and checking for expiration messages.

Workaround OpenSSH 7.0 Problems

OpenSSH 7+ deprecates weak key exchange algorithm diffie-hellman-group1-sha1 and DSA public keys for both host and user keys which lead to the following error messages:

Unable to negotiate with 172.16.0.10 port 22: no matching key exchange method found. Their offer: diffie-hellman-group1-sha1
or a simple permission denied when using a user DSA public key or
Unable to negotiate with 127.0.0.1: no matching host key type found.
Their offer: ssh-dss
when connecting to a host with a DSA host key.

Workaround

Allow the different deprecated features in ~/.ssh/config
Host myserver
  # To make pub ssh-dss keys work again
  PubkeyAcceptedKeyTypes +ssh-dss

# To make host ssh-dss keys work again HostkeyAlgorithms +ssh-dss

# To allow weak remote key exchange algorithm KexAlgorithms +diffie-hellman-group1-sha1
Alternatively pass those three options using -o. For example allow the key exchange when running SSH
ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 <host>

Solution

Replace all your dss keys to avoid keys stopping to work. And upgrade all SSH version to avoid offering legacy key exchange algorithms.