Analysis And Usage of SSHGuard

This post has moved to eklausmeier.goip.de/blog/2021/02-28-analysis-and-usage-of-sshguard.

To ban annoying ssh access to your Linux box you can use fail2ban. Or, alternatively, you can use SSHGuard. SSHGuard’s installed size is 1.3 MB on Arch Linux. Its source code, including all C-files, headers, manuals, configuration, and makefiles is 8 KLines. In contrast, for fail2ban just the Python source code of version 0.11.2 is 31 KLines, not counting configuration files, manuals, and text files; its installed size is 3.3 MB. fail2ban is also way slower than SSHGuard. For example, one one machine fail2ban used 7 minutes of CPU time, where SSHGuard used 11 seconds. I have written on fail2ban in “Blocking Network Attackers“, “Chinese Hackers“, and “Blocking IP addresses with ipset“.

SSHGuard is a package in Arch Linux, and there is a Wiki page on it.

1. Internally SSHGuard maintains three lists:

  1. whitelist: allowed IP addresses, given by configuration
  2. blocklist: list of IP addresses which are blocked, but which can become unblocked after some time, in-memory only
  3. blacklist: permanently blocked IP addresses, stored in cleartext in file

SSHGuard’s main function is summarized in below excerpt from its shell-script /bin/sshguard.

eval $tailcmd | $libexec/sshg-parser | \
    $libexec/sshg-blocker $flags | $BACKEND &
wait

There are four programs, where each reads from stdin and writes to stdout, and does a well defined job. Each program stays in an infinite loop.

  1. $tailcmd reads the log, for example via tail -f, which might contain the offending IP address
  2. sshg-parser parses stdin for offending IP’s
  3. sshg-blocker writes IP addresses
  4. $BACKEND is a firewall shell script which either uses iptables, ipset, nft, etc.

sshg-blocker in addition to writing to stdout, also writes to a file, usually /var/db/sshguard/blacklist.db. This is the blacklist file. The content looks like this:

1613412470|100|4|39.102.76.239
1613412663|100|4|62.210.137.165
1613415749|100|4|39.109.122.173
1613416009|100|4|80.102.214.209
1613416139|100|4|106.75.6.234
1613418135|100|4|42.192.140.183

The first entry is time in time_t format, second entry is service, in our case always 100=ssh, third entry is either 4 for IPv4, or 6 for IPv6.

SSHGuard handles below services:

enum service {
    SERVICES_ALL            = 0,    //< anything
    SERVICES_SSH            = 100,  //< ssh
    SERVICES_SSHGUARD       = 110,  //< SSHGuard
    SERVICES_UWIMAP         = 200,  //< UWimap for imap and pop daemon
    SERVICES_DOVECOT        = 210,  //< dovecot
    SERVICES_CYRUSIMAP      = 220,  //< cyrus-imap
    SERVICES_CUCIPOP        = 230,  //< cucipop
    SERVICES_EXIM           = 240,  //< exim
    SERVICES_SENDMAIL       = 250,  //< sendmail
    SERVICES_POSTFIX        = 260,  //< postfix
    SERVICES_OPENSMTPD      = 270,  //< OpenSMTPD
    SERVICES_COURIER        = 280,  //< Courier IMAP/POP
    SERVICES_FREEBSDFTPD    = 300,  //< ftpd shipped with FreeBSD
    SERVICES_PROFTPD        = 310,  //< ProFTPd
    SERVICES_PUREFTPD       = 320,  //< Pure-FTPd
    SERVICES_VSFTPD         = 330,  //< vsftpd
    SERVICES_COCKPIT        = 340,  //< cockpit management dashboard
    SERVICES_CLF_UNAUTH     = 350,  //< HTTP 401 in common log format
    SERVICES_CLF_PROBES     = 360,  //< probes for common web services
    SERVICES_CLF_LOGIN_URL  = 370,  //< CMS framework logins in common log format
    SERVICES_OPENVPN        = 400,  //< OpenVPN
    SERVICES_GITEA          = 500,  //< Gitea
};

2. A typical configuration file might look like this:

LOGREADER="LANG=C /usr/bin/journalctl -afb -p info -n1 -t sshd -o cat"
THRESHOLD=10
BLACKLIST_FILE=10:/var/db/sshguard/blacklist.db
BACKEND=/usr/lib/sshguard/sshg-fw-ipset
PID_FILE=/var/run/sshguard.pid
WHITELIST_ARG=192.168.178.0/24

Furthermore one has to add below lines to /etc/ipset.conf:

create -exist sshguard4 hash:net family inet
create -exist sshguard6 hash:net family inet6

Also, /etc/iptables/iptables.rules and /etc/iptables/ip6tables.rules need the following link to ipset respectively:

-A INPUT -m set --match-set sshguard4 src -j DROP
-A INPUT -m set --match-set sshguard6 src -j DROP

3. Firewall script sshg-fw-ipset, called “BACKEND”, is essentially:

fw_init() {
    ipset -quiet create -exist sshguard4 hash:net family inet
    ipset -quiet create -exist sshguard6 hash:net family inet6
}

fw_block() {
    ipset -quiet add -exist sshguard$2 $1/$3
}

fw_release() {
    ipset -quiet del -exist sshguard$2 $1/$3
}

...

while read -r cmd address addrtype cidr; do
    case $cmd in
        block)
            fw_block "$address" "$addrtype" "$cidr";;
        release)
            fw_release "$address" "$addrtype" "$cidr";;
        flush)
            fw_flush;;
        flushonexit)
            flushonexit=YES;;
        *)
            die 65 "Invalid command";;
    esac
done

The “BACKEND” is called from sshg-blocker as follows:

static void fw_block(const attack_t *attack) {
    unsigned int subnet_size = fw_block_subnet_size(attack->address.kind);

    printf("block %s %d %u\n", attack->address.value, attack->address.kind, subnet_size);
    fflush(stdout);
}

static void fw_release(const attack_t *attack) {
    unsigned int subnet_size = fw_block_subnet_size(attack->address.kind);

    printf("release %s %d %u\n", attack->address.value, attack->address.kind, subnet_size);
    fflush(stdout);
}

SSHGuard is using the list-implementation SimCList from Michele Mazzucchi.

4. sshg-parser uses flex (=lex) and bison (=yacc) for evaluating log-messages. An introduction to flex and bison is here. Tokenization for ssh using flex is:

"Disconnecting "[Ii]"nvalid user "[^ ]+" "           { return SSH_INVALUSERPREF; }
"Failed password for "?[Ii]"nvalid user ".+" from "  { return SSH_INVALUSERPREF; } 

Actions based on tokens using bison is:

%token SSH_INVALUSERPREF SSH_NOTALLOWEDPREF SSH_NOTALLOWEDSUFF

msg_single:
    sshmsg            { attack->service = SERVICES_SSH; }
  | sshguardmsg       { attack->service = SERVICES_SSHGUARD; }
  . . .
  ;

/* attack rules for SSHd */
sshmsg:
    /* login attempt from non-existent user, or from existent but non-allowed user */
    ssh_illegaluser
    /* incorrect login attempt from valid and allowed user */
  | ssh_authfail
  | ssh_noidentifstring
  | ssh_badprotocol
  | ssh_badkex
  ;

ssh_illegaluser:
    /* nonexistent user */
    SSH_INVALUSERPREF addr
  | SSH_INVALUSERPREF addr SSH_ADDR_SUFF
    /* existent, unallowed user */
  | SSH_NOTALLOWEDPREF addr SSH_NOTALLOWEDSUFF
  ;

Once an attack is noticed, it is just printed to stdout:

static void print_attack(const attack_t *attack) {
    printf("%d %s %d %d\n", attack->service, attack->address.value,
           attack->address.kind, attack->dangerousness);
}

5. For exporting fail2ban’s blocked IP addresses to SSHGuard one would use below SQL:

select ip from (select ip from bans union select ip from bips)

to extract from /var/lib/fail2ban/fail2ban.sqlite3.

6. In case one wants to unblock an IP address, which got blocked inadvertently, you can simply issue

ipset del sshguard4 <IP-address>

in case you are using ipset as “BACKEND”. If this IP address is also present in the blacklist, you have to delete it there as well. For that, you must stop SSHGuard.

ssh as SOCKS server

This post has moved to eklausmeier.goip.de/blog/2021/02-21-ssh-as-socks-server.

Assume three computers A, B, and C. A can connect to B via ssh, but A cannot connect to C, but B can connect to C.

A -> B -> C

On A open ssh as SOCKS-server with

ssh -N -D 9020 user@B

Now on A one can use

brave --proxy-server="socks5://localhost:9020"

The browser will then show up as if directly surfing on B thereby circumventing the limitations on A.

Instead of the brave browser, one can use Chromium, or Firefox. Option “-N”: Do not execute a remote command.

See How to Set up SSH SOCKS Tunnel for Private Browsing, or SOCKS.

Poisson Log-Normal Distributed Random Numbers

This post has moved to eklausmeier.goip.de/blog/2021/02-09-poisson-log-normal-distributed-random-numbers.

Task at hand: Generate random numbers which follow a lognormal distribution, but this drawing is governed by a Poisson distribution. I.e., the Poisson distribution governs how many lognormal random values are drawn. Input to the program are \lambda of the Poisson distribution, modal value and either 95% or 99% percentile of the lognormal distribution.

From Wikipedia’s entry on Log-normal distribution we find the formula for the quantile q for the p-percentage of the percentile (0<p<1), given mean \mu and standard deviation \sigma:

q = \exp\left( \mu + \sqrt{2}\,\sigma\, \hbox{erf}^{-1}(2p-1)\right)

and the modal value m as

m = \exp\left( \mu - \sigma^2 \right).

So if q and m are given, we can compute \mu and \sigma:

\mu = \log m + \sigma^2,

and \sigma is the solution of the quadratic equation:

\log q = \log m + \sigma^2 + \sqrt{2}\,\sigma\, \hbox{erf}^{-1}(2p-1),

hence

\sigma_{1/2} = -{\sqrt{2}\over2}\, \hbox{erf}^{-1}(2p-1) \pm\sqrt{ {1\over2}\left(\hbox{erf}^{-1}(2p-1)\right)^2 - \log(m/q) },

or more simple

\sigma_{1/2} = -R/2 \pm \sqrt{R^2/4 - \log(m/q) },

with

R = \sqrt{2}\,\hbox{erf}^{-1}(2p-1).

For quantiles 95% and 99% one gets R as 1.64485362695147 and 2.32634787404084 respectively. For computing the inverse error function I used erfinv.c from lakshayg.

Actual generation of random numbers according Poisson- and lognormal-distribution is done using GSL. My program is here: gslSoris.c.

Poisson distribution looks like this (from GSL documentation):
Poisson distribution

Lognormal distribution looks like this (from GSL):
Lognormal distribution

Performance comparison Ryzen vs. Intel vs. Bulldozer vs. ARM

This post has moved to eklausmeier.goip.de/blog/2021/02-01-performance-comparison-ryzen-vs-intel-vs-bulldozer-vs-arm.

For comparing different machines I invert the Hilbert matrix

H = \left(\begin{array}{ccccc}  1 & {1\over2} & {1\over3} & \cdots & {1\over n} \\                                 {1\over2} & {1\over3} & {1\over4} & \cdots & {1\over n+1} \\                                 {1\over3} & {1\over4} & {1\over5} & \cdots & {1\over n+2} \\                                 \vdots    & \vdots    & \vdots    & \ddots & \vdots \\                                 {1\over n} & {1\over n+1} & {1\over n+2} & \cdots & {1\over2n-1}         \end{array} \right)         = \left( {\displaystyle {1\over i+j-1} } \right)_{ij}

This matrix is known have very high condition numbers. Program xlu5.c stores four double precision matrices of dimension n. Matrix H and A store the Hilbert matrix, X is the identity matrix, Y is the inverse of H. Finally the maximum norm of I-H\cdot H^{-1} is printed, which should be zero. These four double precision matrices occupy roughly 1.6 MB for n=230.

1. Runtime on Ryzen, AMD Ryzen 5 PRO 3400G with Radeon Vega Graphics, max 3.7 GHz, as given by lscpu.

$ time xlu5o3b 230 > /dev/null
        real 0.79s
        user 0.79s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       128 KiB
L1i cache:                       256 KiB
L2 cache:                        2 MiB
L3 cache:                        4 MiB

Required storage for above program is 4 matrices, each having 230×230 entries with double (8 bytes), giving 1692800 bytes, roughly 1.6 MB.

2. Runtime on AMD FX-8120, Bulldozer, max 3.1 GHz, as given by lscpu.

$ time xlu5o3b 230 >/dev/null 
        real 1.75s
        user 1.74s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       64 KiB
L1i cache:                       256 KiB
L2 cache:                        8 MiB
L3 cache:                        8 MiB

3. Runtime on Intel, Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz, max 2.6 GHz, as given by lscpu.

$ time xlu5o3b 230 > /dev/null
        real 1.68s
        user 1.67s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       64 KiB
L1i cache:                       64 KiB
L2 cache:                        512 KiB
L3 cache:                        3 MiB

Apparently the Ryzen processor can outperform the Intel processor on cache, higher clock frequency. But even for smaller matrix sizes, e.g., 120, the Ryzen is two times faster.

Interestingly, the error in computations are different!

AMD and Intel machines run ArchLinux with kernel version 5.9.13, gcc was 10.2.0.

4. Runtime on Raspberry Pi 4, ARM Cortex-A72, max 1.5 GHz, as given by lscpu.

$ time xlu5 230 > /dev/null
        real 4.37s
        user 4.36s
        sys 0
        swapped 0
        total space 0

Linux 5.4.83 and GCC 10.2.0.

5. Runtime on Odroid XU4, Cortex-A7, max 2 GHz, as given by lscpu.

$ time xlu5 230 > /dev/null
        real 17.75s
        user 17.60s
        sys 0
        swapped 0
        total space 0

So the Raspberry Pi 4 is clearly way faster than the Odroid XU4.