Set-Up “Let’s Encrypt” for Hiawatha Web-Server

Google announced that starting with Chrome version 68 they will gradually mark HTTP-connections as “not secure”. “Let’s Encrypt” is a free service for web-masters to obtain certificates in an easy manner. Work on “Let’s Encrypt” started in 2014.

Setting up “Let’s Encrypt” with Hiawatha web-server is quite easy, although there are some pitfalls. I used the ArchLinux package for Hiawatha. There is also a ArchWiki page for Hiawatha.

Another detailed description is: Let’s Encrypt with Hiawatha by Chris Wadge.

1. Unpacking and production-server setting. After installing the ArchLinux package I unpacked the file /usr/share/hiawatha/letsencrypt.tar.gz. You have to edit letsencrypt.conf at three places:

ACCOUNT_EMAIL_ADDRESS = your@mail.address
LE_CA_HOSTNAME =           # Production

I stumbled with the last variable LE_CA_HOSTNAME. This has to be the productive “Let’s Encrypt” server. Although you might register with the testing-server, you apparently cannot do anything else with the testing-server. So delete the testing-server. The rest of the configuration file is obvious to change.

2. Configuration file. Now check your hiawatha.conf file:

Binding {
        Port = 443
        #TLScertFile = tls/hiawatha.pem
        TLScertFile = /etc/hiawatha/tls/
        Interface =
        MaxRequestSize = 2048
        TimeForRequest = 30
VirtualHost {
        Hostname =,,,,,,,, borussia

Continue reading


Set-Up Hiawatha Web-Server

I stumbled upon Hiawatha web-server when I read about a web-server for a houseboat by Ronald Scheckelhoff, WB8LZR. I had used Apache, thttpd, Lighttp, NGINX, and others before. Now I use Hiawatha web-server.

Hiawatha has three objectives, which are nicely met:

  1. Security: Hiawatha resisted Heartbleed and Slowloris attacks
  2. Ease of use: use the man-pages for configuring the web-server, no extensive Googling
  3. Lightweight on resources

The following diagram shows the number of source code lines using

wc `find . -iname \*.c -o -iname \*.h -o -iname \*akefile\* `

for each web-server.

Below configuration mostly follows the example configuration and provides Perl and PHP as CGI:

ServerId = http
ConnectionsTotal = 1000
ConnectionsPerIP = 25
SystemLogfile = /var/log/hiawatha/system.log
GarbageLogfile = /var/log/hiawatha/garbage.log

Binding {
        Port = 80
        MaxRequestSize = 1572864
        MaxUploadSize = 2047
        TimeForRequest = 90,180

CGIhandler = /usr/bin/perl:pl
CGIhandler = /usr/bin/php-cgi:php

Directory {
        DirectoryID = DownloadArea
        Path = /Download
        ShowIndex = yes

Directory {
        DirectoryID = WebPresence
        Path = /
        ExecuteCGI = yes

Hostname =
WebsiteRoot = /srv/http

VirtualHost {
        Hostname =,,,,,,,,
        WebsiteRoot = /srv/http
        FollowSymlinks = yes
        UseDirectory = WebPresence, DownloadArea

So I have a directory where Hiawatha shows a graphical representation of some files I can download. And it has an ordinary directory where I serve HTML and PHP files. I had to change MaxRequestSize and MaxUploadSize as I sometimes upload large chunks of data.

Since the 2014 Microsoft shotgun attack on I have many different DNS names to better withstand this vandalism.

Enabling GD for PHP is described here: php-gd — just uncomment extension=gd.

Towards web-based delta synchronization for cloud storage systems

Very interesting article.

Some remarkable excerpts:

To isolate performance issues to the JavaScript VM, the authors rebuilt the client side of WebRsync using the Chrome native client support and C++. It’s much faster.

Replacing MD5 with SipHash reduces computation complexity by almost 5x. As a fail-safe mechanism in case of hash collisions, WebRsync+ also uses a lightweight full content hash check. If this check fails then the sync will be re-started using MD5 chunk fingerprinting instead.

The client side of WebR2sync+ is 1700 lines of JavaScript. The server side is based on node.js (about 500 loc) and a set of C processing modules (a further 1000 loc).

the morning paper

Towards web-based delta synchronization for cloud storage systems Xiao et al., FAST’18

If you use Dropbox (or an equivalent service) to synchronise file between your Mac or PC and the cloud, then it uses an efficient delta-sync (rsync) protocol to only upload the parts of a file that have changed. If you use a web interface to synchronise the same files though, the entire file will be uploaded. This situation seems to hold across a wide range of popular services:

Given the universal presence of the web browser, why can’t we have efficient delta syncing for web clients? That’s the question Xiao et al. set out to investigate: they built an rsync implementation for the web, and found out it performed terribly. Having tried everything to improve the performance within the original rsync design parameters, then they resorted to a redesign which moved more of the heavy lifting back to…

View original post 728 more words


Unix Command comm: Compare Two Files

One lesser known Unix command is comm. This command is far less known than diff. comm needs two already sorted files FILE1 and FILE2. With the options

  • -1 suppress column 1 (lines unique to FILE1)
  • -2 suppress column 2 (lines unique to FILE2)
  • -3 suppress column 3 (lines that appear in both files)

For example, comm -12 F1 F2 prints all common lines in files F1 and F2.

I thought that comm had a bug, so I wrote a short Perl script to simulate the behaviour of comm. Of course, there was no bug, I just missed to notice that the records in the two files did not match due to white space.

#!/bin/perl -W
use strict;

use Getopt::Std;
my %opts = ('d' => 0, 's' => 0);
my $debug = ($opts{'d'} != 0);
my $member = defined($opts{'s'}) ? $opts{'s'} : 0;

my ($set,$prev) = (1,"");
my %H;

while (<>) {
        $prev = $ARGV if ($prev eq "");
        if ($ARGV ne $prev) {
                $set *= 2;
                $prev = $ARGV;
        $H{$_} |= $set;
        printf("\t>>\t%s: %s -> %d\n",$ARGV,$_,$H{$_}) if ($debug);

$member = 2*$set - 1 if ($member == 0);
printf("\t>>\tmember = %d\n",$member) if ($debug);
for my $i (sort keys %H) {
        printf("%s\n",$i) if ($H{$i} == $member);

Above Perl scripts does not need sorted input files, as it stores all records of the files in memory, in a hash. It uses a bitmask as a set. For example, mycomm -s2 F1 F2 prints only those records, which are only in file F2 but not in F1.


Unix sort Issue

I wondered why Unix sort behaved strangely.

printf "A0 1\nA  1\n" | sort


A0 1
A  1

Of course, I expected A to come before A0. This was strange, as printf "A1 1\nA 1\n" | sort produced

A  1
A1 1

just as expected. Also, printf "A0\nA\n" | sort orders A before A0, as expected.

Solution: Use LC_ALL before sort. So

printf "A0 1\nA  1\n" | LC_ALL=C sort


A  1
A0 1

I realized this when I called sort with --debug flag,

printf "A0 1\nA  1\n" | sort --debug

which shows the empleyed locale:

sort: using ‘en_US.UTF-8’ sorting rules
A0 1
A  1

To check that my expected sort-order was indeed the “right” order, I wrote the following simple Perl-script to sort, which confirmed my understanding of ASCII sorting:

#!/bin/perl -W
use strict;
my @F = <>;     # slurp
for my $i (sort @F) { print $i; }

Parallelization and CPU Cache Overflow

In the post Rewriting Perl to plain C the runtime of the serial runs were reported. As expected the C program was a lot faster than the Perl script. Now running programs in parallel showed two unexpected behaviours: (1) more parallelizations can degrade runtime, and (2) running unoptimized programs can be faster.

See also CPU Usage Time Is Dependant on Load.

In the following we use the C program siriusDynCall and the Perl script siriusDynUpro which was described in above mentioned post. The program or scripts reads roughly 3GB of data. Before starting the program or script all this data has been already read into memory by using something like wc or grep.

1. AMD Processor. Running 8 parallel instances, s=size=8, p=partition=1(1)8:

for i in 1 2 3 4 5 6 7 8; do time siriusDynCall -p$i -s8 * > ../resultCp$i & done
        real 50.85s
        user 50.01s
        sys 0

Merging the results with the sort command takes a negligible amount of time

sort -m -t, -k3.1 resultCp* > resultCmerged

Best results are obtained when running just s=4 instances in parallel:

$ for i in 1 2 3 4 ; do /bin/time -p siriusDynCall -p$i -s4 * > ../dyn4413c1p$i & done
        real 33.68
        user 32.48
        sys 1.18

Continue reading


Rewriting Perl to plain C

Perl script was running too slow. Rewriting it in C made it 20 times faster.

1. Problem statement. Analyze call-tree dependency in COBOL programs. There are 77 million lines of COBOL code in ca. 30,000 files. These 30,000 COBOL programs could potentially include 74,000 COPY-books comprising 10 million lines of additional code. COBOL COPY-books are analogous to C header-files. So in total there are around 88 million lines of COBOL code. Just for comparison: the Linux kernel has ca. 20 million lines of code.

COBOL program analysis started with a simple Perl script. This Perl script is less than 200 lines, including comments. This script produced the desired dependency information.

Wading through all this COBOL code took up to 25 minutes in serial mode, and 13 minutes using 4 cores on an HP EliteBook notebook using Intel Skylake i7-6600U clocked 2.8 GHz. It took 36 minutes on an AMD FX-8120 clocked with 3.1 GHz. This execution time was deemed too long to see any changes in the output changing something in the Perl script. All runs are on Arch Linux 4.14.11-1 SMP PREEMPT.

2. Result. Rewriting the Perl script in C resulted in a speed improvement of factor 20 when run in serial mode, i.e., run times are now 110s on one core. It runs in 32s when using 8 cores on an AMD FX-8120. C program uses taylormade hashing routines.
Continue reading