Text Analysis using Concordance

When analyzing longer text, especially if this text was written by oneself, it helps to read the text in a different way, here using a concordance.

Assume your text is provided as PDF. Convert PDF to text using pdftotext, which part of package poppler. Replace line breaks in text file with spaces using below C program (called linebreak.c):

#include <stdio.h>

int main(int argc, char *argv[]) {
        int c, flag=0;
        FILE *fp;

        if (argc >= 2) {
                if ((fp = fopen(argv[1],"rb")) == NULL)
                        return 1;
        } else {
                fp = stdin;
        }

        while ((c = fgetc(fp)) != EOF) {
                if (c == '\n') {
                        flag += 1;
                        if (flag > 1) { putchar(c); flag = 0; }
                        else putchar(' ');
                } else {
                        flag = 0;
                        putchar(c);
                }
        }

        return 0;
}

Then generate a list of (single) words with below Perl program:

#!/bin/perl -W
# Print word concordances

use strict;

my (%H,@F);

while (<>) {
        chomp;
        s/\s+$//;       # rtrim
        @F = split;
        foreach my $w (@F) {
                $w =~ s/^\s+//; # ltrim
                $w =~ s/\s+$//; # rtrim
                $H{$w} += 1;
        }
}

foreach my $w (sort keys %H) {
        printf("\t%6d\t%s\n",$H{$w},$w);
}

To print all word pairs replace above loop with

while (<>) {
        chomp;
        s/\s+$//;       # rtrim
        @F = split;
        for(my $i=0; $i<$#F; ++$i) {
                $F[$i] =~ s/^\s+//;     # ltrim
                $F[$i] =~ s/\s+$//;     # rtrim
                $F[$i+1] =~ s/^\s+//;   # ltrim
                $F[$i+1] =~ s/\s+$//;   # rtrim
                $H{$F[$i] . " " . $F[$i+1]} += 1;
        }
}

Similar, for word triples replace the loop with

while (<>) {
        chomp;
        s/\s+$//;       # rtrim
        @F = split;
        for(my $i=0; $i+1<$#F; ++$i) {
                $F[$i] =~ s/^\s+//;     # ltrim
                $F[$i] =~ s/\s+$//;     # rtrim
                $F[$i+1] =~ s/^\s+//;   # ltrim
                $F[$i+1] =~ s/\s+$//;   # rtrim
                $F[$i+2] =~ s/^\s+//;   # ltrim
                $F[$i+2] =~ s/\s+$//;   # rtrim
                $H{$F[$i] . " " . $F[$i+1] . " " . $F[$i+2]} += 1;
        }
}

Printing concordances using Perl hashes is very simple, as one can see.

Here is an example from the man-page of expect using below sequence of commands:

( TERM=dumb; man expect ) | linebreak | word3concord | sort -r

Truncated result is

            16  For example, the
            13  example, the following
            12  the current process.
             9  the end of
             8  using Expectk, this
             8  this option is
             8  sent to the
             8  flag causes the
             8  body is executed
             8  Expectk, this option
             8  (When using Expectk,
             7  to the current
             7  the spawn id
             7  the most recent
             7  the current process
             7  the corresponding body
             7  option is specified
             7  is specified as
             7  corresponding body is
             7  by Don Libes,
             7  be used to
             6  set for the
             6  of the current
             6  is set for
             6  is an alias
Advertisements

Contributing to Hugo Static

The discussion forum for Hugo contains a description: Hugo development – how to contribute code. Also see Contributing to Hugo.

1. Preparation

First set GOPATH as

export GOPATH=$HOME/tmp/H

then

cd $GOPATH

Fetch source with go get

time go get -u -v github.com/spf13/hugo

takes around 1-2 minutes as it has to download almost 200MB.

Now change to the Hugo source code and compile

cd src/github.com/spf13/hugo/
time make hugo

Compilation from scratch takes roughly 1-2 minutes. Recompiling a single file usually takes less than 10 seconds.

In the same directory, run test-cases with

time make check

which takes less than a minute.

All timings are on an AMD FX(tm)-8120 Eight-Core Processor clocked with 3.1 GHz running Linux 4.11.3, and using Go 1.8.3.

2. Fork in Github, git branch and pull-request

Fork https://github.com/spf13/hugo by pressing the “Fork” icon:

Move original Git repository out of your way, clone the new fork, add or modify files as required, add, and commit them:

cd $GOPATH/src/github.com/spf13/
mv hugo hugo.original
time git clone git@github.com:eklausme/hugo.git

cd hugo
git branch YOURNAME
git checkout YOURNAME

go fmt
git add YOURFILE
git commit

A git clone of hugo alone takes less than 10 seconds. Watch out to run go fmt before git add.

Contributors are asked to provide single commits. In case you have multiple, then squash them into one, i.e., git rebase -i and git push -f.

Finally press the pull-request button in Github:

Be prepared to wait weeks or even months before your pull-request will be accepted or even rejected, so patience is required.

HP ePrint Obsolescense

HP (Hewlett Packard) manufacturer of printers of various sorts (laser, ink) unfortunately again decided to annoy its loyal customers. Repeatedly HP updated firmware in printers so that do not work with alternate ink, see for example Disabling 3rd-party ink ensures “best printing experience”. Now for the last years they disabled the so called apps (ePrint functionality): after a few years the printer can no longer connect to the web-services of HP. See, for example, HP Apps Service Retired on several printers.

In my case, I bought a CM1415 in November 2011. It can no longer connect to HP web-services. Therefore I can no longer send e-mails to the printers which then get printed. In November 2014 I bought an M276nw, a similar printer as the CM1415. This model can still connect to HP web-services. So it looks like after six years HP silently disables functionality. These web-services also offer other services, like weather forecasts, news, sudokus, etc.

These chicanery make it clear that customers should not trust cloud-services, or at least have a contingency plan if these services do not work, or have prices which are ridiculous. Recently the price increase by Firebase made headlines, see Firebase Costs Increased by 7,000%!

Five-Value Theorem of Nevanlinna

In German known as Fünf-Punkte-Satz. This theorem is astounding. It says: If two meromorphic functions share five values ignoring multiplicity, then both functions are equal. Two functions, f(z) and g(z), are said to share the value a if f(z) - a = 0 and g(z) - a = 0 have the same solutions (zeros).

More precisely, suppose f(z) and g(z) are meromorphic functions and a_1, a_2, \ldots, a_5 are five distinct values. If

\displaystyle{      E(a_i,f) = E(a_i,g), \qquad 1\le i\le 5,  }

where

\displaystyle{      E(a,h) = \left\{ z | h(z) = a \right\},  }

then f(z) \equiv g(z).

For a generalization see Some generalizations of Nevanlinna’s five-value theorem. Above statement has been reproduced from this paper.

The identity theorem makes assumption on values in the codomain and concludes that the functions are identical. The five-value theorem makes assumptions on values in the domain of the functions in question.

Taking e^z and e^{-z} as examples, one sees that these two meromorphic functions share the four values a_1=0, a_2=1, a_3=-1, a_4=\infty but are not equal. So sharing four values is not enough.

There is also a four-value theorem of Nevanlinna. If two meromorphic functions, f(z) and g(z), share four values counting multiplicities, then f(z) is a Möbius transformation of g(z).

According Frank and Hua: We simply say “2 CM + 2 IM implies 4 CM”. So far it is still not known whether “1 CM + 3 IM implies 4 CM”; CM meaning counting multiplicities, IM meaning ignoring multiplicities.

For a full proof there are books which are unfortunately paywall protected, e.g.,

  1. Gerhard Jank, Lutz Volkmann: Einführung in die Theorie der ganzen und meromorphen Funktionen mit Anwendungen auf Differentialgleichungen
  2. Lee A. Rubel, James Colliander: Entire and Meromorphic Functions
  3. Chung-Chun Yang, Hong-Xun Yi: Uniqueness Theory of Meromorphic Functions, five-value theorem proved in §3

For an introduction to complex analysis, see for example Terry Tao:

  1. 246A, Notes 0: the complex numbers
  2. 246A, Notes 1: complex differentiation
  3. 246A, Notes 2: complex integration
  4. Math 246A, Notes 3: Cauchy’s theorem and its consequences
  5. Math 246A, Notes 4: singularities of holomorphic functions
  6. 246A, Notes 5: conformal mapping, covers Picard’s great theorem
  7. 254A, Supplement 2: A little bit of complex and Fourier analysis, proves Poisson-Jensen formula for the logarithm of a meromorphic function in relation to its zeros within a disk