PHP extension seg-faulting

This post has moved to eklausmeier.goip.de/blog/2021/03-29-php-extension-seg-faulting.

Task at hand: Call Cobol (=GnuCobol) from PHP. I used FFI for this:

<?php
        $cbl = FFI::cdef("int phpsqrt(void); void cob_init_nomain(int,char**); int cob_tidy(void);", "/srv/http/phpsqrt.so");
        $ffj0 = FFI::cdef("double j0(double);", "libm.so.6");

        $cbl->cob_init_nomain(0,null);
        $ret = $cbl->phpsqrt();
        printf("\tret = %d<br>\n",$ret);
        echo "Before calling cob_tidy():<br>\n";
        echo "\tReturn: ", $cbl->cob_tidy(), "<br>\n";
        printf("j0(2) = %f<br>\n", $ffj0->j0(2));
?>

The Cobol program is:

000010 IDENTIFICATION DIVISION.
000020 PROGRAM-ID.   phpsqrt.
000030 AUTHOR.       Elmar Klausmeier.
000040 DATE-WRITTEN. 01-Jul-2004.
000050
000060 DATA DIVISION.
000070 WORKING-STORAGE SECTION.
000080 01 i       PIC 9(5).
000090 01 s       usage comp-2.
000100
000110 PROCEDURE DIVISION.
000120*    DISPLAY "Hello World!".
000130     PERFORM VARYING i FROM 1 BY 1 UNTIL i > 10
000140         move function sqrt(i) to s
000150*        DISPLAY i, " ", s
000160     END-PERFORM.
000170
000180     move 17 to return-code.
000190     GOBACK.
000200

Config in php.ini file has to be changed:

extension=ffi
ffi.enable=true

To call GnuCobol from C, you have to first call cob_init() or cob_init_nomain(), which initializes GnuCobol. I tried both initialization routines, and both resulted in PHP crashing after running above program, i.e., segmentation fault.

I created a bug for this: FFI crashes with segmentation fault when calling cob_init().

1. I compiled PHP 8.0.3 from source. For this I had to add below packages:

pacman -S tidy freetds c-client

I grep’ed my current configuration:

php -i | grep "Configure Comman"
Configure Command =>  './configure'  '--srcdir=../php-8.0.3' '--config-cache' '--prefix=/usr' '--sbindir=/usr/bin' '--sysconfdir=/etc/php' '--localstatedir=/var' '--with-layout=GNU' '--with-config-file-path=/etc/php' '--with-config-file-scan-dir=/etc/php/conf.d' '--disable-rpath' '--mandir=/usr/share/man' '--enable-cgi' '--enable-fpm' '--with-fpm-systemd' '--with-fpm-acl' '--with-fpm-user=http' '--with-fpm-group=http' '--enable-embed=shared' '--enable-bcmath=shared' '--enable-calendar=shared' '--enable-dba=shared' '--enable-exif=shared' '--enable-ftp=shared' '--enable-gd=shared' '--enable-intl=shared' '--enable-mbstring' '--enable-pcntl' '--enable-shmop=shared' '--enable-soap=shared' '--enable-sockets=shared' '--enable-sysvmsg=shared' '--enable-sysvsem=shared' '--enable-sysvshm=shared' '--with-bz2=shared' '--with-curl=shared' '--with-db4=/usr' '--with-enchant=shared' '--with-external-gd' '--with-external-pcre' '--with-ffi=shared' '--with-gdbm' '--with-gettext=shared' '--with-gmp=shared' '--with-iconv=shared' '--with-imap-ssl' '--with-imap=shared' '--with-kerberos' '--with-ldap=shared' '--with-ldap-sasl' '--with-mhash' '--with-mysql-sock=/run/mysqld/mysqld.sock' '--with-mysqli=shared,mysqlnd' '--with-openssl' '--with-password-argon2' '--with-pdo-dblib=shared,/usr' '--with-pdo-mysql=shared,mysqlnd' '--with-pdo-odbc=shared,unixODBC,/usr' '--with-pdo-pgsql=shared' '--with-pdo-sqlite=shared' '--with-pgsql=shared' '--with-pspell=shared' '--with-readline' '--with-snmp=shared' '--with-sodium=shared' '--with-sqlite3=shared' '--with-tidy=shared' '--with-unixODBC=shared' '--with-xsl=shared' '--with-zip=shared' '--with-zlib'

To this I added --enable-debug. Command configure needs two minutes. Then make -j8 needs another two minutes.

I copied php.ini to local directory, changed it to activated FFI. Whenever I called

$BUILD/sapi/cli/php

I had to add -c php.ini, when I called an extension written by me, stored in ext/.

2. The fix for segmentation fault is actually pretty easy: Just set environment variable ZEND_DONT_UNLOAD_MODULES:

ZEND_DONT_UNLOAD_MODULES=1 $BUILD/sapi/cli/php -c php.ini -r 'test1();'

Reason for this: see valgrind output below.

3. Before I had figured out the “trick” with ZEND_DONT_UNLOAD_MODULES, I wrote a PHP extension. The extension is:

/* {{{ void test1() */
PHP_FUNCTION(test1)
{
        ZEND_PARSE_PARAMETERS_NONE();

        php_printf("test1(): The extension %s is loaded and working!\r\n", "callcob");
        cob_init(0,NULL);
}
/* }}} */

Unfortunately, running this extension resulted in:

Module compiled with build ID=API20200930,NTS
PHP    compiled with build ID=API20200930,NTS,debug
These options need to match

I solved this by adding below string:

/* {{{ callcob_module_entry */
zend_module_entry callcob_module_entry = {
        STANDARD_MODULE_HEADER,
        //sizeof(zend_module_entry), ZEND_MODULE_API_NO, 1, USING_ZTS,
        "callcob",                                      /* Extension name */
        ext_functions,                                  /* zend_function_entry */
        NULL,                                                   /* PHP_MINIT - Module initialization */
        NULL,                                                   /* PHP_MSHUTDOWN - Module shutdown */
        PHP_RINIT(callcob),                     /* PHP_RINIT - Request initialization */
        NULL,                                                   /* PHP_RSHUTDOWN - Request shutdown */
        PHP_MINFO(callcob),                     /* PHP_MINFO - Module info */
        PHP_CALLCOB_VERSION,            /* Version */
        STANDARD_MODULE_PROPERTIES
        ",debug"
};
/* }}} */

I guess this is not the recommend approach.

4. Valgrind shows the following:

==37350== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==37350==  Access not within mapped region at address 0x852AD20
==37350==    at 0x852AD20: ???
==37350==    by 0x556EF7F: ??? (in /usr/lib/libc-2.33.so)
==37350==    by 0x5570DCC: getenv (in /usr/lib/libc-2.33.so)
==37350==    by 0x76BB43: module_destructor (zend_API.c:2629)
==37350==    by 0x75EE31: module_destructor_zval (zend.c:782)
==37350==    by 0x7777A1: _zend_hash_del_el_ex (zend_hash.c:1330)
==37350==    by 0x777880: _zend_hash_del_el (zend_hash.c:1353)
==37350==    by 0x779188: zend_hash_graceful_reverse_destroy (zend_hash.c:1807)
==37350==    by 0x769390: zend_destroy_modules (zend_API.c:1992)
==37350==    by 0x75F582: zend_shutdown (zend.c:1078)
==37350==    by 0x6C3F17: php_module_shutdown (main.c:2359)
==37350==    by 0x84E46D: main (php_cli.c:1351)
==37350==  If you believe this happened as a result of a stack
==37350==  overflow in your program's main thread (unlikely but
==37350==  possible), you can try to increase the size of the
==37350==  main thread stack using the --main-stacksize= flag.
==37350==  The main thread stack size used in this run was 8388608.
. . .
zsh: segmentation fault (core dumped)  valgrind $BUILD/sapi/cli/php -c $BUILD/php.ini -r 'test1();'

As shown above, the relevant code in question is Zend/zend_API.c in line 2629. This is shown below:

void module_destructor(zend_module_entry *module) /* {{{ */
{
. . .
        module->module_started=0;
        if (module->type == MODULE_TEMPORARY && module->functions) {
                zend_unregister_functions(module->functions, -1, NULL);
        }

#if HAVE_LIBDL
        if (module->handle && !getenv("ZEND_DONT_UNLOAD_MODULES")) {
                DL_UNLOAD(module->handle);
        }
#endif
}
/* }}} */

It is the DL_UNLOAD, which is a #define for dlclose, which actually provokes the crash.

According PHP Internals Book — Zend Extensions:

Here, we are loaded as a PHP extension. Look at the hooks. When hitting MSHUTDOWN(), the engine runs our MSHUTDOWN(), but it unloads us just after that ! It calls for dlclose() on our extension, look at the source code, the solution is as often located in there.

So what happens is easy, just after triggering our RSHUTDOWN(), the engine unloads our pib.so ; when it comes to call our Zend extension part shutdown(), we are not part of the process address space anymore, thus we badly crash the entire PHP process.

What is still not understood: Why does FFI not crash with those simple functions, like printf(), or sqrt()?

Performance comparison Ryzen vs. Intel vs. Bulldozer vs. ARM

This post has moved to eklausmeier.goip.de/blog/2021/02-01-performance-comparison-ryzen-vs-intel-vs-bulldozer-vs-arm.

For comparing different machines I invert the Hilbert matrix

H = \left(\begin{array}{ccccc}  1 & {1\over2} & {1\over3} & \cdots & {1\over n} \\                                 {1\over2} & {1\over3} & {1\over4} & \cdots & {1\over n+1} \\                                 {1\over3} & {1\over4} & {1\over5} & \cdots & {1\over n+2} \\                                 \vdots    & \vdots    & \vdots    & \ddots & \vdots \\                                 {1\over n} & {1\over n+1} & {1\over n+2} & \cdots & {1\over2n-1}         \end{array} \right)         = \left( {\displaystyle {1\over i+j-1} } \right)_{ij}

This matrix is known have very high condition numbers. Program xlu5.c stores four double precision matrices of dimension n. Matrix H and A store the Hilbert matrix, X is the identity matrix, Y is the inverse of H. Finally the maximum norm of I-H\cdot H^{-1} is printed, which should be zero. These four double precision matrices occupy roughly 1.6 MB for n=230.

1. Runtime on Ryzen, AMD Ryzen 5 PRO 3400G with Radeon Vega Graphics, max 3.7 GHz, as given by lscpu.

$ time xlu5o3b 230 > /dev/null
        real 0.79s
        user 0.79s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       128 KiB
L1i cache:                       256 KiB
L2 cache:                        2 MiB
L3 cache:                        4 MiB

Required storage for above program is 4 matrices, each having 230×230 entries with double (8 bytes), giving 1692800 bytes, roughly 1.6 MB.

2. Runtime on AMD FX-8120, Bulldozer, max 3.1 GHz, as given by lscpu.

$ time xlu5o3b 230 >/dev/null 
        real 1.75s
        user 1.74s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       64 KiB
L1i cache:                       256 KiB
L2 cache:                        8 MiB
L3 cache:                        8 MiB

3. Runtime on Intel, Intel(R) Core(TM) i5-4250U CPU @ 1.30GHz, max 2.6 GHz, as given by lscpu.

$ time xlu5o3b 230 > /dev/null
        real 1.68s
        user 1.67s
        sys 0
        swapped 0
        total space 0

Cache sizes within CPU are:

L1d cache:                       64 KiB
L1i cache:                       64 KiB
L2 cache:                        512 KiB
L3 cache:                        3 MiB

Apparently the Ryzen processor can outperform the Intel processor on cache, higher clock frequency. But even for smaller matrix sizes, e.g., 120, the Ryzen is two times faster.

Interestingly, the error in computations are different!

AMD and Intel machines run ArchLinux with kernel version 5.9.13, gcc was 10.2.0.

4. Runtime on Raspberry Pi 4, ARM Cortex-A72, max 1.5 GHz, as given by lscpu.

$ time xlu5 230 > /dev/null
        real 4.37s
        user 4.36s
        sys 0
        swapped 0
        total space 0

Linux 5.4.83 and GCC 10.2.0.

5. Runtime on Odroid XU4, Cortex-A7, max 2 GHz, as given by lscpu.

$ time xlu5 230 > /dev/null
        real 17.75s
        user 17.60s
        sys 0
        swapped 0
        total space 0

So the Raspberry Pi 4 is clearly way faster than the Odroid XU4.

Compiling Java source to binary (native)

This post has moved to eklausmeier.goip.de/blog/2020/12-19-compiling-java-source-to-binary-native.

With GraalVM you can now fully compile a Java file to a native binary. This is also called AOT, ahead-of-time compilation. Compilation is very slow, and resulting binary is huge as it must contain all code which might be referenced. In contrast the class file usually is quite small. Though, it is advantageous that the resulting binary starts way faster.

From the GraalVM web-page:

The Native Image builder or native-image is a utility that processes all classes of an application and their dependencies, including those from the JDK. It statically analyzes these data to determine which classes and methods are reachable during the application execution. Then it ahead-of-time compiles that reachable code and data to a native executable for a specific operating system and architecture. This entire process is called building an image

Assume the simple program:

public class hello {
    public static void main(String argv[]) {
        System.out.println("Hello world.");
    }
}

Compiling this simple program is quite slow:

$ time /usr/lib/jvm/java-11-graalvm/bin/native-image hello
[hello:90709]    classlist:   1,173.69 ms,  0.96 GB
[hello:90709]        (cap):     746.78 ms,  0.96 GB
[hello:90709]        setup:   2,072.73 ms,  0.96 GB
[hello:90709]     (clinit):     214.77 ms,  1.22 GB
[hello:90709]   (typeflow):   5,433.03 ms,  1.22 GB
[hello:90709]    (objects):   4,402.72 ms,  1.22 GB
[hello:90709]   (features):     281.83 ms,  1.22 GB
[hello:90709]     analysis:  10,615.01 ms,  1.22 GB
[hello:90709]     universe:     486.71 ms,  1.71 GB
[hello:90709]      (parse):   1,237.17 ms,  1.71 GB
[hello:90709]     (inline):   1,174.69 ms,  1.71 GB
[hello:90709]    (compile):   7,934.95 ms,  2.35 GB
[hello:90709]      compile:  10,857.38 ms,  2.35 GB
[hello:90709]        image:   1,052.94 ms,  2.35 GB
[hello:90709]        write:     174.08 ms,  2.35 GB
[hello:90709]      [total]:  26,598.83 ms,  2.35 GB
real 27.23s
user 145.40s
sys 0
swapped 0
total space 0

Compilation will start a huge number of threads. Resulting binary is ca. 9 MB.

In ArchLinux the native compiler is contained in native-image-jdk11-bin, which in turn needs jdk11-graalvm-bin.

Running the binary is way faster than starting the class file within JVM. Starting the binary.

$ time ./hello
Hello world.
real 0.00s
user 0.00s
sys 0
swapped 0
total space 0

Starting the class file in JVM.

$ time java hello
Hello world.
real 0.07s
user 0.07s
sys 0
swapped 0
total space 0

For building native images on Windows follow the instructions here: Prerequisites for Using Native Image on Windows.

Contrary to initial conception, GraalVM is not automatically faster during runtime. It is clearly way faster during startup. Michael Larabel conducted a performance test on GraalVM 20.1, OpenJDK 11, OpenJDK 14.0.1, OpenJDK 15, and others. Result: Sometimes GraalVM is faster, sometimes not.

null

SciMark #2

Calling C from Julia

This post has moved to eklausmeier.goip.de/blog/2020/06-23-calling-c-from-julia.

Two ways to compute the error function or Bessel function in Julia.

1. Calling C. On UNIX libm provides erf() and j0(). So calling them goes like this:

ccall(("erf","libm.so.6"),Float64,(Float64,),0.1)
ccall(("j0"),Float64,(Float64,),3)

In this case one can omit the reference to libm.so. Watch out for the funny looking (Float64,).

2. Using Julia. SpecialFunctions.jl provides erf() and besselj0.

import Pkg
Pkg.add("SpecialFunctions")
import SpecialFunctions
SpecialFunctions.erf(0.1)
SpecialFunctions.besselj0(3)

Splitting and anti-merging vCard files

This post has moved to eklausmeier.goip.de/blog/2020/06-09-splitting-and-anti-merging-vcard-files.

Sometimes vCard files need to be split into smaller files, or the file needs to be protected against merging in another application.

1. Splitting. Below Perl script splits the input file into as many files as required. Output files are named adr1.vcf, adr2.vcf, etc. You can pass a command line argument “-n” to specify the number of card records per file. Splitting a vCard file is provided in palmadrsplit on GitHub:

use Getopt::Std;

my %opts;
getopts('n:',\%opts);
my ($i,$k,$n) = (1,0,950);
$n = ( defined($opts{'n'}) ? $opts{'n'} : 950 );

open(F,">adr.$i.vcf") || die("Cannot open adr.$i.vcf for writing");
while (<>) {
        if (/BEGIN:VCARD/) {
                if (++$k % $n == 0) {   # next address record
                        close(F) || die("Cannot close adr.$i.vcf");
                        ++$i;   # next file number
                        open(F,">adr.$i.vcf") || die("Cannot open adr.$i.vcf for writing");
                }
        }
        print F $_;
}
close(F) || die("Cannot close adr.$i.vcf");

This is required for Google Contacts, as Google does not allow to import more than 1,000 records per day, see Quotas for Google Services.

2. Anti-Merge. Inhibiting annoying merging is given in file palmantimerge on GitHub. Overall logic is as follows: Read entire vCard file and each card, delimited by BEGIN:VCARD and END:VCARD, is put on a hashmap. Each hashmap entry is a list of vCards. Hash key is the N: entry, i.e., the concatentation of lastname and firstname. Once everything is hashed, then walk through hash. Those hash entries, where the list contains just one entry, can be output as is. Where the list contains more than one entry, then these entries would otherwise be merged, and then the N: part is modified by using the ORG: field.

use strict;
my @singleCard = ();    # all info between BEGIN:VCARD and END:VCARD
my ($name) = "";        # N: part, i.e., lastname semicolon firstname
my ($clashes,$line,$org) = (0,"","");
my %allCards = {};      # each entry is list of single cards belonging to same first and lastname, so hash of array of array

while (<>) {
        if (/BEGIN:VCARD/) {
                ($name,@singleCard) = ("", ());
                push @singleCard, $_;
        } elsif (/END:VCARD/) {
                push @singleCard, $_;
                push @{ $allCards{$name} }, [ @singleCard ];
        } else {
                push @singleCard, $_;
                $name = $_ if (/^N:/);
        }
}

for $name (keys %allCards) {
        $clashes = $#{$allCards{$name}};
        for my $sglCrd (@{$allCards{$name}}) {
                if ($clashes == 0) {
                        for $line (@{$sglCrd}) { print $line; }
                } else {
                        $org = "";
                        for $line (@{$sglCrd}) {
                                $org = $1 if ($line =~ /^ORG:([ \-\+\w]+)/);
                        }
                        for $line (@{$sglCrd}) {
                                $line =~ s/;/ \/${org}\/;/ if ($line =~ /^N:/);
                                print $line;
                        }
                }
        }
}

Every lastname is appended with “/organization/” if the combination of firstname and lastname is not unique. For example, two records with Peter Miller in ABC-Corp and XYZ-Corp, will be written as N:Miller /ABC-Corp/;Peter and N:Miller /XYZ-Corp/;Peter.

This way Simple Mobile Tools Contacts will not merge records together which it shouldn’t. Issue #446 for this is on GitHub.

Performance Comparison Pallene vs. Lua 5.1, 5.2, 5.3, 5.4 vs. C

This post has moved to eklausmeier.goip.de/blog/2020/05-14-performance-comparison-pallene-vs-lua-5-1-5-2-5-3-5-4-vs-c.

Installing Pallene is described in the previous post: Installing Pallene Compiler. In this post we test the performance of Pallene versus C, Lua 5.4, and LuaJIT. Furthermore we benchmark different Lua versions starting with Lua 5.1 up to 5.4.

1. Array Access. I checked a similar program as in Performance Comparison C vs. Lua vs. LuaJIT vs. Java.

function lua_perf(N:integer, S:integer)
        local t:{ {a:float, b:float, f:float} } = {}

        for i = 1, N do
                t[i] = {
                        a = 0.0,
                        b = 1.0,
                        f = i * 0.25
                }
        end

        for j = 1, S-1 do
                for i = 1, N-1 do
                        t[i].a = t[i].a + t[i].b * t[i].f
                        t[i].b = t[i].b - t[i].a * t[i].f
                end
                --io_write( t[1].a )
        end
end

This program, which does no I/O at all, runs in 0.14s, and therefore runs two times slower than the LuaJIT, which finishes in 0.07s. This clearly is somewhat disappointing. Lua 5.4, as part of Pallene, needs 0.75s. So Pallene is roughly five times faster than Lua.
Continue reading

Installing Pallene Compiler

This post has moved to eklausmeier.goip.de/blog/2020/05-12-installing-pallene-compiler.

Pallene is a Lua based language. In contrast to Lua, which is untyped, Pallene is typed. A good paper on Pallene is “Pallene: A companion language for Lua”, by Hugo Musso Gualandi, and Roberto Ierusalimschy.

From above paper:

The compiler itself is quite conventional. After a standard parsing step, it converts the program to a high-level intermediate form and from that it emits C code, which is then fed into a C compiler such as gcc.

From “A gradually typed subset of a scripting language can be simple and efficient”:

Pallene was designed for performance, and one fundamental part of that is that its compiler generates efficient machine code. To simplify the implementation, and for portability, Pallene generates C source code instead of directly generating assembly language.

So, very generally, this idea is similar to f2c (Fortran to C), cobc (Cobol compiler), or Lush (Lisp Universal SHell).

The whole Pallene compiler is implemented in less than 7 kLines of Lua, and less than 1 kLines of C source code for the runtime.

To install Pallene compiler you need git, gcc, lua, and luarocks. Description is for Linux. MacOS is very similar.

1. Source. Fetch source code via git clone.

$ git clone https://github.com/pallene-lang/pallene.git 
Cloning into 'pallene'...
$ cd pallene

2. Rocks. Fetch required Lua rocks via luarocks command.

$ luarocks install --local --only-deps pallene-dev-1.rockspec
Missing dependencies for pallene dev-1:                                                                                                                 
   lpeglabel >= 1.5.0 (not installed)                                                                                                                   
   inspect >= 3.1.0 (not installed)                                                                                                                     
   argparse >= 0.7.0 (not installed)                                                                                                                    
   luafilesystem >= 1.7.0 (not installed)                                                                                                               
   chronos >= 0.2 (not installed)                                                                                                                       
                                                                                                                                                        
pallene dev-1 depends on lua ~> 5.3 (5.3-1 provided by VM)                                                                                              
pallene dev-1 depends on lpeglabel >= 1.5.0 (not installed)                                                                                             
Installing https://luarocks.org/lpeglabel-1.6.0-1.src.rock                                                                                              
                                                                                                                                                        
lpeglabel 1.6.0-1 depends on lua >= 5.1 (5.3-1 provided by VM)
gcc -O2 -fPIC -I/usr/include -c lpcap.c -o lpcap.o
gcc -O2 -fPIC -I/usr/include -c lpcode.c -o lpcode.o
gcc -O2 -fPIC -I/usr/include -c lpprint.c -o lpprint.o
gcc -O2 -fPIC -I/usr/include -c lptree.c -o lptree.o
gcc -O2 -fPIC -I/usr/include -c lpvm.c -o lpvm.o
gcc -shared -o lpeglabel.so lpcap.o lpcode.o lpprint.o lptree.o lpvm.o
No existing manifest. Attempting to rebuild...
lpeglabel 1.6.0-1 is now installed in /home/klm/.luarocks (license: MIT/X11) 

pallene dev-1 depends on inspect >= 3.1.0 (not installed)
Installing https://luarocks.org/inspect-3.1.1-0.src.rock

inspect 3.1.1-0 depends on lua >= 5.1 (5.3-1 provided by VM)
inspect 3.1.1-0 is now installed in /home/klm/.luarocks (license: MIT <http://opensource.org/licenses/MIT>)

pallene dev-1 depends on argparse >= 0.7.0 (not installed)
Installing https://luarocks.org/argparse-0.7.0-1.all.rock

argparse 0.7.0-1 depends on lua >= 5.1, < 5.4 (5.3-1 provided by VM)
argparse 0.7.0-1 is now installed in /home/klm/.luarocks (license: MIT)

pallene dev-1 depends on luafilesystem >= 1.7.0 (not installed)
Installing https://luarocks.org/luafilesystem-1.8.0-1.src.rock

luafilesystem 1.8.0-1 depends on lua >= 5.1 (5.3-1 provided by VM)
gcc -O2 -fPIC -I/usr/include -c src/lfs.c -o src/lfs.o
gcc -shared -o lfs.so src/lfs.o
luafilesystem 1.8.0-1 is now installed in /home/klm/.luarocks (license: MIT/X11)

pallene dev-1 depends on chronos >= 0.2 (not installed)
Installing https://luarocks.org/chronos-0.2-4.src.rock

chronos 0.2-4 depends on lua >= 5.1 (5.3-1 provided by VM)
gcc -O2 -fPIC -I/usr/include -c src/chronos.c -o src/chronos.o -I/usr/include
gcc -shared -o chronos.so src/chronos.o -L/usr/lib -Wl,-rpath,/usr/lib -lrt
chronos 0.2-4 is now installed in /home/klm/.luarocks (license: MIT/X11)

Stopping after installing dependencies for pallene dev-1

3. Environment variables. Make sure that you source the environment variables given by

luarocks path

For example:

export LUA_PATH='/usr/share/lua/5.3/?.lua;/usr/share/lua/5.3/?/init.lua;/usr/lib/lua/5.3/?.lua;/usr/lib/lua/5.3/?/init.lua;./?.lua;./?/init.lua;/home/klm/.luarocks/share/lua/5.3/?.lua;/home/klm/.luarocks/share/lua/5.3/?/init.lua'
export LUA_CPATH='/usr/lib/lua/5.3/?.so;/usr/lib/lua/5.3/loadall.so;./?.so;/home/klm/.luarocks/lib/lua/5.3/?.so'
export PATH='/home/klm/.luarocks/bin:/usr/bin:/home/klm/bin:...:.

4. Build Lua and runtime. Build Lua and the Pallene runtime (you are still in the pallene directory):

make linux-readline

Some warnings will show up for Lua, but they can be ignored for now.

5. Run compiler. Now you can run pallenec, provided you still are in the same directory, where you built pallene.

$ ./pallenec   
Usage: pallenec [-h] [--emit-c] [--emit-asm] [--compile-c]
       [--dump {parser,checker,ir,uninitialized,constant_propagation}]
       <source_file>

Error: missing argument 'source_file'

6. Run example. Now check one of the examples.

$ pallenec examples/factorial/factorial.pln
$ ./lua/src/lua -l factorial examples/factorial/main.lua 
The factorial of 5 is 120.

The most common error will be to not use the lua/src/lua command from Pallene, but rather the system-wide.

You can compile all examples and benchmarks:

for i in examples/*/*.pln; do pallenec $i; done
for i in benchmark/*/*.pln; do pallenec $i; done

Things to note in Pallene:

  1. Array indexes must start at one
  2. Pallene source code, except type-, record-definition or variable definitions, must be within a function
  3. Pallene offers no goto statement. The goto statement was added in Lua 5.2.

Performance Comparison in Computing Exponential Function

This post has moved to eklausmeier.goip.de/blog/2020/05-05-performance-comparison-in-computing-exponential-function.

If your computation is dominated by exponential function evaluations, then it makes a significant difference whether you evaluate the exponential function exp() in single precision or in double precision. You can reduce your computing time by roughly 25% when moving from double precision (double) to single precision (float). Evaluation in quadruple precision is more than six times more expensive than evaluation in double precision.

Changing from double precision to single precision also halves the amount of storage needed. On x86_64 Linux float usually occupies 4 bytes, double occupies 8 bytes, and long double needs 16 bytes.

1. Result. Here are the runtime numbers of a test program.

  1. Single precision (float): 2.44s
  2. Double precision (double): 3.32s
  3. Quadruple precision (long double): 22.88s

These numbers are dependant on CPU internal scheduling, see CPU Usage Time Is Dependant on Load.

2. Test program. The test program is essentially as below:

long i, rep=1024, n=65000;
int c, precision='d';
float sf = 0;
double sd = 0;
long double sq = 0;
...
switch(precision) {
case 'd':
        while (rep-- > 0)
                for (i=0; i<n; ++i)
                        sd += exp(i % 53) - exp((i+1) % 43) - exp((i+2) % 47) - exp((i+3) % 37);
        printf("sd = %f\n",sd);
        break;
case 'f':
        while (rep-- > 0)
                for (i=0; i<n; ++i)
                        sf += expf(i % 53) - expf((i+1) % 43) - expf((i+2) % 47) - expf((i+3) % 37);
        printf("sf = %f\n",sf);
        break;
case 'q':
        while (rep-- > 0)
                for (i=0; i<n; ++i)
                        sq += expl(i % 53) - expl((i+1) % 43) - expl((i+2) % 47) - expl((i+3) % 37);
        printf("sq = %Lf\n",sq);
        break;
}

Full source code is in GitHub, file in question is called exptst.c.

3. Environment.AMD Bulldozer FX-8120, 3.1 GHz, Arch Linux 5.6.8, gcc version 9.3.0. Compiled the code with -O3 -march=native

J-Pilot Plugin For SQLite Export

This post has moved to eklausmeier.goip.de/blog/2020/04-29-j-pilot-plugin-for-sqlite-export.

In SQL Datamodel For J-Pilot I described the SQLite datamodel. I wrote a J-Pilot plugin which can export the below entities and write them to an SQLite database file. The direction is one-way: from J-Pilot to SQLite.

  1. Address
  2. Datebook
  3. Memo
  4. To-Do
  5. Expense
  6. Various categories for above entities

Adding more entities is pretty easy. For example, if people need the Calendar Palm database exported, this can be implemented quickly. We use the usual SQLite API with sqlite3_exec(), and sqlite3_prepare(), sqlite3_bind(), sqlite3_step(), and finally sqlite3_finalize().

The general mechanics of a J-Pilot plugin are described by Judd Montgomery, the author of J-Pilot, in this document. I took the Expense/expense.c source code from the Expense plugin as a guide.

The plugin provides the following functionality:

  1. Create new database from scratch, it is called jptables.db
  2. Export above mentioned entities
  3. In debug mode you can use J-Pilot‘s search to search in the SQLite database

If you call jpilot -d then debug-mode is activated.

Installation.

  1. Compile single source code file jpsqlite.c
  2. Copy library (.so file) in plugin directory ($HOME/.jpilot/plugins)
  3. Copy datamodel SQL file jptables.sql into plugin directory

Compilation is with below command:

gcc `pkg-config -cflags-only-I gtk+-2.0` -I <J-Pilot src dir> -s -fPIC -shared jpsqlite.c -o libjpsqlite.so -lsqlite3

For this to work you need the Pilot-Link header files and the J-Pilot (AUR) source code at hand.

Running the plugin: go to the plugins menu by main-menu selection or function key (F7 in my case), then press SQL button. All previous data is completey erased in the database, then all data is written to database within a single transaction.

In debug mode and in debug mode only, the “J-Pilot search” also searches through all entities in the SQLite database.

The long-term goal is that SQLite is the internal data structure for J-Pilot, thereby abandoning the binary files entirely.

java.sql.SQLRecoverableException: IO Error: Connection reset by peer, Authentication lapse

This post has moved to eklausmeier.goip.de/blog/2019/01-19-java-sql-sqlrecoverableexception-io-error-connection-reset-by-peer-authentication-lapse.

I encountered the following error, when I wanted to connect to Oracle v12.2.0.1.0 database with Java 1.8.0_192-b26:

java.sql.SQLRecoverableException: IO Error: Connection reset by peer, Authentication lapse 321631 ms.

This was unexpected as the same program did run absolutely fine on another Linux machine. Program in question is

import java.sql.Connection;
import java.sql.SQLException;

import oracle.jdbc.pool.OracleDataSource;

public class OraSample1 {

        public static void main (String argv[]) {
                System.out.println("Starting...");
                try {
                        OracleDataSource ds = new OracleDataSource();
                        ds.setURL("jdbc:oracle:thin:@nuc:1521:orcl");
                        Connection conn=ds.getConnection("c##klm","klmOpRisk");
                        System.out.println("Connected");
                } catch (SQLException e) {
                        System.out.println(e);
                }
        }

}

Solution: Add the following property setting to command line

java -Djava.security.egd=file:/dev/urandom OraSample1

Also see “java.sql.SQLException: I/O Error: Connection reset” in linux server [duplicate] on Stackoverflow.

Passing HashMap from Java to Java Nashorn

This post has moved to eklausmeier.goip.de/blog/2018/07-22-passing-hashmap-from-java-to-java-nashorn.

Java Nashorn is the JavaScript engine shipped since Java 8. You can therefore use JavaScript wherever you have at least Java 8. Java 8 also has a standalone interpreter, called jjs.

It is possible to create a Java HashMap and use this structure directly in JavaScript. Here is the code:

import java.util.*;
import java.io.*;
import javax.script.*;


public class HashMapDemo {

        public static void main(String[] args) {
                HashMap hm = new HashMap();

                hm.put("A", new Double(3434.34));
                hm.put("B", new Double(123.22));
                hm.put("C", new Double(1200.34));
                hm.put("D", new Double(99.34));
                hm.put("E", new Double(-19.34));

                for( String name: hm.keySet() )
                        System.out.println(name + ": "+ hm.get(name));

                // Increase A's balance by 1000
                double balance = ((Double)hm.get("A")).doubleValue();
                hm.put("A", new Double(balance + 1000));
                System.out.println("A's new account balance : " + hm.get("A"));

                // Call JavaScript from Java
                try {   
                        ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");
                        engine.eval("print('Hello World');");
                        engine.eval(new FileReader("example.js"));
                        Invocable invocable = (Invocable) engine;
                        Object result = invocable.invokeFunction("sayHello", "John Doe");
                        System.out.println(result);
                        System.out.println(result.getClass());

                        result = invocable.invokeFunction("prtHash", hm);
                        System.out.println(result);
                } catch (FileNotFoundException | NoSuchMethodException | ScriptException e) {
                        e.printStackTrace();
                        System.out.println(e);
                }

        }
}

And here is the corresponding JavaScript file example.js:

var sayHello = function(name) {
        print('Hello, ' + name + '!');
        return 'hello from javascript';
};

var prtHash = function(h) {
        print('h.A = ' + h.A);
        print('h.B = ' + h["B"]);
        print('h.C = ' + h.C);
        print('h.D = ' + h["D"]);
        print('h.E = ' + h.E);
};

Output is:

$ java HashMapDemo
A: 3434.34
B: 123.22
C: 1200.34
D: 99.34
E: -19.34
A's new account balance : 4434.34
Hello World
Hello, John Doe!
hello from javascript
class java.lang.String
h.A = 4434.34
h.B = 123.22
h.C = 1200.34
h.D = 99.34
h.E = -19.34
null

Above example uses sample code from

  1. Riding the Nashorn: Programming JavaScript on the JVM
  2. Simple example for Java HashMap
  3. Nashorn: Run JavaScript on the JVM

Decisive was the statement in https://winterbe.com/posts/2014/04/05/java8-nashorn-tutorial/:

Java objects can be passed without loosing any type information on the javascript side. Since the script runs natively on the JVM we can utilize the full power of the Java API or external libraries on nashorn.

Above program works the same if one changes HashMap to HashMap and populating accordingly, e.g.:

                HashMap hm = new HashMap();

                hm.put("A", new Double(3434.34));
                hm.put("B", new String("Test"));
                hm.put("C", new Date(5000));
                hm.put("D", new Integer(99));
                hm.put("E", new Boolean(Boolean.TRUE));

Output from JavaScript would be

h.A = 4434.34
h.B = Test
h.C = Thu Jan 01 01:00:05 CET 1970
h.D = 99
h.E = true

Entries changed in JavaScript can be returned back to Java. Assume JavaScript program changes values:

var prtHash = function(h,hret) {
        hret.U = 57;
        hret.V = "Some text";
        hret.W = false;
};

Then these changed arguments can be used back in Java program:

HashMap hret = new HashMap();

result = invocable.invokeFunction("prtHash", hm, hret);
System.out.println(result);
System.out.println("hret.U = " + hret.get("U"));
System.out.println("hret.V = " + hret.get("V"));
System.out.println("hret.W = " + hret.get("W"));

Output is then

hret.U = 57
hret.V = Some text
hret.W = false

Added 09-Dec-2020: Since JDK 15 Nashorn is no longer available. It is now necessary to use GraalVM. Here is a Migration Guide from Nashorn to GraalVM JavaScript.

Using Scooter Software Beyond Compare

This post has moved to eklausmeier.goip.de/blog/2018/05-17-using-scooter-software-beyond-compare.

Beyond Compare is a graphical file comparison tool sold by Scooter Software. Its open-source competitors are mainly vimdiff, and kdiff3. Its advantage is ease-of-use. While comparing files they can be edited instantly. You can diff complete directory trees.

It is written in Delphi Object Pascal, the source code is not open-source. It runs on Windows, x86 Linux, and OS X. It does not run on ARM, like Raspberry Pi or Odroid, see support for arm processors – like the raspberry pi. The “Standard Edition” costs $30, the “Pro Edition” costs $60. The software is in AUR.

1. Root User Problem. When using it as root-user you must use:

export QT_GRAPHICSSYSTEM=native
bcompare

When running

DIFFPROG=bcompare pacdiff

the screen looks like this:

2. Git Usage. To use Beyond Compare with git difftool you have to do two things: First you must create an alias bc3 for bcompare.

[root /bin]# ln -s bcompare bc3

Second add the following lines to your ~/.gitconfig file:

[diff]
        tool = bc3
[difftool]
        prompt = false
        bc3 = trustExitCode
[merge]
        tool = bc3
[mergetool]
        bc3 = trustExitCode

Alternatively to above changes in the ~/.gitconfig file, use the following commands:

git config --global diff.tool bc3
git config --global difftool.bc3.trustExitCode true
git config --global merge.tool bc3
git config --global mergetool.bc3.trustExitCode true