Rewriting Perl to plain C

Perl script was running too slow. Rewriting it in C made it 20 times faster.

1. Problem statement. Analyze call-tree dependency in COBOL programs. There are 77 million lines of COBOL code in ca. 30,000 files. These 30,000 COBOL programs could potentially include 74,000 COPY-books comprising 10 million lines of additional code. COBOL COPY-books are analogous to C header-files. So in total there are around 88 million lines of COBOL code. Just for comparison: the Linux kernel has ca. 20 million lines of code.

COBOL program analysis started with a simple Perl script. This Perl script is less than 200 lines, including comments. This script produced the desired dependency information.

Wading through all this COBOL code took up to 25 minutes in serial mode, and 13 minutes using 4 cores on an HP EliteBook notebook using Intel Skylake i7-6600U clocked 2.8 GHz. It took 36 minutes on an AMD FX-8120 clocked with 3.1 GHz. This execution time was deemed too long to see any changes in the output changing something in the Perl script. All runs are on Arch Linux 4.14.11-1 SMP PREEMPT.

2. Result. Rewriting the Perl script in C resulted in a speed improvement of factor 20 when run in serial mode, i.e., run times are now 110s on one core. It runs in 32s when using 8 cores on an AMD FX-8120. C program uses taylormade hashing routines.
Continue reading

Performance of Dalvik versus native C compilation

Android application programs are usually written in the Java programming language. Java source code is compiled to bytecode, which is then interpreted by the Java virtual machine. Current virtual machines (VM for short) use two tricks to improve performance:

  1. compile bytecode to machine code during run-time (JIT: just-in-time compilation)
  2. optimize machine code according usage pattern (hotspot technology)

Despite all these tricks native code produced by C/C++ compilers is ways faster than JIT/hotspot, see for example shootout from 2008, or see The Computer Language Benchmarks Game.

Continue reading

Copy Directories with Symbolic Links via ssh

Although probably known in most circles it is worth repeating that scp by itself does not honor symbolic links. To overcome this limitation just combine tar and ssh, i.e., tar on sending side, untar on receiving side:

tar cf - /src/dir | ssh remotehost "cd /dst/dir ; tar xf -"

Usually this is even faster than using scp, as now only big chunks of data are transfered via TCP. Expect an almost twofold performance increase for larger directories which contain a couple of small files.

See also commandlinefu.

Hash functions: An empirical comparison — article by Peter Kankowski

Peter Kankowski wrote a very interesting article on hashing functions. It compares a number of current hash functions and conducts some performance benchmarks.

  1. iSCSI CRC
  2. Meiyan
  3. Murmur2
  4. XXHfast32
  5. SBox (dead link:
  6. Larson
  7. XXHstrong32
  8. Sedgewick
  9. Novak unrolled
  10. CRC-32
  11. Murmur3
  12. x65599
  13. FNV (Fowler–Noll–Vo) hash
  14. Murmur2A
  15. Fletcher
  16. Kernighan & Ritchie
  17. Paul Hsieh
  18. Bernstein
  19. x17 unrolled
  20. lookup3
  21. MaPrime2c
  22. Ramakrishna
  23. One At Time
  24. Arash Partow
  25. Weinberger
  26. Hanson

A more complicated algorithm does not necessarily mean better performance. So the classical Kernighan & Ritchie hash still performs quite well.