Angry Swarm: 2014

2014-09-27

Another day in my love affair with AWK

I consider myself a C/C++ developer. Right now I am embracing C++11 (I wanted to wait till it is actually well supported by compilers) and I am loving it.

Despite my happy relationship with C/C++ I have maintained a torrid affair with AWK for many years, which has spilled into this blog before:

Almost a year ago I concluded that MAWK is freakin' fast and GNU AWK freakin' fast as a snail
The past summer I stumbled over a bottleneck in the one-true-AWK, default for *BSD and Mac OS-X

A Matter of Accountability

So far circumstances dictated that either the script or the input data or both had to be kept secret. In this post both will be publicly available. The purpose of this post is to give people the chance to perform their own tests.

The following is required to perform the test:

The dbc2c.awk script was already part of my first post. It parses Vector DBC (Database CAN) files, an industry standard for describing a set of devices, messages and signals for the real time bus CAN (one can argue it's soft real time, it depends). It does the following things:

Parse data from 1 or more input files
Store the data in arrays, use indexes as references to describe relationships
Output the data
1. Traverse the data structure and store attributes of objects in an array
2. Read a template
3. Insert data into the template and print on stdout

Test Environment

The operating system:
FreeBSD AprilRyan.norad 10.1-BETA2 FreeBSD 10.1-BETA2 #0 r271856: Fri Sep 19 12:55:39 CEST 2014 root@AprilRyan.norad:/usr/obj/S403/amd64/usr/src/sys/S403 amd64
The compiler:
FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
Target: x86_64-unknown-freebsd10.1
Thread model: posix
CPU: Core i7@2.4GHz (Haswell)
NAWK version: awk version 20121220 (FreeBSD)
MAWK version: mawk 1.3.4.20140914
GNU AWK version: GNU Awk 4.1.1, API: 1.1

Tests

With the recent changeset 219:01114669a8bf, the script switched from using array iteration (for (index in array) { … }) to creating a numbered index for each object type and iterate through them in order of creation to make sure data is output in the same order with every AWK implementation. This makes it much easier to compare and validate outputs from different flavours of AWK.

To reproduce the tests, run:

time -l awk -f scripts/dbc2c.awk -vDATE=whenever j1939_utf8.dbc | sha256

The checksum for the output should read:

9f0a105ed06ecac710c20d863d6adefa9e1154e9d3a01c681547ce1bd30890df

Here are my runtime results:

6.23 s

6.32 s

6.27 s

11.79 s

11.88 s

11.80 s

1.98 s

2.02 s

1.97 s

Memory usage (maximum resident set size):

22000 k

50688 k

26644 k

Conclusion

Once again the usual order of things establishes itself. GNU AWK wastes our time and memory while MAWK takes the winner's crown and NAWK keeps to the middle ground.

The dbc2c.awk script has been tested before and GNU AWK actually performs much better this time, 6.0 instead of 9.6 time slower than MAWK. Maybe just parsing one file instead of 3 helps or the input data produces less collisions for the hashing algorithm (AWK array indexes are always cast to string and stored in hash tables).

In any way I'd love to see some more benchmarks out there. And maybe someone bringing their favourite flavour of AWK to the table.

2014-07-03

AWK Reloaded

Last year I compared the performance of 3 AWK interpreters, NAWK, GAWK and MAWK. For the test I used 3 of my .awk scripts (available under Beerware). But the data I processed with them was confidential. Any way, NAWK won 1/3, MAWK 2/3 (with astonishing leads), GAWK was the clear looser with abysmal performance in 2/3 tests.

Recently I developed a run time interpreter for the Heidenhain NC (Numerical Control) language. The code of the script as well as the program it interprets are confidential, unfortunately. But the results are interesting nonetheless.

Tests

Test Environment

Unfortunately I cannot run the tests on the same machine as last time, it was stolen during a visit in the UK last winter.

So this time the tests are run on its replacement, an Intel Haswell Core i-7 (2 cores, 4 pipelines) at 2.4 GHz under FreeBSD 10 r267867 (amd64).

AWK Versions

nawk 20121220 (FreeBSD)
gawk 4.1.1
mawk 1.3.4

hhrti.awk [1 pt/s]

This is a test run with the aforementioned run time interpreter. There is a more in depth explanation at the end of this article.

150.43 s

149.41 s

149.29 s

67.40 s

67.86 s

67.61 s

48.97 s

47.48 s

48.02 s

Memory usage (maximum resident set size):

2864 k

4240 k

2760 k

xml.awk [100 pt/s]

This is one of the tests run last time, to confirm that the interpreters still compare similarly with the previous scripts, despite the updated test platform. That seems to be the case here.

0.16 s

0.16 s

0.16 s

0.57 s

0.57 s

0.58 s

0.02 s

0.02 s

0.02 s

Conclusions

Consistently with the previous performance tests, MAWK takes the lead. What is surprising that GAWK performs well with only 1.38 times the run time of MAWK, which is a far cry from the abysmal performance it exhibited in some of the other tests. A quick rerun of the previous tests shows the same performance gaps as before, so neither the slight version changes nor the new compiler version (clang 3.4.1) introduced a performance boost in GAWK.

The real surprise is the performance of NAWK. This is the first test case where it performs worse than GAWK, with a runtime factor of 3.0. That's a far cry from GAWK's sad >25 in the xml.awk case, but still it hints to a bottleneck in NAWK.

Differences to Previous Tests

This test is a lot less array heavy than the dbc2c.awk and xml.awk test cases. The parsing stage barely takes any time after that only small local arrays are used for temporary tokenizing. The most time consuming operation seems to be evaluating arithmetic, because whenever an operation is performed it creates a number of copy operations. Depending on the operator all following tokens need to be shifted one or two places.

Bottleneck Test

In order to verify the assumption that copies in arrays might be responsible I created a small script that performs this operation repeatedly:

BEGIN {
 TXT = "111 222 333 444 555 666 777 888 999 000"
 REPEAT = 100000
 if (ARGC > 1) {
  REPEAT = ARGV[--ARGC]
  delete ARGV[ARGC]
 }
 srand() # Seed

 # Perform the test this many times
 for (i = 1; i <= REPEAT; i++) {

  # Create an array with tokens
  len = split(TXT, a)
  a[0] = len # Store the length in index 0, this is very
             # convenient in real apps with lots of arrays

  # Test case, delete a random field until none
  # are left
  while (a[0]) {
   # Select a random entry to delete
   del = int(rand() * 65536) % a[0] + 1

   # Shift the following tokens left
   for (p = del; p < a[0]; p++) {
    a[p] = a[p + 1]
   }
   # Delete the tail
   delete a[a[0]--]
  }
 }
}

bottleneck.awk [10 pt/s]

This artificial test seems to confirm this, by reproducing the same performance pattern and amplifying the performance problem of NAWK. The script was run with 200000 repetitions.

12.24 s

12.28 s

12.17 s

2.29 s

2.29 s

2.28 s

1.68 s

1.63 s

1.69 s

Memory usage (maximum resident set size):

2488 k

3596 k

2476 k

The Heidenhain Real Time Interpreter

The Heidenhain NC language can be used to control an NC mill. I.e. access various functions of the machine, such as cooling systems, automatic tool changers and provide milling instructions. Additionally it has programming instructions that can be used to make on the fly calculations and decisions. The purpose of the interpreter is to perform arithmetic and conditional flow in advance.

The need for this arose with a program written for a research project, which is so computation heavy that it causes the machine to stutter.

The interpreter works in two stages, a code parsing stage and an evaluation stage.

Parsing

In this stage every command is stored in a one-way linked list. Additional code files may be called within a program, those are parsed after the current file has been completed and appended to the same list.

The Heidenhain NC language has several kinds of commands, most of these are pretty static, they access machine functions, or describe target coordinates or curves. These kinds of commands are what the interpreter outputs in the evaluation stage.

The other kind of commands provide arithmetic and program flow:

Variable assignments
Arithmetic expressions (really part of variable assignments)
Labels
Label calls
Conditional label calls (i.e. IF)
Program calls

The list is not complete, but it should get the idea across.

Every list entry is classified during parsing stage, and some are preprocessed. E.g. labels and subprogram entries are recorded in an associative array so they can be branched to in the evaluation stage.

Evaluation

In this stage the interpretation is performed. The program starts with an empty call stack at the first parsed code line. Each line is evaluated according to its classification.

Variable assignments are evaluated and stored in an associative array containing name, value pairs
Label calls are performed
Conditions are evaluated and either branch to a label, that is fetched from the array recorded during parsing, or continue with the next line
Calls to other programs cause a reference to the current code line to be pushed on the stack
Ends of programs cause a call back to the code line recorded on the stack, if the stack is empty the interpreter terminates
…

Every command that is not classified for special treatment receives the following default treatment:

Substitute variables with their current values
Output the command

The result is a flat NC program that does no longer contain arithmetic and conditional code. E.g:

0    BEGIN PGM mandelbrot_kante MM
1    BLK FORM 0.1 Z X+0 Y-90.0000 Z-50
2    BLK FORM 0.2 X+220.0000 Y+90.0000 Z+0
3    TOOL CALL 4 Z S32000 F8000
4    M3
5    L Z+20 FMAX
6    L X+110.0000 Y-52.2387 FMAX
7    L Z+2 FMAX
8    L Z-0.5000 F800
9    L X+110.2000 Y-52.2387 F8000
[...]
7758 L X+109.4000 Y-52.8387 F8000
7759 L X+109.4000 Y-52.6387 F8000
7760 L X+109.4000 Y-52.4387 F8000
7761 L X+109.4000 Y-52.2387 F8000
7762 L X+109.6000 Y-52.2387 F8000
7763 L X+109.8000 Y-52.2387 F8000
7764 L X+110.0000 Y-52.2387 F8000
7765 L X+110.2000 Y-52.2387 F8000
7766 L Z+50 FMAX
7767 END PGM mandelbrot_kante MM

2014-04-01

geli suspend/resume with Fulll Disk Encryption

This article has been updated 2014-04-01. Changes are marked with this background colour.

This article details my solution of the geli resume deadlock. It is the result of much fiddling and locking myself out of the file system.

The presented solution works most of the time, but it is still possible to lock up the system so far that VT-switching is no longer possible.

After my good old HP6510b notebook was stolen I decided to set up full disk encryption for its replacement. However after I set it up I faced the problem that the device would be wide open after resuming from suspend. That said I rarely reboot my system, I usually keep everything open permanently and suspend the laptop for transport or extended non-use. So the problem is quite severe.

Luckily the FreeBSD encryption solution geli(8) provides a mechanism called geli suspend that deletes the key from memory and stalls all processes trying to access the file system. Unfortunately geli resume would be one such process.

The System

So first things first, a quick overview of the system. If you ever set up full disk encryption yourself, you can probably skip ahead.

The boot partition containing the boot configuration, the kernel and its modules is not encrypted. It resides in the device ada0p2 labelled gpt/6boot. The encrypted device is ada0p4 labelled 6root. For easy maintenance and use the 6boot:/boot directory is mounted into 6root.eli:/boot (the .eli marks an attached encrypted device). Because /boot is a subdirectory in the 6boot file system, a nullfs(5) mount is required to access 6boot:/boot and mount it into 6root:/boot. To access 6boot:/boot, 6boot is mounted into /mnt/boot.

Usually mount automatically loads the required modules when invoked, but this doesn't work when the root file system doesn't contain them. So the required modules need to be loaded during the loader stage.

/boot/loader.conf

# Encrypted root file systeme
vfs.root.mountfrom="ufs:gpt/6root.eli"
geom_eli_load="YES"                     # FS crypto
aesni_load="YES"                        # Hardware AES

# Allow nullfs mounting /boot
nullfs_load="YES"
tmpfs_load="YES"

/etc/fstab

# Device           Mountpoint   FStype Options    Dump Pass
/dev/gpt/6root.eli /            ufs    rw,noatime 1    1
/dev/gpt/6boot     /mnt/boot    ufs    rw,noatime 1    1
/mnt/boot/boot     /boot        nullfs rw         0    0
/dev/gpt/6swap.eli none         swap   sw         0    0
# Temporary files
tmpfs              /tmp         tmpfs  rw         0    0
tmpfs              /var/run     tmpfs  rw         0    0

The Problem

The problem with geli suspend/resume is that calling geli resume ada0p4 deadlocks, because geli is located on the partition that is supposed to be resumed.

The Approach

The solution is quite simple. Put geli somewhere unencrypted.

To implement this several challenges need to be faced:

Challenge	Approach
Programming	Shell-scripting
Technology, avoiding file system access	Use tmpfs(5)
Usability, how to enter passphrases	Use a system console
Safety, the solution needs to be running before a suspend	Use an always on, unauthenticated console
Security, an unauthenticated interactive service is prone to abuse	Only allow password entry, no other kinds of interactive control
Safety, what about accidentally terminating the script	Ignore SIGINT

The Script

The complete script can be found at the bottom.

Constants

At the beginning of the script some read-only variables (the closest available thing to constants) are defined, mostly for convenience and to avoid typos.

#!/bin/sh
set -f

readonly gcdir="/tmp/geliconsole"
readonly dyn="/sbin/geli;/usr/bin/grep;/bin/sleep;/usr/sbin/acpiconf"
readonly static="/rescue/sh"

Bootstrapping

The script is divided into two parts, the first part is the bootstrapping section that requires file system access and creates the tmpfs with everything that is needed to resume suspended partitions.

The bootstrap is performed in a conditional block, that checks whether the script is runnig from gcdir. It ends with calling a copy of the script. The exec call means the bootstrapping process is replaced with the new call. The copy of the script will detect that it is running from the tmpfs and skip the bootstrapping:

# If this process isn't running from the tmpfs, bootstrap
if [ "${0#${gcdir}}" == "$0" ]; then
 …
 # Complete bootstrap
 exec "${gcdir}/sh" "${gcdir}/${0##*/}" "$@"
fi

Before completing the bootstrap, the tmpfs needs to be set up. Creating it is a good start:

# Create tmpfs
/bin/mkdir -p "${gcdir}"
/sbin/mount -t tmpfs tmpfs "$gcdir" || exit 1

# Copy the script before changing into gcdir, $0 might be a
# relative path
/bin/cp "$0" "${gcdir}/" || exit 1

# Enter tmpfs
cd "${gcdir}" || exit 1

The next step is to populate it with everything that is needed. I.e. all binaries required after performing the bootstrap. Two kinds of binaries are used, statically linked (see the static read-only) and dynamically linked (see the dyn read-only).

The static binaries can simply be copied into the tmpfs, the dynamically linked ones also require libraries, a list of which is provided by ldd(1).

Note the use of IFS (Input Field Separator) to split variables into multiple arguments and how subprocesses are used to limit the scope of IFS changes.

# Get shared objects
(IFS='
'
 for lib in $(IFS=';';/usr/bin/ldd -f '%p;%o\n' ${dyn}); do
  (IFS=';' ; /bin/cp ${lib})
 done
)

# Get executables
(IFS=';' ; /bin/cp ${dyn} ${static} "${gcdir}/")

The resulting tmpfs contains the binaries sh, geli, sleep, grep, acpiconf and all required libraries.

Interactive Stage

When reaching the interactive stage, the script is already run by a static shell within the tmpfs. The first order of business is to make sure the shell won't look for executables outside the tmpfs:

export PATH="./"

The next step is to trap some signals to make sure the script exits gracefully and is not terminated by pressing CTRL+C:

 
trap 'echo geliconsole: Exiting' EXIT
trap "/sbin/umount -f '${gcdir}' ; exit 0" SIGTERM
trap '' SIGINT SIGHUP

The last stage is a while-true loop that checks for suspended partitions and calls geli resume.

echo "geliconsole: Activated"
while :; do
 if geli list | grep -qFx 'State: SUSPENDED'; then
  geom="$(geli list | grep -FxB1 'State: SUSPENDED')"
  geom="${geom#Geom name: }"
  geom="${geom%%.eli*}"
  echo "geliconsole: Resume $geom"
  geli resume "$geom"
  echo .
 else
  sleep 2
 fi
done

The System Console

Because the script does not take care of grabbing the right console, it cannot simply be run from /etc/ttys. Instead it needs to be started by getty(8). To do this a new entry into /etc/gettytab is required:

#
# geliconsole
#
geliconsole|gc.9600:\
 :al=root:tc=std.9600:lo=/root/bin/geliconsole:

The entry defines a new terminal type called geliconsole with auto login.

The new terminal can now be started by the init(8) process by adding the following line to /etc/ttys:

ttyvb "/usr/libexec/getty geliconsole" xterm on  secure

With kill -HUP 1 the init process can be notified of the change.

The console should now be available on terminal 11 (CTRL+ALT+F12) and look similar to this:

FreeBSD/amd64 (AprilRyan.norad) (ttyvb)

geliconsole: Activated

Suspending

In order to automatically suspend disks update /etc/rc.suspend:

…
/usr/bin/logger -t $subsystem suspend at `/bin/date +'%Y%m%d %H:%M:%S'`
/bin/sync && /bin/sync && /bin/sync
/bin/rm -f /var/run/rc.suspend.pid
/usr/sbin/vidcontrol -s 12 < /dev/ttyv0 > /dev/ttyv0
/sbin/geli suspend -a
# The following delay may be reduced, by how much depends on the system, I am using a 1 second delay.
/tmp/geliconsole/sleep 3

if [ $subsystem = "apm" ]; then
 /usr/sbin/zzz
else
 # Notify the kernel to continue the suspend process
 /tmp/geliconsole/acpiconf -k 0
fi

exit 0

The vidcontrol command VT-switches to the geli console, before the geli command suspends all encrypted partitions. They can be recovered by pressing CTRL+ALT+F12 to enter the console and entering the passphrase there.

In order for the VT-switch to work without flaw, the automatic VT switch to console 0 needs to be turned off:

# sysctl hw.syscons.sc_no_suspend_vtswitch=1
# echo hw.syscons.sc_no_suspend_vtswitch=1 >> /etc/sysctl.conf

Desirable Improvements

For people running X, especially with a version where X breaks the console (like is currently the case with KMS support), it would be nice to enter the keywords through a screen locker.

Also it is not really necessary to run the script with root privileges. A dedicated, less privileged user account, should be created and used.

Files

/root/bin/geliconsole

#!/bin/sh
set -f

readonly gcdir="/tmp/geliconsole"
readonly dyn="/sbin/geli;/usr/bin/grep;/bin/sleep;/usr/sbin/acpiconf"
readonly static="/rescue/sh"

# If this process isn't running from the tmpfs, bootstrap
if [ "${0#${gcdir}}" == "$0" ]; then
 # Create tmpfs
 /bin/mkdir -p "${gcdir}"
 /sbin/mount -t tmpfs tmpfs "$gcdir" || exit 1

 # Copy the script before changing into gcdir, $0 might be a
 # relative path
 /bin/cp "$0" "${gcdir}/" || exit 1

 # Enter tmpfs
 cd "${gcdir}" || exit 1

 # Get shared objects
 (IFS='
'
  for lib in $(IFS=';';/usr/bin/ldd -f '%p;%o\n' ${dyn}); do
   (IFS=';' ; /bin/cp ${lib})
  done
 )

 # Get executables
 (IFS=';' ; /bin/cp ${dyn} ${static} "${gcdir}/")

 # Complete bootstrap
 exec "${gcdir}/sh" "${gcdir}/${0##*/}" "$@"
fi

export PATH="./"
 
trap 'echo geliconsole: Exiting' EXIT
trap "/sbin/umount -f '${gcdir}' ; exit 0" SIGTERM
trap '' SIGINT SIGHUP

echo "geliconsole: Activated"
while :; do
 if geli list | grep -qFx 'State: SUSPENDED'; then
  geom="$(geli list | grep -FxB1 'State: SUSPENDED')"
  geom="${geom#Geom name: }"
  geom="${geom%%.eli*}"
  echo "geliconsole: Resume $geom"
  geli resume "$geom"
  echo .
 else
  sleep 2
 fi
done