SSD Speed on Dell XPS 13 9350 with Samsung EVO 970 Plus

In continuation of blog-post SSD Speed on Dell XPS 13 9350, here are performance measurements for a Samsung EVO 970 Plus in Dell XPS 13.

Caveat from Dell xps 13 9350 and Samsung 970 evo M.2 nvme Compatability?:

However, be aware that although the XPS 13 9350 uses a PCIe x4 configuration for its NVMe interface, the lanes are run in power saving mode, not max performance mode, and this cannot be changed by the user. 4 lanes in power saving mode is roughly equivalent to 2 lanes in max performance mode, so the result is that your sequential read and write speeds will max out at 1.8 GB/s, even though the 970 Evo can do much more than that.

Installing the new Samsung EVO 970 Plus.

Mounted in laptop:

Read speed of unencrypted disk:

Once again read speed, this with a LUKS encrypted disk:

ASRock DeskMini A300M with AMD Ryzen 3400G

Below are some photographs during assembly of the Asrock A300M with an AMD Ryzen 5 Pro 3400G processor.

The Asrock web-site detailing the specs of the A300M: DeskMini A300 Series.

Three noticable reviews on the A300M:

  1. Anandtech has a very readable review of the A300M: Home> Systems The ASRock DeskMini A300 Review: An Affordable DIY AMD Ryzen mini-PC
  2. A short review from Techspot: Asrock DeskMini A300 Review.
  3. A German review with many photos during assembly: ASRock DeskMini A300 mit AMD Ryzen 5 2400G im Test.

1. CPU. Three photos from the AMD 3400G CPU:

2. Power. The power supply of the A300M will provide at most 19V x 6.32A = 120W.

3. Dimensions. The case has volume of at most two liters, exemplified by the two milk cartons.

4. Mounting. CPU mounted on motherboard.

5. Temperature. I installed a Noctua NH-L9a-AM4 cooler. Running the AMD at full speed with full load shows the following temperature using command sensors:

Adapter: PCI adapter
vddgfx:           N/A
vddnb:            N/A
edge:         +85.0°C  (crit = +80.0°C, hyst =  +0.0°C)

Adapter: PCI adapter
Vcore:         1.23 V
Vsoc:          1.07 V
Tctl:         +85.2°C
Tdie:         +85.2°C
Icore:       100.00 A
Isoc:          9.00 A

Adapter: PCI adapter
Composite:    +57.9°C  (low  =  -0.1°C, high = +74.8°C)
                       (crit = +79.8°C)

Adapter: ISA adapter
in0:                   656.00 mV (min =  +0.00 V, max =  +1.74 V)
in1:                     1.86 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                     3.41 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                     3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                   328.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                   216.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                   448.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                     3.39 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                     3.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                     1.84 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                  256.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                  216.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                    1.86 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                    1.71 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                  272.00 mV (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     0 RPM  (min =    0 RPM)
fan2:                  2626 RPM  (min =    0 RPM)
fan3:                     0 RPM  (min =    0 RPM)
fan4:                     0 RPM  (min =    0 RPM)
fan5:                     0 RPM  (min =    0 RPM)
SYSTIN:                 +97.0°C  (high =  +0.0°C, hyst =  +0.0°C)  sensor = thermistor
CPUTIN:                 +87.5°C  (high = +80.0°C, hyst = +75.0°C)  ALARM  sensor = thermistor
AUXTIN0:                +62.0°C  (high =  +0.0°C, hyst =  +0.0°C)  ALARM  sensor = thermistor
AUXTIN1:                +94.0°C    sensor = thermistor
AUXTIN2:                +90.0°C    sensor = thermistor
AUXTIN3:                +86.0°C    sensor = thermistor
SMBUSMASTER 0:          +85.0°C
PCH_CHIP_TEMP:           +0.0°C
PCH_CPU_TEMP:            +0.0°C
intrusion0:            OK
intrusion1:            ALARM
beep_enable:           disabled

Fully loaded:

  1  [||||||||||||||||||||||||||||||||||||||||                            52.9%]   Tasks: 120, 405 thr; 5 running
  2  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||99.4%]   Load average: 7.28 6.82 6.02
  3  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||99.4%]   Uptime: 8 days, 07:27:51
  4  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||94.1%]
  5  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
  6  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||  88.2%]
  7  [|||||||||||||||||||||||||||||||||                                   44.2%]
  8  [|||||||||||||||||||||||||||||||||||||||                             50.6%]
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||       28.7G/60.8G]
  Swp[                                                                    0K/0K]

Added, 22-Aug-2020: I noticed that the A300M motherboard has a very significant clock lag. So you are requried to run ntpdate or timesyncd.

Parallelization and CPU Cache Overflow

In the post Rewriting Perl to plain C the runtime of the serial runs were reported. As expected the C program was a lot faster than the Perl script. Now running programs in parallel showed two unexpected behaviours: (1) more parallelizations can degrade runtime, and (2) running unoptimized programs can be faster.

See also CPU Usage Time Is Dependant on Load.

In the following we use the C program siriusDynCall and the Perl script siriusDynUpro which was described in above mentioned post. The program or scripts reads roughly 3GB of data. Before starting the program or script all this data has been already read into memory by using something like wc or grep.

1. AMD Processor. Running 8 parallel instances, s=size=8, p=partition=1(1)8:

for i in 1 2 3 4 5 6 7 8; do time siriusDynCall -p$i -s8 * > ../resultCp$i & done
real 50.85s
user 50.01s
sys 0

Merging the results with the sort command takes a negligible amount of time

sort -m -t, -k3.1 resultCp* > resultCmerged

Best results are obtained when running just s=4 instances in parallel:

$ for i in 1 2 3 4 ; do /bin/time -p siriusDynCall -p$i -s4 * > ../dyn4413c1p$i & done
real 33.68
user 32.48
sys 1.18

Continue reading

Filippo Mantovani: ARM for HPC

On 23-Oct-2017 Filippo Mantovani held a talk in Darmstadt on “Mobile technology for production-ready high-performance computing systems: The path of the Mont-Blanc project”. Unfortunately I was unable to attend, but Mr. Mantovani sent me his Darmstadt Seminar slides. As his slides and documents are very interesting to people using or intending to use ARM in HPC, I copy these documents here, so they are easily available. I also copied a report on “MB3_D6.4 Report on application tuning and optimization on ARM platform“.

HP ePrint Obsolescense

HP (Hewlett Packard) manufacturer of printers of various sorts (laser, ink) unfortunately again decided to annoy its loyal customers. Repeatedly HP updated firmware in printers so that they do not work with alternate ink, see for example Disabling 3rd-party ink ensures “best printing experience”. Now for the last years they disabled the so called apps (ePrint functionality): after a few years the printer can no longer connect to the web-services of HP. See, for example, HP Apps Service Retired on several printers.

In my case, I bought a CM1415 in November 2011. It can no longer connect to HP web-services. Therefore I can no longer send e-mails to the printers which then get printed. In November 2014 I bought an M276nw, a similar printer as the CM1415. This model can still connect to HP web-services. So it looks like after six years HP silently disables functionality. These web-services also offer other services, like weather forecasts, news, sudokus, etc.

These chicanery make it clear that customers should not trust cloud-services, or at least have a contingency plan if these services do not work, or have prices which are ridiculous. Recently the price increase by Firebase made headlines, see Firebase Costs Increased by 7,000%!

Bluetooth Headphones in Arch Linux

There is a big difference between noise-cancelling headphones, and classical headphones without noise-cancelling ability! Especially when you use them in a noisy environment, like a plane or a large office bureau. Inspired by a positive review of the Bose headphones by Marques Brownlee, I bought them.

Here is the review:

Pairing with my OnePlus One smartphone was completely automatic and works like a charm. No further explanation is required.

Paring with a PC/laptop running Arch Linux needed a little more effort. Some advice in Bluetooth headset helped me alot.

One-time configuation:

systemctl start bluetooth.service
hciconfig hci0 up piscan
pacmd list-sinks | grep index:

In bluetoothctl enter

pair xx:yy:...
trust xx:yy:...
connect xx:yy:...

I had to delete the directory below /var/lib/bluetooth. Apparently something was stored there which shouldn’t have been there.

Once the pairing works, as described above, I just use:

systemctl start bluetooth.service
hciconfig hci0 up piscan

for starting bluetooth and making my PC with bluetooth visible. I switch on the headphone, which normally finds the PC in less than a second. Then I have to set the right sink via pacmd:

pacmd set-default-sink `pacmd list-sinks | grep index: | tail -1 | cut -d " " -f6`

Checking that all is well is:

pacmd list-sinks | grep index:

Once paired with your Linux machine, yo have to repeat set-default-sink if you lose the connection to your Linux machine, for example by walking too far away. If you lose Bluetooth connection apparently sound output will go to the regular speakers of your Linux machine. In case you are working in a large office other people will hear your music. Of course, you can mute the regular loudspeakers of your Linux machine using

pacmd set-sink-volume 0 0

assuming sink #0 is regular loudspeaker.

Added 13-Oct-2017: Arch Linux dropped hciconfig+hcictool from package bluez-utils since version 5.44. My latest good package still containing these commands is 5.43-2, for easy reference located here bluez-utils-5.43-2-x86_64.pkg.tar.xz. Download it and use ascii2hex.c to convert from hex to ascii/binary because does not allow XZ-files.

SSD Speed on Dell XPS 13 9350

In Hard-Disk and SSD Speed Comparisons I compared a Mushkin SSD with 60 GB against a ADATA with 128 GB against a Seagate 1 TB hard disk drive. The SSD’s had roughly three times the speed of the hard disk drive. I.e., 380 MB/s vs. 134 MB/s for reading Mushkin vs. Seagate, and 297 MB/s vs. 131 MB/s for writing ADATA vs. Seagate.

I also compared USB-thumb-drives against above Seagate 1 TB hard drive in Hard-Disk and USB Drive Speed Comparisons. Read times were comparable (100-200 MB/s), while for writing the Seagate drive was roughly 4 to 5-times faster (100 MB/s vs 20 MB/s).

Hard drives speeds and prices in 2013 are given in Harddisk Drive Speed in MB/s. Read speeds are roughly 200 MB/s for enterprise drives.

Now the read- and write-speeds of the SSD in the Dell XPS 13 9350 are quite astonishing: up to 1.5 GB/s for reading, 532 MB/s for writing. Even if you use LUKS/dm-crypt the values are 840 MB/s for reading, and 428 MB/s for writing. Below are the measurements using gnome-disk-utility.

Without encryption:
Screenshot from 2016-07-17 14-47-25

Continue reading

Small Scale Computing

Below is a short overview of small scale computers for use in embedded computing.

  1. Intel Galileo, single core Quark/Pentium, ca. 100 USD, <15 W
  2. Intel Edison, dual core Atom Silvermont, 500 MHz, ca. 100 USD, <1 W
  3. Raspberry Pi, single core ARM, 700 MHz, ca. 35 USD, ~1 W
  4. Arduino series, e.g., Intel Quark/Cortex, 32 MHz, ca. 15 USD, ~1 W
  5. ESP8266 (spec), single core Tensilica Xtensa LX106, 80 MHz, 7 USD, 1 mW

On the other end of the spectrum there are Mini PCs.

  1. Intel NUC, quadcore Intel i5/i7, ca. 400 USD (no SSD+no RAM), 6-60 W
  2. Apple Mac Mini, Intel i5/i7, ca. 500 USD, 6-85 W
  3. Apple Mac Pro, Intel 3.5 GHz 6-Core Intel Xeon E5, ca. 4,000 USD, <450 W

Now comes the fun part: connecting many small scale computers to a cluster.

Connecting 64 Raspberry Pi running MPI and putting them all into a Lego frame, see Raspberry Pi at Southampton.

Clustering 48 PandaBoards and putting this all into an industrial trashcan, see Phoronix: Building A 96-Core Ubuntu ARM Solar-Powered Cluster.

Clustering ten Intel NUCs each with 10GB RAM: Orange Box, see The Orange Box: Cloud for the Free Man, motto: Most planes fly in clouds…this cloud flies in planes!

Mac Pros are offered in the cloud, see Dedicated Mac Pro server hosting.

Output of lstopo from hwloc

This is the output of

lstopo --of png > ~/tmp/lstopo.png

for a machine with an AMD octacore FX 8120, bulldozer architecture, see AMD Bulldozer CPU Architecture Overview.


One can just type lstopo, which shows the same in a separate window. lstopo is part of hwloc in Arch or hwloc in Ubuntu.

Below is the output for an Intel NUC, with 4th generation/Haswell Core i5-4250U:

See Vol 1 datasheet, Vol 2 datasheet.

Added 06-Jan-2018: Below is the output for Skylake i7-6600U in an HP EliteBook notebook: