WebP Cloud Services Blog

Hetzner AX102 Review: Why Databases Need Enterprise NVMe (vs. AX41) — The Importance of PLP and fsync Performance

Recently, we purchased several Hetzner AX102 dedicated servers and added a 15.36 TB NVMe SSD Datacenter Edition for our internal services. Seeing that there are very few reviews of this machine online, we wanted to share some information about it, hoping it might help those who are considering this configuration.

Basic Information

The machine uses an AMD Ryzen™ 9 7950X3D CPU, comes with 128 GB DDR5 ECC RAM and 2x 1.92 TB NVMe SSD Datacenter Edition (Gen4), with a base price of €104 per month (Finland region).

Additionally, we added an SSD that Hetzner calls the 15.36 TB NVMe SSD Datacenter Edition, for €130 per month.

For the Hetzner AX line, AX102 is the first “Enterprise-grade” configuration.

In previous configurations, either the memory lacked ECC (such as the AX52 and the early AX41-NVMe), or the included drives were not Datacenter Edition. While machines without ECC and Datacenter drives are fine for running stateless applications, once you get to a production environment, it’s easy to pay a price that shouldn’t have been paid, for example:

“Living in the Cloud for too long gives you a false sense of security—you start believing that as long as you don’t mess up the config, the hardware, filesystem, and RAM will never fail.

Moving to bare metal feels like someone is running Chaos Engineering on your infrastructure—bizarre and inexplicable issues just keep popping up.

(I have to say, trying to run production services on a combo of Consumer CPUs, Non-ECC RAM, and that Samsung MZVL22T0HBLB-00B00 is absolute torture.)”

https://x.com/n0vad3v/status/1917941496155980105

So this time, we specifically chose a configuration with full ECC + “Datacenter Edition.” After taking sufficient security measures (referring to cold/hot backups), we migrated some critical services to the AX102 to evaluate its overall stability and ROI performance.

Basic Benchmark

Without further ado, let’s start with the basic tests, referencing the script from our blog’s first and only post to hit the front page of Hacker News: “The performance review of Hetzner’s CAX-line ARM64 servers and the practical experience of WebP Cloud Services on them.”.

Using the test script command:

curl -sL yabs.sh | bash -s -- -i

Basic Specs

Processor  : AMD Ryzen 9 7950X3D 16-Core Processor
CPU cores  : 32 @ 4865.977 MHz
AES-NI     : ✔ Enabled
VM-x/AMD-V : ✔ Enabled
RAM        : 124.9 GiB
Swap       : 0.0 KiB
Disk       : 15.6 TiB
Distro     : Ubuntu 24.04.3 LTS
Kernel     : 6.8.0-85-generic
VM Type    : NONE
IPv4/IPv6  : ✔ Online / ✔ Online

IPv6 Network Information:
---------------------------------
ISP        : Hetzner Online GmbH
ASN        : AS24940 Hetzner Online GmbH
Location   : Helsinki, Uusimaa (18)
Country    : Finland


Geekbench 6 Benchmark Test:
---------------------------------
Test            | Value                         
                |                               
Single Core     | 2365                          
Multi Core      | 14981                         

Basic Disk Information

Since this machine comes with three disks, the models are as follows:

  • 2x 1.92 TB NVMe SSD Datacenter Edition
    • Micron_7450_MTFDKCC1T9TFR
  • 15.36 TB NVMe SSD Datacenter Edition
    • MTFDKCC15T3TGP-1BK1DABYY

Through searching, we know that:

The 1.92 TB one is the Micron 7450 NVMe SSD, official link: https://www.micron.com/products/storage/ssd/data-center-ssd/7450-ssd, with nominal performance as follows:

- Sequential 128KB READ: Up to 6800 MB/s
- Sequential 128KB WRITE: Up to 5600 MB/s
- Random 4KB READ: Up to 1,000,000 IOPS
- Random 4KB WRITE: Up to 400,000 IOPS

The 15.36 TB one is still Micron, but it’s the 7500 PRO, with nominal performance as follows:

Read speed (max.) 7000 MB/s
Write speed (max.)  5900 MB/s
Read access (max.)  1,100,000 IOPS
Write access (max.) 250,000 IOPS

Since the 1.92 TB NVMe drives are in RAID, the test results might not be representative, so this time we focus on testing the 15T NVMe. In the test script above, we obtained the following data:

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/nvme2n1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 775.69 MB/s (193.9k) | 2.69 GB/s    (42.0k)
Write      | 777.74 MB/s (194.4k) | 2.70 GB/s    (42.2k)
Total      | 1.55 GB/s   (388.3k) | 5.39 GB/s    (84.3k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 3.10 GB/s     (6.0k) | 3.10 GB/s     (3.0k)
Write      | 3.26 GB/s     (6.3k) | 3.30 GB/s     (3.2k)
Total      | 6.37 GB/s    (12.4k) | 6.41 GB/s     (6.2k)

To be honest, it looks quite impressive. To help readers compare the performance with other Hetzner machines, we selected and purchased an AX41-NVMe machine (which does not have Datacenter NVMe) and two Cloud machines. The test results are pasted below.

The machines participating in the comparison are as follows:

  • AX41-NVMe (Another dedicated server, AMD Ryzen™ 5 3600 CPU, paired with 512G standard NVMe)
    • The disk is SAMSUNG MZVL2512HCJQ-00B00 (Samsung PM9A1)
  • CCX63 (The largest configuration under Dedicated Resources, featuring 48 Cores, 192GB RAM, and 960GB LocalSSD)
  • CAX41 (The largest configuration under ARM64, featuring 16 Cores, 32GB RAM, and 320GB LocalSSD)

For Cloud tests, we ran them on the / directory of the machine to test the built-in LocalSSD performance. The AX41-NVMe machine was tested on the partition where the hard drive was mounted.

AX41-NVMe

---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 509.95 MB/s (127.4k) | 1.43 GB/s    (22.4k)
Write      | 511.30 MB/s (127.8k) | 1.44 GB/s    (22.5k)
Total      | 1.02 GB/s   (255.3k) | 2.87 GB/s    (44.9k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 1.87 GB/s     (3.6k) | 2.05 GB/s     (2.0k)
Write      | 1.97 GB/s     (3.8k) | 2.19 GB/s     (2.1k)
Total      | 3.84 GB/s     (7.5k) | 4.24 GB/s     (4.1k)

It can be seen that the performance on the raw NVMe disk is quite good, but it is slightly inferior in the Cloud tests, as follows:

CCX63

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/sda1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 109.71 MB/s  (27.4k) | 1.00 GB/s    (15.7k)
Write      | 110.00 MB/s  (27.5k) | 1.01 GB/s    (15.8k)
Total      | 219.71 MB/s  (54.9k) | 2.02 GB/s    (31.5k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 1.33 GB/s     (2.5k) | 1.18 GB/s     (1.1k)
Write      | 1.40 GB/s     (2.7k) | 1.26 GB/s     (1.2k)
Total      | 2.73 GB/s     (5.3k) | 2.44 GB/s     (2.3k)

CAX41

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/sda1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 96.69 MB/s   (24.1k) | 579.80 MB/s   (9.0k)
Write      | 96.62 MB/s   (24.1k) | 597.04 MB/s   (9.3k)
Total      | 193.32 MB/s  (48.3k) | 1.17 GB/s    (18.3k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 816.20 MB/s   (1.5k) | 1.09 GB/s     (1.0k)
Write      | 886.02 MB/s   (1.7k) | 1.22 GB/s     (1.1k)
Total      | 1.70 GB/s     (3.3k) | 2.31 GB/s     (2.2k)

Oh, I have to mention, if you have mounted a Hetzner Volume, the result you get is like this:

fio Disk Speed Tests (Mixed R/W 50/50):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 29.70 MB/s    (7.4k) | 314.04 MB/s   (4.9k)
Write      | 29.68 MB/s    (7.4k) | 323.38 MB/s   (5.0k)
Total      | 59.38 MB/s   (14.8k) | 637.43 MB/s   (9.9k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 298.03 MB/s    (582) | 288.82 MB/s    (282)
Write      | 323.52 MB/s    (631) | 322.23 MB/s    (314)
Total      | 621.56 MB/s   (1.2k) | 611.05 MB/s    (596)

So, do not run any strange high-load applications on a Volume, otherwise the IOWait will make you doubt your life.

Datacenter SSD?

This time we specifically mentioned the Datacenter NVMe, so what is the main difference between it and a standard NVMe?

First, in terms of macro design philosophy, the goals of the two are distinct:

  • Standard NVMe (Consumer): The design goal is “Burst Power”. It assumes the user’s load is bursty and read-heavy/write-light (e.g., system boot, game loading). It relies heavily on SLC Cache (simulated SLC); once the cache is full, performance falls off a cliff.
  • Enterprise NVMe (Enterprise): The design goal is “Determinism” and Steady State performance. It assumes the device will run at full load 24/7, valuing not just average latency, but even more so the stability of the 99.99th percentile (P99.99) tail latency.

Therefore, we often see some consumer-grade NVMes boasting very high (sequential) read/write speeds, and we also find that for the same price, enterprise-grade NVMes are significantly more expensive than consumer-grade ones.

Okay, at this point, some readers might ask: “I don’t care about these so-called high-load average latencies and tail latency stability. I have sufficient disaster recovery measures. Is it unnecessary to buy an Enterprise NVMe, and can I just use a Consumer NVMe?”

Remember that Twitter screenshot above? I thought the same thing at the time…

This brings in another advantage of Enterprise NVMe—PLP (Power Loss Protection). Before understanding PLP, we need to look at what fsync is:

In a Linux system, when you call write to write data, the operating system, for performance reasons, does not immediately write the data to the SSD. Instead, it throws it into the Page Cache in memory and immediately tells you “it’s written.”

This is not a problem in general scenarios. Many standard file operations on the OS do not require data to land on the disk immediately but rely on the kernel’s pdflush or work queues to leisurely perform dirty page writeback in the background, thereby trading for ultimate memory read/write performance.

For instance, you can see such a setting in Windows Device Manager -> Disk Management (it is enabled by default):

But databases don’t think this way. To ensure that every piece of data is truly saved (rather than waiting for the OS to commit), databases (like MySQL’s redo log flushing) will forcefully call fsync or fdatasync.

The semantics of fsync are: “I don’t care how you (OS) and the hard drive coordinate. I am blocking the thread right now until the hard drive explicitly replies to me that ’the data has been physically etched onto the medium,’ only then will I proceed to the next step.”

For a standard consumer SSD without power loss protection (PLP), when it receives an fsync command from the host, it goes through the following steps:

  1. Force Flush: The controller must immediately program and write all relevant dirty data in the DRAM cache into the NAND Flash particles.
  2. Physical Latency: Writing to NAND Flash is not fast. The program time for TLC/QLC is usually between 50us to several hundred us or even longer.
  3. Return Confirmation: Only when the charge is truly injected into the wafer’s cells does the controller dare to send an “ACK” confirmation signal to the host.

During this process, the CPU is waiting, the database thread is waiting, and all business logic is waiting for these few hundred microseconds of physical operation. Therefore, in scenarios with frequent Sync Write (such as database transaction commits), the IOPS of consumer SSDs is tightly locked by the write latency of the physical medium, usually only achieving a few thousand IOPS.

For Enterprise SSDs, because they are equipped with PLP:

  • When an fsync command is received, the controller only needs to write the data to the onboard DRAM cache to immediately return “Success” to the OS.
  • Principle: Even if the power is suddenly pulled at this moment, the power provided by the onboard capacitors is enough for the controller to flush the dirty data from DRAM into NAND within a few milliseconds.
  • Consequence: The latency of fsync is equivalent to the latency of writing to DRAM (nanoseconds). This allows Enterprise SSDs to achieve IOPS that are 10 times or more higher than consumer grades in Sync Write scenarios.

The difference mentioned above reflects on the databases that everyone usually runs on servers (who doesn’t have a database running on their server nowadays?) as follows:

To ensure the “D” (Durability) in ACID, databases usually require the Write Ahead Log (WAL) to land on the disk (e.g., MySQL’s innodb_flush_log_at_trx_commit=1).

  • Phenomenon: This writing is usually small-block (4K~16K), sequential, but extremely frequent fsync calls.
  • Standard NVMe: Even if the nominal sequential write speed is 3000MB/s, under frequent fsync, the throughput may drop to tens of MB/s, causing the database TPS (Transactions Per Second) to stagnate.
  • Enterprise NVMe: Relying on PLP, it can easily handle high-frequency small-block Sync Writes, and TPS scales almost linearly with the CPU.

At this point, some students might ask, how can we intuitively feel this gap?

PGBench

To make it easy to feel intuitively, we use pgbench included with PGSQL to run a test. The test uniformly uses the following method to start PGSQL (I know using containers to start it isn’t a recommended way, but isn’t this just for testing?). On Hetzner Cloud, it runs in the / directory. On the AX102 and AX41-NVMe machines, it runs under the 15T NVMe disk mounted with the ext4 file system.

version: "3.8"

services:
  db:
    image: postgres:18-alpine
    restart: always
    shm_size: 1g
    environment:
      PGUSER: postgres
      POSTGRES_PASSWORD: hellopassword
      POSTGRES_DB: tester
    volumes:
      - ./db_data:/var/lib/postgresql/18/docker
    healthcheck:
      test: [ "CMD", "pg_isready", "-h", "db", "-U", "${PGUSER:-postgres}", "-d", "${POSTGRES_DB:-tester}" ]
      interval: 1s
      timeout: 3s
      retries: 60

The test commands are as follows:

createdb pgbench_test
pgbench -i -s 100 pgbench_test
pgbench -c 16 -j 4 -T 60 pgbench_test

AX102

pgbench (18.0)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 16
number of threads: 4
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 1556888
number of failed transactions: 0 (0.000%)
latency average = 0.617 ms
initial connection time = 9.362 ms
tps = 25951.831902 (without initial connection time)

AX41-NVMe

Let’s look at the performance of the AX41-NVMe, which is also a dedicated server + NVMe, and based on the disk benchmarks above, shows little difference, but is not a Datacenter SSD:

pgbench (18.1)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 16
number of threads: 4
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 323149
number of failed transactions: 0 (0.000%)
latency average = 2.971 ms
initial connection time = 12.635 ms
tps = 5385.896737 (without initial connection time)

You can see that once an fsync scenario is encountered, the TPS instantly drops to only 1/5.

Please note that the difference here is not solely due to storage performance. Since the AX102 is equipped with a 7950X3D CPU that is significantly more powerful than the AX41’s Ryzen 3600, this also contributes to the higher TPS.

However, please pay close attention to the average latency and the subsequent fio fsync specific tests. In terms of average latency, the AX102 clocked in at 0.617 ms, compared to 2.971 ms for the AX41-NVMe.

In the fio test—a pure I/O benchmark that eliminates CPU interference—the enterprise-grade drives demonstrated an IOPS advantage of over 50x. This is the key factor in preventing database lag/stuttering.

By the way, using the same method, let’s look at the Cloud performance:

CCX63

pgbench (18.0)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 16
number of threads: 4
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 388748
number of failed transactions: 0 (0.000%)
latency average = 2.469 ms
initial connection time = 19.118 ms
tps = 6480.864372 (without initial connection time)

CAX41

pgbench (18.0)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 16
number of threads: 4
maximum number of tries: 1
duration: 60 s
number of transactions actually processed: 301215
number of failed transactions: 0 (0.000%)
latency average = 3.185 ms
initial connection time = 39.615 ms
tps = 5023.234052 (without initial connection time)

FIO

Finally, we also used the same fio program to test the disk performance with fsync.

Test command:

TEST_FILE=/mnt/fio_fsync_test.dat

fio --name=fsync-test --filename=$TEST_FILE --size=5G --ioengine=libaio --direct=0 --rw=write --bs=8k --numjobs=1 --iodepth=1 --fsync=1 --group_reporting

Simple put:

  • AX102
    • min= 8306, max=10842, avg=10318.05, stdev=512.82, samples=126
  • AX41-NVMe
    • min= 146, max= 194, avg=187.15, stdev= 4.94, samples=7003
  • CCX63
    • min= 602, max= 1178, avg=822.16, stdev=115.33, samples=1594
  • CAX41
    • min= 418, max= 991, avg=727.56, stdev=93.69, samples=1801

The results are as follows (since the results are too long, they have been collapsed):

<details> <summary>Click me to expand fio results</summary>

CAX41

fio --name=fsync-test --filename=$TEST_FILE --size=5G --ioengine=libaio --direct=0 --rw=write --bs=8k --numjobs=1 --iodepth=1 --fsync=1 --group_reporting
fsync-test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.36
Starting 1 process
fsync-test: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=6064KiB/s][w=758 IOPS][eta 00m:00s]
fsync-test: (groupid=0, jobs=1): err= 0: pid=7397: Tue Nov 11 13:22:04 2025
  write: IOPS=726, BW=5815KiB/s (5955kB/s)(5120MiB/901547msec); 0 zone resets
    slat (usec): min=8, max=7282, avg=34.96, stdev=31.44
    clat (nsec): min=1120, max=3602.4k, avg=3768.39, stdev=8391.96
     lat (usec): min=9, max=7368, avg=38.73, stdev=33.14
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    3], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    5], 90.00th=[    5], 95.00th=[    6],
     | 99.00th=[   13], 99.50th=[   21], 99.90th=[   45], 99.95th=[   80],
     | 99.99th=[  277]
   bw (  KiB/s): min= 3344, max= 7935, per=100.00%, avg=5821.29, stdev=749.69, samples=1801
   iops        : min=  418, max=  991, avg=727.56, stdev=93.69, samples=1801
  lat (usec)   : 2=0.65%, 4=74.41%, 10=23.75%, 20=0.67%, 50=0.44%
  lat (usec)   : 100=0.05%, 250=0.03%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%, 4=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=468, max=50797, avg=1336.17, stdev=774.61
    sync percentiles (usec):
     |  1.00th=[  717],  5.00th=[  783], 10.00th=[  832], 20.00th=[  898],
     | 30.00th=[  963], 40.00th=[ 1037], 50.00th=[ 1123], 60.00th=[ 1237],
     | 70.00th=[ 1401], 80.00th=[ 1598], 90.00th=[ 1975], 95.00th=[ 2474],
     | 99.00th=[ 4490], 99.50th=[ 5342], 99.90th=[ 8979], 99.95th=[11207],
     | 99.99th=[17957]
  cpu          : usr=1.06%, sys=8.48%, ctx=1312544, majf=0, minf=13
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,655360,0,655359 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=5815KiB/s (5955kB/s), 5815KiB/s-5815KiB/s (5955kB/s-5955kB/s), io=5120MiB (5369MB), run=901547-901547msec

Disk stats (read/write):
  sda: ios=18/1968375, sectors=528/31683384, merge=0/1321571, ticks=6/714675, in_queue=986506, util=77.33%

AX102

fio --name=fsync-test --filename=$TEST_FILE --size=5G --ioengine=libaio --direct=0 --rw=write --bs=8k --numjobs=1 --iodepth=1 --fsync=1 --group_reporting
fsync-test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.36
Starting 1 process
fsync-test: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=80.2MiB/s][w=10.3k IOPS][eta 00m:00s]
fsync-test: (groupid=0, jobs=1): err= 0: pid=2347193: Tue Nov 11 14:08:52 2025
  write: IOPS=10.3k, BW=80.6MiB/s (84.5MB/s)(5120MiB/63524msec); 0 zone resets
    slat (usec): min=3, max=847, avg= 5.86, stdev= 4.22
    clat (nsec): min=480, max=625479, avg=640.36, stdev=955.14
     lat (usec): min=4, max=848, avg= 6.50, stdev= 4.47
    clat percentiles (nsec):
     |  1.00th=[  490],  5.00th=[  502], 10.00th=[  510], 20.00th=[  510],
     | 30.00th=[  524], 40.00th=[  532], 50.00th=[  540], 60.00th=[  548],
     | 70.00th=[  588], 80.00th=[  684], 90.00th=[  876], 95.00th=[ 1032],
     | 99.00th=[ 1448], 99.50th=[ 1864], 99.90th=[ 8512], 99.95th=[11968],
     | 99.99th=[21632]
   bw (  KiB/s): min=66448, max=86736, per=100.00%, avg=82544.38, stdev=4102.53, samples=126
   iops        : min= 8306, max=10842, avg=10318.05, stdev=512.82, samples=126
  lat (nsec)   : 500=1.22%, 750=83.20%, 1000=9.37%
  lat (usec)   : 2=5.78%, 4=0.17%, 10=0.19%, 20=0.06%, 50=0.01%
  lat (usec)   : 100=0.01%, 750=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=52, max=6975, avg=90.81, stdev=41.37
    sync percentiles (usec):
     |  1.00th=[   67],  5.00th=[   73], 10.00th=[   77], 20.00th=[   81],
     | 30.00th=[   84], 40.00th=[   87], 50.00th=[   89], 60.00th=[   91],
     | 70.00th=[   93], 80.00th=[   96], 90.00th=[  101], 95.00th=[  111],
     | 99.00th=[  163], 99.50th=[  200], 99.90th=[  375], 99.95th=[  570],
     | 99.99th=[ 1713]
  cpu          : usr=1.54%, sys=20.19%, ctx=1325091, majf=0, minf=16
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,655360,0,655359 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=80.6MiB/s (84.5MB/s), 80.6MiB/s-80.6MiB/s (84.5MB/s-84.5MB/s), io=5120MiB (5369MB), run=63524-63524msec

Disk stats (read/write):
  nvme2n1: ios=66675/2008316, sectors=4288160/38031992, merge=14/1356438, ticks=9377/40112, in_queue=49490, util=61.98%

CCX63

fio --name=fsync-test --filename=$TEST_FILE --size=5G --ioengine=libaio --direct=0 --rw=write --bs=8k --numjobs=1 --iodepth=1 --fsync=1 --group_reporting
fsync-test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.36
Starting 1 process
fsync-test: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=5749KiB/s][w=718 IOPS][eta 00m:00s]
fsync-test: (groupid=0, jobs=1): err= 0: pid=3616829: Tue Nov 11 13:20:52 2025
  write: IOPS=821, BW=6572KiB/s (6730kB/s)(5120MiB/797748msec); 0 zone resets
    slat (usec): min=8, max=4587, avg=26.51, stdev=27.83
    clat (nsec): min=1172, max=594986, avg=2655.72, stdev=2212.45
     lat (usec): min=10, max=4589, avg=29.16, stdev=28.18
    clat percentiles (nsec):
     |  1.00th=[ 1368],  5.00th=[ 1560], 10.00th=[ 1608], 20.00th=[ 1688],
     | 30.00th=[ 1832], 40.00th=[ 2256], 50.00th=[ 2736], 60.00th=[ 2864],
     | 70.00th=[ 2960], 80.00th=[ 3056], 90.00th=[ 3344], 95.00th=[ 3856],
     | 99.00th=[ 5408], 99.50th=[12736], 99.90th=[35072], 99.95th=[39680],
     | 99.99th=[53504]
   bw (  KiB/s): min= 4816, max= 9424, per=100.00%, avg=6578.23, stdev=922.67, samples=1594
   iops        : min=  602, max= 1178, avg=822.16, stdev=115.33, samples=1594
  lat (usec)   : 2=34.57%, 4=61.31%, 10=3.50%, 20=0.32%, 50=0.28%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=560, max=99343, avg=1188.78, stdev=407.12
    sync percentiles (usec):
     |  1.00th=[  742],  5.00th=[  799], 10.00th=[  840], 20.00th=[  898],
     | 30.00th=[  971], 40.00th=[ 1045], 50.00th=[ 1139], 60.00th=[ 1254],
     | 70.00th=[ 1369], 80.00th=[ 1434], 90.00th=[ 1516], 95.00th=[ 1614],
     | 99.00th=[ 2114], 99.50th=[ 2835], 99.90th=[ 5080], 99.95th=[ 5735],
     | 99.99th=[ 8356]
  cpu          : usr=0.72%, sys=8.52%, ctx=1314198, majf=0, minf=15
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,655360,0,655359 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=6572KiB/s (6730kB/s), 6572KiB/s-6572KiB/s (6730kB/s-6730kB/s), io=5120MiB (5369MB), run=797748-797748msec

Disk stats (read/write):
  sda: ios=0/1980548, sectors=0/33330920, merge=0/1377610, ticks=0/652120, in_queue=906250, util=78.84%

AX41-NVMe

fio --name=fsync-test --filename=$TEST_FILE --size=5G --ioengine=libaio --direct=0 --rw=write --bs=8k --numjobs=1 --iodepth=1 --fsync=1 --group_reporting
fsync-test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.36
Starting 1 process
fsync-test: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=1505KiB/s][w=188 IOPS][eta 00m:00s]  
fsync-test: (groupid=0, jobs=1): err= 0: pid=21222: Fri Nov 21 15:39:44 2025
  write: IOPS=187, BW=1497KiB/s (1533kB/s)(5120MiB/3502354msec); 0 zone resets
    slat (usec): min=8, max=1031, avg=19.54, stdev= 6.14
    clat (nsec): min=752, max=1238.4k, avg=1971.06, stdev=1724.91
     lat (usec): min=8, max=1258, avg=21.52, stdev= 6.75
    clat percentiles (nsec):
     |  1.00th=[  988],  5.00th=[ 1320], 10.00th=[ 1496], 20.00th=[ 1560],
     | 30.00th=[ 1608], 40.00th=[ 1720], 50.00th=[ 1832], 60.00th=[ 1976],
     | 70.00th=[ 2160], 80.00th=[ 2320], 90.00th=[ 2640], 95.00th=[ 2800],
     | 99.00th=[ 3504], 99.50th=[ 4048], 99.90th=[13632], 99.95th=[17024],
     | 99.99th=[22912]
   bw (  KiB/s): min= 1168, max= 1555, per=100.00%, avg=1497.67, stdev=39.50, samples=7003
   iops        : min=  146, max=  194, avg=187.15, stdev= 4.94, samples=7003
  lat (nsec)   : 1000=1.08%
  lat (usec)   : 2=59.81%, 4=38.59%, 10=0.36%, 20=0.14%, 50=0.02%
  lat (usec)   : 100=0.01%
  lat (msec)   : 2=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=4950, max=43794, avg=5323.43, stdev=491.68
    sync percentiles (usec):
     |  1.00th=[ 5145],  5.00th=[ 5145], 10.00th=[ 5145], 20.00th=[ 5211],
     | 30.00th=[ 5211], 40.00th=[ 5276], 50.00th=[ 5276], 60.00th=[ 5276],
     | 70.00th=[ 5276], 80.00th=[ 5276], 90.00th=[ 5342], 95.00th=[ 5407],
     | 99.00th=[ 8356], 99.50th=[ 8717], 99.90th=[12125], 99.95th=[12387],
     | 99.99th=[12911]
  cpu          : usr=0.11%, sys=1.10%, ctx=1312291, majf=0, minf=14
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,655360,0,655359 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=1497KiB/s (1533kB/s), 1497KiB/s-1497KiB/s (1533kB/s-1533kB/s), io=5120MiB (5369MB), run=3502354-3502354msec

Disk stats (read/write):
  nvme2n1: ios=0/1966100, sectors=0/31458056, merge=0/1310775, ticks=0/3416205, in_queue=4253444, util=97.54%

</details>

“Speaking of which, developers using Macs might be wondering: ‘My Mac also uses a consumer-grade SSD, so why does it run databases so quickly?’ This brings us to a famous ‘semantic trap’ in macOS…”

Side Note: Mac Users Beware, Your fsync Might Be “Lying”

If you are developing locally on a MacBook Pro (M1/M2/M3), you might find that your local database runs incredibly snappy—sometimes even faster than on a server. However, this doesn’t mean Apple’s SSDs are “Enterprise-grade”; rather, it’s because macOS plays a game of semantics with its fsync implementation.

1. API Semantic Deception: fsync vs. F_FULLFSYNC

In the Linux standard, the definition of fsync(fd) is extremely strict: the call is not considered complete until the data has been written to the physical storage media.

However, on macOS (and iOS), the standard fsync system call does not guarantee that data hits the physical media. It often returns “success” as soon as the data is flushed to the drive’s DRAM cache. This is equivalent to macOS enabling a “cheat mode” for all SSDs by default.

To achieve “physical persistence safety” on macOS equivalent to Linux’s fsync, the application must call an Apple-specific fcntl command:

// Real "physical flush" on macOS
fcntl(fd, F_FULLFSYNC);

Summary and Suggestions

Through the basic fio test and the pgbench real-world comparison, the conclusion is very obvious: Nominal specs are just the tip of the iceberg; architectural differences are the monsters in the deep sea.

In standard sequential read/write tests, the AX41-NVMe’s consumer-grade hard drive can almost go toe-to-toe with the AX102’s enterprise-grade hard drive. But when it comes to real database loads—scenarios containing massive amounts of fsync synchronous writes—the enterprise SSD, relying on the physical advantage of PLP (Power Loss Protection), achieves a more than 5x TPS crushing victory over the consumer drive (25k vs 5k).

This precisely explains why many developers run databases in local development environments (usually high-performance NVMe) or on ordinary VPSs, and once the concurrency goes up, the IOwait hits the roof before the CPU even maxes out.

For those folks who are watching the AX102, our suggestion is:

  • If you are running stateless applications (such as Web Servers, CI/CD Runners, Compute Nodes): The ordinary AX series or even Cloud instances are completely sufficient; you do not need to pay the premium for Datacenter SSD and ECC memory.
  • If you are running stateful applications (especially databases like PostgreSQL, MySQL, Redis): For the extra few dozen Euros per month for Datacenter NVMe, you are buying not just larger capacity, but the determinism of IO latency and the safety of data landing.

After all, in a production environment, “stability” itself is the greatest performance metric. We hope this review helps you avoid detours and spend your money where it counts.

Happy Hacking!

Oh, by the way, if you are interested in Hetzner’s machines after reading this article, you can try using our link to register for Hetzner: https://hetzner.cloud/?ref=6moYBzkpMb9s

If you register through our link, you will directly receive €20 of usable credit after successful registration, and we will also receive a €10 reward, which also supports the development of our product.


The WebP Cloud Services team is a small team of three individuals from Shanghai and Malmö. Since we are not funded and have no profit pressure, we remain committed to doing what we believe is right. We strive to do our best within the scope of our resources and capabilities. We also engage in various activities without affecting the services we provide to the public, and we continuously explore novel ideas in our products.

If you find this service interesting, feel free to log in to the WebP Cloud Dashboard to experience it. If you’re curious about other magical features it offers, take a look at our WebP Cloud Services Docs. We hope everyone enjoys using it!


Discuss on Hacker News