blog

2025-04-28

Notes on a Personal Computer Built for Calculation and AI

Introduction

Ten years later, the system described beginning in Personal Linux R Server in a Mini-ITX Gaming Case was getting old and slow. Time for a replacement...

Computer

Contents


Hardware

Step-by-step reference: How to Build a PC

Understanding fan specs and theory: Fan airflow and static pressure

Component List

PCPartPicker: 2025 RTX 4070 Build

Case ASUS Prime AP201 Black MicroATX Tempered Glass Front Panel (Amazon)
Power Supply Corsair SF1000 SFX Power Supply (Black) (B&H Photo)
Motherboard GIGABYTE Z790M AORUS ELITE AX ICE LGA 1700 Intel (Newegg)
Processor Intel Core i9-13900KS (B&H Photo)
CPU Cooler Cooler Master 240 Atmos White High Performance Close-Loop AIO Liquid Cooler (Amazon)
RAM Crucial Pro 96GB DDR5 RAM Kit (2x48GB) 5600MHz (Amazon)
M.2 SSD SAMSUNG 990 EVO Plus SSD 2TB, PCIe Gen 4x4, Gen 5x2 M.2 2280 (Amazon)
SATA SSD Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSDX
SATA Hard Drive WD Black 2TB - 7200 RPM SATA 6 Gb/s 64MB Cache 3.5 Inch WD2003FZEX
Video Card ASUS Dual GeForce RTX™ 4070 EVO OC Edition 12GB (Amazon)
120mm Case Fans 3x Noctua NF-A12x25 PWM chromax.Black.swap, Premium Quiet Fan, 4-Pin (120mm, Black) (Amazon)
92mm Case Fan Noctua NF-A9 PWM chromax.Black.swap, Premium Quiet Fan, 4-Pin (92mm, Black) (Amazon)

Update: replaced WD Black 2TB drive with another M.2 drive.

M.2 SSD WD_BLACK 8TB SN850X NVMe SSD with Heatsink - Gen4 PCIe, M.2 2280 (Amazon)

Miscellaneous Parts:

Straight SATA cables BENFEI 3 Pack SATA Cable III 6Gbps Straight HDD SDD Data Cable 18 Inch (Amazon)
PWM Fan Splitters JBtek All Black Sleeved PWM Fan Splitter Cable 1 to 2 Converter, 2 Pack (Amazon)
Motherboard Speaker 5 PCS Motherboard Speaker (Amazon)
External Speaker USB Computer Speaker for Desktop PC Laptop | Small Plug-N-Play External Speaker (Amazon)
Wireless Mouse Logitech M196 Bluetooth Wireless Mouse - Graphite (Amazon)
Windows New Windows 10 Professional USB 32/64-Bit and license key Sealed Retail box (eBay)
Cable Tape XFasten Wire Harness Tape 3/4 Inch x 50 ft, Residue-Free Cloth Electrical Felt Tape (Amazon)
Watt meter Upgraded Watt Power Meter Plug Home Electrical Usage Monitor with Backlight, Overload Protection, 7 Modes Display (Amazon)
EDID Dongle HDMI EDID Emulator Passthrough - 1920x1200 @59hz (Amazon)

Bling Accessories:

LED Fan Halos Addressable RGB Fan Halo, Airgoo 3 Packs Rainbow Fan Frame for 120mm Noctua PWM Fans (Amazon)
ARGB Fan Controller for Testing Thermalright 5V 3PIN A-RGB Fan Controller, 5V Lighting Controller (Amazon)
ARGB Cables BULIK 5V 3Pin ARGB Extension Cable, Motherboard Port to SM3P 1-to-1 Female ARGB Connector Adapter Cable (Amazon)
ARGB Light Bar Addressable RGB LED Strip for Gaming Case, 0.98ft 30LEDs Diffused Rainbow Magnetic ARGB Strip (Amazon)
Arduino-Based ARGB Controller Gelid Solutions CODI6 ARGB 6-Channel Programmable USB Controller Kit PWM LED Fan (eBay)
USB B Extension Cable NFHK USB 2.0 B Type Male to Female Extension Cable Down Angled 90 Degree 20cm for Printer Scanner Disk (Amazon)

Intel i9-13900KS CPU

Discussion of i9-139xx and i9-149xx CPUs:

Core i9-14900K ≈ i9-13900KS

At the time of purchase, the 13900KS was actually cheaper and more readily available than the 13900K.

Reviews:

CPU cooler

I opted for a smaller (240mm, 2-fan) liquid cooler; the case can actually accomodate a full-size (360mm) cooler, but that would have required moving the PSU case to a lower position, which could inhibit the ability to install a longer graphics card and/or install a 3.5" hard drive on the inside front of the case.

Review:

Installation video:

I first tried a Noctua NH-D15 cooler, but it's so big I was concerned about whether it would impinge on the video card, which has to go in the first slot next to it.

Motherboard

Memory (RAM)

I initially installed two 48GB DIMMs in the recommended slots on the motherboard, for a total of 96GB.

Once everything was working, and operating under the incorrect assumption that more would better, I installed two more 48GB DIMMs, populating all four slots for a total of 192GB. What I failed to appreciate was how much the system would slow down when this was done. The BIOS set the RAM speed at the stock value of 5600 MT/s when two DIMMs were installed; it dropped the speed to 4000 MT/s when four were installed. I tried enabling each of the XMP profiles (XMP 1 and XMP 2) but both promptly started producing memory errors when I ran a memory stress test (stressapptest.

sudo apt install stressapptest
sudo stressapptest -W -s 120

I was not interested in doing extensive tweaking and tuning to get the overclocking right, so I returned the extra DIMMs and reverted to 96GB and the stock 5600 MT/s.

See LLM timing tests for comparative results of 96GB vs. 192GBs when running actual LLMs.

Video Card (GPU)

I did initially testing of the system without a video card installed; I used the Intel internal graphics that is part of the i9-13900KS CPU. After everything else was demonstrated to be working, I unboxed the video card and installed it.

This is the one I wanted:

GeForce RTX 5090 Founders Edition
GeForce RTX 5090 Founders Edition
32GB VRAM

But it sold out a few seconds after it was launched. So I got this one:

ASUS Dual GeForce RTX™ 4070 EVO OC Edition 12GB GDDR6
ASUS Dual GeForce RTX™ 4070 EVO OC Edition 12GB GDDR6
12GB VRAM

Specification comparison here

ASUS installation instructions:

Power supply

Although the case can hold an ATX PSU, I opted for the smaller SFX format to maximize room. The Corsair SF1000 SFX could (just barely) handle an RTX 5090 (my first choice) and give plenty of headroom for anything else.

Review:

The box didn't come with a manual. I found these on the Corsair website after I had already hooked everything up, first the wrong way, then the right way:

I hooked up most of the cables I needed before installing the PSU in the case, but I found it helpful to have the following photo to refer to later on when attaching a few more.

Corsair SF1000 Outputs

PCPartPicker estimated the power requirement to be 713W (excluding LED bling) for the RTX 4070 build. The 4070 accounts for 200W of that. If it were replaced with by an RTX 5090, the estimated power requirement would be 1088W, which could exceed the limit of the SF1000 if the CPU and GPU were both maxed out at the same time.

Possible alternate PSU if RTX 5090 installed:

Power connections

My initial mistake was assuming that the larger power cable labelled "Motherboard" was the only one I needed to hook up to the motherboard. When I did that, the motherboard powered up, but nothing else happened, and the yellow CPU warning light on the motherboard was lit.

Apparently, I'm not the only one to have been dismayed by this, fearing an expensive CPU or other failure; the solution was provided here as well, I just needed to hook up an additional one or two power cables. There was nothing wrong with any of the components.

The additional power connections are labelled ATX_12V_2X2 and ATX_12V_2X4 on the motherboard. The ATX_12V_2X4 is required. As explained here, the ATX_12V_2X2 might not be necessary in general, but given that the i9-13900KS is particularly power-hungry (as is the RTX 5090 if I ever get one), it seemed a good idea to hook that up too.

See Corsair document Which PSU cables go where?. Note that a 2x4 power cable connector can be split in half to give two 2x2 connectors.

For discussion of GPU power connections, see GPU Power Cable Guide – 6-Pin, 8-Pin, (6+2), 12-Pin PCIe.

Mass storage

The SATA SSD and the SATA HD came from the previous build.

The case came with two SATA cables with right-angle connectors, which was not useful. They can't be used with drives mounted on the case walls, hence the need to purchase additional SATA cables with straight connectors.

Fans

The case is designed to take two 120mm fans on the bottom, and has holes for mounting screws in the appropriate positions. I mounted an additional third fan, the smaller 92mm one, on the bottom with zip ties.

The standard Noctua NF-A12x25 PWM fans, in brown and tan, conveniently come with a Y-splitter cable. Stylish builders who want the all-black version are penalized not only by having to pay two dollars more for each one, but they also don't come with the Y-splitters, which must be purchased separately.

ARGB elements

Noctua does not make fans that light up (because dignity), but Airgoo makes add-on LED "halos" that attach to their 120mm fans. For a good fit, I needed to slightly enlarge the halo mounting holes by running a 3/16" drill bit through them. The halos come in a set of three. They can be daisy-chained, but to run them independently, I needed the additional Bulik adapter cables.

Airgoo also makes a nice foot-long LED bar. I had to glue some rare earth magnets to the edge of the bottom fans so the bar could attach there. (The bar has weak magnetic strips along two edges.)

For immediate testing, the $8 Thermalright controller allowed for nice LED colors and pattern demos for all the available LEDs. For a permanent LED controller, the Gelid Solutions CODI6 is an Arduino-based controller for up to 6 units. It has a motherboard USB cable than allows for programming from within the system. In order for it to fit, it needed a short USB extension cable; one with a right-angle connector (in the correct direction) allows for optimum placement.


Unit LED Count LED Position
Light Bar 30 LED #0 is next to wire
Halo 30 LED #0 is near wire, proceeds CCW
Cooler Pump 12 LED #0 is at bottom vertex; proceeds CW
Cooler Fans 8 Point towards front of case is between LEDs 7 and 0; proceeds CW

The Arduino and LEDs get their power from a SATA-type power connector. I measured the current draw of the LEDS at various intensity levels to make sure this would be OK.

Light Bar Current (mA) at Intensity
LEDs 25% 50% 100%
R 70 100 100
G 70 100 100
B 70 100 100
RGB 125 150 150

Halo Current (mA) at Intensity
LEDs 25% 50% 100%
R 50 90 135
G 50 90 135
B 50 90 135
RGB 115 170 200

ARGB controllers

The Thermalright controller is a little $8 package that has some canned LED display colors and patterns. I could get it right away and use it for initial testing.

The CODI6 controller is more expensive, purchased via eBay from the UK. It contains a programmable Arduino controller for up to six devices, and the cables necessary for internal connection to a motherboard USB connector. Thus it can be programmed and controlled from within the main computer itself to create arbitrary, changing LED patterns.

It works as advertised, and is fun to program. The header pins that connect to the ARGB cables do not provide a secure connection, however. I found the slightest jar to a cable could cause it come loose, which was irritating. (If I have to open the case again to move things around, I may just want to glue the cable connectors onto the board.)

Speakers

One tiny piezo speaker mounts on a motherboard header, and is needed to hear BIOS startup and error beeps. The other, a standard USB speaker, is needed to hear beeps and boops from software and operating systems.

Disk-activity LED

The motherboard came with header pins for a disk activity LED, but the case did not have one built in, so I had to add one myself. With the glass side panel facing the front of my desk, this was fine, since I could mount the additional LED where it could be seen through the panel; no drilling necessary.

The motherboard manual did not give specs for the output, so I had to make measurements. The crucial question was whether I needed a series current-limiting resistor, or not, and if so, how big.

Watt meter

I wanted some indicator for when I might be getting close to the limits of the power supply; probably not an issue for the RTX 4070, but definitely might be a problem for the RTX 5090, if I can get one. Therefore I hooked up an external power meter that can continually show the total wattage input to the SF1000 power supply. The SF1000 is rated at 1000W total output, and the input will always be greater than output due to efficiency always being less than 100%.

At best, the SF1000 is not expected to exceed about 93% efficiency. At 100% load, that drops to about 90% efficiency:

SF1000 efficiency graph

The above suggests that the 1000W output limit would correspond to about 1100W input as read on on the wattmeter. It would be reasonable to use a safety factor and simply keep the input at 1000W or less.

Five beeps at boot

In early testing, I sometimes heard the computer produce 5 beeps at boot, and show the CPU error light on the motherboard. This was alarming; 5 beeps is alleged to mean processor error. Did I have a damaged CPU?

After further research (e.g., here), I found that in this case the 5 beeps simply meant that a monitor had not been detected at boot. As long as I made sure that my KVM switch was set to the new computer before booting it, everything was fine.

EDID dongle

I wanted to use the new computer with a KVM switch that also had my Mac Mini attached. I noticed early on that the new box often presented only a blank screen when I switched to it after a minute or so away. If the delay was short, the computer seemed to remember my monitor configuration, otherwise it wouldn't. The problem was solved (in a suggestion that ChatGPT actually supplied) by the use of an EDID emulator dongle.

This also solved the above "5 beeps" problem described above. The dongle specifies the exact resolution of my monitor, and makes it appear to the computer as if a monitor is always connected. Microcode issues

In 2024 there were reports of problems with instability and damage to certain Intel processors. The problem was eventually confirmed by Intel and a microcode fix was released.

I had hoped that by the time I bought my CPU in January 2025 that the fix would have been applied to the currently-shipping units. Output from the following command:

cat /proc/cpuinfo

gives information about each of the 32 processors and shows that the required microcode version (0x12b) is in place:

...
processor	: 31
vendor_id	: GenuineIntel
cpu family	: 6
model		: 183
model name	: 13th Gen Intel(R) Core(TM) i9-13900KS
stepping	: 1
microcode	: 0x12b
...

Another way to get this information (and more) is via the Intel System Support Utility:

# Install these first:
sudo apt install lshw net-tools ethtool hdparm smartmontools wodim

# Then download the utility here.

# They made an error in naming the file
mv ssu_3.0.0.2_tar.gz Downloads/ssu_3.0.0.2.tar.gz

# unpack it, then run
sudo ./ssu.sh

# It writes the results to a text file with the name of computer.

Operating Systems

For initial testing, I just installed Debian, without a separate video card and with Secure Boot turned off. I intended to later install various other Linux distributions my two larger drives, and Windows on a third separate drive that would have no other systems on it. Then I learned:

So I did a lot of erasing, reformatting, and reinstalling.

BIOS

Summary of BIOS settings

References
Secure Boot
Notes

With the Extreme setting, the mprime torture test sent CPU temp to peak of 92°C and peak input power to 490W momentarily, and then ramped down to a continuous 390W. With the Performance setting, the peak temperature was 77°C and the peak wattage was 390W.

Arduino in CODI6 ARGB controller remained powered on at all times unless ErP was enabled.

Initial Secure Boot state:

~$ mokutil --sb-state
SecureBoot disabled
Platform is in Setup Mode

BIOS steps to enable Secure Boot:

  1. Standard -> Custom
  2. Reinstall factory keys
  3. Custom -> Standard
  4. MODE = User
  5. Enable Secure Boot

After enabling:

~$ mokutil --sb-state
SecureBoot enabled

Windows 11

Windows wants to go first, without any other systems installed. To decouple the Windows installation as much as possible from everything else, I installed it on the separate 500GB SD dedicated solely to Windows.

Debian 12

Prepare installation flash drivem:

Reboot to installation drive. Setup and install to 2TB drive nvme0n1. Use Standard Partitions (not LVM). Use Guided Partitioning to make use of entire drive.

The above should produce a working CLI system derived entirely from offline resources. The Debian GUI failed to start with the RTX 4070 unless an NVIDIA driver was installed separately.

Note: changing the BIOS setting for the internal graphics to anything but Enabled resulted in a non-bootable system that required resetting the motherboard BIOS to work again.

Prevent system from asking CD to be reinserted when later installing software. Explanation.

su -
vi /etc/apt/sources.list
# comment out following line:

Set up static networking:

# look up name of wired interface:
ls /sys/class/net

su -
vi /etc/network/interfaces
# insert the following:
---
auto enp5s0
iface enp5s0 inet static
     address 172.16.1.20/24
     gateway 172.16.1.10
---

vi /etc/resolv.conf
# insert the following, replacing x.x.x.x with your DNS name server(s):
---
nameserver x.x.x.x
nameserver x.x.x.y
---

# bring up interface
ifup enp5s0

# test:
ping apple.com

exit

Retrieve previous ssh keys, if any

su -
mkdir -p /media/michael/data1
mount /dev/nvme1n1p2 /media/michael/data1

cp /media/michael/data1/ssh_host/* /etc/ssh/
reboot

Set up to install software from network per advice.


su -
tee /etc/apt/sources.list<<EOF
deb http://deb.debian.org/debian bookworm main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm main contrib non-free-firmware

deb http://deb.debian.org/debian bookworm-updates main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm-updates main contrib non-free-firmware

# deb http://deb.debian.org/debian bookworm-backports main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm-backports main contrib non-free-firmware

deb http://security.debian.org/debian-security bookworm-security main contrib non-free-firmware
# deb-src http://security.debian.org/debian-security bookworm-security main contrib non-free-firmware
EOF

Set up sudo for user.

su -
apt update
apt install sudo
usermod -aG sudo michael
exit
exit
# [login]

Install GNOME:

sudo apt install task-gnome-desktop

But make sure we don't boot into GUI yet!

sudo systemctl set-default multi-user.target

See below to install NVIDIA driver.

To switch to GUI after booting (only after installing driver):

startx

Fix issue of wired connection not being manageable in GNOME, per advice.

sudo vi /etc/NetworkManager/NetworkManager.conf
---
# change following entry to 'true':
managed=true
---

sudo service NetworkManager restart

AlmaLinux 9.5

Prepare installation flash drive using an existing Linux system:

Reboot to installation drive.

  • Set up networking.
  • Set up root and admin user passwords.
  • Set time zone.
  • Select workstation software. Select all sub-options that might be useful.

Setup and install to 8TB drive nvme1n1. Use Standard Partitions (not LVM).

Installation Partitions
Order Path Size
1. /boot/efi 1024 MiB
2. /boot 1024 MiB
3. / 200 GiB
4. [swap] 10 GiB

Manual partitioning

Change hostname to make clear which distro is in use:

sudo hostnamectl set-hostname aprime
ssh

If relevant files are already present on system, mount drive and copy them:

sudo cp -r /run/media/michael/data1/ssh_host ~

If relevant files are on Mac, copy via sftp:

# delete current ssh key on Mac for linux box
sudo vi ~/.ssh/known_hosts

sftp michael@prime

mkdir ssh_host
cd ssh_host
put /Users/michael/Documents/Prime\ System/ssh_host/*

sudo reboot

# delete new ssh key on Mac for linux box
sudo vi ~/.ssh/known_hosts

Use previous ssh keys, if any

su -
mkdir -p /run/media/michael/data1
mount /dev/nvme1n1p2 /run/media/michael/data1

cp /run/media/michael/data1/ssh_host/* /etc/ssh/
reboot

Gnome settings

  • Multitasking > Active Screen Edges: Off
  • Power > Power Mode: Performance
  • Power > Power Savings Options > Screen Blank: Never
  • Suspend & Power Button > Power Button Option: Power Off
  • Privacy > Screen Lock > Automatic Screen Lock: Off

Linux NVIDIA driver

Be sure computer boots to command line, by typing

sudo systemctl set-default multi-user.target

Reference:

If relevant files are already present on system, mount drive and copy them:

# For AlmaLinux use /run/media/michael/...
sudo cp -r /media/michael/data1/mok ~

If relevant files are on Mac, copy via sftp:

sftp michael@prime

cd ..
mkdir mok
cd mok
put /Users/michael/Documents/Prime\ System/mok/*

exit

AlmaLinux prep:

sudo dnf update

sudo dnf install epel-release
sudo dnf config-manager --enable crb
sudo dnf config-manager --set-enabled extras

sudo dnf install kernel-devel
sudo dnf install kernel-headers
sudo dnf install dkms
# sudo dnf install redhat-lsb-core	*** No match for argument: redhat-lsb-core
sudo dnf install vulkan
sudo dnf install vulkan-tools
sudo dnf install vulkan-headers
sudo dnf install vulkan-loader-devel

# this is also needed
sudo dnf install libglvnd-egl.i686

echo "blacklist nouveau" | sudo tee /etc/modprobe.d/nouveau-blacklist.conf
echo "options nouveau modeset=0" | sudo tee -a /etc/modprobe.d/nouveau-blacklist.conf

sudo dracut --force

sudo grub2-mkconfig -o /boot/grub2/grub.cfg

sudo reboot

Debian prep:

sudo apt install build-essential cmake git

sudo apt source linux
E: You must put some 'deb-src' URIs in your sources.list

sudo vi /etc/apt/sources.list
#uncomment deb-src lines
sudo apt update

sudo apt source linux
sudo apt install linux-headers-`uname -r`

sudo dpkg --add-architecture i386
sudo apt update
sudo apt install libc6:i386

sudo apt install pkg-config libglvnd-dev


Download Latest Production Branch Version from https://www.nvidia.com/en-us/drivers/unix/:

Manual driver search: NVIDIA RTX / Quadro, NVIDIA RTX Series, NVIDIA RTX 4000 Ada Generation, Linux 64-bit, English (US)
cd Downloads/
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/570.144/NVIDIA-Linux-x86_64-570.144.run
chmod +x NVIDIA-Linux-x86_64-570.144.run

# AlmaLinux:
sudo ./NVIDIA-Linux-x86_64-570.144.run --glvnd-egl-config-path=/usr/share/glvnd/egl_vendor.d/

# Debian
#   (Make sure latest kernel headers are installed:)
sudo apt install linux-headers-$(uname -r)
sudo ./NVIDIA-Linux-x86_64-570.144.run

Multiple kernel module types are available for this system. Which would you like to use?

NVIDIA Proprietary

The target kernel has CONFIG_MODULE_SIG set, which means that it supports cryptographic signatures on kernel modules. On some systems, the kernel may refuse to load modules without a valid signature from a trusted key. This system also has UEFI Secure Boot enabled; many distributions enforce module signature verification on UEFI systems when Secure Boot is enabled. Would you like to sign the NVIDIA kernel module?

Sign the kernel modul

First time on computer:

Would you like to sign the NVIDIA kernel module with an existing key pair, or would you like to generate a new one?

Generate a new key pair

The NVIDIA kernel module was successfully signed with a newly generated key pair. Would you like to delete the private signing key?

No

An X.509 certificate containing the public signing key will be installed to /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der. The SHA1 fingerprint of this certificate is: 2E:18:9F:2F:9D:E5:A1:39:A2:14:D5:8C:8C:FE:DE:DC:2A:AF:46:C1. This certificate must be added to a key database which is trusted by your kernel in order for the kernel to be able to verify the module signature.

OK

The private signing key will be installed to /usr/share/nvidia/nvidia-modsign-key-C3A74FC1.key. After the public key is added to a key database which is trusted by your kernel, you may reuse the saved public/private key pair to sign additional kernel modules, without needing to re-enroll the public key. Please take some reasonable precautions to secure the private key: see the README for suggestions.

OK

The signed kernel module failed to load. Secure boot is enabled on this system, so this is likely because the kernel does not trust any key which is capable of verifying the module signature. Would you like to install the signed kernel module anyway? Note that if this module loading failure is due to the lack of a trusted signature, you will not be able to load the installed module until after a key that can verify the module signature is added to a key database that is trusted by the kernel. This will likely require rebooting your computer.

Install signed kernel module

Save the .key and .der files on a separate computer and/or separate partition on Linux computer so they can be re-used on subsequent Linux installations

AFTER installation complete, need the following:

sudo mokutil --import /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der
sudo reboot

[michael@aprime ~]$ sudo mokutil --import /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der
input password: 
wa1mco
sudo reboot
choose option: "Enroll MOK"
save keys

# For AlmaLinux use /run/media/michael/...
sudo cp /usr/share/nvidia/nvidia-modsign* ~/mok
sudo cp /usr/share/nvidia/nvidia-modsign* /media/michael/data1/mok


Later installations on computer:

Sign the kernel module

Would you like to sign the NVIDIA kernel module with an existing key pair, or would you like to generate a new one?

Use an existing key pair

Please provide the path to the private key:

/home/michael/mok/nvidia-modsign-key-C3A74FC1.key

Please provide the path to the public key:

/home/michael/mok/nvidia-modsign-crt-C3A74FC1.der

Install NVIDIA's 32-bit compatibility libraries?

Yes

Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if your kernel changes later.

Yes

The initramfs will likely need to be rebuilt due to the following condition(s): * Nouveau is present in the initramfs. Would you like to rebuild the initramfs?

Rebuild initramfs

Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up.

Yes

Your X configuration file has been successfully updated. Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 570.86.16) is now complete.

OK

Reboot and verify installation:

$ nvidia-smi
Wed Apr  2 18:38:10 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07             Driver Version: 570.133.07     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070        Off |   00000000:01:00.0  On |                  N/A |
|  0%   31C    P8              7W /  200W |     110MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            2148      G   /usr/libexec/Xorg                        36MiB |
|    0   N/A  N/A            2223      G   /usr/bin/gnome-shell                     54MiB |
+-----------------------------------------------------------------------------------------+

After driver is successfully installed, start GUI by typing the following after logging in:

startx
Uninstall NVIDIA driver

Sometimes the driver needs to be uninstalled and reinstalled when a kernel update happens. The driver should always be uninstalled before changing to a new video card.

(Make sure the latest kernel headers are also installed before reinstalling the driver)

sudo ./NVIDIA-Linux-x86_64-570.144.run --uninstall

sudo apt install linux-headers-$(uname -r)

If you plan to no longer use the NVIDIA driver, you should make sure that no X screens are configured to use the NVIDIA X driver in your X configuration file. If you used nvidia-xconfig to configure X, it may have created a backup of your original configuration. Would you like to run `nvidia-xconfig --restore-original-backup` to attempt restoration of the original X configuration file?

Yes

Uninstallation of existing driver: NVIDIA Accelerated Graphics Driver for Linux-x86_64 (570.86.16) is complete.

OK
Build llama.cpp

Download and install CUDA Toolkit, etc., by visiting https://developer.nvidia.com/cuda-downloads and specifying desired platform (use Rocky Linux version).

Installer: Linux, x86_64, Rocky, 9, runfile (local)

Copy and run instructions.


cd ~/Downloads
wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run
sudo sh cuda_12.8.1_570.124.06_linux.run

Accept the EULA, then install the following:


┌─ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt┐
│ CUDA Installer                                                               │
│ - [ ] Driver                                                                 │
│      [ ] 570.124.06                                                          │
│ + [X] CUDA Toolkit 12.8                                                      │
│   [X] CUDA Demo Suite 12.8                                                   │
│   [X] CUDA Documentation 12.8                                                │
│ - [ ] Kernel Objects                                                         │
│      [ ] nvidia-fs                                                           │
│   Options                                                                    │
│   Install                                                                    │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│                                                                              │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└─ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt┘

After installation, the following note appears:


===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-12.8/

Please make sure that
 -   PATH includes /usr/local/cuda-12.8/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-12.8/lib64, or, add /usr/local/cuda-12.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.8/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 570.00 is required for CUDA 12.8 functionality to work.
To install the driver using this installer, run the following command, replacing  with the name of this run file:
    sudo .run --silent --driver

Logfile is /var/log/cuda-installer.log

llama.cpp Setup

Download llama.cpp from https://github.com/ggerganov/llama.cpp:

mkdir ~/Projects
cd ~/Projects/

git clone https://github.com/ggerganov/llama.cpp


Build (AlmaLinux):


cd ~/Projects/llama.cpp

export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH

sudo dnf update
sudo dnf install cmake
sudo dnf install libcurl-devel

sudo dnf install gcc-toolset-13-gcc.x86_64 gcc-toolset-13-gcc-gfortran gcc-toolset-13-gcc-c++

# enable software collection needed for llama.cpp
source scl_source enable gcc-toolset-13

cmake -B build -DGGML_CUDA=ON -DGGML_CCACHE=OFF
cmake --build build --config Release

exit

Mount data partition and run llama.cpp (AlmaLinux):

sudo mkdir -p /run/media/michael/data1
sudo mount /dev/nvme1n1p2 /run/media/michael/data1

~/Projects/llama.cpp/build/bin/llama-cli -m /run/media/michael/data1/models/Ministral-8B-Instruct-2410.bf16.gguf \
  -t 16 --n-gpu-layers 20

Build (Debian):


cd ~/Projects/llama.cpp

export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH

sudo apt update
sudo apt upgrade
sudo apt install cmake

sudo apt install curl libcurl4-openssl-dev

cmake -B build -DGGML_CUDA=ON -DGGML_CCACHE=OFF
cmake --build build --config Release

Mount data partition and run llama.cpp (Debian):

sudo mkdir -p /media/michael/data1
sudo mount /dev/nvme1n1p2 /media/michael/data1

~/Projects/llama.cpp/build/bin/llama-cli -m /media/michael/data1/models/Ministral-8B-Instruct-2410.bf16.gguf \
  -t 16 --n-gpu-layers 20

If necessary, convert Hugging Face to .gguf files (Debian):

First time through, set up virtual Python environment:

# https://askubuntu.com/questions/320996/how-to-make-python-program-command-execute-python-3
sudo apt install python-is-python3

sudo apt install python3-venv
python -m venv mvirte
mvirte/bin/pip install transformers torch
mvirte/bin/pip install sentencepiece

# Main argument is path to folder containing files of the form model-00001-of-00007.safetensors
mvirte/bin/python ~/Projects/llama.cpp/convert_hf_to_gguf.py /media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16 \
   --outfile  /media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16.gguf

If necessary, convert Hugging Face to .gguf files (AlmaLinux):

First time through, set up virtual Python environment:

# https://stackoverflow.com/questions/75608323/how-do-i-solve-error-externally-managed-environment-every-time-i-use-pip-3#75722775
python -m venv mvirte
mvirte/bin/pip install transformers torch
mvirte/bin/pip install sentencepiece

# Main argument is path to folder containing files of the form model-00001-of-00007.safetensors
mvirte/bin/python ~/Projects/llama.cpp/convert_hf_to_gguf.py /run/media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16 \
   --outfile  /run/media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16.gguf

If necessary, merge segmented .gguf files (Debian):

# (for AlmaLinux, use /run/media/michael/...)
~/Projects/llama.cpp/build/bin/llama-gguf-split --merge \
  /media/michael/data1/models/Llama-4-Scout-17B-16E-Instruct-GGUF/Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00002.gguf \
  /media/michael/data1/models/Llama-4-Scout-17B-16E-Instruct-GGUF/Llama-4-Scout-17B-16E-Instruct-Q4_K_M.gguf

LLM timing tests

Hints from ChatGPT on suggested parameters to use for consistent timing tests:

To ensure repeatable testing with llama-cli, you can use the following command format:

./main -m models/llama-2-13b.Q4_K_M.gguf \
       --prompt "Benchmark test." \
       --n-predict 128 \
       --threads 16 \
       --batch_size 512 \
       --log-disable \
       --repeat-penalty 1.0 \
       --temp 0.7

Explanation of Key Options:

-m models/llama-2-13b.Q4_K_M.gguf → Selects the model file.
--prompt "Benchmark test." → Keeps input consistent.
--n-predict 128 → Ensures the same number of tokens are generated.
--threads 16 → Uses 16 threads for CPU processing (adjustable).
--batch_size 512 → Helps performance but can be tuned if needed.
--log-disable → Prevents excess log output for clean results.
--repeat-penalty 1.0 & --temp 0.7 → Keeps generation behavior stable.

GPU-Specific:

To disable GPU completely, add --n-gpu-layers 0.
To test with GPU assist, set --n-gpu-layers X (e.g., 30 for part offloaded).

Test results, showing tokens per second and effect of 2 DIMMs vs. 4 DIMMs and CPU-only vs. CPU + GPU:

CPU only (tps)192GB RAM96GB RAM2-DIMM Speedup
lama-2-13b.Q4_K_M.gguf 7.04 9.84 40%
llama-2-13b.Q8_0.gguf 4.03 5.71 42%
llama-2-70b.Q4_K_M.gguf 1.35 1.92 42%
Llama-2-70B-fp16_Q8_0.gguf 0.79 1.12 42%

GPU Assist (tps) 192GB RAM 96GB RAM 2-DIMM Speedup gpu
layers
llama-2-13b.Q4_K_M.gguf 50.95 51.03 0% 100
llama-2-13b.Q8_0.gguf 9.52 12.44 31% 27
llama-2-70b.Q4_K_M.gguf 1.73 2.39 38% 21
Llama-2-70B-fp16_Q8_0.gguf 0.90 1.27 41% 12

GPU Speedup 192GB RAM 96GB RAM
llama-2-13b.Q4_K_M.gguf 7.24x 5.19x
llama-2-13b.Q8_0.gguf 2.36x 2.18x
llama-2-70b.Q4_K_M.gguf 1.28x 1.24x
Llama-2-70B-fp16_Q8_0.gguf 1.14x 1.13x