Notes on a Personal Computer Built for Calculation and AI
Introduction
Ten years later, the system described beginning in Personal Linux R Server in a Mini-ITX Gaming Case was getting old and slow. Time for a replacement...
Contents
- Hardware
- Component List
- Intel i9-13900KS CPU
- CPU cooler
- Motherboard
- Memory (RAM)
- Video Card (GPU)
- Power supply
- Power connections
- Mass storage
- Fans
- ARGB elements
- ARGB controllers
- Speakers
- Disk-activity LED
- Watt meter
- Five beeps at boot
- EDID dongle
- Microcode issues
- Operating Systems
- BIOS
- Windows 11
- Debian 12
- AlmaLinux 9.5
- ssh
- Gnome setup
- Linux NVIDIA driver
- Uninstall NVIDIA driver
- llama.cpp Setup
- LLM timing tests
Hardware
Step-by-step reference: How to Build a PC
Understanding fan specs and theory: Fan airflow and static pressure
Component List
PCPartPicker: 2025 RTX 4070 Build
| Case | ASUS Prime AP201 Black MicroATX Tempered Glass Front Panel (Amazon) |
|---|---|
| Power Supply | Corsair SF1000 SFX Power Supply (Black) (B&H Photo) |
| Motherboard | GIGABYTE Z790M AORUS ELITE AX ICE LGA 1700 Intel (Newegg) |
| Processor | Intel Core i9-13900KS (B&H Photo) |
| CPU Cooler | Cooler Master 240 Atmos White High Performance Close-Loop AIO Liquid Cooler (Amazon) |
| RAM | Crucial Pro 96GB DDR5 RAM Kit (2x48GB) 5600MHz (Amazon) |
| M.2 SSD | SAMSUNG 990 EVO Plus SSD 2TB, PCIe Gen 4x4, Gen 5x2 M.2 2280 (Amazon) |
| SATA SSD | Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSDX |
| SATA Hard Drive | WD Black 2TB - 7200 RPM SATA 6 Gb/s 64MB Cache 3.5 Inch WD2003FZEX |
| Video Card | ASUS Dual GeForce RTX™ 4070 EVO OC Edition 12GB (Amazon) |
| 120mm Case Fans | 3x Noctua NF-A12x25 PWM chromax.Black.swap, Premium Quiet Fan, 4-Pin (120mm, Black) (Amazon) |
| 92mm Case Fan | Noctua NF-A9 PWM chromax.Black.swap, Premium Quiet Fan, 4-Pin (92mm, Black) (Amazon) |
Update: replaced WD Black 2TB drive with another M.2 drive.
| M.2 SSD | WD_BLACK 8TB SN850X NVMe SSD with Heatsink - Gen4 PCIe, M.2 2280 (Amazon) |
|---|
Miscellaneous Parts:
| Straight SATA cables | BENFEI 3 Pack SATA Cable III 6Gbps Straight HDD SDD Data Cable 18 Inch (Amazon) |
|---|---|
| PWM Fan Splitters | JBtek All Black Sleeved PWM Fan Splitter Cable 1 to 2 Converter, 2 Pack (Amazon) |
| Motherboard Speaker | 5 PCS Motherboard Speaker (Amazon) |
| External Speaker | USB Computer Speaker for Desktop PC Laptop | Small Plug-N-Play External Speaker (Amazon) |
| Wireless Mouse | Logitech M196 Bluetooth Wireless Mouse - Graphite (Amazon) |
| Windows | New Windows 10 Professional USB 32/64-Bit and license key Sealed Retail box (eBay) |
| Cable Tape | XFasten Wire Harness Tape 3/4 Inch x 50 ft, Residue-Free Cloth Electrical Felt Tape (Amazon) |
| Watt meter | Upgraded Watt Power Meter Plug Home Electrical Usage Monitor with Backlight, Overload Protection, 7 Modes Display (Amazon) |
| EDID Dongle | HDMI EDID Emulator Passthrough - 1920x1200 @59hz (Amazon) |
Bling Accessories:
| LED Fan Halos | Addressable RGB Fan Halo, Airgoo 3 Packs Rainbow Fan Frame for 120mm Noctua PWM Fans (Amazon) |
|---|---|
| ARGB Fan Controller for Testing | Thermalright 5V 3PIN A-RGB Fan Controller, 5V Lighting Controller (Amazon) |
| ARGB Cables | BULIK 5V 3Pin ARGB Extension Cable, Motherboard Port to SM3P 1-to-1 Female ARGB Connector Adapter Cable (Amazon) |
| ARGB Light Bar | Addressable RGB LED Strip for Gaming Case, 0.98ft 30LEDs Diffused Rainbow Magnetic ARGB Strip (Amazon) |
| Arduino-Based ARGB Controller | Gelid Solutions CODI6 ARGB 6-Channel Programmable USB Controller Kit PWM LED Fan (eBay) |
| USB B Extension Cable | NFHK USB 2.0 B Type Male to Female Extension Cable Down Angled 90 Degree 20cm for Printer Scanner Disk (Amazon) |
Intel i9-13900KS CPU
Discussion of i9-139xx and i9-149xx CPUs:
Core i9-14900K ≈ i9-13900KS
At the time of purchase, the 13900KS was actually cheaper and more readily available than the 13900K.
Reviews:
- Intel Core i9-13900KS Review: The World's First 6 GHz 320W CPU 2023-01-29
- Intel Core i9-13900KS Review - The Empire Strikes Back 2023-04-21
CPU cooler
I opted for a smaller (240mm, 2-fan) liquid cooler; the case can actually accomodate a full-size (360mm) cooler, but that would have required moving the PSU case to a lower position, which could inhibit the ability to install a longer graphics card and/or install a 3.5" hard drive on the inside front of the case.
Review:
Installation video:
I first tried a Noctua NH-D15 cooler, but it's so big I was concerned about whether it would impinge on the video card, which has to go in the first slot next to it.
Motherboard
- Manual here: GIGABYTE Z790M AORUS ELITE AX ICE LGA 1700 Intel
Memory (RAM)
I initially installed two 48GB DIMMs in the recommended slots on the motherboard, for a total of 96GB.
Once everything was working, and operating under the incorrect assumption that more would better, I installed two more 48GB DIMMs, populating all four slots for a total of 192GB. What I failed to appreciate was how much the system would slow down when this was done. The BIOS set the RAM speed at the stock value of 5600 MT/s when two DIMMs were installed; it dropped the speed to 4000 MT/s when four were installed. I tried enabling each of the XMP profiles (XMP 1 and XMP 2) but both promptly started producing memory errors when I ran a memory stress test (stressapptest.
sudo apt install stressapptest
sudo stressapptest -W -s 120
I was not interested in doing extensive tweaking and tuning to get the overclocking right, so I returned the extra DIMMs and reverted to 96GB and the stock 5600 MT/s.
See LLM timing tests for comparative results of 96GB vs. 192GBs when running actual LLMs.
Video Card (GPU)
I did initially testing of the system without a video card installed; I used the Intel internal graphics that is part of the i9-13900KS CPU. After everything else was demonstrated to be working, I unboxed the video card and installed it.
This is the one I wanted:

32GB VRAM
But it sold out a few seconds after it was launched. So I got this one:

Specification comparison here
ASUS installation instructions:
Power supply
Although the case can hold an ATX PSU, I opted for the smaller SFX format to maximize room. The Corsair SF1000 SFX could (just barely) handle an RTX 5090 (my first choice) and give plenty of headroom for anything else.
Review:
- Corsair SF1000 ATX v3.1 PSU Review 2024-06-14
The box didn't come with a manual. I found these on the Corsair website after I had already hooked everything up, first the wrong way, then the right way:
I hooked up most of the cables I needed before installing the PSU in the case, but I found it helpful to have the following photo to refer to later on when attaching a few more.
PCPartPicker estimated the power requirement to be 713W (excluding LED bling) for the RTX 4070 build. The 4070 accounts for 200W of that. If it were replaced with by an RTX 5090, the estimated power requirement would be 1088W, which could exceed the limit of the SF1000 if the CPU and GPU were both maxed out at the same time.
Possible alternate PSU if RTX 5090 installed:
Power connections
My initial mistake was assuming that the larger power cable labelled "Motherboard" was the only one I needed to hook up to the motherboard. When I did that, the motherboard powered up, but nothing else happened, and the yellow CPU warning light on the motherboard was lit.
Apparently, I'm not the only one to have been dismayed by this, fearing an expensive CPU or other failure; the solution was provided here as well, I just needed to hook up an additional one or two power cables. There was nothing wrong with any of the components.
The additional power connections are labelled ATX_12V_2X2 and ATX_12V_2X4 on the motherboard. The ATX_12V_2X4 is required. As explained here, the ATX_12V_2X2 might not be necessary in general, but given that the i9-13900KS is particularly power-hungry (as is the RTX 5090 if I ever get one), it seemed a good idea to hook that up too.
See Corsair document Which PSU cables go where?. Note that a 2x4 power cable connector can be split in half to give two 2x2 connectors.
For discussion of GPU power connections, see GPU Power Cable Guide – 6-Pin, 8-Pin, (6+2), 12-Pin PCIe.
Mass storage
The SATA SSD and the SATA HD came from the previous build.
The case came with two SATA cables with right-angle connectors, which was not useful. They can't be used with drives mounted on the case walls, hence the need to purchase additional SATA cables with straight connectors.
Fans
The case is designed to take two 120mm fans on the bottom, and has holes for mounting screws in the appropriate positions. I mounted an additional third fan, the smaller 92mm one, on the bottom with zip ties.
The standard Noctua NF-A12x25 PWM fans, in brown and tan, conveniently come with a Y-splitter cable. Stylish builders who want the all-black version are penalized not only by having to pay two dollars more for each one, but they also don't come with the Y-splitters, which must be purchased separately.
ARGB elements
Noctua does not make fans that light up (because dignity), but Airgoo makes add-on LED "halos" that attach to their 120mm fans. For a good fit, I needed to slightly enlarge the halo mounting holes by running a 3/16" drill bit through them. The halos come in a set of three. They can be daisy-chained, but to run them independently, I needed the additional Bulik adapter cables.
Airgoo also makes a nice foot-long LED bar. I had to glue some rare earth magnets to the edge of the bottom fans so the bar could attach there. (The bar has weak magnetic strips along two edges.)
For immediate testing, the $8 Thermalright controller allowed for nice LED colors and pattern demos for all the available LEDs. For a permanent LED controller, the Gelid Solutions CODI6 is an Arduino-based controller for up to 6 units. It has a motherboard USB cable than allows for programming from within the system. In order for it to fit, it needed a short USB extension cable; one with a right-angle connector (in the correct direction) allows for optimum placement.
| Unit | LED Count | LED Position |
|---|---|---|
| Light Bar | 30 | LED #0 is next to wire |
| Halo | 30 | LED #0 is near wire, proceeds CCW |
| Cooler Pump | 12 | LED #0 is at bottom vertex; proceeds CW |
| Cooler Fans | 8 | Point towards front of case is between LEDs 7 and 0; proceeds CW |
The Arduino and LEDs get their power from a SATA-type power connector. I measured the current draw of the LEDS at various intensity levels to make sure this would be OK.
| Light Bar Current (mA) at Intensity | |||
|---|---|---|---|
| LEDs | 25% | 50% | 100% |
| R | 70 | 100 | 100 |
| G | 70 | 100 | 100 |
| B | 70 | 100 | 100 |
| RGB | 125 | 150 | 150 |
| Halo Current (mA) at Intensity | |||
|---|---|---|---|
| LEDs | 25% | 50% | 100% |
| R | 50 | 90 | 135 |
| G | 50 | 90 | 135 |
| B | 50 | 90 | 135 |
| RGB | 115 | 170 | 200 |
ARGB controllers
The Thermalright controller is a little $8 package that has some canned LED display colors and patterns. I could get it right away and use it for initial testing.
The CODI6 controller is more expensive, purchased via eBay from the UK. It contains a programmable Arduino controller for up to six devices, and the cables necessary for internal connection to a motherboard USB connector. Thus it can be programmed and controlled from within the main computer itself to create arbitrary, changing LED patterns.
It works as advertised, and is fun to program. The header pins that connect to the ARGB cables do not provide a secure connection, however. I found the slightest jar to a cable could cause it come loose, which was irritating. (If I have to open the case again to move things around, I may just want to glue the cable connectors onto the board.)
Speakers
One tiny piezo speaker mounts on a motherboard header, and is needed to hear BIOS startup and error beeps. The other, a standard USB speaker, is needed to hear beeps and boops from software and operating systems.
Disk-activity LED
The motherboard came with header pins for a disk activity LED, but the case did not have one built in, so I had to add one myself. With the glass side panel facing the front of my desk, this was fine, since I could mount the additional LED where it could be seen through the panel; no drilling necessary.
The motherboard manual did not give specs for the output, so I had to make measurements. The crucial question was whether I needed a series current-limiting resistor, or not, and if so, how big.
- The no-load output of the motherboard connection was 3.236V.
- With a 1kΩ load resistor across the output, the voltage was exactly the same, suggesting there was no internal current-limiting resistor (although the output could be current source).
- With a 300Ω resistor in series with a green LED I got reasonable brightness.
- Current was 2.4mA, which seems reasonable for safe operation.
Watt meter
I wanted some indicator for when I might be getting close to the limits of the power supply; probably not an issue for the RTX 4070, but definitely might be a problem for the RTX 5090, if I can get one. Therefore I hooked up an external power meter that can continually show the total wattage input to the SF1000 power supply. The SF1000 is rated at 1000W total output, and the input will always be greater than output due to efficiency always being less than 100%.
At best, the SF1000 is not expected to exceed about 93% efficiency. At 100% load, that drops to about 90% efficiency:
The above suggests that the 1000W output limit would correspond to about 1100W input as read on on the wattmeter. It would be reasonable to use a safety factor and simply keep the input at 1000W or less.
Five beeps at boot
In early testing, I sometimes heard the computer produce 5 beeps at boot, and show the CPU error light on the motherboard. This was alarming; 5 beeps is alleged to mean processor error. Did I have a damaged CPU?
After further research (e.g., here), I found that in this case the 5 beeps simply meant that a monitor had not been detected at boot. As long as I made sure that my KVM switch was set to the new computer before booting it, everything was fine.
EDID dongle
I wanted to use the new computer with a KVM switch that also had my Mac Mini attached. I noticed early on that the new box often presented only a blank screen when I switched to it after a minute or so away. If the delay was short, the computer seemed to remember my monitor configuration, otherwise it wouldn't. The problem was solved (in a suggestion that ChatGPT actually supplied) by the use of an EDID emulator dongle.
This also solved the above "5 beeps" problem described above. The dongle specifies the exact resolution of my monitor, and makes it appear to the computer as if a monitor is always connected. Microcode issues
In 2024 there were reports of problems with instability and damage to certain Intel processors. The problem was eventually confirmed by Intel and a microcode fix was released.
- June 2024 Guidance regarding Intel Core 13th and 14th Gen K/KF/KS instability reports
- Intel Core 13th and 14th Gen Desktop Instability Root Cause Update
I had hoped that by the time I bought my CPU in January 2025 that the fix would have been applied to the currently-shipping units. Output from the following command:
cat /proc/cpuinfo
gives information about each of the 32 processors and shows that the required microcode version (0x12b) is in place:
...
processor : 31
vendor_id : GenuineIntel
cpu family : 6
model : 183
model name : 13th Gen Intel(R) Core(TM) i9-13900KS
stepping : 1
microcode : 0x12b
...
Another way to get this information (and more) is via the Intel System Support Utility:
# Install these first:
sudo apt install lshw net-tools ethtool hdparm smartmontools wodim
# Then download the utility here.
# They made an error in naming the file
mv ssu_3.0.0.2_tar.gz Downloads/ssu_3.0.0.2.tar.gz
# unpack it, then run
sudo ./ssu.sh
# It writes the results to a text file with the name of computer.
Operating Systems
For initial testing, I just installed Debian, without a separate video card and with Secure Boot turned off. I intended to later install various other Linux distributions my two larger drives, and Windows on a third separate drive that would have no other systems on it. Then I learned:- Windows does not like to be installed if there are any other systems installed. Windows wants to be first.
- Windows 10 does not require Secure Boot, but Windows 11 does.
- The operating systems seem to have a smoother installation if they are installed with Secure Boot on, rather than installing with Secure Boot off and trying to enable it later.
- The operating systems seem to have a smoother installation if they are installed with the video card installed, rather than installing without a video card and then trying to enable it later.
- All operating systems will initially be able to work with the video card even if Secure Boot is enabled, but some will not be able to load proprietary Nvidia drivers with Secure Boot on until signing issues have been dealt with.
So I did a lot of erasing, reformatting, and reinstalling.
BIOS
Summary of BIOS settings
- Switch to Advanced settings.
- Tweaker > Intel Default Settings: default is Extreme, changed to Performance.
- Settings > Platform Power > ErP: default is Disabled, changed to Enabled.
- Boot > Secure Boot: default is Disabled, changed to Enabled.
- Smart Fan (F6) > default is Silent, changed to Normal.
References
- Gigabyte Z790 BIOS Manual
- Intel Power Delivery Profiles
-
What exactly is ErP in BIOS?
ErP Support determines whether to let the system consume less than 1W of power in S5 (shutdown) state. When the setting is enabled, the following four functions will become unavailable: PME Event Wake Up, Power On By Mouse, Power On By Keyboard, and Wake On LAN. - 13th Generation Intel Core, Intel Core 14th Generation, Intel Core Processor (Series 1) and (Series 2), and Intel® Xeon™ E 2400 Processor Datasheet, Volume 1 of 2
Secure Boot
- https://www.linux.org/threads/should-i-enable-secure-boot-after-installation-of-linux.54247/
....There is a boot shim that gets installed during the initial installation. It checks to see if secure boot is enabled or not.
Most always, if you have secure boot disabled during install, the non-secure boot shim gets installed. If you re-enable secure boot, it likely won't work.
- What is "Platform is in setup mode" mean? SecureBoot disabled although TPM is enabled
- Debian Wiki: Secure Boot
- Debian Secure Boot: To be, or not to be, that is the question!
- Overview: How UEFI Secure Boot Works in Linux
- AlmaLinux: Secure Boot
- FreeBSD: Secure Boot
- Ubuntu Wiki: Secure Boot
- Fedora: Need help to enable Secure Boot
Notes
With the Extreme setting, the mprime torture test sent CPU temp to peak of 92°C and peak input power to 490W momentarily, and then ramped down to a continuous 390W. With the Performance setting, the peak temperature was 77°C and the peak wattage was 390W.
Arduino in CODI6 ARGB controller remained powered on at all times unless ErP was enabled.
Initial Secure Boot state:
~$ mokutil --sb-state
SecureBoot disabled
Platform is in Setup Mode
BIOS steps to enable Secure Boot:
- Standard -> Custom
- Reinstall factory keys
- Custom -> Standard
- MODE = User
- Enable Secure Boot
After enabling:
~$ mokutil --sb-state
SecureBoot enabled
Windows 11
Windows wants to go first, without any other systems installed. To decouple the Windows installation as much as possible from everything else, I installed it on the separate 500GB SD dedicated solely to Windows.
- Make sure Secure Boot is enabled.
- Wipe other system installations.
- Install Windows 10.
- Download Windows 11 Installation Assistant and update.
- Hope for the best.
Debian 12
Prepare installation flash drivem:
- Go to Debian website.
- Download 64-bit PC DVD-1 iso. This is the larger complete installation, which does not require an internet connection when installing.
- Download balenaEtcher from https://etcher.balena.io
- Use balenaEtcher to burn ISO to flash drive
Reboot to installation drive. Setup and install to 2TB drive nvme0n1. Use Standard Partitions (not LVM).
Use Guided Partitioning to make use of entire drive.
- Configure the network: [choose wired interface]
- Do not configure the network at this time
- Hostname: dprime
- Partition disks: Guided - use entire disk
- Select disk: /dev/nvme0n1 2.0TB
- Partitioning scheme: All files in one partition
- Software selection: Uncheck all except SSH server, standard system utilities
The above should produce a working CLI system derived entirely from offline resources. The Debian GUI failed to start with the RTX 4070 unless an NVIDIA driver was installed separately.
Note: changing the BIOS setting for the internal graphics to anything but Enabled resulted in a non-bootable system that required resetting the motherboard BIOS to work again.
Prevent system from asking CD to be reinserted when later installing software. Explanation.
su -
vi /etc/apt/sources.list
# comment out following line:
Set up static networking:
# look up name of wired interface:
ls /sys/class/net
su -
vi /etc/network/interfaces
# insert the following:
---
auto enp5s0
iface enp5s0 inet static
address 172.16.1.20/24
gateway 172.16.1.10
---
vi /etc/resolv.conf
# insert the following, replacing x.x.x.x with your DNS name server(s):
---
nameserver x.x.x.x
nameserver x.x.x.y
---
# bring up interface
ifup enp5s0
# test:
ping apple.com
exit
Retrieve previous ssh keys, if any
su -
mkdir -p /media/michael/data1
mount /dev/nvme1n1p2 /media/michael/data1
cp /media/michael/data1/ssh_host/* /etc/ssh/
reboot
Set up to install software from network per advice.
su -
tee /etc/apt/sources.list<<EOF
deb http://deb.debian.org/debian bookworm main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm main contrib non-free-firmware
deb http://deb.debian.org/debian bookworm-updates main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm-updates main contrib non-free-firmware
# deb http://deb.debian.org/debian bookworm-backports main contrib non-free-firmware
# deb-src http://deb.debian.org/debian bookworm-backports main contrib non-free-firmware
deb http://security.debian.org/debian-security bookworm-security main contrib non-free-firmware
# deb-src http://security.debian.org/debian-security bookworm-security main contrib non-free-firmware
EOF
Set up sudo for user.
su -
apt update
apt install sudo
usermod -aG sudo michael
exit
exit
# [login]
Install GNOME:
sudo apt install task-gnome-desktop
But make sure we don't boot into GUI yet!
sudo systemctl set-default multi-user.target
See below to install NVIDIA driver.
To switch to GUI after booting (only after installing driver):
startx
Fix issue of wired connection not being manageable in GNOME, per advice.
sudo vi /etc/NetworkManager/NetworkManager.conf
---
# change following entry to 'true':
managed=true
---
sudo service NetworkManager restart
AlmaLinux 9.5
Prepare installation flash drive using an existing Linux system:
- Go to AlmaLinux website.
- Download x86-64 AlmaLinux OS 9.5 Boot ISO
- Download balenaEtcher from https://etcher.balena.io
- Use balenaEtcher to burn ISO to flash drive
Reboot to installation drive.
- Set up networking.
- Set up root and admin user passwords.
- Set time zone.
- Select workstation software. Select all sub-options that might be useful.
Setup and install to 8TB drive nvme1n1. Use Standard Partitions (not LVM).
| Installation Partitions | |||
|---|---|---|---|
| Order | Path | Size | |
| 1. | /boot/efi | 1024 MiB | |
| 2. | /boot | 1024 MiB | |
| 3. | / | 200 GiB | |
| 4. | [swap] | 10 GiB | |
Change hostname to make clear which distro is in use:
sudo hostnamectl set-hostname aprime
ssh
If relevant files are already present on system, mount drive and copy them:
sudo cp -r /run/media/michael/data1/ssh_host ~
If relevant files are on Mac, copy via sftp:
# delete current ssh key on Mac for linux box
sudo vi ~/.ssh/known_hosts
sftp michael@prime
mkdir ssh_host
cd ssh_host
put /Users/michael/Documents/Prime\ System/ssh_host/*
sudo reboot
# delete new ssh key on Mac for linux box
sudo vi ~/.ssh/known_hosts
Use previous ssh keys, if any
su -
mkdir -p /run/media/michael/data1
mount /dev/nvme1n1p2 /run/media/michael/data1
cp /run/media/michael/data1/ssh_host/* /etc/ssh/
reboot
Gnome settings
- Multitasking > Active Screen Edges: Off
- Power > Power Mode: Performance
- Power > Power Savings Options > Screen Blank: Never
- Suspend & Power Button > Power Button Option: Power Off
- Privacy > Screen Lock > Automatic Screen Lock: Off
Linux NVIDIA driver
Be sure computer boots to command line, by typing
sudo systemctl set-default multi-user.target
Reference:
- AlmaLinux Wiki - NVIDIA: Installation on 9.x Variant III
If relevant files are already present on system, mount drive and copy them:
# For AlmaLinux use /run/media/michael/...
sudo cp -r /media/michael/data1/mok ~
If relevant files are on Mac, copy via sftp:
sftp michael@prime
cd ..
mkdir mok
cd mok
put /Users/michael/Documents/Prime\ System/mok/*
exit
AlmaLinux prep:
sudo dnf update
sudo dnf install epel-release
sudo dnf config-manager --enable crb
sudo dnf config-manager --set-enabled extras
sudo dnf install kernel-devel
sudo dnf install kernel-headers
sudo dnf install dkms
# sudo dnf install redhat-lsb-core *** No match for argument: redhat-lsb-core
sudo dnf install vulkan
sudo dnf install vulkan-tools
sudo dnf install vulkan-headers
sudo dnf install vulkan-loader-devel
# this is also needed
sudo dnf install libglvnd-egl.i686
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/nouveau-blacklist.conf
echo "options nouveau modeset=0" | sudo tee -a /etc/modprobe.d/nouveau-blacklist.conf
sudo dracut --force
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot
Debian prep:
sudo apt install build-essential cmake git
sudo apt source linux
E: You must put some 'deb-src' URIs in your sources.list
sudo vi /etc/apt/sources.list
#uncomment deb-src lines
sudo apt update
sudo apt source linux
sudo apt install linux-headers-`uname -r`
sudo dpkg --add-architecture i386
sudo apt update
sudo apt install libc6:i386
sudo apt install pkg-config libglvnd-dev
Download Latest Production Branch Version from https://www.nvidia.com/en-us/drivers/unix/:
cd Downloads/
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/570.144/NVIDIA-Linux-x86_64-570.144.run
chmod +x NVIDIA-Linux-x86_64-570.144.run
# AlmaLinux:
sudo ./NVIDIA-Linux-x86_64-570.144.run --glvnd-egl-config-path=/usr/share/glvnd/egl_vendor.d/
# Debian
# (Make sure latest kernel headers are installed:)
sudo apt install linux-headers-$(uname -r)
sudo ./NVIDIA-Linux-x86_64-570.144.run
Multiple kernel module types are available for this system. Which would you like to use?
NVIDIA ProprietaryThe target kernel has CONFIG_MODULE_SIG set, which means that it supports cryptographic signatures on kernel modules. On some systems, the kernel may refuse to load modules without a valid signature from a trusted key. This system also has UEFI Secure Boot enabled; many distributions enforce module signature verification on UEFI systems when Secure Boot is enabled. Would you like to sign the NVIDIA kernel module?
Sign the kernel modulFirst time on computer:
Would you like to sign the NVIDIA kernel module with an existing key pair, or would you like to generate a new one?
Generate a new key pairThe NVIDIA kernel module was successfully signed with a newly generated key pair. Would you like to delete the private signing key?
NoAn X.509 certificate containing the public signing key will be installed to /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der. The SHA1 fingerprint of this certificate is: 2E:18:9F:2F:9D:E5:A1:39:A2:14:D5:8C:8C:FE:DE:DC:2A:AF:46:C1. This certificate must be added to a key database which is trusted by your kernel in order for the kernel to be able to verify the module signature.
OKThe private signing key will be installed to /usr/share/nvidia/nvidia-modsign-key-C3A74FC1.key. After the public key is added to a key database which is trusted by your kernel, you may reuse the saved public/private key pair to sign additional kernel modules, without needing to re-enroll the public key. Please take some reasonable precautions to secure the private key: see the README for suggestions.
OKThe signed kernel module failed to load. Secure boot is enabled on this system, so this is likely because the kernel does not trust any key which is capable of verifying the module signature. Would you like to install the signed kernel module anyway? Note that if this module loading failure is due to the lack of a trusted signature, you will not be able to load the installed module until after a key that can verify the module signature is added to a key database that is trusted by the kernel. This will likely require rebooting your computer.
Install signed kernel moduleSave the .key and .der files on a separate computer and/or separate partition on Linux computer so they can be re-used on subsequent Linux installations
AFTER installation complete, need the following:
sudo mokutil --import /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der
sudo reboot
[michael@aprime ~]$ sudo mokutil --import /usr/share/nvidia/nvidia-modsign-crt-C3A74FC1.der
input password:
wa1mco
sudo reboot
choose option: "Enroll MOK"
save keys
# For AlmaLinux use /run/media/michael/...
sudo cp /usr/share/nvidia/nvidia-modsign* ~/mok
sudo cp /usr/share/nvidia/nvidia-modsign* /media/michael/data1/mok
Later installations on computer:
Sign the kernel moduleWould you like to sign the NVIDIA kernel module with an existing key pair, or would you like to generate a new one?
Use an existing key pairPlease provide the path to the private key:
/home/michael/mok/nvidia-modsign-key-C3A74FC1.keyPlease provide the path to the public key:
/home/michael/mok/nvidia-modsign-crt-C3A74FC1.derInstall NVIDIA's 32-bit compatibility libraries?
YesWould you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if your kernel changes later.
YesThe initramfs will likely need to be rebuilt due to the following condition(s): * Nouveau is present in the initramfs. Would you like to rebuild the initramfs?
Rebuild initramfsWould you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up.
YesYour X configuration file has been successfully updated. Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 570.86.16) is now complete.
OKReboot and verify installation:
$ nvidia-smi
Wed Apr 2 18:38:10 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 Off | 00000000:01:00.0 On | N/A |
| 0% 31C P8 7W / 200W | 110MiB / 12282MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2148 G /usr/libexec/Xorg 36MiB |
| 0 N/A N/A 2223 G /usr/bin/gnome-shell 54MiB |
+-----------------------------------------------------------------------------------------+
After driver is successfully installed, start GUI by typing the following after logging in:
startx
Uninstall NVIDIA driver
Sometimes the driver needs to be uninstalled and reinstalled when a kernel update happens. The driver should always be uninstalled before changing to a new video card.
(Make sure the latest kernel headers are also installed before reinstalling the driver)
sudo ./NVIDIA-Linux-x86_64-570.144.run --uninstall
sudo apt install linux-headers-$(uname -r)
If you plan to no longer use the NVIDIA driver, you should make sure that no X screens are configured to use the NVIDIA X driver in your X configuration file. If you used nvidia-xconfig to configure X, it may have created a backup of your original configuration. Would you like to run `nvidia-xconfig --restore-original-backup` to attempt restoration of the original X configuration file?
YesUninstallation of existing driver: NVIDIA Accelerated Graphics Driver for Linux-x86_64 (570.86.16) is complete.
OKBuild llama.cpp
Download and install CUDA Toolkit, etc., by visiting https://developer.nvidia.com/cuda-downloads and specifying desired platform (use Rocky Linux version).
Copy and run instructions.
cd ~/Downloads
wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda_12.8.1_570.124.06_linux.run
sudo sh cuda_12.8.1_570.124.06_linux.run
Accept the EULA, then install the following:
┌─ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt┐
│ CUDA Installer │
│ - [ ] Driver │
│ [ ] 570.124.06 │
│ + [X] CUDA Toolkit 12.8 │
│ [X] CUDA Demo Suite 12.8 │
│ [X] CUDA Documentation 12.8 │
│ - [ ] Kernel Objects │
│ [ ] nvidia-fs │
│ Options │
│ Install │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │
└─ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt┘
After installation, the following note appears:
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-12.8/
Please make sure that
- PATH includes /usr/local/cuda-12.8/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-12.8/lib64, or, add /usr/local/cuda-12.8/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.8/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 570.00 is required for CUDA 12.8 functionality to work.
To install the driver using this installer, run the following command, replacing with the name of this run file:
sudo .run --silent --driver
Logfile is /var/log/cuda-installer.log
llama.cpp Setup
Download llama.cpp from https://github.com/ggerganov/llama.cpp:
mkdir ~/Projects
cd ~/Projects/
git clone https://github.com/ggerganov/llama.cpp
Build (AlmaLinux):
cd ~/Projects/llama.cpp
export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH
sudo dnf update
sudo dnf install cmake
sudo dnf install libcurl-devel
sudo dnf install gcc-toolset-13-gcc.x86_64 gcc-toolset-13-gcc-gfortran gcc-toolset-13-gcc-c++
# enable software collection needed for llama.cpp
source scl_source enable gcc-toolset-13
cmake -B build -DGGML_CUDA=ON -DGGML_CCACHE=OFF
cmake --build build --config Release
exit
Mount data partition and run llama.cpp (AlmaLinux):
sudo mkdir -p /run/media/michael/data1
sudo mount /dev/nvme1n1p2 /run/media/michael/data1
~/Projects/llama.cpp/build/bin/llama-cli -m /run/media/michael/data1/models/Ministral-8B-Instruct-2410.bf16.gguf \
-t 16 --n-gpu-layers 20
Build (Debian):
cd ~/Projects/llama.cpp
export PATH=/usr/local/cuda-12.8/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:$LD_LIBRARY_PATH
sudo apt update
sudo apt upgrade
sudo apt install cmake
sudo apt install curl libcurl4-openssl-dev
cmake -B build -DGGML_CUDA=ON -DGGML_CCACHE=OFF
cmake --build build --config Release
Mount data partition and run llama.cpp (Debian):
sudo mkdir -p /media/michael/data1
sudo mount /dev/nvme1n1p2 /media/michael/data1
~/Projects/llama.cpp/build/bin/llama-cli -m /media/michael/data1/models/Ministral-8B-Instruct-2410.bf16.gguf \
-t 16 --n-gpu-layers 20
If necessary, convert Hugging Face to .gguf files (Debian):
First time through, set up virtual Python environment:
# https://askubuntu.com/questions/320996/how-to-make-python-program-command-execute-python-3
sudo apt install python-is-python3
sudo apt install python3-venv
python -m venv mvirte
mvirte/bin/pip install transformers torch
mvirte/bin/pip install sentencepiece
# Main argument is path to folder containing files of the form model-00001-of-00007.safetensors
mvirte/bin/python ~/Projects/llama.cpp/convert_hf_to_gguf.py /media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16 \
--outfile /media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16.gguf
If necessary, convert Hugging Face to .gguf files (AlmaLinux):
First time through, set up virtual Python environment:
# https://stackoverflow.com/questions/75608323/how-do-i-solve-error-externally-managed-environment-every-time-i-use-pip-3#75722775
python -m venv mvirte
mvirte/bin/pip install transformers torch
mvirte/bin/pip install sentencepiece
# Main argument is path to folder containing files of the form model-00001-of-00007.safetensors
mvirte/bin/python ~/Projects/llama.cpp/convert_hf_to_gguf.py /run/media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16 \
--outfile /run/media/michael/data1/models/CodeLlama-34b-Instruct-hf-BF16.gguf
If necessary, merge segmented .gguf files (Debian):
# (for AlmaLinux, use /run/media/michael/...)
~/Projects/llama.cpp/build/bin/llama-gguf-split --merge \
/media/michael/data1/models/Llama-4-Scout-17B-16E-Instruct-GGUF/Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00002.gguf \
/media/michael/data1/models/Llama-4-Scout-17B-16E-Instruct-GGUF/Llama-4-Scout-17B-16E-Instruct-Q4_K_M.gguf
LLM timing tests
Hints from ChatGPT on suggested parameters to use for consistent timing tests:
To ensure repeatable testing with llama-cli, you can use the following command format:
./main -m models/llama-2-13b.Q4_K_M.gguf \
--prompt "Benchmark test." \
--n-predict 128 \
--threads 16 \
--batch_size 512 \
--log-disable \
--repeat-penalty 1.0 \
--temp 0.7
Explanation of Key Options:
-m models/llama-2-13b.Q4_K_M.gguf → Selects the model file.
--prompt "Benchmark test." → Keeps input consistent.
--n-predict 128 → Ensures the same number of tokens are generated.
--threads 16 → Uses 16 threads for CPU processing (adjustable).
--batch_size 512 → Helps performance but can be tuned if needed.
--log-disable → Prevents excess log output for clean results.
--repeat-penalty 1.0 & --temp 0.7 → Keeps generation behavior stable.
GPU-Specific:
To disable GPU completely, add --n-gpu-layers 0.
To test with GPU assist, set --n-gpu-layers X (e.g., 30 for part offloaded).
Test results, showing tokens per second and effect of 2 DIMMs vs. 4 DIMMs and CPU-only vs. CPU + GPU:
| CPU only (tps) | 192GB RAM | 96GB RAM | 2-DIMM Speedup |
|---|---|---|---|
| lama-2-13b.Q4_K_M.gguf | 7.04 | 9.84 | 40% | llama-2-13b.Q8_0.gguf | 4.03 | 5.71 | 42% |
| llama-2-70b.Q4_K_M.gguf | 1.35 | 1.92 | 42% |
| Llama-2-70B-fp16_Q8_0.gguf | 0.79 | 1.12 | 42% |
| GPU Assist (tps) | 192GB RAM | 96GB RAM | 2-DIMM Speedup | gpu layers |
|---|---|---|---|---|
| llama-2-13b.Q4_K_M.gguf | 50.95 | 51.03 | 0% | 100 |
| llama-2-13b.Q8_0.gguf | 9.52 | 12.44 | 31% | 27 |
| llama-2-70b.Q4_K_M.gguf | 1.73 | 2.39 | 38% | 21 |
| Llama-2-70B-fp16_Q8_0.gguf | 0.90 | 1.27 | 41% | 12 |
| GPU Speedup | 192GB RAM | 96GB RAM |
|---|---|---|
| llama-2-13b.Q4_K_M.gguf | 7.24x | 5.19x |
| llama-2-13b.Q8_0.gguf | 2.36x | 2.18x |
| llama-2-70b.Q4_K_M.gguf | 1.28x | 1.24x |
| Llama-2-70B-fp16_Q8_0.gguf | 1.14x | 1.13x |