Proxmox/Ollama : llm_benchmark

En passant

J’ai trouvé un outil de test de llm : llm_benchmark ( installation via pip )

Je suis en dernière position : https://llm.aidatatools.com/results-linux.php , avec « llama3.1:8b »: « 1.12 ».

 llm_benchmark run
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%', 
'gpu_temperature': '60.0°C'}
Only one GPU card
Total memory size : 61.36 GB
cpu_info: Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
gpu_info: Quadro 4000
os_version: Ubuntu 22.04.5 LTS
ollama_version: 0.5.7
----------
LLM models file path:/usr/local/lib/python3.10/dist-packages/llm_benchmark/data/benchmark_models_16gb_ram.yml
Checking and pulling the following LLM models
phi4:14b
qwen2:7b
gemma2:9b
mistral:7b
llama3.1:8b
llava:7b
llava:13b
----------
....
----------------------------------------
Sending the following data to a remote server
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
 'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%', 
'gpu_temperature': '61.0°C'}
Only one GPU card
-------Linux----------
{'id': '0', 'name': 'Quadro 4000', 'driver': '390.157', 'gpu_memory_total': '1985.0 MB',
 'gpu_memory_free': '1984.0 MB', 'gpu_memory_used': '1.0 MB', 'gpu_load': '0.0%',
 'gpu_temperature': '61.0°C'}
Only one GPU card
{
    "mistral:7b": "1.40",
    "llama3.1:8b": "1.12",
    "phi4:14b": "0.76",
    "qwen2:7b": "1.31",
    "gemma2:9b": "1.03",
    "llava:7b": "1.84",
    "llava:13b": "0.73",
    "uuid": "",
    "ollama_version": "0.5.7"
}
----------

Proxmox : Installation de Ollama en version LXC

Petit test d’installation de Ollama en version LXC via un script :

bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/ct/ollama.sh)"

On va voir le résultat … actuellement m’a carte NVIDIA (ou Bios) de supporte pas le Proxmox Passthrough.

root@balkany:~# dmesg | grep -e DMAR -e IOMMU | grep "enable"
[    0.333769] DMAR: IOMMU enabled

root@balkany:~# dmesg | grep 'remapping'
[    0.821036] DMAR-IR: Enabled IRQ remapping in xapic mode
[    0.821038] x2apic: IRQ remapping doesn't support X2APIC mode

# lspci -nn | grep 'NVIDIA'
0a:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF100GL [Quadro 4000] [10de:06dd] (rev a3)
0a:00.1 Audio device [0403]: NVIDIA Corporation GF100 High Definition Audio Controller [10de:0be5] (rev a1)

# cat /etc/default/grub | grep "GRUB_CMDLINE_LINUX_DEFAULT"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt video=vesafb:off video=efifb:off initcall_blacklist=sysfb_init

# efibootmgr -v
EFI variables are not supported on this system.

# cat /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

# cat /etc/modprobe.d/pve-blacklist.conf | grep nvidia
blacklist nvidiafb
blacklist nvidia

J’ai donc ajouter ceci :

# cat  /etc/modprobe.d/iommu_unsafe_interrupts.conf
options vfio_iommu_type1 allow_unsafe_interrupts=1

J’ai bien un seul groupe iommugroup pour la carte NVIDIA :

Quand je lance le script cela termine par une erreur :

 
  ____  ____
  / __ \/ / /___ _____ ___  ____ _
 / / / / / / __ `/ __ `__ \/ __ `/
/ /_/ / / / /_/ / / / / / / /_/ /
\____/_/_/\__,_/_/ /_/ /_/\__,_/

Using Default Settings
Using Distribution: ubuntu
Using ubuntu Version: 22.04
Using Container Type: 1
Using Root Password: Automatic Login
Using Container ID: 114
Using Hostname: ollama
Using Disk Size: 24GB
Allocated Cores 4
Allocated Ram 4096
Using Bridge: vmbr0
Using Static IP Address: dhcp
Using Gateway IP Address: Default
Using Apt-Cacher IP Address: Default
Disable IPv6: No
Using Interface MTU Size: Default
Using DNS Search Domain: Host
Using DNS Server Address: Host
Using MAC Address: Default
Using VLAN Tag: Default
Enable Root SSH Access: No
Enable Verbose Mode: No
Creating a Ollama LXC using the above default settings
 ✓ Using datastore2 for Template Storage.
 ✓ Using datastore2 for Container Storage.
 ✓ Updated LXC Template List
 ✓ LXC Container 114 was successfully created.
 ✓ Started LXC Container
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
 //bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
 ✓ Set up Container OS
 ✓ Network Connected: 192.168.1.45 
 ✓ IPv4 Internet Connected
 ✗ IPv6 Internet Not Connected
 ✓ DNS Resolved github.com to 140.82.121.3
 ✓ Updated Container OS
 ✓ Installed Dependencies
 ✓ Installed Golang
 ✓ Set up Intel® Repositories
 ✓ Set Up Hardware Acceleration
 ✓ Installed Intel® oneAPI Base Toolkit
 / Installing Ollama (Patience)   
[ERROR] in line 23: exit code 0: while executing command "$@" > /dev/null 2>&1
The silent function has suppressed the error, run the script with verbose mode enabled, which will provide more detailed output.

Misère.

Proxmox : Resize disk on Ubuntu 22

En passant

Passage de 98Go à 392G, sans aucun problème.

La première étape se fait via l’IHM de Proxmox, ensuite il faut lancer ses commandes sur Ubuntu.

fdisk -l

Disk /dev/sda: 400 GiB, 429496729600 bytes, 838860800 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 0588156E-1871-4A3D-900F-4C8C2758E02E

Device Start End Sectors Size Type
/dev/sda1 2048 4095 2048 1M BIOS boot
/dev/sda2 4096 4198399 4194304 2G Linux filesystem
/dev/sda3 4198400 838860766 834662367 398G Linux filesystem

Disk /dev/mapper/ubuntu--vg-ubuntu--lv: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

df -h

Filesystem Size Used Avail Use% Mounted on
tmpfs 6,2G 1,2M 6,2G 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 98G 89G 4,2G 96% /
tmpfs 31G 4,0K 31G 1% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock

sudo pvdisplay

--- Physical volume ---
PV Name /dev/sda3
VG Name ubuntu-vg
PV Size <398,00 GiB / not usable 16,50 KiB
Allocatable yes
PE Size 4,00 MiB
Total PE 101887
Free PE 76287
Allocated PE 25600
PV UUID kJRjOE-1iPT-CVJQ-7QyB-c8I2-ndQQ-Uzi9VE

pvresize /dev/sda3

Physical volume "/dev/sda3" changed
1 physical volume(s) resized or updated / 0 physical volume(s) not resize

sudo lvextend -l +100%FREE /dev/ubuntu-vg/ubuntu-lv

Size of logical volume ubuntu-vg/ubuntu-lv changed from 100,00 GiB (25600 extents) to <398,00 GiB (101887 extents).
Logical volume ubuntu-vg/ubuntu-lv successfully resized.

sudo lvdisplay

--- Logical volume ---
LV Path /dev/ubuntu-vg/ubuntu-lv
LV Name ubuntu-lv
VG Name ubuntu-vg
LV UUID l8Obv4-PXVy-VEsm-db9B-5yZ8-Ybmi-010JO9
LV Write Access read/write
LV Creation host, time ubuntu-server, 2024-02-08 09:41:27 +0000
LV Status available
# open 1
LV Size <398,00 GiB
Current LE 101887
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0

sudo resize2fs /dev/ubuntu-vg/ubuntu-lv

resize2fs 1.46.5 (30-Dec-2021)
Filesystem at /dev/ubuntu-vg/ubuntu-lv is mounted on /; on-line resizing required
old_desc_blocks = 13, new_desc_blocks = 50
The filesystem on /dev/ubuntu-vg/ubuntu-lv is now 104332288 (4k) blocks long.

df -h

Filesystem Size Used Avail Use% Mounted on
tmpfs 6,2G 1,2M 6,2G 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv 392G 89G 286G 24% /
tmpfs 31G 4,0K 31G 1% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock

Proxmox/NVIDIA : Quadro 4000 en mode PCI Passthrough … ca marche !

En passant

Configuration :

  • Proxmox : 8.3.2
  • Proxmox kernel : 6.8.12-5-pve
  • VM : Ubuntu 22.04.5 LTS
  • VM kernel : 5.15.0-130-generic

Installation :

# sudo add-apt-repository ppa:graphics-drivers/ppa --yes
# sudo apt update
# sudo apt install nvidia-cuda-toolkit
# sudo apt install nvidia-driver-390
...
reboot
...
# nvidia-smi       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.157                Driver Version: 390.157                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro 4000         Off  | 00000000:00:10.0 Off |                  N/A |
| 36%   61C   P12    N/A /  N/A |      1MiB /  1985MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Le driver 4xx ne fonctionne pas avec la « Quadro 4000 » .

Pour finir, blocage des mises à jours

# dpkg-query -W --showformat='${Package} ${Status}\n' | grep -v deinstall | awk '{ print $1 }' | \
    grep -E 'nvidia.*-[0-9]+$' | \
    xargs -r -L 1 sudo apt-mark hold
libnvidia-cfg1-390 passé en figé (« hold »).
libnvidia-common-390 passé en figé (« hold »).
libnvidia-compute-390 passé en figé (« hold »).
libnvidia-decode-390 passé en figé (« hold »).
libnvidia-encode-390 passé en figé (« hold »).
libnvidia-extra-470 passé en figé (« hold »).
libnvidia-fbc1-390 passé en figé (« hold »).
libnvidia-gl-390 passé en figé (« hold »).
libnvidia-ifr1-390 passé en figé (« hold »).
nvidia-compute-utils-390 passé en figé (« hold »).
nvidia-dkms-390 passé en figé (« hold »).
nvidia-driver-390 passé en figé (« hold »).
nvidia-kernel-common-390 passé en figé (« hold »).
nvidia-kernel-source-390 passé en figé (« hold »).
nvidia-utils-390 passé en figé (« hold »).
xserver-xorg-video-nvidia-390 passé en figé (« hold »).