## mail an bjoern - boeltz: sample daten fuer test des setups; boltz braucht wie es scheint nur cuda installiert - ollama: ordner auf qumulo fuer docker daten - [x] nfs docker mount timing - [x] explicit gpu - [x] searxng openwebui - [x] collect all url's ## Steps 1. [x] mounting and cabeling 2. [x] check bios settings 3. [x] setup storage ?? or do they have hardware raid1 4. [x] setup iLo 5. [x] os installation via usb stick - prepare before hand 6. [x] ansible base install (sec, packages, docker) 7. [x] ansible compose 8. [x] ansible nfs - mount qumulo share(s) 9. [x] manuall 25 GBits config -> use saved netplan file 10. [x] manuall nvidia driver install with manuall (nvidia driver, cuda driver and container toolkit) 11. [x] install beszel agent 12. [x] spin up containers and test them 13. [x] install [boltz](https://github.com/jwohlwend/boltz) and test it ## TODO - [ ] (optional) clean from snap - [=] beszel reverse proxying via firewall. sophos intuitively not made for this - [=] install beszel agent on all devices - [ ] extend network diagram - [x] write ansible playbook? - [x] test ansible contruct - [x] prepare boot stick ## base - Hostname: neo-srv-ai-01 - IP Addres: 192.168.60.203 - Floating IP: 192.168.60.213 - iLo IP: 192.168.50.213 ## ansible-roles - [x] geerlingguy.security - [x] geerlingguy.docker - [x] nfs-client (mount qumulo shares) - [ ] users (separate) - [x] nvidia (driver) -> do manually - [x] interfaces (25GBits NICs) -> do manually ## Manual nvidia driver, cuda driver and container toolkit ### NVIDIA driver Check if GPUs are recognized by the base OS: ```bash sudo lspci | grep -i nvidia ``` Which should some output if it finds nvidia deivces. Search for required drivers for your GPUs: ```bash sudo ubuntu-drivers devices ``` Automatically install all drivers: ```bash sudo ubuntu-drivers autoinstall ``` Reboot the system for changes to take effect: ```bash sudo reboot ``` Shot GPU stats with: ```bash nvidia-smi ``` ### Cuda driver **Disable Secure Boot in BIOS** Install Cuda drivers: ```bash wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt update sudo apt -y install cuda-toolkit-12-8 sudo apt install -y cuda-drivers ``` ### Container toolkit Install the Nvidia Container toolkit: ```bash curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list apt update apt install -y nvidia-container-toolkit ``` Test a simple cuda container and nvidia-smi command inside: ```bash docker run --rm --gpus all nvidia/cuda:13.0.0-base-ubuntu24.04 nvidia-smi ```