NAS Lab - Nils Malmberg

Le matériel

Le NAS Ugreen DXP6800 Pro est un NAS de la gamme NASync équipé d'un Intel i5-1235U (10 cœurs). J'y ai installé 64 Go de RAM DDR4 et un GPU Nvidia RTX 3050 (6Gb de VRAM) de la marque Yeston en low-profile via le slot PCIe (x4) interne. J'utilise le système d'exploitation fournit par UGREEN : UGOS — un système basé sur Debian 12 (Bookworm).

Ce NAS est ma plateforme d'expérimentation principale : conteneurs Docker, LLM locale, hébergement de services. Les tutos ci-dessous documentent certaines de mes installations.

📋 Tutos disponibles

Accès au NAS à distance
Déployer Ollama + OpenWebUI sur le NAS (stack Docker complète)
Corriger les packages UGOS pour accéder au GPU Nvidia dans Docker
Déployer OpenCode sur le NAS (en construction)

Tuto 1 — Accès au NAS à distance

Afin d'accéder au NAS à distance, je ne souhaitais pas utiliser le service UGREENlink car il implique une dépendance à un service tiers et le passage des données sur les serveurs de UGREEN. Cela peut ralentir le transit mais également poser question d'un point de vue protection des données. De plus, cela implique une exposition sur Internet du NAS.
J'ai donc opté pour une solution plus classique : acessibilité du NAS uniquement sur mon réseau local. Afin d'accéder au NAS à distance, depuis un autre réseau, j'utilise le système de VPN WireGuard qui est directement hébergé sur ma Freebox. L'hébergement du serveur VPN sur ma box et non sur le NAS permet de protéger le NAS si le VPN est compromis, et d'avoir accès à tous les autres appareils de mon réseau local.

1 Prérequis : demander un adresse IP fixe

Par défaut, l'adresse IP des box internet est attribuée dynamiquement. Elle va donc changer périodiquement. Deux options sont possibles :

Demander une adresse IP fixe dans le panneau de configuration de votre box.
Utiliser un service de DNS dynamique (DDNS) pour mapper un nom de domaine à l'adresse IP du NAS.

L'option 1 est généralement la plus simple à mettre en place.
Pour ma part, je me suis connecté sur mon espace abonné free puis dans "Ma Freebox", j'ai cliqué sur "Demander une adresse IP fixe V4 full-stack". Site Free

L'attribution d'une IP fixe peut prendre un certain temps.

2 Activer le serveur VPN sur sa box internet

Pour l'hébergement du serveur VPN sur ma box, je me suis connecté sur le site mafreebox.freebox.fr puis je clique sur "Paramètres de la Freebox" puis sur "Serveur VPN" puis sur "WireGuard" et j'active le service. Ensuite je configure un nouvel utilisateur et je télécharge le fichier de configuration. Il y a également un QR code disponible permettant de configurer rapidement son VPN sur l'application mobile WireGuard.

3 Accéder au NAS

Maintenant il suffit d'activer le VPN sur votre appareil puis accéder au NAS via son adresse IP locale ou via l'application UGREEN NAS.

Tuto 2 — Déployer Ollama + OpenWebUI sur le NAS (stack Docker complète)

Ollama permet de faire tourner des LLM open source en local (sans clé API). Il existe une application pour Windows et Mac OS mais il peut également être lancé via terminal. OpenWebUI offre une interface type ChatGPT afin d'interagir avec les modèles que l'on peut trouver sur Ollama.
Ici, on sera limité dans le choix du modèle à utiliser car il doit pouvoir tourner sur le NAS. Malgré les 64Gb de RAM, les performances sont fortements limitées lorsqu'on exécute sur CPU. Cela est du au fait qu'un CPU est très bon dans l'exécution séquentielle de tâches complexes. Un LLM et autres métamodèles sont essentiellements basés sur la multiplication/inversion de matrices. Des milliards d'opérations sont nécessaires pour chaque token généré. Ces calculs étant parallélisables, faire tourner ces modèles sur GPU permet donc un gain de performance significatif.
Il faut compter envrion 5Gb de VRAM nécessaire pour faire tourner un modèle de 7 milliards de paramètres sur GPU. Avec la configuration actuelle (sans GPU), il faudra se contenter de modèles plus légers.
Un autre tuto est consacré à l'installation d'une carte graphique pour accélérer les performances d'exécution de ces modèles.

1 Prérequis : activer SSH sur UGOS

Il est possible de passer par l'application Docker disponible dans UGOS mais je préfère utiliser le terminal SSH pour plus de flexibilité :

Panneau de configuration → Terminal → SSH → Activer → Appliquer

Puis se connecter depuis PowerShell ou un terminal :

ssh votre_utilisateur@adresse_ip_du_nas

Passer en root :

sudo -i

2 Créer le dossier et le fichier docker-compose.yml

Créer un dossier dédié sur le volume de données partagées:

mkdir -p /volume1/docker/ai-stack
cd /volume1/docker/ai-stack
vim docker-compose.yml

Créer une clé API:

openssl rand -hex 32

Coller la configuration suivante en remplaçant "change_this_secret_key_please" par la clé générée (choisir la version CPU si pas de GPU installé. Sinon regarder le tuto sur l'installation d'une carte graphique et choisir la version GPU) :

docker-compose.yml (version CPU)

services:

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - OLLAMA_HOST=0.0.0.0
      - OLLAMA_ORIGINS=*
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      timeout: 10s
      retries: 3

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    volumes:
      - open_webui_data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=change_this_secret_key_please
    depends_on:
      ollama:
        condition: service_healthy

volumes:
  ollama_data:
  open_webui_data:

docker-compose.yml (version GPU)

services:

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      - OLLAMA_HOST=0.0.0.0
      - OLLAMA_ORIGINS=*
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      timeout: 10s
      retries: 3

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    volumes:
      - open_webui_data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=change_this_secret_key_please
    depends_on:
      ollama:
        condition: service_healthy

volumes:
  ollama_data:
  open_webui_data:

Sauvegarder en écrivant :

:wq

3 Lancer la stack et accéder à l'interface

docker compose up -d

Attendre quelques minutes le téléchargement des images. Une fois démarré, ouvrir dans le navigateur :

http://<ip_nas>:3000/

4 Première utilisation

Ouvrir dans le navigateur :

http://<ip_nas>:3000/

Créer un compte administrateur.

Choisir un modèle sur le site d'Ollama. Je vous conseille de commencer par un modèle léger comme qwen3.5:0.8b qui fait environ 1Gb et tourne correctement sur CPU.

Télécharger le modèle choisi.

Tester le modèle.

Tuto 3 — Corriger les packages UGOS pour accéder au GPU Nvidia dans Docker

J'ai installé une carte graphique Nvidia RTX 3050 sur mon NAS et j'ai installé les pilotes et la toolkit disponibles dans le centre d'application de UGOS.

Si vous souhaitez installer un GPU, il faut vérifier qu'il est compatible (voir la liste des cartes supportées). De plus, si vous souhaitez utiliser le port PCIe 4 pin disponible et intégrer la carte dans le boitier, il faut que la carte soit low-profile, single-slot et quelle consomme moins de 75W. Si vous souhaitez utiliser un GPU plus performant, vous pouvez utiliser le port thunderbolt et fonctionner en eGPU.

Cependant, malgré toutes les précautions, il n'est pas possible, par défaut d'utiliser le GPU via Docker. En effet, le GPU est parfaitement intégré pour les outils d'IA pour le traitement des Photos mais pas pour le reste. Le problème reste néanmoins classique : les drivers Nvidia et CUDA/cuDNN sont installés sur le NAS, mais il manque le NVIDIA Container Toolkit — le composant qui fait le pont entre Docker et les drivers GPU. Sans lui, Docker ne voit pas la carte graphique.

Sur UGOS (Debian 12), une difficulté supplémentaire vient des dépôts locaux CUDA/cuDNN installés par Ugreen qui ont des fichiers Release manquants, bloquant apt.

1 Ouvrir une connexion SSH et tester l'installation

Via l'interface graphique (application UGREEN NAS):

Panneau de configuration → Terminal → SSH → Activer → Appliquer

Puis se connecter depuis PowerShell ou un terminal :

ssh votre_utilisateur@adresse_ip_du_nas

Passer en root :

sudo -i

Tester l'installation du toolkit :

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

Si vous voyez des erreurs du type "The repository does not have a Release file" pour les dépôts CUDA locaux, passez à l'étape suivante.

2 Désactiver les dépôts CUDA/cuDNN locaux cassés

Les dépôts locaux installés par UGOS pour CUDA 12.8 et cuDNN 9.10.0 ont des fichiers Release manquants. Il faut les désactiver temporairement :

sudo mv /etc/apt/sources.list.d/cuda-debian12-12-8-local.list /tmp/ 2>/dev/null || true
sudo mv /etc/apt/sources.list.d/cudnn-local-repo-debian12-9.10.0.list /tmp/ 2>/dev/null || true

Pourquoi ? Ces dépôts locaux pointent vers des fichiers /var/cuda-repo-*/Release absents. Apt refuse alors de mettre à jour, bloquant toute installation.

3 Réparer les dépendances et installer le toolkit

# Mettre à jour les listes de paquets
sudo apt-get update

# Corriger les dépendances cassées (libglib2.0, libexiv2, etc.)
sudo apt --fix-broken install -y

# Installer le NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit

Cette séquence installe libnvidia-container1, libnvidia-container-tools, nvidia-container-toolkit-base et nvidia-container-toolkit.

4 Configurer Docker et vérifier l'accès GPU

Configurer le runtime NVIDIA pour Docker :

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

A ce stade, il peut rester un dépôt cuDNN cassé. Ce n'est pas bloquant mais pour retirer l'avertissement :

ls /etc/apt/sources.list.d/ | grep cudnn
  sudo rm /etc/apt/sources.list.d/<nom_du_fichier>

Vérifier que Docker voit bien le GPU :

docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

Si la commande affiche les infos de votre carte Nvidia, tout est en ordre. Vous pouvez maintenant relancer la stack Ollama avec accélération GPU.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.181                Driver Version: 570.181        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050        On  |   00000000:02:00.0 Off |                  N/A |
|100%   38C    P8              5W /   70W |       0MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

5 Test de l'utilisation du GPU avec Ollama

Lancer un conteneur Ollama avec accélération GPU :

docker run -d \
  --gpus all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  --restart unless-stopped \
  ollama/ollama

Télécharger un modèle léger adapté à 6GB VRAM

docker exec -it ollama ollama pull llama3.2:3b

Tester l'utilisation du GPU :

docker exec -it ollama ollama run llama3.2:3b "Code moi la fonction factorielle en Python."

En parallèle, dans un autre terminal, établissez la connexion ssh et tapez la commande suivante :

watch -n 1 nvidia-smi

Durant la génération de la réponse, vous devriez voir l'évolution de l'utilisation des ressources GPU.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.181                Driver Version: 570.181        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050        On  |   00000000:02:00.0 Off |                  N/A |
|100%   47C    P2             39W /   70W |    2782MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A          347591      C   /usr/bin/ollama                        2774MiB |
+-----------------------------------------------------------------------------------------+

6 Nettoyage des fichiers de tests

Toujours en connexion ssh :

docker stop ollama
docker rm ollama
docker volume rm ollama
docker ps -a | grep ollama
docker volume ls | grep ollama
docker rmi nvidia/cuda:12.8.0-base-ubuntu22.04

7 Configuration de Ollama

Voir le tuto 2.

Tuto 4 (en construction) — Installation d'OpenCode et configuration avec Ollama (local)

OpenCode est un outil d'automatisation de tâches basé sur des LLM. Il permet de créer des agents personnalisés capables d'exécuter des tâches complexes en combinant différentes actions (exécuter du code, faire des requêtes web, interagir avec des APIs, etc.). OpenCode peut être utilisé pour automatiser une grande variété de tâches, de la gestion de fichiers à l'automatisation de processus métier.

OpenCode est souvent vu et utilisé commen une alternative à Claude Code.

1 Prérequis

Suivre les tutos précédents afin de configurer Ollama.

Retour

NASLinuxDocker OllamaLLMen cours

Back to projects

Components

The Ugreen DXP6800 Pro is a NAS from the NASync range equipped with an Intel i5-1235U (10 cores). I installed 64 Go of DDR4 RAM and a Nvidia RTX 3050 (6Gb VRAM) GPU from the brand Yeston in low-profile via the internal PCIe (x4) slot. I use the operating system provided by UGREEN : UGOS — a system based on Debian 12 (Bookworm).

This NAS is my main experimentation platform : Docker containers, local LLM, service hosting. The tutorials below document some of my installations.

📋 Available Tutorials

Remote Access to the NAS
DDeploy Ollama + OpenWebUI on the NAS (complete Docker stack)
Fix UGOS packages to access Nvidia GPU in Docker
DDeploy OpenCode on the NAS (under construction)

Tuto 1 — Remote Access to the NAS

In order to access the NAS remotely, I did not want to use the UGREENlink service because it involves a dependency on a third-party service and the transfer of data through UGREEN's servers. This can slow down the data transit and also raise concerns from a data protection perspective. Moreover, it implies exposing the NAS on the Internet.
I therefore opted for a more traditional solution: accessibility of the NAS only on my local network. In order to access the NAS remotely, from another network, I use the WireGuard VPN system which is directly hosted on my Freebox. Hosting the VPN server on my box and not on the NAS allows me to protect the NAS if the VPN is compromised, and to have access to all other devices on my local network.

1 Requirements : Request a static IP address

By default, the IP address of internet boxes is assigned dynamically. It will therefore change periodically. Two options are possible :

Request a static IP address in the configuration panel of your box.
Use a dynamic DNS (DDNS) service to map a domain name to the NAS's IP address.

Option 1 is generally the simplest to set up.
For my part, I logged into my free subscriber area and then in "My Freebox", I clicked on "Request a static IPv4 full-stack address". Free Site

Assigning a static IP address can take some time.

2 Activate the VPN server on your internet box

For hosting the VPN server on my box, I logged into the website mafreebox.freebox.fr then clicked on "Box Settings" then on "VPN Server" then on "WireGuard" and activated the service. Then I configured a new user and downloaded the configuration file. There is also a QR code available that allows for quick configuration of the VPN on the mobile WireGuard application.

3 Access the NAS

Now you just need to activate the VPN on your device and then access the NAS via its local IP address or through the UGREEN NAS application.

Tuto 2 — Deploy Ollama + OpenWebUI on the NAS (Complete Docker Stack)

Ollama allows you to run open-source LLMs locally (without an API key). There is an application for Windows and Mac OS, but it can also be run via the terminal. OpenWebUI provides a ChatGPT-style interface to interact with the models available on Ollama.
Here, we will be limited in the choice of models to use as they must be able to run on the NAS. Despite having 64GB of RAM, performance is significantly limited when running on CPU. This is because a CPU is very good at executing sequential tasks. LLMs and other meta-models are essentially based on matrix multiplication/inversion. Billions of operations are needed for each generated token. Since these calculations are parallelizable, running these models on GPU allows for a significant performance gain.
You need to account for approximately 5GB of VRAM necessary to run a 7-billion-parameter model on GPU. With the current configuration (without GPU), you will have to settle for lighter models.
Another tutorial is dedicated to installing a graphics card to accelerate the performance of these models.

1 Prerequisites: Enable SSH on UGOS

It is possible to use the Docker application available in UGOS, but I prefer to use the SSH terminal for more flexibility:

Configuration Panel → Terminal → SSH → Enable → Apply

Then connect from PowerShell or a terminal:

ssh your_user@nas_ip_address

Switch to root user:

sudo -i

2 Create the directory and the docker-compose.yml file

Create a dedicated folder on the shared data volume:

mkdir -p /volume1/docker/ai-stack
cd /volume1/docker/ai-stack
vim docker-compose.yml

Create an API key:

openssl rand -hex 32

Paste the following configuration, replacing "change_this_secret_key_please" with the generated key (choose the CPU version if no GPU is installed. Otherwise, check the tutorial on installing a graphics card and choose the GPU version):

docker-compose.yml (version CPU)

services:

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    environment:
      - OLLAMA_HOST=0.0.0.0
      - OLLAMA_ORIGINS=*
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      timeout: 10s
      retries: 3

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    volumes:
      - open_webui_data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=change_this_secret_key_please
    depends_on:
      ollama:
        condition: service_healthy

volumes:
  ollama_data:
  open_webui_data:

docker-compose.yml (version GPU)

services:

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    environment:
      - OLLAMA_HOST=0.0.0.0
      - OLLAMA_ORIGINS=*
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      timeout: 10s
      retries: 3

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    volumes:
      - open_webui_data:/app/backend/data
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - WEBUI_SECRET_KEY=change_this_secret_key_please
    depends_on:
      ollama:
        condition: service_healthy

volumes:
  ollama_data:
  open_webui_data:

Save the file by typing:

:wq

3 Launch the stack and access the interface

docker compose up -d

Wait a few minutes for the image downloads. Once started, open in the browser:

http://<ip_nas>:3000/

4 First-time setup

Open in the browser:

http://<ip_nas>:3000/

Create an administrator account.

Choose a model on the Ollama website. I recommend starting with a lightweight model like qwen3.5:0.8b, which is about 1GB and runs correctly on CPU.

Download the chosen model.

Test the model.

Tuto 3 — Repair UGOS broken packages to access Nvidia GPU in Docker

I have installed an Nvidia RTX 3050 graphics card on my NAS and installed the drivers and toolkit available in the UGOS application center.

If you want to install a GPU, you need to verify that it is compatible (see the list of supported cards). Moreover, if you want to use the available PCIe 4-pin port and integrate the card into the case, the card must be low-profile, single-slot and consume less than 75W. If you want to use a more powerful GPU, you can use the thunderbolt port and operate in eGPU mode.

However, despite all precautions, it is not possible by default to use the GPU via Docker. Indeed, the GPU is perfectly integrated for AI tools for photo processing but not for the rest. The problem remains classic: the Nvidia drivers and CUDA/cuDNN are installed on the NAS, but the NVIDIA Container Toolkit is missing — the component that bridges Docker and the GPU drivers. Without it, Docker does not see the graphics card.

On UGOS (Debian 12), an additional difficulty comes from the local CUDA/cuDNN repositories installed by Ugreen which have missing Release files, blocking apt.

1 Open an SSH connection and test the installation

Using the graphical interface (UGREEN NAS application):

Configuration panel → Terminal → SSH → Enable → Apply

Then connect from PowerShell or a terminal:

ssh your_user@nas_ip_address

Switch to root:

sudo -i

Test the installation of the toolkit:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
  | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

If you see errors of the type "The repository does not have a Release file" for the local CUDA repositories, proceed to the next step.

2 Disable the broken local CUDA/cuDNN repositories

The local repositories installed by UGOS for CUDA 12.8 and cuDNN 9.10.0 have missing Release files. They need to be disabled temporarily:

sudo mv /etc/apt/sources.list.d/cuda-debian12-12-8-local.list /tmp/ 2>/dev/null || true
sudo mv /etc/apt/sources.list.d/cudnn-local-repo-debian12-9.10.0.list /tmp/ 2>/dev/null || true

Why? These local repositories point to absent /var/cuda-repo-*/Release files. Apt then refuses to update, blocking any installation.

3 Repair the dependencies and install the toolkit

# Update the package lists
sudo apt-get update

# Fix broken dependencies (libglib2.0, libexiv2, etc.)
sudo apt --fix-broken install -y

# Install the NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit

This sequence installs libnvidia-container1, libnvidia-container-tools, nvidia-container-toolkit-base and nvidia-container-toolkit.

4 Configure Docker and verify GPU access

Configure the NVIDIA runtime for Docker:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

At this stage, there might still be a broken cuDNN repository. This is not blocking but to remove the warning:

ls /etc/apt/sources.list.d/ | grep cudnn
  sudo rm /etc/apt/sources.list.d/<nom_du_fichier>

Verify that Docker can see the GPU:

docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi

If the command displays information about your Nvidia card, everything is in order. You can now restart the Ollama stack with GPU acceleration.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.181                Driver Version: 570.181        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050        On  |   00000000:02:00.0 Off |                  N/A |
|100%   38C    P8              5W /   70W |       0MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

5 Test the use of the GPU with Ollama

Launch an Ollama container with GPU acceleration:

docker run -d \
  --gpus all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  --restart unless-stopped \
  ollama/ollama

Download a lightweight model suitable for 6GB VRAM

docker exec -it ollama ollama pull llama3.2:3b

Test the use of the GPU:

docker exec -it ollama ollama run llama3.2:3b "Code the factorial function in Python."

In parallel, in another terminal, establish the ssh connection and type the following command:

watch -n 1 nvidia-smi

During the response generation, you should see the evolution of the GPU resources usage.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.181                Driver Version: 570.181        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050        On  |   00000000:02:00.0 Off |                  N/A |
|100%   47C    P2             39W /   70W |    2782MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A          347591      C   /usr/bin/ollama                        2774MiB |
+-----------------------------------------------------------------------------------------+

6 Cleaning the test files

Always in ssh connection:

docker stop ollama
docker rm ollama
docker volume rm ollama
docker ps -a | grep ollama
docker volume ls | grep ollama
docker rmi nvidia/cuda:12.8.0-base-ubuntu22.04

7 Configuration of Ollama

See tutorial 2.

Tuto 4 (in construction) — OpenCode installation and configuration with Ollama (local)

OpenCode is a task automation tool based on LLMs. It allows creating customized agents capable of executing complex tasks by combining different actions (executing code, making web requests, interacting with APIs, etc.). OpenCode can be used to automate a wide variety of tasks, from file management to business process automation.

OpenCode is often seen and used as an alternative to Claude Code.

1 Requirements

Follow the previous tutorials to configure Ollama.

Return

NASLinuxDocker OllamaLLMin progress

Ugreen DXP6800 Pro - NAS Lab