Hosting Ollama Models Locally with Ubuntu Server: A Step-by-Step Guide

12/14/2024 • 3 min read

Setting up a local server for AI experimentation? This guide walks you through installing Ubuntu, configuring NVIDIA GPUs, and integrating Ollama models with OpenWebUI for seamless interaction.

Ubuntu AI Self-Hosting

Experimenting with AI models like Ollama requires a robust local setup to maximize performance and usability. In this guide, I’ll show you how to prepare an Ubuntu server, install Ollama, and set up OpenWebUI for an interactive interface—all optimized for users with or without technical expertise.

The Foundation: Setting Up Ubuntu Server

A solid foundation begins with Ubuntu Server installation. Let’s dive into the setup process.

Step 1: Installing Ubuntu Server

First, download and install Ubuntu Server on your PC. Here’s how:

Visit the Ubuntu Server download page and get the ISO file.
Create a bootable USB drive using tools like Rufus (for Windows) or dd (for Linux/Mac):
```
sudo dd if=/path/to/ubuntu-server.iso of=/dev/sdX bs=4M status=progress
```
(Replace /dev/sdX with your USB device.)
Boot from the USB and follow the on-screen instructions to install Ubuntu. Remember to set up a username, password, and hostname.

Building Blocks: Essential Updates and GPU Configuration

Keeping Your System Updated

Once Ubuntu is installed, ensure your server is up-to-date with essential tools:

sudo apt update && sudo apt upgrade -y
sudo apt install build-essential dkms linux-headers-$(uname -r) software-properties-common -y

Configuring NVIDIA Drivers

If your server uses an NVIDIA GPU, install the appropriate drivers:

Add the NVIDIA PPA:

sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update

Detect and install the recommended driver:

ubuntu-drivers devices
sudo apt install nvidia-driver-560 -y
sudo reboot

Verify the installation:
```
nvidia-smi
```

Getting Started with Ollama

Ollama lets you work with advanced AI models locally. Here’s how to get started:

Install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Add models (e.g., llama3):
```
ollama pull llama3
```

Enhancing Usability with OpenWebUI

OpenWebUI provides a seamless interface for interacting with your models:

Set up OpenWebUI using Docker:

sudo docker run -d --network=host -v open-webui:/app/backend/data \
    -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
    --name open-webui --restart always \
    ghcr.io/open-webui/open-webui:main

Access the WebUI through your server’s IP.

Testing and Troubleshooting

GPU Verification: Use nvidia-smi to confirm GPU functionality.
Common Errors:

aplay command not found: Install alsa-utils:
```
sudo apt install alsa-utils -y
```
Deprecated hwdb errors: Update packages:
```
sudo apt update && sudo apt full-upgrade -y
```

Optional: CUDA for Compute Workloads

To maximize GPU compute capabilities:

Install CUDA:
```
sudo apt install nvidia-cuda-toolkit -y
```
Verify:
```
nvcc --version
```

With this setup, your Ubuntu server is now optimized for hosting Ollama models and leveraging OpenWebUI. Whether for experimentation or production, this guide ensures a smooth and efficient process. Happy experimenting!