Hosting Ollama Models Locally with Ubuntu Server: A Step-by-Step Guide

A comprehensive guide to setting up a local AI server with Ubuntu and NVIDIA GPUs.
Experimenting with AI models like Ollama requires a robust local setup to maximize performance and usability. In this guide, I’ll show you how to prepare an Ubuntu server, install Ollama, and set up OpenWebUI for an interactive interface—all optimized for users with or without technical expertise.
**The Foundation: Setting Up Ubuntu Server**
A solid foundation begins with Ubuntu Server installation. Let’s dive into the setup process.
**Step 1: Installing Ubuntu Server**
First, download and install Ubuntu Server on your PC. Here's how:
Visit the Ubuntu Server download page and get the ISO file.
Create a bootable USB drive using tools like Rufus (for Windows) or
dd(for Linux/Mac):sudo dd if=/path/to/ubuntu-server.iso of=/dev/sdX bs=4M status=progress(Replace
/dev/sdXwith your USB device.)Boot from the USB and follow the on-screen instructions to install Ubuntu. Remember to set up a username, password, and hostname.
**Building Blocks: Essential Updates and GPU Configuration**
**Keeping Your System Updated**
Once Ubuntu is installed, ensure your server is up-to-date with essential tools:
sudo apt update && sudo apt upgrade -y
sudo apt install build-essential dkms linux-headers-$(uname -r) software-properties-common -y
**Configuring NVIDIA Drivers**
If your server uses an NVIDIA GPU, install the appropriate drivers:
- Add the NVIDIA PPA:
sudo add-apt-repository ppa:graphics-drivers/ppa -y sudo apt update - Detect and install the recommended driver:
ubuntu-drivers devices sudo apt install nvidia-driver-560 -y sudo reboot - Verify the installation:
nvidia-smi
**Getting Started with Ollama**
Ollama lets you work with advanced AI models locally. Here’s how to get started:
- Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh - Add models (e.g.,
llama3):ollama pull llama3
**Enhancing Usability with OpenWebUI**
OpenWebUI provides a seamless interface for interacting with your models:
- Set up OpenWebUI using Docker:
sudo docker run -d --network=host -v open-webui:/app/backend/data \ -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \ --name open-webui --restart always \ ghcr.io/open-webui/open-webui:main - Access the WebUI through your server’s IP.
**Testing and Troubleshooting**
GPU Verification: Use
nvidia-smito confirm GPU functionality.Common Errors:
aplay command not found: Installalsa-utils:sudo apt install alsa-utils -yDeprecated
hwdberrors: Update packages:sudo apt update && sudo apt full-upgrade -y
**Optional: CUDA for Compute Workloads**
To maximize GPU compute capabilities:
- Install CUDA:
sudo apt install nvidia-cuda-toolkit -y - Verify:
nvcc --version
With this setup, your Ubuntu server is now optimized for hosting Ollama models and leveraging OpenWebUI. Whether for experimentation or production, this guide ensures a smooth and efficient process. Happy experimenting!