Back to projects

Desktop launcher

llama-wrap

A small Tkinter app for building, saving, importing, and running llama-server commands with local GGUF models.

What it does

llama-wrap is not a chat UI. It is a focused wrapper for people who already use llama.cpp and want a cleaner way to launch llama-server without rebuilding the full command by hand every time.

Why it exists

Local model launches often involve long paths, context settings, GPU layer choices, cache types, ports, host settings, and one-off flags. This tool keeps those settings visible, reusable, and easier to adjust.

Model paths

Browse for model and MMProj GGUF files instead of typing long file paths.

Flag controls

Edit common llama-server flags from the UI and keep uncommon options in extra args.

Presets

Save and reload launch presets, including model paths, enabled flags, and recent commands.

Live output

Watch server logs, launch status, model-loaded detection, and rough VRAM estimates.

Run it locally

Install Python 3.10 or newer, make sure llama-server is available in your PATH, then run the app from source.

python llamawrap.py