Desktop launcher

llama-wrap

A small Tkinter app for building, saving, importing, and runningllama-server commands with local GGUF models.

What it does

llama-wrap is not a chat UI. It is a focused wrapper for people who already usellama.cpp and want a cleaner way to launch llama-server without rebuilding the full command by hand every time.

Why it exists

Local model launches often involve long paths, context settings, GPU layer choices, cache types, ports, host settings, and one-off flags. This tool keeps those settings visible, reusable, and easier to adjust.

Model paths

Browse for model and MMProj GGUF files instead of typing long file paths.

Flag controls

Edit common llama-server flags from the UI and keep uncommon options in extra args.

Presets

Save and reload launch presets, including model paths, enabled flags, and recent commands.

Live output

Watch server logs, launch status, model-loaded detection, and rough VRAM estimates.

Run it locally

Install Python 3.10 or newer, make sure llama-server is available in your PATH, then run the app from source.

python llamawrap.py