What it does
llama-wrap is not a chat UI. It is a focused wrapper for people who already use llama.cpp and want a cleaner way to launch llama-server without rebuilding the full command by hand every time.
Desktop launcher
A small Tkinter app for building, saving, importing, and running llama-server commands with local GGUF models.
llama-wrap is not a chat UI. It is a focused wrapper for people who already use llama.cpp and want a cleaner way to launch llama-server without rebuilding the full command by hand every time.
Local model launches often involve long paths, context settings, GPU layer choices, cache types, ports, host settings, and one-off flags. This tool keeps those settings visible, reusable, and easier to adjust.
Browse for model and MMProj GGUF files instead of typing long file paths.
Edit common llama-server flags from the UI and keep uncommon options in extra args.
Save and reload launch presets, including model paths, enabled flags, and recent commands.
Watch server logs, launch status, model-loaded detection, and rough VRAM estimates.
Install Python 3.10 or newer, make sure llama-server is available in your PATH, then run the app from source.
python llamawrap.py