Skip to Content
Models

Models

skillet_agent runs against any OpenAI-compatible  endpoint. Three providers are wired today: OpenAI (the default, hosted), LMStudio, and Ollama (both local OpenAI-compatible servers). Models are named with the explicit <provider>/<model> convention so the runtime can route requests without guessing — see UtilsAi.parseProviderModel. The full provider set is the fixed enum UtilsAi.SUPPORTED_PROVIDERS; an unknown provider prefix throws at startup.

Picking a model

The runner looks for a model spec in this order (first hit wins):

  1. SKILLET_MODEL_RUNNER env var — global override, full <provider>/<model> form.
  2. model: in the agent’s AGENTS.md frontmatter — per-skillet, same format.
  3. openai/gpt-4.1-nano — built-in default.

For the eval grader, the equivalent var is SKILLET_MODEL_EVAL (same fallback chain, same default).

Preflight check

When SKILLET_MODEL_RUNNER is set, chat, run, and eval_run preflight it before the first model call and exit 1 with an actionable message if the provider can’t be reached — no OPENAI_API_KEY for openai, or an unreachable local server for lmstudio / ollama. This fails fast instead of dying deep inside the first request. The probe is a GET <baseURL>/models bounded to 1.5 s; any HTTP response counts as reachable. When the variable is unset, the check is a no-op. See UtilsAi.checkModelRunnerRunnable.

Model spec format

Every model spec is <provider>/<model>. The string is split on the first /; everything after it is the model id passed verbatim to the OpenAI SDK. That lets LMStudio’s own namespaced ids (like liquid/lfm2-1.2b) ride along:

SpecProviderModel id sent to the SDK
openai/gpt-4.1-nanoopenaigpt-4.1-nano
openai/gpt-4.1openaigpt-4.1
lmstudio/liquid/lfm2-1.2blmstudioliquid/lfm2-1.2b
lmstudio/google/gemma-3-4b-itlmstudiogoogle/gemma-3-4b-it
ollama/llama3.2ollamallama3.2
ollama/qwen3ollamaqwen3

Unknown providers or bare model names (no /) throw at startup with an actionable error.

Base URLs

Each provider has a default endpoint and an optional override:

ProviderDefault base URLOverride env var
openaiOpenAI SDK default (https://api.openai.com/v1)OPENAI_BASE_URL
lmstudiohttp://localhost:1234/v1LMSTUDIO_BASE_URL
ollamahttp://localhost:11434/v1OLLAMA_BASE_URL

Set the override when you want to point at a proxy, gateway, or a remote LMStudio / Ollama host. All three vars are optional; leave them unset to use the defaults.

OpenAI

Set OPENAI_API_KEY in .env and you’re set. Common model choices:

SpecWhen to pick it
openai/gpt-4.1-nanoDefault. Cheapest, fastest, fine for routing and most skills.
openai/gpt-4.1-miniWhen the nano starts ad-libbing or missing nuance.
openai/gpt-4.1When the mini still falls short — usually long context or complex multi-step reasoning.

Older models work too — anything the OpenAI SDK accepts (openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-3.5-turbo, etc.).

LMStudio

LMStudio  runs a local OpenAI-compatible server. Start it, load a model, leave it running on the default port (1234). Then point the runtime at it:

export SKILLET_MODEL_RUNNER=lmstudio/liquid/lfm2-1.2b npm run dev:chat:todo_list

Or set the model in your AGENTS.md:

--- name: todo_list description: Todo list manager. model: lmstudio/liquid/lfm2-1.2b ---

The runtime hits http://localhost:1234/v1 (or LMSTUDIO_BASE_URL if set) and treats the response as if it were OpenAI’s. No API key required. Cost-tracking entries are still written but the dollar amounts will be zero.

Tested LMStudio models include lmstudio/liquid/lfm2-1.2b and lmstudio/google/gemma-3-4b-it. Anything LMStudio loads and serves should work — the model id after lmstudio/ is sent through unchanged.

Ollama

Ollama  also exposes an OpenAI-compatible endpoint. Install it, pull a model, and start the server:

ollama pull llama3.2 ollama serve # serves http://localhost:11434

Then point the runtime at it:

export SKILLET_MODEL_RUNNER=ollama/llama3.2 npm run dev:chat:todo_list

Or set the model in your AGENTS.md:

--- name: todo_list description: Todo list manager. model: ollama/llama3.2 ---

The runtime hits http://localhost:11434/v1 (or OLLAMA_BASE_URL if set). No API key is required: the OpenAI SDK constructor refuses to start without one, so the ollama client ships a harmless placeholder key that Ollama ignores. As with LMStudio, cost-tracking entries are still written but the dollar amounts are zero. The model id after ollama/ (e.g. llama3.2, qwen3) is sent through unchanged.

Streaming

Pass -s to chat or run to stream tokens. Both providers support it. The events that come out of the async generator are the same in both modes — streaming just emits text events incrementally as tokens arrive instead of in one chunk at the end.

npx tsx ./src/cli.ts chat -c ./data/skillets/todo_list.skilled_crew.yaml -s

What’s not supported

Status
Anthropic (Claude)No native adapter. The anthropic/* provider prefix is not wired; reach Claude through an OpenAI-compatible gateway via the openai provider and OPENAI_BASE_URL.
Google Gemini API (generativelanguage.googleapis.com)No native adapter. Run Gemini models locally through LMStudio with lmstudio/google/<model>, or front the Gemini API with an OpenAI-compatible gateway.

Every provider must speak the OpenAI API — there is no plugin or config-file mechanism for new backends. Adding one means editing the UtilsAi.SUPPORTED_PROVIDERS enum and the UtilsAi.PROVIDER_CLIENT_CONFIGS client wiring (the base URL, its override env var, and an optional placeholder key); see UtilsAi and UtilsAi.getOpenAiClient.

Last updated on