Models
skillet_agent runs against any OpenAI-compatible endpoint. Three providers are wired today: OpenAI (the default, hosted), LMStudio, and Ollama (both local OpenAI-compatible servers). Models are named with the explicit <provider>/<model> convention so the runtime can route requests without guessing — see UtilsAi.parseProviderModel. The full provider set is the fixed enum UtilsAi.SUPPORTED_PROVIDERS; an unknown provider prefix throws at startup.
Picking a model
The runner looks for a model spec in this order (first hit wins):
SKILLET_MODEL_RUNNERenv var — global override, full<provider>/<model>form.model:in the agent’sAGENTS.mdfrontmatter — per-skillet, same format.openai/gpt-4.1-nano— built-in default.
For the eval grader, the equivalent var is SKILLET_MODEL_EVAL (same fallback chain, same default).
Preflight check
When SKILLET_MODEL_RUNNER is set, chat, run, and eval_run preflight it before the first model call and exit 1 with an actionable message if the provider can’t be reached — no OPENAI_API_KEY for openai, or an unreachable local server for lmstudio / ollama. This fails fast instead of dying deep inside the first request. The probe is a GET <baseURL>/models bounded to 1.5 s; any HTTP response counts as reachable. When the variable is unset, the check is a no-op. See UtilsAi.checkModelRunnerRunnable.
Model spec format
Every model spec is <provider>/<model>. The string is split on the first /; everything after it is the model id passed verbatim to the OpenAI SDK. That lets LMStudio’s own namespaced ids (like liquid/lfm2-1.2b) ride along:
| Spec | Provider | Model id sent to the SDK |
|---|---|---|
openai/gpt-4.1-nano | openai | gpt-4.1-nano |
openai/gpt-4.1 | openai | gpt-4.1 |
lmstudio/liquid/lfm2-1.2b | lmstudio | liquid/lfm2-1.2b |
lmstudio/google/gemma-3-4b-it | lmstudio | google/gemma-3-4b-it |
ollama/llama3.2 | ollama | llama3.2 |
ollama/qwen3 | ollama | qwen3 |
Unknown providers or bare model names (no /) throw at startup with an actionable error.
Base URLs
Each provider has a default endpoint and an optional override:
| Provider | Default base URL | Override env var |
|---|---|---|
openai | OpenAI SDK default (https://api.openai.com/v1) | OPENAI_BASE_URL |
lmstudio | http://localhost:1234/v1 | LMSTUDIO_BASE_URL |
ollama | http://localhost:11434/v1 | OLLAMA_BASE_URL |
Set the override when you want to point at a proxy, gateway, or a remote LMStudio / Ollama host. All three vars are optional; leave them unset to use the defaults.
OpenAI
Set OPENAI_API_KEY in .env and you’re set. Common model choices:
| Spec | When to pick it |
|---|---|
openai/gpt-4.1-nano | Default. Cheapest, fastest, fine for routing and most skills. |
openai/gpt-4.1-mini | When the nano starts ad-libbing or missing nuance. |
openai/gpt-4.1 | When the mini still falls short — usually long context or complex multi-step reasoning. |
Older models work too — anything the OpenAI SDK accepts (openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-3.5-turbo, etc.).
LMStudio
LMStudio runs a local OpenAI-compatible server. Start it, load a model, leave it running on the default port (1234). Then point the runtime at it:
export SKILLET_MODEL_RUNNER=lmstudio/liquid/lfm2-1.2b
npm run dev:chat:todo_listOr set the model in your AGENTS.md:
---
name: todo_list
description: Todo list manager.
model: lmstudio/liquid/lfm2-1.2b
---The runtime hits http://localhost:1234/v1 (or LMSTUDIO_BASE_URL if set) and treats the response as if it were OpenAI’s. No API key required. Cost-tracking entries are still written but the dollar amounts will be zero.
Tested LMStudio models include lmstudio/liquid/lfm2-1.2b and lmstudio/google/gemma-3-4b-it. Anything LMStudio loads and serves should work — the model id after lmstudio/ is sent through unchanged.
Ollama
Ollama also exposes an OpenAI-compatible endpoint. Install it, pull a model, and start the server:
ollama pull llama3.2
ollama serve # serves http://localhost:11434Then point the runtime at it:
export SKILLET_MODEL_RUNNER=ollama/llama3.2
npm run dev:chat:todo_listOr set the model in your AGENTS.md:
---
name: todo_list
description: Todo list manager.
model: ollama/llama3.2
---The runtime hits http://localhost:11434/v1 (or OLLAMA_BASE_URL if set). No API key is required: the OpenAI SDK constructor refuses to start without one, so the ollama client ships a harmless placeholder key that Ollama ignores. As with LMStudio, cost-tracking entries are still written but the dollar amounts are zero. The model id after ollama/ (e.g. llama3.2, qwen3) is sent through unchanged.
Streaming
Pass -s to chat or run to stream tokens. Both providers support it. The events that come out of the async generator are the same in both modes — streaming just emits text events incrementally as tokens arrive instead of in one chunk at the end.
npx tsx ./src/cli.ts chat -c ./data/skillets/todo_list.skilled_crew.yaml -sWhat’s not supported
| Status | |
|---|---|
| Anthropic (Claude) | No native adapter. The anthropic/* provider prefix is not wired; reach Claude through an OpenAI-compatible gateway via the openai provider and OPENAI_BASE_URL. |
Google Gemini API (generativelanguage.googleapis.com) | No native adapter. Run Gemini models locally through LMStudio with lmstudio/google/<model>, or front the Gemini API with an OpenAI-compatible gateway. |
Every provider must speak the OpenAI API — there is no plugin or config-file mechanism for new backends. Adding one means editing the UtilsAi.SUPPORTED_PROVIDERS enum and the UtilsAi.PROVIDER_CLIENT_CONFIGS client wiring (the base URL, its override env var, and an optional placeholder key); see UtilsAi and UtilsAi.getOpenAiClient.