Models

skillet_agent runs against any OpenAI-compatible endpoint. Three providers are wired today: OpenAI (the default, hosted), LMStudio, and Ollama (both local OpenAI-compatible servers). Models are named with the explicit <provider>/<model> convention so the runtime can route requests without guessing — see UtilsAi.parseProviderModel. The full provider set is the fixed enum UtilsAi.SUPPORTED_PROVIDERS; an unknown provider prefix throws at startup.

Picking a model

The runner looks for a model spec in this order (first hit wins):

SKILLET_MODEL_RUNNER env var — global override, full <provider>/<model> form.
model: in the agent’s AGENTS.md frontmatter — per-skillet, same format.
openai/gpt-4.1-nano — built-in default.

For the eval grader, the equivalent var is SKILLET_MODEL_EVAL (same fallback chain, same default).

Preflight check

When SKILLET_MODEL_RUNNER is set, chat, run, and eval_run preflight it before the first model call and exit 1 with an actionable message if the provider can’t be reached — no OPENAI_API_KEY for openai, or an unreachable local server for lmstudio / ollama. This fails fast instead of dying deep inside the first request. The probe is a GET <baseURL>/models bounded to 1.5 s; any HTTP response counts as reachable. When the variable is unset, the check is a no-op. See UtilsAi.checkModelRunnerRunnable.

Model spec format

Every model spec is <provider>/<model>. The string is split on the first /; everything after it is the model id passed verbatim to the OpenAI SDK. That lets LMStudio’s own namespaced ids (like liquid/lfm2-1.2b) ride along:

Spec	Provider	Model id sent to the SDK
`openai/gpt-4.1-nano`	`openai`	`gpt-4.1-nano`
`openai/gpt-4.1`	`openai`	`gpt-4.1`
`lmstudio/liquid/lfm2-1.2b`	`lmstudio`	`liquid/lfm2-1.2b`
`lmstudio/google/gemma-3-4b-it`	`lmstudio`	`google/gemma-3-4b-it`
`ollama/llama3.2`	`ollama`	`llama3.2`
`ollama/qwen3`	`ollama`	`qwen3`

Unknown providers or bare model names (no /) throw at startup with an actionable error.

Base URLs

Each provider has a default endpoint and an optional override:

Provider	Default base URL	Override env var
`openai`	OpenAI SDK default (`https://api.openai.com/v1`)	`OPENAI_BASE_URL`
`lmstudio`	`http://localhost:1234/v1`	`LMSTUDIO_BASE_URL`
`ollama`	`http://localhost:11434/v1`	`OLLAMA_BASE_URL`

Set the override when you want to point at a proxy, gateway, or a remote LMStudio / Ollama host. All three vars are optional; leave them unset to use the defaults.

OpenAI

Set OPENAI_API_KEY in .env and you’re set. Common model choices:

Spec	When to pick it
`openai/gpt-4.1-nano`	Default. Cheapest, fastest, fine for routing and most skills.
`openai/gpt-4.1-mini`	When the nano starts ad-libbing or missing nuance.
`openai/gpt-4.1`	When the mini still falls short — usually long context or complex multi-step reasoning.

Older models work too — anything the OpenAI SDK accepts (openai/gpt-4o, openai/gpt-4o-mini, openai/gpt-3.5-turbo, etc.).

LMStudio

LMStudio runs a local OpenAI-compatible server. Start it, load a model, leave it running on the default port (1234). Then point the runtime at it:


export SKILLET_MODEL_RUNNER=lmstudio/liquid/lfm2-1.2b
npm run dev:chat:todo_list

Or set the model in your AGENTS.md:


---
name: todo_list
description: Todo list manager.
model: lmstudio/liquid/lfm2-1.2b
---

The runtime hits http://localhost:1234/v1 (or LMSTUDIO_BASE_URL if set) and treats the response as if it were OpenAI’s. No API key required. Cost-tracking entries are still written but the dollar amounts will be zero.

Tested LMStudio models include lmstudio/liquid/lfm2-1.2b and lmstudio/google/gemma-3-4b-it. Anything LMStudio loads and serves should work — the model id after lmstudio/ is sent through unchanged.

Ollama

Ollama also exposes an OpenAI-compatible endpoint. Install it, pull a model, and start the server:


ollama pull llama3.2
ollama serve          # serves http://localhost:11434

Then point the runtime at it:


export SKILLET_MODEL_RUNNER=ollama/llama3.2
npm run dev:chat:todo_list

Or set the model in your AGENTS.md:


---
name: todo_list
description: Todo list manager.
model: ollama/llama3.2
---

The runtime hits http://localhost:11434/v1 (or OLLAMA_BASE_URL if set). No API key is required: the OpenAI SDK constructor refuses to start without one, so the ollama client ships a harmless placeholder key that Ollama ignores. As with LMStudio, cost-tracking entries are still written but the dollar amounts are zero. The model id after ollama/ (e.g. llama3.2, qwen3) is sent through unchanged.

Streaming

Pass -s to chat or run to stream tokens. Both providers support it. The events that come out of the async generator are the same in both modes — streaming just emits text events incrementally as tokens arrive instead of in one chunk at the end.


npx skilled_crew chat -c ./data/skillets/todo_list.skilled_crew.yaml -s

What’s not supported

	Status
Anthropic (Claude)	No native adapter. The `anthropic/*` provider prefix is not wired; reach Claude through an OpenAI-compatible gateway via the `openai` provider and `OPENAI_BASE_URL`.
Google Gemini API (`generativelanguage.googleapis.com`)	No native adapter. Run Gemini models locally through LMStudio with `lmstudio/google/<model>`, or front the Gemini API with an OpenAI-compatible gateway.

Every provider must speak the OpenAI API — there is no plugin or config-file mechanism for new backends. Adding one means editing the UtilsAi.SUPPORTED_PROVIDERS enum and the UtilsAi.PROVIDER_CLIENT_CONFIGS client wiring (the base URL, its override env var, and an optional placeholder key); see UtilsAi and UtilsAi.getOpenAiClient.