Add scripts and documentation for local inference configuration with Ollama and mlx-omni-server

- Introduced `configure_local_inference.sh` to automatically set `.dev.vars` based on active local inference services.
- Updated `start_inference_server.sh` to handle both Ollama and mlx-omni-server server types.
- Enhanced `package.json` to include new commands for starting and configuring inference servers.
- Refined README to include updated instructions for running and adding models for local inference.
- Minor cleanup in `MessageBubble.tsx`.
This commit is contained in:
geoffsee
2025-06-02 12:38:50 -04:00
committed by Geoff Seemueller
parent f2d91e2752
commit 9e8b427826
5 changed files with 93 additions and 32 deletions

View File

@@ -19,6 +19,9 @@
"tail:analytics-service": "wrangler tail -c workers/analytics/wrangler-analytics.toml",
"tail:session-proxy": "wrangler tail -c workers/session-proxy/wrangler-session-proxy.toml --env production",
"openai:local": "./scripts/start_inference_server.sh",
"openai:local:mlx": "./scripts/start_inference_server.sh mlx-omni-server",
"openai:local:ollama": "./scripts/start_inference_server.sh ollama",
"openai:local:configure": "scripts/configure_local_inference.sh",
"test": "vitest run",
"test:watch": "vitest",
"test:coverage": "vitest run --coverage.enabled=true"