Files
predict-otron-9001/server.log
geoffsee 719beb3791 - Change default server host to localhost for improved security.
- Increase default maximum tokens in CLI configuration to 256.
- Refactor and reorganize CLI
2025-08-27 21:47:31 -04:00

49 lines
3.6 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Compiling inference-engine v0.1.0 (/Users/williamseemueller/workspace/seemueller-io/predict-otron-9000/crates/inference-engine)
warning: unused import: `Config as Config1`
--> crates/inference-engine/src/model.rs:2:42
|
2 | use candle_transformers::models::gemma::{Config as Config1, Model as Model1};
| ^^^^^^^^^^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: unused import: `Config as Config2`
--> crates/inference-engine/src/model.rs:3:43
|
3 | use candle_transformers::models::gemma2::{Config as Config2, Model as Model2};
| ^^^^^^^^^^^^^^^^^
warning: unused import: `Config as Config3`
--> crates/inference-engine/src/model.rs:4:43
|
4 | use candle_transformers::models::gemma3::{Config as Config3, Model as Model3};
| ^^^^^^^^^^^^^^^^^
warning: unused import: `self`
--> crates/inference-engine/src/server.rs:10:28
|
10 | use futures_util::stream::{self, Stream};
| ^^^^
warning: `inference-engine` (lib) generated 4 warnings (run `cargo fix --lib -p inference-engine` to apply 4 suggestions)
Compiling predict-otron-9000 v0.1.0 (/Users/williamseemueller/workspace/seemueller-io/predict-otron-9000/crates/predict-otron-9000)
Finished `release` profile [optimized] target(s) in 4.01s
Running `target/release/predict-otron-9000`
2025-08-28T01:43:11.512475Z  INFO predict_otron_9000::middleware::metrics: Performance metrics summary:
avx: false, neon: true, simd128: false, f16c: false
2025-08-28T01:43:11.512811Z  INFO hf_hub: Using token file found "/Users/williamseemueller/.cache/huggingface/token"
retrieved the files in 685.958µs
2025-08-28T01:43:12.661378Z  INFO predict_otron_9000: Unified predict-otron-9000 server listening on 127.0.0.1:8080
2025-08-28T01:43:12.661400Z  INFO predict_otron_9000: Performance metrics tracking enabled - summary logs every 60 seconds
2025-08-28T01:43:12.661403Z  INFO predict_otron_9000: Available endpoints:
2025-08-28T01:43:12.661405Z  INFO predict_otron_9000: GET / - Root endpoint from embeddings-engine
2025-08-28T01:43:12.661407Z  INFO predict_otron_9000: POST /v1/embeddings - Text embeddings
2025-08-28T01:43:12.661409Z  INFO predict_otron_9000: POST /v1/chat/completions - Chat completions
2025-08-28T01:43:19.166677Z  WARN inference_engine::server: Detected repetition pattern: ' plus' (count: 1)
2025-08-28T01:43:19.296257Z  WARN inference_engine::server: Detected repetition pattern: ' plus' (count: 2)
2025-08-28T01:43:19.424883Z  WARN inference_engine::server: Detected repetition pattern: ' plus' (count: 3)
2025-08-28T01:43:19.554508Z  WARN inference_engine::server: Detected repetition pattern: ' plus' (count: 4)
2025-08-28T01:43:19.683153Z  WARN inference_engine::server: Detected repetition pattern: ' plus' (count: 5)
2025-08-28T01:43:19.683181Z  INFO inference_engine::server: Stopping generation due to excessive repetition
2025-08-28T01:43:19.683221Z  INFO inference_engine::server: Text generation stopped: Repetition detected - stopping generation