Hosted API¶
The GQI hosted API serves the managed models and runs per-dataset fine-tuning for
you. It's what the gqi package talks to in hosted mode, and what
you call directly from any language.
Base URL: https://gqilabs.com/api
Test mode
The API is currently in test mode — payments use Stripe test cards and no real charges are made. Sign in at gqilabs.com to get a key and start experimenting.
Principles¶
A few decisions shape the API — worth knowing up front:
- Your data stays yours. Each fine-tune trains on the rows you send in that request and keeps only the resulting adapter — we don't store your training data.
- Models are private to your key. A tuned model is
(your key, its name); different keys are fully isolated from each other. - Predictable, not magic. Re-fitting a name replaces it (no hidden versioning), and a fine-tuned model is live the instant its job finishes — no deploy step.
- You pay for input, not output. Billing is on input tokens; the output is a single calibrated number, so it's free.
Authentication¶
Every request carries a Bearer API key. Create and manage keys in your account console — key management is tied to your signed-in browser session, so keys can't be minted with a bare API call. The full key is shown once, at creation:
Send it on every call:
Your key identifies your account and scopes every tuned model and your credit balance to you.
Two kinds of model¶
-
Base models — shared, pre-trained bases you can use directly or fine-tune on top of. List them with
GET /v1/models:tag notes t270m_general_stageBGQI Lite — fast (the default) GQI_1B_v0GQI Pro — most accurate GQI_Max_v0GQI Max — our most capable model (preview) -
Tuned models — your own fine-tuned models. Each is a lightweight adapter over a base, named by you and private to your key. The name is your task's identity — reuse it to keep predicting, or to retrain.
The core loop: fit → poll → predict¶
POST /v1/fit¶
Starts an asynchronous fine-tune. Pass a model name, your examples X (text)
and targets y (numbers), and optionally a base_model. Returns a job handle.
curl -X POST https://gqilabs.com/api/v1/fit \
-H "Authorization: Bearer gqi_sk_test_..." -H "Content-Type: application/json" \
-d '{
"model": "ticket-priority",
"base_model": "GQI_1B_v0",
"X": ["cannot log in, urgent!", "typo on the about page"],
"y": [9.0, 2.0]
}'
# -> { "job_id": "job_abc123", "model": "ticket-priority", "status": "queued" }
GET /v1/jobs/{id}¶
Poll until the fit finishes (status goes queued → running → done).
curl https://gqilabs.com/api/v1/jobs/job_abc123 \
-H "Authorization: Bearer gqi_sk_test_..."
# -> { "status": "running", "history": [ { "step": 8, "val_loss": 0.74 }, ... ] }
POST /v1/predict¶
Predict with your tuned model by name, or with a base_model directly. Every
prediction returns y (the calibrated number — a median over samples) and
y_std (its calibrated uncertainty — a robust spread, with sampler outliers
clipped). num_samples sets how many samples back the estimate (default 32).
# your tuned model
curl -X POST https://gqilabs.com/api/v1/predict \
-H "Authorization: Bearer gqi_sk_test_..." -H "Content-Type: application/json" \
-d '{ "model": "ticket-priority", "X": ["server down in prod!!"] }'
# -> { "y": [8.7], "y_std": [1.1], "served": "adapter" }
# a base model, no fine-tune
curl -X POST https://gqilabs.com/api/v1/predict \
-H "Authorization: Bearer gqi_sk_test_..." -H "Content-Type: application/json" \
-d '{ "base_model": "GQI_1B_v0", "X": ["..."] }'
GET /v1/tuned · DELETE /v1/tuned/{name}¶
List or delete your tuned models.
curl https://gqilabs.com/api/v1/tuned -H "Authorization: Bearer gqi_sk_test_..."
# -> { "models": [ { "name": "ticket-priority", "base_model": "GQI_1B_v0",
# "status": "ready", "n_examples": 340, "updated": 1719... } ] }
Updating a model with new data¶
Re-run POST /v1/fit with the same model name to retrain it. GQI trains a
fresh adapter on exactly the data you send — so to train on more data, send
the full (cumulative) set.
We don't store your training data
Each fit trains on the rows in that request and keeps only the resulting
adapter (a few MB). You own your data — send it when you want to (re)train. A
future continue mode will warm-start from your existing adapter for cheap
incremental updates; it isn't available yet.
Design intent, so it's predictable:
- A tuned model is
(your key, name). Same key + same name → the same model. Different keys are fully isolated. - Re-fit replaces. Fitting a name again trains a new adapter and points the name at it (no silent versioning).
- Live immediately. Once a job is
done, predict against the name right away — there is no separate deploy step.
Credits & usage¶
Requests meter against prepaid credits ($30 free each month, no card
required). Billing is on input tokens — the output is one number, so it's
free. See
pricing. If you run out, calls return 402; top
up in the console.
| Method & path | Purpose |
|---|---|
GET /v1/billing/balance |
Your credit balance |
POST /v1/billing/checkout |
Add credits (Stripe) → checkout URL |
GET /v1/usage |
Spend this month + recent activity |
Endpoint summary¶
| Method & path | Purpose |
|---|---|
POST /v1/keys |
Create an API key (shown once; console session only) |
GET /v1/models |
List base models |
POST /v1/fit |
Start a fine-tune (async) → job handle |
GET /v1/jobs/{id} |
Poll a fit job |
POST /v1/predict |
Predict with a tuned model or a base_model |
GET /v1/tuned · DELETE /v1/tuned/{name} |
List / delete your tuned models |
GET /v1/billing/balance · GET /v1/usage · POST /v1/billing/checkout |
Credits & usage |