Adaptive voice verification API documentation

Quickstart · v1

Adaptive verification in fifteen minutes

Your production integration should converge on one call: POST /v1/voice/verify. Enrollment is a prerequisite (below). Telemetry dashboards that look like BI live in Lab; they stay separate from verification itself.

Obtain your API credentials

Prefer the Developer Console after registration. For ephemeral testing in this browser session, mint a key via the Evaluation Console or POST /v1/keys/demo when your operator permits it.

Enroll users once

/create-profile + /enroll build the biometric template your later verify calls authenticate against.

Verify every risky action server-side

Call POST /v1/voice/verify — see First verification for multipart fields.

Observe adaptive learning

Read learning_progress, thresholds, and reason codes from the verify response. Drill deeper optionally via GET /v1/user-learning/:user_id inside the Evaluation Console.

Minimal verify (Bearer, after enrollment)

curl -sS -X POST https://your-instance.com/v1/voice/verify \
  -H "Authorization: Bearer YOUR_KEY" \
  -F "user_id=alice" \
  -F "phrase=silver bridge 19" \
  -F "audio=@voice.wav"

Authentication

Authenticate REST calls

Protected routes expect either x-api-key: YOUR_KEY or Authorization: Bearer YOUR_KEY. Never ship production keys inside browser bundles—the landing widget is a carve-out UX demo handled separately.

Enrollment prerequisites

Prime the profile before verifying

POST /v1/voice/verify rejects unknown users (profile_not_found) until the same API key stamps the records below:

POST /create-profile

{ "user_id": "alice" } — allocates the local bookkeeping row.

POST /enroll

multipart user_id + WAV audio builds the biometric template referenced by verifier.

curl enrollment

curl -sS -X POST https://your-instance.com/create-profile \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"user_id":"alice"}'

curl -sS -X POST https://your-instance.com/enroll \
  -H "Authorization: Bearer YOUR_KEY" \
  -F "user_id=alice" \
  -F "audio=@alice.wav"

Primary verifier

`POST /v1/voice/verify`

Multipart WAV only. Typical fields:

user_id audio (WAV) phrase (recommended)

Returns accept / retry / reject with rich diagnostics—use this everywhere you integrate sensitive actions behind voice.

Verifier schema

`/v1` response surface

The verify route uses multiple biometric signals fused into one auditable verdict.

Typical verify JSON snippet

{
  "decision": "accept",
  "confidence": 0.89,
  "signals": {
    "voice": "high",
    "phrase": "high",
    "liveness": "high"
  },
  "learning_progress": 0.42
}

Each signals.* tier is low, medium, or high. Larger payloads expose calibration, spoof labels, thresholds, retries, and explanatory insights.

Operational transparency

`GET /v1/usage`

Poll per-key aggregates (totals, success/reject mixes, throttle windows, reason histograms). Mirrors what the Evaluation Console surfaces for sandbox keys—and what the Developer Console reflects for SaaS-backed keys.

Errors

Handle failure modes

Plan for these common responses.

Reject — /v1

401

{
  "decision": "reject",
  "confidence": 0.41,
  "reason": "voice_mismatch"
}

Cooldown — embedded SDK only

429 · /verify-voice

{
  "decision": "cooldown",
  "reason": "cooldown",
  "retryAfterMs": 3000
}

Missing identity template

404 · /v1/voice/verify

{
  "error": "profile_not_found",
  "message": "User must enroll before verification"
}

Throttle — API key

429 · /v1/*

{
  "error": "rate_limited",
  "retryAfterMs": 42000,
  "scope": "minute"
}

Advanced embedded SDK flow — not the default (/start-auth, /verify-voice, /verify-token)

Legacy narrative

Embedded SDK · optional

For most teams the primary path stays POST /v1/voice/verify. Use the helpers below only when you want an opinionated browser flow that emits short-lived JWTs via /verify-voice + /verify-token.

The Node.js middleware coordinates sessions/tokens while the Python microservice scores biometrics—you still authenticate every call with your API key.

Embedded session flow

SDK challenge round-trip

Six steps—the first two overlap with prerequisite enrollment documented above.

Create profile

Enroll voice

Record the user reading a sentence. Submit the WAV audio to build their voice profile. Enrollment is a one-time step per user.

Start authorization session

Request a session for the user. The server returns a unique session_id and a challenge phrase. Sessions expire automatically.

User speaks the challenge

Display the challenge phrase in your UI. Record the user speaking it. Keep the audio file ready to submit.

Verify voice

Submit the audio with the session_id and user_id. The system verifies voice identity, phrase match, and liveness in real time.

Receive and validate token

On success, a voiceToken (JWT) is returned. Your backend validates this token via /verify-token to complete the authorization.

Legacy RPC reference

Session + enrollment endpoints

Authenticated routes accept x-api-key or Authorization: Bearer with your key. Unauthenticated routes should be disabled in production deployments (for example public key minting). Audio must be multipart/form-data WAV.

POST /create-profile

user_id (required)

POST /enroll

Build a voice profile for the user from a recording. Submit audio as a WAV file alongside user_id.

user_id (required) audio — WAV file (required)

POST /start-auth

Create an authorization session. Returns a session_id, a challenge phrase, and an expiresAt timestamp. Sessions are single-use and expire automatically.

user_id (required)

POST /verify-voice

Verify a user's voice against the active session. Runs speaker recognition, phrase matching, and liveness detection. Returns a voiceToken on success.

user_id (required) session_id (required) audio — WAV file (required)

POST /verify-token

Validate a voice token issued by /verify-voice. Tokens are single-use — each token is revoked after the first successful validation call.

token (required)

Browser helper

SDK usage (advanced)

The Voice Authorize SDK handles the browser-side recording, encoding, and API communication. Drop in one script tag and call a single method.

integration.html

<!-- 1. Load the SDK -->
<script src="voiceauthorize.js"></script>

<!-- 2. Initialize once with your API key -->
<script>
  VoiceAuthorize.init({
    baseUrl: "https://your-instance.com",
    apiKey:  "your_api_key"
  });
</script>

<!-- 3. Create a profile (once per user) -->
<script>
  await VoiceAuthorize.createProfile({ user_id: "user_123" });
</script>

<!-- 4. Enroll (once per user) -->
<script>
  await VoiceAuthorize.enroll({
    user_id: "user_123",
    audio:   wavBlob
  });
</script>

<!-- 5. Start auth session and get challenge -->
<script>
  const session = await VoiceAuthorize.startAuth({ user_id: "user_123" });
  showChallenge(session.challenge);  // display to user
</script>

<!-- 6. Submit voice and receive token -->
<script>
  const result = await VoiceAuthorize.sendVoice({
    user_id:    "user_123",
    session_id: session.session_id,
    audio:      wavBlob
  });

  if (result.success) {
    authorizeAction(result.voiceToken);
  }
</script>

Legacy SDK

Embedded flow response examples

All responses are JSON. On success, /verify-voice returns a voiceToken JWT. On failure, a structured error with a machine-readable code is returned.

Success — 200 OK

verify-voice response

{
  "success":    true,
  "decision":   "accept",
  "voiceToken": "eyJhbGci...",
  "user_id":    "user_123"
}

Failure — 401 Unauthorized

verify-voice response

{
  "success":  false,
  "decision": "reject",
  "code":     "voice_rejected",
  "reason":   "low_similarity"
}

Token verification (/verify-token) response on success:

verify-token response

{
  "valid":   true,
  "user_id": "user_123",
  "iat":     1746452100
}

The decision field in /verify-voice responses can be one of:

accept — identity confirmed, token issued retry — borderline result, request a new session reject — identity not confirmed

Trust

Built for real-world conditions

Voice verification is the product; dashboards are optional accessories.

147 tests across 17 suites

Automated coverage across middleware, routes, and integration paths so regressions surface before your integration does.

Multi-signal verification

Voice similarity, phrase alignment, liveness, and synthetic detection are fused into a single decision.

Replay defenses

Short TTLs, single-use tokens where applicable, and liveness gates reduce replay abuse.

Structured failure feedback

reason codes such as voice_mismatch translate straight into UX retries and analytics bucketing.

Security

Security notes

Defense-in-depth applies to /v1 first. Items below about voice JWTs matter primarily for the optional embedded SDK path.

Single-use voice tokens

Every voiceToken is a signed JWT that is revoked immediately upon its first use. Token replay attacks are blocked at the API level.

Sessions expire automatically

Authorization sessions have a short TTL. Expired sessions are swept from memory. Users cannot replay old sessions to authorize new actions.

Replay audio protection

Liveness detection in the Python service rejects pre-recorded audio replays. Voice biometrics require live speech — a recording of the user's voice will not pass.

Multi-signal verification

Three independent checks must pass: speaker similarity, phrase transcript match, and liveness score. A single failing signal blocks authorization.

API key authentication

Every API call requires a server-side API key. Client-side calls should proxy through your backend. Never expose your API key in browser code.

Ship adaptive voice verification in 15 minutes