Adaptive verification in fifteen minutes
Your production integration should converge on one call:
POST /v1/voice/verify. Enrollment is a prerequisite (below). Telemetry dashboards that look like BI live in
Lab; they stay separate from verification itself.
Obtain your API credentials
Prefer the Developer Console after registration. For ephemeral testing in this browser session, mint a key via the
Evaluation Console or POST /v1/keys/demo when your operator permits it.
Enroll users once
/create-profile + /enroll build the biometric template your later verify calls authenticate against.
Verify every risky action server-side
Call POST /v1/voice/verify — see First verification for multipart fields.
Observe adaptive learning
Read learning_progress, thresholds, and reason codes from the verify response. Drill deeper optionally via GET /v1/user-learning/:user_id inside the Evaluation Console.
curl -sS -X POST https://your-instance.com/v1/voice/verify \ -H "Authorization: Bearer YOUR_KEY" \ -F "user_id=alice" \ -F "phrase=silver bridge 19" \ -F "audio=@voice.wav"
Authenticate REST calls
Protected routes expect either x-api-key: YOUR_KEY or Authorization: Bearer YOUR_KEY. Never ship production keys inside browser bundles—the landing widget is a carve-out UX demo handled separately.
Prime the profile before verifying
POST /v1/voice/verify rejects unknown users (profile_not_found) until the same API key stamps the records below:
{ "user_id": "alice" } — allocates the local bookkeeping row.
multipart user_id + WAV audio builds the biometric template referenced by verifier.
curl -sS -X POST https://your-instance.com/create-profile \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"user_id":"alice"}' curl -sS -X POST https://your-instance.com/enroll \ -H "Authorization: Bearer YOUR_KEY" \ -F "user_id=alice" \ -F "audio=@alice.wav"
POST /v1/voice/verify
Multipart WAV only. Typical fields:
Returns accept / retry / reject with rich diagnostics—use this everywhere you integrate sensitive actions behind voice.
/v1 response surface
The verify route uses multiple biometric signals fused into one auditable verdict.
{
"decision": "accept",
"confidence": 0.89,
"signals": {
"voice": "high",
"phrase": "high",
"liveness": "high"
},
"learning_progress": 0.42
}
Each signals.* tier is low, medium, or high. Larger payloads expose calibration, spoof labels, thresholds, retries, and explanatory insights.
GET /v1/usage
Poll per-key aggregates (totals, success/reject mixes, throttle windows, reason histograms). Mirrors what the Evaluation Console surfaces for sandbox keys—and what the Developer Console reflects for SaaS-backed keys.
Handle failure modes
Plan for these common responses.
/v1{
"decision": "reject",
"confidence": 0.41,
"reason": "voice_mismatch"
}
{
"decision": "cooldown",
"reason": "cooldown",
"retryAfterMs": 3000
}
{
"error": "profile_not_found",
"message": "User must enroll before verification"
}
{
"error": "rate_limited",
"retryAfterMs": 42000,
"scope": "minute"
}
Advanced embedded SDK flow — not the default (/start-auth, /verify-voice, /verify-token)
Embedded SDK · optional
For most teams the primary path stays POST /v1/voice/verify. Use the helpers below only when you want an opinionated browser flow that emits short-lived JWTs via /verify-voice + /verify-token.
The Node.js middleware coordinates sessions/tokens while the Python microservice scores biometrics—you still authenticate every call with your API key.
SDK challenge round-trip
Six steps—the first two overlap with prerequisite enrollment documented above.
Create profile
Register the user in the system with a unique user_id. This creates a placeholder for their voice profile.
Enroll voice
Record the user reading a sentence. Submit the WAV audio to build their voice profile. Enrollment is a one-time step per user.
Start authorization session
Request a session for the user. The server returns a unique session_id and a challenge phrase. Sessions expire automatically.
User speaks the challenge
Display the challenge phrase in your UI. Record the user speaking it. Keep the audio file ready to submit.
Verify voice
Submit the audio with the session_id and user_id. The system verifies voice identity, phrase match, and liveness in real time.
Receive and validate token
On success, a voiceToken (JWT) is returned. Your backend validates this token via /verify-token to complete the authorization.
Session + enrollment endpoints
Authenticated routes accept x-api-key or Authorization: Bearer with your key. Unauthenticated routes should be disabled in production deployments (for example public key minting). Audio must be multipart/form-data WAV.
Register a new user in the system. Must be called before enrollment.
Build a voice profile for the user from a recording. Submit audio as a WAV file alongside user_id.
Create an authorization session. Returns a session_id, a challenge phrase, and an expiresAt timestamp. Sessions are single-use and expire automatically.
Verify a user's voice against the active session. Runs speaker recognition, phrase matching, and liveness detection. Returns a voiceToken on success.
Validate a voice token issued by /verify-voice. Tokens are single-use — each token is revoked after the first successful validation call.
SDK usage (advanced)
The Voice Authorize SDK handles the browser-side recording, encoding, and API communication. Drop in one script tag and call a single method.
<!-- 1. Load the SDK --> <script src="voiceauthorize.js"></script> <!-- 2. Initialize once with your API key --> <script> VoiceAuthorize.init({ baseUrl: "https://your-instance.com", apiKey: "your_api_key" }); </script> <!-- 3. Create a profile (once per user) --> <script> await VoiceAuthorize.createProfile({ user_id: "user_123" }); </script> <!-- 4. Enroll (once per user) --> <script> await VoiceAuthorize.enroll({ user_id: "user_123", audio: wavBlob }); </script> <!-- 5. Start auth session and get challenge --> <script> const session = await VoiceAuthorize.startAuth({ user_id: "user_123" }); showChallenge(session.challenge); // display to user </script> <!-- 6. Submit voice and receive token --> <script> const result = await VoiceAuthorize.sendVoice({ user_id: "user_123", session_id: session.session_id, audio: wavBlob }); if (result.success) { authorizeAction(result.voiceToken); } </script>
Embedded flow response examples
All responses are JSON. On success, /verify-voice returns a voiceToken JWT. On failure, a structured error with a machine-readable code is returned.
{
"success": true,
"decision": "accept",
"voiceToken": "eyJhbGci...",
"user_id": "user_123"
}
{
"success": false,
"decision": "reject",
"code": "voice_rejected",
"reason": "low_similarity"
}
Token verification (/verify-token) response on success:
{
"valid": true,
"user_id": "user_123",
"iat": 1746452100
}
The decision field in /verify-voice responses can be one of:
Built for real-world conditions
Voice verification is the product; dashboards are optional accessories.
147 tests across 17 suites
Automated coverage across middleware, routes, and integration paths so regressions surface before your integration does.
Multi-signal verification
Voice similarity, phrase alignment, liveness, and synthetic detection are fused into a single decision.
Replay defenses
Short TTLs, single-use tokens where applicable, and liveness gates reduce replay abuse.
Structured failure feedback
reason codes such as voice_mismatch translate straight into UX retries and analytics bucketing.
Security notes
Defense-in-depth applies to /v1 first. Items below about voice JWTs matter primarily for the optional embedded SDK path.
Single-use voice tokens
Every voiceToken is a signed JWT that is revoked immediately upon its first use. Token replay attacks are blocked at the API level.
Sessions expire automatically
Authorization sessions have a short TTL. Expired sessions are swept from memory. Users cannot replay old sessions to authorize new actions.
Replay audio protection
Liveness detection in the Python service rejects pre-recorded audio replays. Voice biometrics require live speech — a recording of the user's voice will not pass.
Multi-signal verification
Three independent checks must pass: speaker similarity, phrase transcript match, and liveness score. A single failing signal blocks authorization.
API key authentication
Every API call requires a server-side API key. Client-side calls should proxy through your backend. Never expose your API key in browser code.