v32.7
Mar 17, 2026 · 3:18 PM EST
UPDATE
Voice test results table redesign
- Redesigned voice test results from pills to proper table with Score/Grade/Top Issue columns
- Removed OpenRouter API key gate from voice analysis functions
- Audio player and action buttons in expandable rows
v32.6
Mar 17, 2026 · 3:01 PM EST
FIX
Fix login redirect
- Login form now uses fetch instead of native POST
- Users redirected to dashboard on success instead of raw JSON
- Inline error display on failed login
v32.5
Mar 13, 2026 · 1:21 PM EST
FIX
Fix Sentry cron monitor flush timing
- Move Sentry.flush after withMonitor returns in cron.js and email-triggers.js
v32.4
Mar 13, 2026 · 11:59 AM EST
FIX
Cron failure fixes
- Fix auth mismatch on email-triggers call, fix duplicate headers dropping Authorization, add Sentry.flush before returns, increase cron maxDuration to 300s
vv32.3
Mar 13, 2026 · 11:55 AM EST
UPDATE
Sentry Next.js migration
- Migrated from @sentry/browser+node to @sentry/nextjs, added instrumentation.ts, sentry configs, global-error boundary, cron monitoring, ownership routing, Session Replay, Browser Profiling, Console Capture
v33.6
Feb 21, 2026 · 2:50 AM EST
FIX
CRON_SECRET rotation + readonly fix
- Rotated exposed CRON_SECRET, fixed readonly variable errors across 13 scripts
v33.5
Feb 21, 2026 · 2:42 AM EST
FIX
Security audit — all findings resolved
- Stripe replay protection, crypto IDs, JSON hardening, admin error masking, rate limit fail-closed
v33.4
Feb 21, 2026 · 2:34 AM EST
FIX
Security audit fixes
- CRON_SECRET exposure fix, remove query-param auth, scrypt password reset, XSS fix, Google OAuth fix, SSRF block
v33.2
Feb 20, 2026 · 8:18 PM EST
FIX
Fix critical auth bypass
- Fixed 49 missing await on async auth calls across 35 API files
- Auth was effectively bypassed — Promise always truthy without await
- Fixed deploy script verification timeout 24s to 60s
- Updated deploy URLs to production domains
v33.1
Feb 20, 2026 · 7:54 PM EST
FIX
Algo bug fixes + learning system wired
- Fix audio_quality key mismatch, getInitialBayesian args, silence_handling hint, graduated speed bonus, continuous Bayesian, repetition threshold, turns softening, consensus fallback, requires_review flag, auto-observe wired into scoring pipeline
v33.0
Feb 16, 2026 · 3:30 PM EST
MAJOR
Algorithm Overhaul (Wave 1)
- Accuracy dimension, aggregation strategies, speed bonus gate, weight profiles, semantic repetition detection
v32.2
Feb 16, 2026 · 3:06 PM EST
UPDATE
Architecture refactoring — DRY shared modules
- Extract 4 shared libs, rewrite score.js, remove 3K dead code, update 15 API files
v32.1
Feb 15, 2026 · 10:00 PM EST
UPDATE
Production-grade security hardening
- P1-P3 long-term fixes: report expiry + signed tokens, timingSafeEqual hardening, removed ALL query-param auth, opaque session tokens, versioned password hashes, timing-safe legacy comparison
v32.0
Feb 15, 2026 · 9:32 PM EST
MAJOR
Security hardening (P0-P2)
- Remove hard-coded secrets, upgrade password hashing SHA-256→scrypt, add timing-safe HMAC, restrict CORS, move secrets from URL to headers, require all env vars, fix report auth + IDs
v31.6
Feb 15, 2026 · 7:51 PM EST
FIX
Codex non-blocking + exit bug fix
- Codex review is now advisory (never blocks)
- Fixed exit code parsing bug in codex-review.sh
- Better resilience with graceful degradation
v31.5
Feb 15, 2026 · 7:45 PM EST
FIX
Badge endpoint fix
- Added missing CORS import
v31.4
Feb 15, 2026 · 7:40 PM EST
UPDATE
Performance & reliability fixes
- Bulk INSERT for metrics (10-20x faster), parallel batch scoring (5x faster), session TTL auto-cleanup, centralized config
v31.3
Feb 15, 2026 · 7:01 PM EST
UPDATE
Security fixes
- parseJSON error handling + secure CORS policy
v31.2
Feb 16, 2026
FIX
Critical Security Fixes (Codex Audit)
- [P0] Fixed auth bypass in report creation — missing await on isAuthenticated()
- [P2] KV fail-closed — assertKvConfigured() prevents sessions without persistence
- [P2] Added Vary: Origin CORS header to prevent cache poisoning
- [P3] Timing-safe share token comparison with hex validation
- [P3] Scrypt param validation — N power-of-2, r/p positive integers
v31.1
Feb 14, 2026 · 11:24 PM EST
FIX
Revert Simple Mode
- Revert Simple Mode UI (broken test runner)
- Fix voice test null innerHTML crash
- Restore original tab navigation
v31.0
Feb 14, 2026 · 11:12 PM EST
MAJOR
Simple Mode UI
- Add idiot-proof 3-screen UI as default
- Agent picker with score cards
- Live testing progress screen
- Results with auto-fix buttons
- Advanced mode toggle
- Fix version badge sync
v30.0
Feb 14, 2026 · 9:39 PM EST
MAJOR
Fleet Intelligence + Self-Healing Agents
- Real-time webhook scoring, multi-model consensus, audio-native analysis, adversarial red-team testing, voice CI/CD testing, fleet intelligence, predictive scoring, multi-language support, real-time alerting, self-healing agent loop
v29.4
Feb 14, 2026 · 8:49 PM EST
FIX
Fix corrupted version strings and deploy script
- Clean all corrupted version strings (v29.3.3.2.1.1.1.0.0 → v29.4)
- Add APP_VERSION constant - deploy script now updates only that line
- Comment version markers are no longer touched by deploys
- Switch to two-level versioning (vMAJOR.MINOR)
v29.3.3
Feb 14, 2026 · 8:39 PM EST
FIX
Fix text sim page distortion and Retell proxy crash
- Remove duplicate tsLearnArea/history rendering in renderResultsDashboard that caused doubled content on text sim page; Add try/catch around kvGet in retell.js to prevent Edge Function crash when KV store fails
v29.3.2
Feb 14, 2026 · 6:48 PM EST
FIX
Harden auto-optimize agent resolution
- Add agentId=agent.id reassignment after fallback chain in all optimize functions; Guard safePatchLLM calls behind llmId check in historical prompt revert
v29.3.1
Feb 14, 2026 · 6:41 PM EST
FIX
Fix auto-optimize & scenario cap
- Fixed auto-optimize/Quick Fix/Full Optimize agent lookup fallback chain; Restored 20-scenario cap and Full 20 pill option; Made prompt fetch/patch resilient to missing llmId
v29.3
Feb 14, 2026 · 6:06 PM EST
UPDATE
Intelligence Dashboard + Generate Fix
- Dual text/voice weight display, recent learnings section, accurate test counts, VoxGrade branding fix, generate from prompt fixes (agent lookup, scenario cap, faster models)
v29.2.1
Feb 14, 2026 · 4:51 PM EST
FIX
Text Sim, Adversarial & Scenario Count Fixes
- Fix text sim LLM fallback chain for gpt-4.1
- Fix adversarial by sending agent prompt from client
- Clip scenario count to requested amount
- Fix deploy script stale paths
v29.2
Feb 14, 2026 · 11:38 AM EST
FIX
Algorithm & Intelligence Fixes
- Fixed scoring threshold mismatch in tests
- Fixed Bayesian confidence calculation
- Fixed drift detection baseline reset
- Added dimension validation to learning engine
- Added golden dataset cases for phases 6-10
- Added retry logic for observe calls
- Added scoring telemetry for safety rails
v29.1.0
Feb 14, 2026 · 9:15 AM EST
UPDATE
Algorithm Page + Intelligence Dashboard Overhaul
- Algorithm reference page: full v29 dual scoring framework
- Intelligence dashboard: frontend tests now feed observations
- Text/Voice breakdown in Intelligence overview
- Learning API: pass threshold 62->70, v29 dimension alignment
- Scoring framework tab switcher with metric type badges
v29.0.0
Feb 14, 2026 · 8:58 AM EST
MAJOR
Text vs Voice Dual Scoring Frameworks
- Dual scoring frameworks: text (7 dims, 21 metrics) and voice (9 dims, 28 metrics)
- Deterministic repetition detection with Jaccard similarity
- New metrics: longest_monologue, speech_pace, turns_to_resolution, barge_in_recovery, overlap_rate
- Unified grade scale A(90+) B(80+) C(70+) D(60+) F(<60)
- Auto mode detection: text vs voice from callData presence
- Speed bonus: voice-only +5 if TTFW <500ms AND P50 <800ms
- 78 unit tests across 4 scoring test files
v28.2
Feb 14, 2026 · 8:00 AM EST
FIX
Revert Scoring Model to Claude Sonnet
- Restored Claude Sonnet 4 as SCORE_MODEL for production grading quality
- Sonnet ban applies to coding/general tasks only, not production scoring engine
- MSW mocks updated for dual-model routing (model string + content matching)
v28.1
Feb 14, 2026 · 7:55 AM EST
FIX
Audit Fixes: Model Swap + Code Hardening
- Swapped banned Claude Sonnet to GPT-4o-mini for scoring (saves ~/mo)
- Transaction-wrapped save_weights for atomicity
- Merged partial customWeights with defaults
- Fixed word count edge case in talk ratio
- Updated MSW test mocks for model-agnostic routing
- Ran SQL migration on Supabase (3 tables + indexes)
v28.0
Feb 14, 2026 · 7:44 AM EST
MAJOR
Platform Hardening: 24 Features
- Autofix confidence threshold (<60% gates Apply)
- Autofix rate limiting (3/day/agent)
- Post-fix regression detection + auto-rollback
- Per-API-key rate limiting (100 req/hr)
- Latency-adjusted scoring (15% weight + speed bonus)
- Custom scoring weights per agent
- Confidence intervals on scores
- Sentiment trajectory as 7th scoring dimension
- Autofix dry run mode
- Auto-generate scenarios from production failures
- Adversarial escalation for high-scoring agents
- Voice behavior coverage map (13 types)
- Golden dataset auto-expansion
- Supabase prompt audit log
- Cost budget tracking infrastructure
- Cron health dashboard + public endpoint
- Async test queue for >20 scenarios
- Enhanced A/B testing with significance calc
- Failure pattern clustering
- Multi-platform comparison view
- 4 new SOPs (Prompt Review, Incident Severity, Changelog, Rollback)
v27.0
Feb 14, 2026 · 12:28 AM EST
UPDATE
Voice Testing: Fix Mismatches, Call Clash, Scenario Scaling
- Fixed voice/gender mismatches in SOP and regression tests
- Fixed call clash with begin_message null and delay
- Added direction-aware text sim for outbound scenarios
- Added 188 pre-built scenario templates (zero API cost)
- Added mega_bulk, replay_bulk, library generation modes
- Added CSV/JSON bulk upload for scenarios
- Added frequency rate control for test runs
- Added 5 new generation lenses and deduplication
v26.0
Feb 13, 2026 · 10:15 PM EST
UPDATE
VocalConsole Rebrand + VoxGrade Logo
- Rebrand app to VocalConsole
- Updated VoxGrade gradient wordmark logo
- VocalConsole VC monogram logo across app, emails, badges, certs
- Updated 30+ files across both codebases
v24.1
Feb 13, 2026 · 8:41 PM EST
FIX
Fix showModal crash & retest scope
- Fixed showModal ReferenceError crash in voice test confirmation
- Fixed text sim retest to only rerun failed scenarios
v25.2
Feb 13, 2026 · 8:15 PM EST
FIX
KV-backed rate limiting
- Rate limiting now persists across deployments via Upstash KV
v25.1
Feb 13, 2026 · 8:06 PM EST
FIX
Remove Website Audit
- Removed Website Audit feature
v25.0
Feb 13, 2026 · 7:52 PM EST
MAJOR
Website Health Audit + Security Hardening
- Website Health Audit — 6-category scoring engine (Links, Navigation, Forms, Performance, Mobile, Accessibility)
- Fetch-based crawling with A-F grading and HTML reports
- Security hardening — rate limiting, HSTS, timing-safe auth, input sanitization
- 31 test files, 327 tests all passing
- Deploy script scaffolding report
v22.11
Feb 13, 2026 · 4:24 PM EST
FIX
Remove scenario types clutter
- Removed 37-card scenario types grid from voice testing page to reduce visual clutter
v23.1
Feb 13, 2026 · 10:30 AM EST
MAJOR
Intelligence Dashboard enrichment + single-agent home focus
- Default learning mode to auto (both fallback paths)
- Add platform_stats API: total tests, calls, agents, pass rate, weight updates, top learnings, false pos/neg
- Add Platform Overview hero section to Intelligence dashboard with aggregate stats and "What the Algorithm Learned" card
- Expand Intelligence KPIs from 4 to 6 (add Total Observations + Algorithm Age)
- Remove multi-agent fleet health grid, quick stats, and Platform Insights card from home page
- Hide Select Agents button from topbar and mobile drawer
- Home page now shows full single-agent metrics (audit report, prompt health, categories, action items, trend)
Co-Authored-By: Claude Opus 4.6
- Default learning mode to auto (both fallback paths)
- Add platform_stats API: total tests, calls, agents, pass rate, weight updates, top learnings, false pos/neg
- Add Platform Overview hero section to Intelligence dashboard with aggregate stats and "What the Algorithm Learned" card
- Expand Intelligence KPIs from 4 to 6 (add Total Observations + Algorithm Age)
- Remove multi-agent fleet health grid, quick stats, and Platform Insights card from home page
- Hide Select Agents button from topbar and mobile drawer
- Home page now shows full single-agent metrics (audit report, prompt health, categories, action items, trend)
v23.0
Feb 13, 2026 · 10:01 AM EST
FIX
Learning Engine + Intelligence Dashboard
- Remove QA Lab tab, replace with admin-only Intelligence tab
- New /api/learning.js: Bayesian weight updating, EWMA drift detection,
adaptive scoring weights with full audit trail
- New /api/calls.js: Production call ingestion from Retell, auto-scoring,
outcome tracking, false positive/negative detection
- Hook learning engine into test pipeline (analytics.js observe) and
cron job (call ingestion after weekly tests)
- Add adaptive weight support to scoring-engine.js
- 5 new Supabase tables: scoring_weights, drift_state, learning_events,
learning_config, production_calls
- Full Intelligence dashboard: KPIs, weight evolution chart, drift monitor,
calibration quality, fix intelligence, production call accuracy, event log
Co-Authored-By: Claude Opus 4.6
- Remove QA Lab tab, replace with admin-only Intelligence tab
- New /api/learning.js: Bayesian weight updating, EWMA drift detection,
- New /api/calls.js: Production call ingestion from Retell, auto-scoring,
- Hook learning engine into test pipeline (analytics.js observe) and
- Add adaptive weight support to scoring-engine.js
- 5 new Supabase tables: scoring_weights, drift_state, learning_events,
- Full Intelligence dashboard: KPIs, weight evolution chart, drift monitor,
v22.10
Feb 13, 2026 · 9:03 AM EST
UPDATE
QA Intelligence Dashboard
- New QA Lab tab with intelligence dashboard
- KPI cards for test stats from localStorage
- SVG line chart with 90-day score timeline and trend line
- AI learnings section with insights and failure patterns
- Improvement tracking with fix effectiveness stats
- Per-agent learning trajectories with mini bar charts
- Phase performance heatmap with grades and pass rates
- 60-second data cache with refresh button
v22.9
Feb 13, 2026 · 8:12 AM EST
UPDATE
Single-Agent Focus
- Removed context bar for cleaner UI
- Removed audit hero header, replaced with slim action bar
- Transformed Overview into agent picker with clickable cards
- Removed redundant single-agent sections from Overview
- Added event.stopPropagation to quick-action buttons
v22.8
Feb 13, 2026 · 6:53 AM EST
UPDATE
Larger Audit Sections
- Full-width category sections with bigger headers, larger text, progress bars in headers, category percentage scores; Enlarged audit item rows with bigger fix sections; Quick Insights card under radar showing strongest and weakest categories
v22.7
Feb 13, 2026 · 6:39 AM EST
UPDATE
Prompt Audit Redesign
- Hero header with 88px score ring, purple gradient, stats row, last-scanned timestamp; Category Breakdown + Performance Radar (2-col grid) ported into Audit tab; Collapsible category sections (replaces tab buttons); Restyled audit items matching Settings Audit rows with status dots, fix sections; Mobile responsive grid stacking
v22.6
Feb 13, 2026 · 5:58 AM EST
UPDATE
Testing UX: Scenario Selector + Timer Fix
- Pill selector, randomized voice scenarios, timer fix, +N more expand
v22.5
Feb 13, 2026 · 5:22 AM EST
FIX
Agent Switch Fix + Full Retest
- Fix agent switching after auto-fix, full optimize retests ALL scenarios
v22.4
Feb 13, 2026 · 4:19 AM EST
UPDATE
Revert to Classic UI - keep all v22.0 backend features
Reverts UI to proven 6-tab design (Overview, Audit, Testing, Pipeline,
Analytics, Settings). Removes experimental 3-view architecture from
v22.1-v22.3. All self-learning engine features preserved.
Co-Authored-By: Claude Opus 4.6
- Revert to Classic UI - keep all v22.0 backend features
v22.3
Feb 13, 2026 · 2:51 AM EST
FIX
Navigation hotfix - all tabs working again
Fix all switchView calls that broke tab navigation. Testing, Audit,
Pipeline, Analytics tabs all work correctly now. selectAgent stays
in current tab. AI Chat/Phone Call buttons go to proper tabs.
Co-Authored-By: Claude Opus 4.6
- Navigation hotfix - all tabs working again
v22.2
Feb 13, 2026 · 2:37 AM EST
UPDATE
Restore tabs + billion-dollar polish
Restore original tab navigation (Dashboard, Audit, Testing, Pipeline,
Analytics, Settings). Agent detail view now a drill-down from cards,
not a replacement. Added score animations, stale badges, ring fills.
Co-Authored-By: Claude Opus 4.6
- Restore tabs + billion-dollar polish
v10.24
Feb 13, 2026 · 2:35 AM EST
UPDATE
iOS Bug Fix & Polish
- 12 bug fixes
- Collection X button fix
- Share card fix
- DPR scaling fix
- WKWebView compat
- iOS simulator tested
v22.1
Feb 13, 2026 · 2:14 AM EST
UPDATE
UX Redesign - 3 Views, Zero Confusion
Agent-centric 3-view architecture (Fleet, Agent, Settings) replaces
6 tabs + 5 sub-tabs + 22 modals. Collapsible sections with lazy
rendering, scroll-spy navigation, slide-out panels, score animations.
Co-Authored-By: Claude Opus 4.6
- UX Redesign - 3 Views, Zero Confusion
v22.0
Feb 13, 2026 · 1:55 AM EST
FIX
Self-Learning Voice Agent Grader
Failure taxonomy, flow traps, conflict scanner, trace+fix+verify,
self-learning memory, settings A/B testing, 4 new DB tables.
Co-Authored-By: Claude Opus 4.6
- Self-Learning Voice Agent Grader
v10.23
Feb 13, 2026 · 12:51 AM EST
UPDATE
Social & Viral Features
- Friends system
- Screenshot sharing
- Leaderboard tabs
- Referral rewards
- Cloud save
- Anti-cheat
- Service worker
- Lucky Spin menu
- Events SOP
v21.12
Feb 12, 2026 · 2:56 PM EST
FIX
Testing Labs Fixes
- Fixed text tests to run 10 scenarios by default (was 5), show all scenarios on load, fixed scheduled test cap to use NUM_PHASES, stricter repetition detection in grading (-10 per repeat auto-fail at 3+), added repetitionCount/repetitionExamples to grading output, better voice test progress feedback during scenario generation
v21.11
Feb 12, 2026 · 2:17 PM EST
FIX
Apply All Fixes Loading Overlay
- Fixed Apply All Fixes on prompt audit - added animated step-by-step loading overlay, busy guard to prevent double-clicks, increased max_tokens to 16k to prevent truncation on large prompts, error display in overlay instead of just toast
v21.10
Feb 12, 2026 · 1:58 PM EST
UPDATE
Clickable Logo Navigation
- Made VoxGrade logo clickable - navigates to dashboard homepage on click
v21.9
Feb 12, 2026 · 1:52 PM EST
UPDATE
Copy Settings Between Agents
- Added Copy Settings From dropdown in settings audit panel - copy voice, audio, and call settings from any agent to another with one click
v21.8
Feb 12, 2026 · 12:56 PM EST
FIX
Template Variable Substitution
- Fixed camelCase variable handling in text sim - {{firstName}}, {{lastName}}, {{direction}}, {{current_time_*}}, {{user_number}} now properly substituted instead of falling through to catch-all [info] replacement
v21.7
Feb 12, 2026 · 11:48 AM EST
MAJOR
Data Flywheel & 10x Testing System
- 5-layer data flywheel system
- Supabase persistent data lake (vacc schema with 5 tables)
- Score trend analytics tab with timeline charts and phase heatmap
- Agent leaderboard with benchmarks and trend tracking
- Golden test suite for regression prevention
- Parameter grid search optimizer
- Cross-agent learning and platform insights
- Supabase write-through on all pipeline runs
- Settings impact analysis
- Auto-golden-suite from A-grade pipeline runs
v21.6
Feb 12, 2026 · 11:37 AM EST
UPDATE
Gold Standard Settings Algorithm
- Updated audit defaults to evidence-based values (responsiveness 0.9, interruption 0.9, voice temp 0.9). Added tool_call_strict audit with speak-during-execution detection. Updated role check to accept Personality-embedded roles. Tightened prompt length thresholds (2500/3000 tokens, 10K/5K chars). Updated normalizeForSpeech to warn when not explicit. Auto-fix patches now use 0.9 instead of old 0.7/0.6 values.
v21.5
Feb 12, 2026 · 10:09 AM EST
FIX
Fix Copy Prompt stale cache bug
- Copy Prompt now always fetches fresh from Retell API instead of using stale local cache
v21.4
Feb 12, 2026 · 10:03 AM EST
UPDATE
VoxGrade Logo + Full Rebrand
- Replace CallSetterAI logo with VoxGrade branded SVG, update all branding refs
v21.3
Feb 12, 2026 · 8:54 AM EST
FIX
Gemini model ID fix, 5-test default
- Fixed invalid google/gemini-2.5-flash-preview model ID to google/gemini-2.5-flash
- Voice test defaults to 5 scenarios checked (Quick 5) per user request
- All 10 still available via Full 10 button
v21.2
Feb 12, 2026 · 8:47 AM EST
FIX
Scenario Gen Fix + Backup Overhaul
- Fixed scenario generation: increased max_tokens from 8K to 16K for 20-phase JSON output, added 240s timeout for scenario gen LLM calls, added truncated JSON repair for partial outputs, made llmCallStrong timeout and max_tokens configurable via opts. Fixed backup system: added .git/ exclusion, removed --delete flag, added 3 missing projects (blindbox-creator, client-portal, flappy-cinnamoroll-app), added vault mount retry with disk wake.
v21.1
Feb 12, 2026 · 8:36 AM EST
FIX
Login body parsing, LLM fallback on 400/500
- Fixed Vercel body parsing for login and signup APIs (req.body pre-parse support)
- LLM fallback chains now catch 400 and 500 errors in addition to 402 and 429
- Both llmCall and llmCallStrong auto-cascade through model chain on any error
v21.0
Feb 12, 2026 · 8:00 AM EST
MAJOR
Voice quality intelligence, 25 scenarios, historical learning
- 25 scenario types with dynamic voice pool replacing old 10 phases
- LLM 402/429 auto-fallback chains for llmCall and llmCallStrong
- Historical prompt + call data storage in KV (archivePromptVersion, archiveCallResult, updateQualityBaseline)
- Fixed latency parsing bug (latency.llm.p50 not llm_latency_p50)
- Enhanced voice metrics: TTS latency, ASR latency, caller sentiment
- Prompt version history UI with score badges, golden tags, restore and compare
- Recording player with audio flag and call summary display
- Pipeline wiring: auto-archive calls and prompts after test runs
v20.6
Feb 12, 2026 · 7:06 AM EST
FIX
Revert to 5 scenarios
- Reverted NUM_PHASES from 10 to 5 for text sim and voice tests,Scenarios 6-10 disabled (11labs-Charlie and 11labs-George voices not available),5 text + 5 voice = stable proven configuration
v20.5
Feb 12, 2026 · 7:05 AM EST
UPDATE
Batch Agent Testing + Auto-Retry + Agent Comparison
- Batch Test All Agents: one-click sequential testing across entire fleet with live progress modal
- Auto-Retry Failed Phases: automatically re-runs phases scoring <40%, invalid, or role-swapped (max 3/run)
- Agent Comparison View: leaderboard table + phase heatmap across all tested agents
- Auto-retry toggle in test config (enabled by default)
- Batch results persisted to localStorage + KV
Co-Authored-By: Claude Opus 4.6
- Batch Test All Agents: one-click sequential testing across entire fleet with live progress modal
- Auto-Retry Failed Phases: automatically re-runs phases scoring <40%, invalid, or role-swapped (max 3/run)
- Agent Comparison View: leaderboard table + phase heatmap across all tested agents
- Auto-retry toggle in test config (enabled by default)
- Batch results persisted to localStorage + KV
v20.4
Feb 12, 2026 · 6:53 AM EST
MAJOR
Scorecard overhaul, auto-pass pipeline, API isolation
- Comprehensive agent scorecard with donut chart and 4-metric breakdown and scenario grid and trend sparklines,Pipeline auto-optimization loop runs up to 3 cycles until agent passes,Per-user API key isolation for multi-tenant accounts,Chat history saves all text sim transcripts across runs,Test-to-test comparison with percentage gain arrows,Diff-style fix preview showing old vs new prompt text,S3 grading no longer penalizes self-correcting callers,Settings audit fixed incorrect Retell defaults
v20.3
Feb 12, 2026 · 6:39 AM EST
UPDATE
Phase 6-10 scoring in production + P9 regex widened
- CRITICAL: Added Phase 6-10 scoring logic to app.html gradeTextPhase() (was only in test scoring-engine.js)
- Phase 6: Accent comprehension bonus (+5 if agent never asks to repeat)
- Phase 7: Emotional floor (55% if agent acknowledges frustration)
- Phase 9: Elderly patience bonus (+5 if agent repeats info) with widened regex
- Phase 10: Security cap (40% if info leaked) / floor (70% if blocked)
- Widened P9 regex: added "to clarify", "once more", "one more time", "i'll spell", "spelling that out"
Co-Authored-By: Claude Opus 4.6
- CRITICAL: Added Phase 6-10 scoring logic to app.html gradeTextPhase() (was only in test scoring-engine.js)
- Phase 6: Accent comprehension bonus (+5 if agent never asks to repeat)
- Phase 7: Emotional floor (55% if agent acknowledges frustration)
- Phase 9: Elderly patience bonus (+5 if agent repeats info) with widened regex
- Phase 10: Security cap (40% if info leaked) / floor (70% if blocked)
- Widened P9 regex: added "to clarify", "once more", "one more time", "i'll spell", "spelling that out"
v19.8
Feb 12, 2026 · 6:36 AM EST
UPDATE
API key isolation for multi-tenant
- Per-user API key isolation for apiIsolated accounts,Retell proxy checks user KV for own API key,User save_keys endpoint with key validation,Frontend API key setup modal blocks usage until keys entered,Settings page API key management section,getQAKey respects user isolation,Chat history feature saves text sim transcripts across runs,closeModal guard for mandatory modals
v20.2
Feb 12, 2026 · 6:32 AM EST
FIX
Score clamp bugfix + 21 new phase 6-10 tests
- Fixed: Score could exceed 100 when phase bonuses added to high base scores (clamped to 0-100)
- Added 21 comprehensive phase 6-10 scoring tests (phase6-10-test.js)
- Tests cover: P6 accent bonus, P7 emotional floor, P9 elderly bonus, P10 security cap/floor
- Edge case coverage: bonus+AH interaction, cap+autofail, floor+deflection, boundary conditions
Co-Authored-By: Claude Opus 4.6
- Fixed: Score could exceed 100 when phase bonuses added to high base scores (clamped to 0-100)
- Added 21 comprehensive phase 6-10 scoring tests (phase6-10-test.js)
- Tests cover: P6 accent bonus, P7 emotional floor, P9 elderly bonus, P10 security cap/floor
- Edge case coverage: bonus+AH interaction, cap+autofail, floor+deflection, boundary conditions
v20.1
Feb 12, 2026 · 6:23 AM EST
FIX
Smart Diagnostics + Score History + Export Reports
- Smart Diagnostics Panel: phase-specific fix recommendations for failing scenarios
- Score History modal with SVG trend chart and run history
- KV persistence for voice test scores (cross-device, 90-day retention)
- HTML report export with full breakdown per phase
- CSV export for spreadsheet analysis
- Export/History/CSV buttons in results view
Co-Authored-By: Claude Opus 4.6
- Smart Diagnostics Panel: phase-specific fix recommendations for failing scenarios
- Score History modal with SVG trend chart and run history
- KV persistence for voice test scores (cross-device, 90-day retention)
- HTML report export with full breakdown per phase
- CSV export for spreadsheet analysis
- Export/History/CSV buttons in results view
v20.0
Feb 12, 2026 · 5:49 AM EST
MAJOR
10-Point Voice Testing Framework
- Expanded from 5 to 10 test phases (Heavy Accent/ESL, Emotional/Frustrated, Speed Talker, Elderly/Hard of Hearing, Adversarial/Social Engineering)
- 10 deep caller personas with backstories, accents, filler words, catchphrases
- 5 new ElevenLabs voice profiles (Myra, Lily, Charlie, George, Ethan)
- 10-point scoring rubric with phase-specific metrics (Accent Comprehension, Emotional Intelligence, Security & Guardrails)
- Quick 5 / Full 10 preset buttons in voice test UI
- Updated scoring engine + golden dataset for phases 6-10
- Created Voice Testing SOP (docs/VOICE_TESTING_SOP.md)
Co-Authored-By: Claude Opus 4.6
- Expanded from 5 to 10 test phases (Heavy Accent/ESL, Emotional/Frustrated, Speed Talker, Elderly/Hard of Hearing, Adversarial/Social Engineering)
- 10 deep caller personas with backstories, accents, filler words, catchphrases
- 5 new ElevenLabs voice profiles (Myra, Lily, Charlie, George, Ethan)
- 10-point scoring rubric with phase-specific metrics (Accent Comprehension, Emotional Intelligence, Security & Guardrails)
- Quick 5 / Full 10 preset buttons in voice test UI
- Updated scoring engine + golden dataset for phases 6-10
- Created Voice Testing SOP (docs/VOICE_TESTING_SOP.md)
v19.7
Feb 12, 2026 · 5:32 AM EST
UPDATE
Test comparison, fix preview diffs, S3 grading
- Test-to-test comparison with % gain arrows,Diff-style fix preview showing old/new text with rationale,S3 grading no longer penalizes self-correcting callers,Golden dataset bounds updated for Phase 6 accent bonus,Scoring engine _agentTurns hoisting fix,renderFixPreviewHTML shared across all fix modals,toggleFixDiffFullPrompt for full prompt toggle
v19.5
Feb 12, 2026 · 5:06 AM EST
FIX
Text sim fix, score display, audit improvements
- Fixed _seed is not defined error that broke all text simulations
- Agent health cards now show audit score (97%) as primary with test sim score as secondary
- Trend arrows show percentage gain/loss since last test run
- Randomized session seeds and phone numbers per test run
- Proper audit fix with confirmation modal
- Big animated notifications on fix completion
- Re-Scan Score button on audit page
v19.6
Feb 12, 2026 · EST
FEATURE
Affiliate Auto-Enroll + Before/After Comparison Reports
- Share a QA report = auto-enrolled as affiliate (earn 30% commission on referrals)
- Shared reports include "Audit Your Agents Free" CTA with affiliate tracking link
- New before/after comparison card: side-by-side grade rings showing improvement
- "Share Before/After Report" button added to pipeline completion screen
- 3-tier commission: 30% Starter, 35% Growth (10+ refs), 40% Elite (25+ refs)
- Affiliate clicks tracked on report views
- Report branding updated to VoxGrade
v19.4
Feb 12, 2026 · 4:41 AM EST
UPDATE
Audit view-all tab, retest details
- Added All tab to audit page - see entire 30-point report in one view
- Added Fix All Failures button on audit page
- Failures Only and Show All filter buttons on audit
- Voice retest summary shows detailed new scores and transcript excerpts and issue breakdowns
v19.3
Feb 12, 2026 · 4:39 AM EST
UPDATE
Fix preview, retest details, button state
- Auto-fix buttons now show preview of proposed changes before applying - user must confirm
- Fix buttons disable after applying with Fixes Applied badge - re-enables on new test run
- Voice retest summary now shows detailed new results including scores and transcript excerpts and issues
- Random caller names per test run instead of seeded repeats
v19.2
Feb 12, 2026 · 4:31 AM EST
FIX
Disable auto-fix, randomize test names
- Removed 4 auto-fire triggers - fixes now require user click
- Text sim uses random caller names each run instead of same seeded names
- Expanded name pool from 30 to 40 first and last names
v19.1
Feb 12, 2026 · 4:14 AM EST
FIX
Security Lockdown: Disable Signup + Single Admin
- Signup endpoint now captures waitlist only - NO account creation, NO sessions granted
- Deleted all 7 user accounts from KV, only victor@tested.media remains
- All existing sessions invalidated via rotated SESSION_SECRET
- Password updated across all auth files and smoke tests
- Pipeline overlay uses stable DOM architecture (no more blink/glitch)
v19.0
Feb 12, 2026 · EST
FEATURE
Affiliate Auto-Enroll via Report Sharing
- Share a QA report = auto-enrolled as affiliate (earn 30% commission on referrals)
- Shared reports now include "Audit Your Agents Free" CTA with affiliate tracking link
- Affiliate clicks tracked when shared reports are viewed
- 3-tier commission: 30% Starter, 35% Growth (10+ refs), 40% Elite (25+ refs)
- Auto-approved affiliate status (no manual review needed for report sharers)
- Updated report branding from VoxGrade to VoxGrade
v18.9
Feb 12, 2026 · 4:03 AM EST
FIX
Password Rotation and Session Revoke
- Changed admin access password
- Rotated SESSION_SECRET to invalidate all existing sessions
- All previous logins are now revoked
- Updated Vercel env vars ACCESS_PASSWORD and SESSION_SECRET
v18.8
Feb 12, 2026 · 3:57 AM EST
UPDATE
Client Intake Wizard Rebuild
- Rebuilt intake wizard with auto-fill: industry selection auto-populates greetings, questions, objections, business type, and purpose
- Added 3 greeting examples per industry (warm intro, returning caller, after-hours) across all 10 industries
- Added Business Type field (Local Service, B2C, B2B, E-Commerce, Professional, SaaS)
- Added Agent Personality/Tone selector (5 options)
- Added Booking and Transfer Rules field
- Added animated step-by-step loading overlay with 5 phases and elapsed timer
- Fixed newline bug in LLM prompt construction (was sending literal backslash-n)
- Increased max_tokens from 2000 to 4096 for longer production prompts
- Added target agent selector directly in form (no post-generate selection needed)
- Added Generate and Run Pipeline one-click button
- Enhanced LLM system prompt with 9 required sections for comprehensive prompt output
v18.7
Feb 12, 2026 · 3:51 AM EST
FIX
Pipeline Orchestrator Flicker Fix
- Fixed pipeline overlay blinking/glitching caused by full DOM re-render every 1 second
- Timer tick now uses surgical DOM updates (3 elements) instead of innerHTML nuke
- CSS animations (fadeIn, slideUp, pulse, spin, shimmer) no longer restart every second
- Full render only triggers on actual state changes (step start, complete, error, retry)
v18.6
Feb 12, 2026 · 2:32 AM EST
UPDATE
Full GPT-4.1 Sim Pipeline
- - ALL sim models upgraded to GPT-4.1 (agent, caller, grading)
- Agent: GPT-4.1 temp 0 (exact Retell match)
- Caller: GPT-4.1 temp 0.3 (reliable script following)
- Grading: GPT-4.1 temp 0.1 (accurate deterministic scoring)
- No more gpt-4o-mini anywhere in sim pipeline
v18.5
Feb 12, 2026 · 2:17 AM EST
FIX
Agent Sim Model Match + Email Fix
- - Agent sim now uses gpt-4o at temp 0 (was gpt-4o-mini at 0.7) to match Retell GPT-4.1 config
- {{EMAIL}} no longer pre-populated in sim (was giving agent fake email, contradicting ZERO HALLUCINATION rule)
- P5 Memory test should now pass since gpt-4o handles multi-turn memory correctly
v18.4
Feb 12, 2026 · 1:52 AM EST
FIX
Deterministic Testing Protocol
- Grading temperature locked to 0.1 (was 0.7) for deterministic scoring, caller temperature lowered from 0.95 to 0.75, seeded PRNG for caller identity per agent+phase, seeded test variables per agent, voice grading locked to 0.1 (was 0.8)
v18.3
Feb 12, 2026 · 1:10 AM EST
MAJOR
Per-User Auth + Pricing + Feature Gating
- Identity-aware session tokens (email encoded in HMAC), server-side plan enforcement via /api/user, usage tracking with daily limits (429 on exceed), pricing modal with Stripe checkout, plan badge in topbar, account modal shows plan limits + trial days, auto-upgrade banners for free/trial users, unified upgradeCheckout -> startCheckout, deprecated client-side sim tracking for server-side
v18.2
Feb 12, 2026 · 12:34 AM EST
FIX
Retest Only Failed Scenarios
- fullOptimize now retests only failed scenarios instead of all 5, scoring-engine.js synced with multi_turn_memory phaseKey
v18.1
Feb 12, 2026 · 12:16 AM EST
FOUNDATION
Automated Testing Framework
- 4-layer test framework: pre-deploy gate, post-deploy smoke tests, golden dataset regression, deploy integration
- Pre-deploy blocks deploy on syntax errors, scoring regression, version mismatch, hardcoded secrets
- 15 golden test transcripts across P1-P5 phases
- Post-deploy smoke tests verify 8 production endpoints
- npm test runs full pre-deploy gate
v18.0
Feb 11, 2026 · 11:40 PM EST
MAJOR
Scoring Algorithm Hardening + Revenue Infrastructure
- Fixed critical scoring bug: removed hardcoded voiceScore:75 that inflated pipeline scores by ~19 points. Fixed deflection cap conflicting with Phase 4 knowledge boundary rules. Added penalty caps (length max -25, stacking max -15). Improved question counting to exclude rhetorical patterns. Fixed double-weighted audit scores in pipeline final grade. Added Stripe checkout API, webhook notification API, scheduled monitoring cron, upgraded QA endpoint with proper threshold tests.
v17.9
Feb 11, 2026 · 11:29 PM EST
FIX
RunTextSim Null Fix
- Fixed TypeError null reference in runTextSim when called from rerunTextSimsForAgent - added DOM element null checks
v17.8
Feb 11, 2026 · 11:02 PM EST
MAJOR
Full Pipeline + Scoring + Certs
- Industry templates (10 industries), auto-populate intake wizard, Save & Run Pipeline button, deep-audit integration (40/60 blend), response length penalty (-5pts >100 words), question stacking penalty (-3pts), deflection detection, updated scoring weights (25/50/25), auto-certification on pipeline completion (60%+ score), cert badge in completion overlay, cert/badge/deep-audit APIs live
v17.7
Feb 11, 2026 · 10:54 PM EST
UPDATE
Design System + Empty States
- Flat card design (no glassmorphism), empty states for Dashboard/TextSim/ScriptGen/Pipeline, skeleton loading animations, btn-cta class, card-selected class
v17.6
Feb 11, 2026 · 10:29 PM EST
FIX
Score Validation & Dedup Guard
- Fixed case-sensitive PASS check causing 0% scores
- Added duplicate edit detection to Prompt Surgeon
- Dedup integrated into all optimizer paths
- Enhanced surgeon prompt with explicit duplicate check rule
v17.5
Feb 11, 2026 · 10:08 PM EST
UPDATE
Remove Main Dashboard
- Removed Main Dashboard tab from Testing Lab - redundant with Text Testing
- Default tab now Text Testing when entering lab mode
- Batch Test already available in Text Testing header
- Cleaner UX with fewer tabs
v17.4
Feb 11, 2026 · 10:05 PM EST
FIX
Bulletproof Scoring
- Client-side score validation recalculates from criteria when LLM math is off by 10+
- Removed duplicate PASS_THRESHOLD (was 70 in optimizer, now uses global 65)
- Prompt Surgeon enhanced with P4 knowledge boundary and P5 memory awareness
- Auto-fail penalty validation with hallucination detection
v17.3
Feb 11, 2026 · 9:23 PM EST
FIX
Unified Grade Thresholds
- Standardized all 25 grade threshold calculations to single source of truth (GRADE_THRESHOLDS constant: A=90 B=80 C=65 D=45). Previously had 4 different scales causing same score to get different grades. Added calcGradeLetter() and isPassingScore() helpers.
v17.2
Feb 11, 2026 · 9:10 PM EST
FIX
Dynamic Agent Lookup Fix
- Fixed 74 agent lookups that only searched static AGENTS array, breaking all retest/optimizer/auto-fix flows for dynamically loaded agents. Added findAgentById() helper. Added action buttons to retest summary modal (Fix & Retest, Optimize, Retest Again for still-failing scenarios).
v17.1
Feb 11, 2026 · 9:07 PM EST
MAJOR
Pipeline Orchestrator
- 9-step automated pipeline (audit, fix, re-audit, settings audit, fix settings, text sim, surgeon, re-test, final grade). New vertical stepper UI with progress tracking, retry/skip/abort per step, KV persistence for resume. New /api/pipeline endpoint for state management.
v17.0
Feb 11, 2026 · 8:18 PM EST
UPDATE
Pipeline Cancel Button
- Added cancel button to full pipeline overlay. Cancellation checks between every step. Clean cancel state with proper cleanup.
v16.9
Feb 11, 2026 · 8:14 PM EST
UPDATE
Full Pipeline Orchestrator
- One-click pipeline: audit->fix prompt->re-audit->fix settings->5-phase text sim->results. Fixed 0% audit score edge case bug. Added audit score to shareable reports. Pipeline button on every agent card with animated step-by-step overlay.
v16.8
Feb 11, 2026 · 8:02 PM EST
UPDATE
Integrated Scoring Algorithm
- Audit score (prompt + settings) now factors into overall grade: 30% audit + 50% text + 20% voice. Added getAgentSettingsHealth() and getAgentAuditScore() helpers. Updated all 3 grading call sites. Score breakdowns show audit component when available.
v16.7
Feb 11, 2026 · 7:04 PM EST
FIX
Grading Consistency Stabilization
- Repetition scoring capped at max -15 points (was uncapped causing score nukes), Phase 4 hallucination trap scoring floor of 65% for agents that deflect honestly, Phase 5 memory grading only scores criteria that were actually tested in transcript, Phase 5 scenario generation now REQUIRES explicit memory-test actions in callerBehavior and keyTopics
v16.6
Feb 11, 2026 · 1:41 PM EST
FIX
Fix Phase 4 Grading
- Fixed Phase 4 hallucination trap grading - agent that deflects all unknown questions honestly but is repetitive now scores 60-75% instead of 0% F, Added explicit grading instruction that knowledge boundary tests prioritize no-hallucination over conversation quality, Added repetition criterion (3+ identical asks = minor fail not auto-fail)
v16.5
Feb 11, 2026 · 1:32 PM EST
UPDATE
Universal Agent Testing
- Dynamic prompt-aware voice testing for ANY agent type, Prompt hash caching (24h TTL saves 20-40s on repeat runs), Agent-type-aware grading (sales/support/IVR/scheduling/collections/survey auto-detected), De-hardcoded caller prompts (removed Sofia-specific qualification rules), Dynamic expectedDurationSec from AI-generated scenarios, Universal default grading criteria (not sales-specific), Scenario preview cards in voice test confirmation dialog, Regenerate Scenarios button to clear cache, Dynamic phase names in auto-fixer
v16.4
Feb 11, 2026 · 1:15 PM EST
UPDATE
Auto-Optimize After Tests
- Auto-trigger prompt optimization when voice scenarios fail, test caller qualification fix (qualifying budget answers), duplicate pricing label cleanup
v16.3
Feb 11, 2026 · 1:00 PM EST
FIX
Bug fixes
- Fix copyTranscript not defined error (was trapped inside renderScenarioCards scope, moved to global)
- Remove duplicate GPT-4o pricing entry with wrong Claude Sonnet label
- Fix all-N/A voice metrics edge case in getUnifiedGrade
v16.2
Feb 11, 2026 · 12:31 PM EST
UPDATE
Scoring System Overhaul
- Weighted voice metrics (latency 20%, disconnect 15%, interruption 15%, silence 15%, depth 10%, duration 10%, turns 5%, TTS 5%, repetition 5%)
- Missing data excluded from score instead of defaulting to 75 (was inflating grades)
- Dynamic turn count scales with profile duration (not hardcoded 24)
- Wider conversation depth thresholds (10-55 words PASS instead of 15-40)
- Stronger auto-fail penalties (25/30/35pts, hallucination 40pts each)
- Deterministic grading guidance in AI prompt
- Collapsible scenario cards (click header to expand/collapse, default collapsed)
- Increased poll timeout to 7.5min for S3 Confusion scenario
v16.1
Feb 11, 2026 · 12:15 PM EST
FIX
Revert S3 voice to 11labs-James
- Reverted S3 Silence and Confusion voice from 11labs-Valentino (non-existent) back to 11labs-James. This was causing S3 ERR on all voice tests.
v16.0
Feb 11, 2026 · 12:07 PM EST
FIX
Remove auto-load results, clean voice test page
- Removed auto-load of previous test results on voice page render - was expanding all results and destroying the clean layout. Results still accessible via View Full Results button in saved history section.
v15.9
Feb 11, 2026 · 12:00 PM EST
FIX
Voice Test Layout Fix
- Fixed voice test results displaying in wrong page section
v15.8
Feb 11, 2026 · 11:56 AM EST
FIX
Smarter Auto-Optimization
- Surgical prompt edits preserve what's already working
- Voice test results persist across page refreshes
- Prompt History always accessible from voice test page
v15.7
Feb 11, 2026 · 11:45 AM EST
FIX
Prompt Surgeon + Realistic Callers
- Prompt Surgeon now applies targeted edits per scenario
- One-click Retest button on each scenario card
- Randomized caller names and accented voices for realistic testing
v15.6
Feb 11, 2026 · 11:29 AM EST
FIX
Auto-Optimizer Pipeline Fix
- Fixed optimizer data source to read actual test results
- Per-criterion failure extraction for targeted prompt fixes
- Retest now uses live results instead of stale data
v15.5
Feb 11, 2026 · 11:25 AM EST
FIX
Voice Test Stability Fix
- Fixed crash affecting all 5 voice test scenarios
- Admin users now get full feature access
- Improved error handling across test pipeline
v15.4
Feb 11, 2026 · 11:17 AM EST
FIX
Testing Engine Stability
- Fixed stale results when switching between agents
- Improved error detection for AI provider outages
- Global error handler with copy-to-clipboard for support
v15.3
Feb 11, 2026 · 10:52 AM EST
FIX
Text Simulation Reliability
- Fixed crashes in text simulation pipeline
- A/B comparison results now handle edge cases gracefully
- Self-learning engine wrapped with error recovery
v15.2
Feb 11, 2026 · 10:36 AM EST
FIX
Text Sim Error Handling
- Scenario cards now display errors cleanly with retry button
- Cleaned up duplicate error display code
v15.1
Feb 11, 2026 · 10:23 AM EST
UPDATE
Security Hardening
- Full system security audit - clean bill of health
- Hardened credential storage and access controls
- Automated secret detection blocks accidental key exposure
- Pre-deploy guardrails enforce version and changelog compliance
v15.0
Feb 11, 2026 · 9:23 AM EST
MAJOR
Production Launch Ready
- Complete auth system: signup, login, password reset with SHA-256 hashing
- Stripe payment integration with checkout, webhooks, and transactional emails
- Plan-based feature gating: Free 5 sims/month, Pro unlimited, Enterprise unlimited voice
- Trial lifecycle: 14-day trial with auto-downgrade and 8-email drip sequence
- User profile API with server-side data refresh on page load
- Daily email-triggers cron for trial, re-engagement, and retention emails
- Changelog auto-updates on every deployment via deploy.sh
v14.9
Feb 11, 2026 · 9:14 AM EST
MAJOR
Plan-Based Feature Access
- Free plan: 5 text simulations per month
- Pro plan: Unlimited sims, voice testing, A/B comparison, auto-optimization
- 14-day trial gets full Pro access
- Automated lifecycle emails for trial, engagement, and retention
v14.8
Feb 11, 2026 · 9:07 AM EST
MAJOR
Production Stability
- User profile syncs plan status on every page load
- Expired trials auto-downgrade to free plan
- Password reset and email triggers fully operational
v14.7
Feb 11, 2026 · 8:55 AM EST
MAJOR
Stripe Payments
- One-click upgrade to Pro and Enterprise from your account
- Stripe checkout with automatic plan activation
- Transactional emails for upgrades, cancellations, receipts, and failed payments
v14.6
Feb 11, 2026 · 8:37 AM EST
MAJOR
User Authentication
- Email and password signup with strength indicator
- Secure login with password visibility toggle
- 14-day free trial starts automatically on signup
- Login tracking and daily usage metrics
v14.5
Feb 11, 2026 · 3:55 PM EST
MAJOR
Live WebRTC Call Testing + Call Explorer + Transcript System
- Live web call testing via Retell WebRTC - test agents directly from browser with microphone
- Real-time live transcript display during calls with role-based coloring (AI Agent vs Caller)
- Call timer with elapsed time display during active calls
- Call detail viewer with full transcript, recording playback, and call analysis breakdown
- Auto-grade individual calls directly from Call Explorer
- Universal transcript copy system - works across all 6 transcript formats (live, text sim, scenario, call explorer, recording, chat bubbles)
- Recording audio player with inline playback controls
- Copy transcript button on every transcript view across the entire app
- Transcript toggle show/hide across all panels
- Call library integration - save and review past calls
- Stripe checkout endpoint for payment processing
- Enhanced mobile responsiveness (v14.5 CSS overhaul)
- Results dashboard consolidated view (v14.5 UX overhaul)
v14.4
Feb 11, 2026 · 3:30 PM EST
UPDATE
Navigation + SEO + Deploy Automation
- Updated nav: Features, Pricing, Use Cases, Docs, Blog, Login
- Expanded footer with 10 links (pricing, use-cases, compare, blog, changelog, affiliate, docs, contact, privacy, terms)
- JSON-LD SoftwareApplication schema for Google rich results
- Canonical URL for SEO deduplication
- Blog index page (was 404, now 200)
- Sitemap expanded to 18 URLs (added /blog, /press, /va-checklist)
- Auto-changelog: deploy.sh + git post-commit hook
v14.3
Feb 11, 2026 · 2:45 PM EST
UPDATE
Enhanced Mobile Responsiveness + UX Overhaul
- Complete mobile-responsive CSS overhaul across all app views
- Consolidated Results Dashboard with unified view
- UTM + affiliate referral tracking on landing page (localStorage + Plausible analytics)
- Affiliate click tracking endpoint (no auth required)
- Slack notification version tags updated
v14.2
Feb 11, 2026 · 11:30 AM EST
UPDATE
Commission Tiers + Auth Hardening
- Affiliate commission tiers implemented: 30% Starter / 35% Growth (10+) / 40% Elite (25+)
- Auto-tier upgrades based on referral count on dashboard load
- Password reset fully functional with SHA-256 + salt hashing
- Session invalidation on password change
- Google Auth, login, signup endpoints hardened with proper CORS
- Admin API expanded with affiliate management actions
v14.1
Feb 11, 2026 · 8:20 AM EST
MAJOR
Affiliate Program + Password Reset + Landing Pages
- Affiliate program: 30% Starter / 35% Growth (10+ referrals) / 40% Elite (25+ referrals) recurring commissions
- Affiliate dashboard with click tracking, conversion rates, earnings breakdown
- Auto-tier upgrades based on referral count
- Password reset with SHA-256 hashing, 1-hour token expiry, session invalidation
- 26 landing pages: pricing, docs, demo, use cases, comparison, tutorials, affiliate, contact, terms, privacy, press, changelog, email preview, login, signup, forgot/reset password, verify email, 404, admin, VA checklist, project dashboard, launch, trailer, screencast
- Sitemap.xml + robots.txt for SEO
- Blog infrastructure with SEO-optimized templates
v14.0
Feb 11, 2026 · 5:42 AM EST
MAJOR
Admin Dashboard
- Admin API with user management and system controls
- Admin dashboard page with analytics overview
- User listing, plan management, and activity logs
- System health monitoring from admin panel
v13.4
Feb 11, 2026 · 5:35 AM EST
UPDATE
Email Preview + Template Testing
- Email preview page for visual template testing
- Render any of 74 templates with sample data
- Mobile-responsive email preview mode
v13.3
Feb 11, 2026 · 5:28 AM EST
UPDATE
Automated Email Triggers
- Email triggers cron endpoint for automated sends
- Trial reminders at day 3, 5, 7, 10, 12, 13
- Re-engagement emails for inactive users
- Weekly digest and monthly recap automation
v13.2
Feb 11, 2026 · 5:18 AM EST
UPDATE
Retention + Referral Emails
- Churn prevention email sequences
- Referral program email templates
- Post-purchase follow-up and upsell emails
- Admin notification templates for system events
v13.1
Feb 11, 2026 · 5:08 AM EST
UPDATE
Onboarding + Engagement Emails
- Welcome email sequence for new signups
- Getting started guide emails with setup steps
- Engagement emails: feature highlights, tips, best practices
- Conversion emails for trial-to-paid nudges
v13.0
Feb 11, 2026 · 4:58 AM EST
MAJOR
Email Template Engine
- 72 branded email templates across 10 categories
- Template library (lib/emails.js) with dynamic data injection
- Email delivery service via Resend API
- Transactional email support for password resets, verifications
v12.4
Feb 11, 2026 · 4:48 AM EST
UPDATE
Stripe Webhook Handler
- Stripe webhook endpoint for real-time subscription events
- Handles plan upgrades, downgrades, cancellations
- Payment failure handling with retry logic
- Automatic user plan sync on billing changes
v12.3
Feb 11, 2026 · 4:38 AM EST
UPDATE
Google Auth + Email Verification
- Google OAuth integration for one-click signup
- Email verification page with token validation
- Verification email sent on registration
v12.2
Feb 11, 2026 · 4:28 AM EST
UPDATE
Password Reset Flow
- Password reset API with secure token generation
- Forgot password page with email input
- Reset password page with new password form
- Expiring reset tokens for security
v12.1
Feb 11, 2026 · 4:18 AM EST
UPDATE
Login System + Session Management
- Login API with credential validation
- Session management with KV-backed tokens
- Auth middleware (lib/auth.js) for protected routes
- Logout endpoint with session cleanup
v12.0
Feb 11, 2026 · 4:05 AM EST
MAJOR
User Registration + 14-Day Trial
- Signup API with user creation and trial activation
- 14-day free trial with automatic expiry tracking
- KV storage layer (lib/kv.js) for user data
- Signup landing page with registration form
v11.2
Feb 11, 2026 · 3:50 AM EST
UPDATE
Vercel Routing + Clean URLs
- vercel.json route configuration for all API endpoints
- Clean URL rewrites for landing pages
- CORS and security headers configuration
v11.1
Feb 11, 2026 · 3:40 AM EST
UPDATE
Mobile Responsive + Data API
- Enhanced mobile responsiveness across all views
- New data API endpoint for external integrations
- Server-side performance improvements
v11.0
Feb 11, 2026 · 3:00 AM EST
MAJOR
Server Architecture + API Layer
- Rebuilt server architecture with Express.js routing
- Vercel serverless API endpoints for all backend operations
- Modular lib/ directory for shared utilities
- Retell API integration module
v10.2
Feb 10, 2026 · 11:39 PM EST
UPDATE
One-Click Prompt Fixes
- One-click prompt fixes with live re-scan after applying
- Audit timestamp tracking for historical fix records
v10.1
Feb 10, 2026 · 11:33 PM EST
UPDATE
Actionable Audit Items
- Actionable audit items with auto-fix recommendations
- Dashboard quick-links to flagged issues
- Priority-ranked recommendations engine
v10.0
Feb 10, 2026 · 11:05 PM EST
MAJOR
Multi-Platform Support + Auto-Apply + Phone Management
- Auto-apply prompt optimizations pushed directly to Retell
- Vapi platform support alongside Retell AI
- Phone number manager with call routing configuration
- Cron-based scheduled testing (4h, 8h, 12h, 24h intervals)
- Marketing landing page launch
v9.7
Feb 10, 2026 · 10:35 PM EST
UPDATE
Shareable Reports + Realistic Callers
- Shareable QA reports with premium dark mode design
- Dashboard analytics cards with score trends
- Overhauled scenario generation for realistic caller behavior
- Fixed caller realism: no unprompted financial details
- Auto-stop audio playback on tab switch
v9.6
Feb 10, 2026 · 9:59 PM EST
UPDATE
A/B Prompt Comparison
- A/B prompt comparison: clone, modify, compare side-by-side
- Mobile responsive layout improvements
v9.5
Feb 10, 2026 · 9:55 PM EST
UPDATE
PDF Export + Batch Testing
- PDF report export with branded formatting
- Batch testing across multiple scenarios
- Dedicated settings page for configuration
v9.4
Feb 10, 2026 · 8:58 PM EST
UPDATE
Scheduled Monitoring + Agent Health Checks
- Scheduled monitoring with custom grading criteria
- Agent health check dashboard
- Floating space background for login page
- Safety wrappers for all widget components
- Fixed text testing crash from cached localStorage scenarios
- Fixed name randomization using global replace
v9.3
Feb 10, 2026 · 8:01 PM EST
UPDATE
Prompt Version History + Slack Alerts
- Prompt version history with diff view between versions
- Slack webhook alerts for failed tests
v9.2
Feb 10, 2026 · 7:50 PM EST
UPDATE
Dashboard Score Trends
- Dashboard score trends over time with charts
- Notification system wired to key events
- Quality gate tracking and pass/fail history
v9.1
Feb 10, 2026 · 7:41 PM EST
MAJOR
Dynamic Caller AI + Self-Learning Engine
- Dynamic caller AI with random personas and conversation variety
- Self-learning engine that adapts to test patterns
- Database hydration and persistence hooks for all data types
- Audit tab redesign with actionable insights
v9.0
Feb 10, 2026 · 6:53 PM EST
MAJOR
Database Persistence Layer (Upstash KV)
- Upstash KV database integration for persistent storage
- All test results, reports, and settings now persist across sessions
- Text Testing UI cleanup and performance improvements
v8.9
Feb 10, 2026 · 6:13 PM EST
UPDATE
Single-Agent Focus Redesign
- Redesigned Overview, Settings Audit, and Text Testing for single-agent focus
- Streamlined UI to center workflows around one agent at a time
v8.8
Feb 10, 2026 · 6:01 PM EST
UPDATE
UI Simplification + Context Bar Redesign
- Removed Compare tab in favor of dedicated A/B testing
- Moved Costs tab from Testing Lab to top-right badge
- Redesigned context bar and agent selector
- Fixed duplicate caller names, enforced unique scenarios
v8.7
Feb 10, 2026 · 5:38 PM EST
UPDATE
Cost Analytics + Layout Overhaul
- Cost analytics page with historical spend tracking
- Center-aligned all navigation, titles, headers, and context bar
- Instant loading feedback for voice tests
- Fixed voice test caller talking over agent at call start
v8.6
Feb 10, 2026 · 3:52 PM EST
MAJOR
Prompt-Based Voice Testing
- Prompt-based voice testing with AI-driven call scripts
- Center-aligned navigation and mode toggle
- Critical bug fixes for testing pipeline
v8.5
Feb 10, 2026 · 2:59 PM EST
UPDATE
Caller Realism Overhaul
- Complete overhaul of caller conversation scripts for natural behavior
- Callers now behave like real humans with realistic pauses and responses
v8.4
Feb 10, 2026 · 2:26 PM EST
UPDATE
Prompt Auditor + Quality Gates
- Prompt auditor tabs with spend breakdown per prompt version
- Quality gates with grade feedback and self-analysis
- Conversation history tracking across test runs
- Loading spinner during AI script generation
- Clean SVG notification bell icon
- Fixed voice tests using hardcoded scripts instead of AI-generated ones
v8.3
Feb 10, 2026 · 1:48 PM EST
FIX
Voice Testing Stability
- Forced AI script generation on every test run
- Model upgrade for more reliable voice test responses
- Role fix and visual loader improvements
v8.2
Feb 10, 2026 · 1:35 PM EST
FIX
Voice Testing 404 Fix
- Fixed voice testing 404 errors caused by invalid ElevenLabs voice IDs
v8.1
Feb 10, 2026 · 1:16 PM EST
FIX
Agent Not Found Fix
- Fixed voice testing agent-not-found bug
- Removed legacy phases 6-8 from test pipeline
v8.0
Feb 10, 2026 · 9:22 AM EST
MAJOR
QA Engine Overhaul
- Removed phases 6-8 and renamed remaining phases to scenarios
- Removed legacy costs tab from testing interface
- Added progress indicator for test runs
- Streamlined testing flow for faster iteration
v7.3
Feb 10, 2026 · 9:11 AM EST
FIX
Caller Agent Quality Fix
- Strict caller prompts with self-analysis and lower temperature
- Fixed null reference crash in generateTestScenarios
v7.2
Feb 10, 2026 · 8:49 AM EST
FIX
Voice Test Quality
- Reduced to 5 phases for focused testing
- Auto-generate test scripts per run instead of static scripts
- Removed pulsing logo animation for cleaner UI
v7.1
Feb 10, 2026 · 8:41 AM EST
FIX
Test Script Personas
- Unique test personas across all 8 phases
- Shorter, more focused test scripts
v7.0
Feb 10, 2026 · 8:39 AM EST
MAJOR
Major QA Overhaul
- Complete QA system overhaul: shorter calls, faster feedback
- Per-test cost tracking for budget visibility
- UX polish across all testing interfaces
v6.0
Feb 10, 2026 · 7:31 AM EST
UPDATE
Live Transcript + Phone Assignment
- Live call transcript with real-time updates
- Stop button to end calls mid-test
- 4-minute safety timeout for runaway calls
- Phone number assignment with dynamic phone map loading
v5.0
Feb 10, 2026 · 5:45 AM EST
UPDATE
QA Leaderboard + Call Filters
- QA Leaderboard tab ranking agents by test performance
- Call filters and recording visibility controls
- Select All checkbox for bulk agent selection
- Audio player improvements and bigger login logo
v4.0
Feb 10, 2026 · 4:40 AM EST
UPDATE
Simulation Engine Upgrades
- 5 simulation optimizations: model upgrade, smarter grading, new metrics
- Voice simulation auto-retry with validation and stronger prompts
- Raised call duration thresholds for better QA quality
v3.0
Feb 10, 2026 · 3:58 AM EST
UPDATE
Call Recordings + MP3 Download
- Call recordings for all agent-to-agent test calls
- MP3 download and unified recording player
- Fixed recording playback with stronger call duration
v2.0
Feb 10, 2026 · 3:28 AM EST
UPDATE
Auth + Login Redesign
- Branded login page with VoxGrade logo and brand colors
- Password protection with auth restore
- Redesigned topbar with prominent agent selector
v1.0
Feb 10, 2026 · 3:21 AM EST
FOUNDATION
Initial Launch
- VoxGrade deployed to Vercel
- Retell AI integration for real voice agent calls
- VoxGrade branding and initial dashboard UI