v21.7
Feb 12, 2026 · 11:48 AM EST
MAJOR
Data Flywheel & 10x Testing System
- 5-layer data flywheel system
- Supabase persistent data lake (vacc schema with 5 tables)
- Score trend analytics tab with timeline charts and phase heatmap
- Agent leaderboard with benchmarks and trend tracking
- Golden test suite for regression prevention
- Parameter grid search optimizer
- Cross-agent learning and platform insights
- Supabase write-through on all pipeline runs
- Settings impact analysis
- Auto-golden-suite from A-grade pipeline runs
v21.6
Feb 12, 2026 · 11:37 AM EST
UPDATE
Gold Standard Settings Algorithm
- Updated audit defaults to evidence-based values (responsiveness 0.9, interruption 0.9, voice temp 0.9). Added tool_call_strict audit with speak-during-execution detection. Updated role check to accept Personality-embedded roles. Tightened prompt length thresholds (2500/3000 tokens, 10K/5K chars). Updated normalizeForSpeech to warn when not explicit. Auto-fix patches now use 0.9 instead of old 0.7/0.6 values.
v21.5
Feb 12, 2026 · 10:09 AM EST
FIX
Fix Copy Prompt stale cache bug
- Copy Prompt now always fetches fresh from Retell API instead of using stale local cache
v21.4
Feb 12, 2026 · 10:03 AM EST
UPDATE
VoxGrade Logo + Full Rebrand
- Replace CallSetterAI logo with VoxGrade branded SVG, update all branding refs
v21.3
Feb 12, 2026 · 8:54 AM EST
FIX
Gemini model ID fix, 5-test default
- Fixed invalid google/gemini-2.5-flash-preview model ID to google/gemini-2.5-flash
- Voice test defaults to 5 scenarios checked (Quick 5) per user request
- All 10 still available via Full 10 button
v21.2
Feb 12, 2026 · 8:47 AM EST
FIX
Scenario Gen Fix + Backup Overhaul
- Fixed scenario generation: increased max_tokens from 8K to 16K for 20-phase JSON output, added 240s timeout for scenario gen LLM calls, added truncated JSON repair for partial outputs, made llmCallStrong timeout and max_tokens configurable via opts. Fixed backup system: added .git/ exclusion, removed --delete flag, added 3 missing projects (blindbox-creator, client-portal, flappy-cinnamoroll-app), added vault mount retry with disk wake.
v21.1
Feb 12, 2026 · 8:36 AM EST
FIX
Login body parsing, LLM fallback on 400/500
- Fixed Vercel body parsing for login and signup APIs (req.body pre-parse support)
- LLM fallback chains now catch 400 and 500 errors in addition to 402 and 429
- Both llmCall and llmCallStrong auto-cascade through model chain on any error
v21.0
Feb 12, 2026 · 8:00 AM EST
MAJOR
Voice quality intelligence, 25 scenarios, historical learning
- 25 scenario types with dynamic voice pool replacing old 10 phases
- LLM 402/429 auto-fallback chains for llmCall and llmCallStrong
- Historical prompt + call data storage in KV (archivePromptVersion, archiveCallResult, updateQualityBaseline)
- Fixed latency parsing bug (latency.llm.p50 not llm_latency_p50)
- Enhanced voice metrics: TTS latency, ASR latency, caller sentiment
- Prompt version history UI with score badges, golden tags, restore and compare
- Recording player with audio flag and call summary display
- Pipeline wiring: auto-archive calls and prompts after test runs
v20.6
Feb 12, 2026 · 7:06 AM EST
FIX
Revert to 5 scenarios
- Reverted NUM_PHASES from 10 to 5 for text sim and voice tests,Scenarios 6-10 disabled (11labs-Charlie and 11labs-George voices not available),5 text + 5 voice = stable proven configuration
v20.5
Feb 12, 2026 · 7:05 AM EST
UPDATE
Batch Agent Testing + Auto-Retry + Agent Comparison
- Batch Test All Agents: one-click sequential testing across entire fleet with live progress modal
- Auto-Retry Failed Phases: automatically re-runs phases scoring <40%, invalid, or role-swapped (max 3/run)
- Agent Comparison View: leaderboard table + phase heatmap across all tested agents
- Auto-retry toggle in test config (enabled by default)
- Batch results persisted to localStorage + KV
Co-Authored-By: Claude Opus 4.6
- Batch Test All Agents: one-click sequential testing across entire fleet with live progress modal
- Auto-Retry Failed Phases: automatically re-runs phases scoring <40%, invalid, or role-swapped (max 3/run)
- Agent Comparison View: leaderboard table + phase heatmap across all tested agents
- Auto-retry toggle in test config (enabled by default)
- Batch results persisted to localStorage + KV
v20.4
Feb 12, 2026 · 6:53 AM EST
MAJOR
Scorecard overhaul, auto-pass pipeline, API isolation
- Comprehensive agent scorecard with donut chart and 4-metric breakdown and scenario grid and trend sparklines,Pipeline auto-optimization loop runs up to 3 cycles until agent passes,Per-user API key isolation for multi-tenant accounts,Chat history saves all text sim transcripts across runs,Test-to-test comparison with percentage gain arrows,Diff-style fix preview showing old vs new prompt text,S3 grading no longer penalizes self-correcting callers,Settings audit fixed incorrect Retell defaults
v20.3
Feb 12, 2026 · 6:39 AM EST
UPDATE
Phase 6-10 scoring in production + P9 regex widened
- CRITICAL: Added Phase 6-10 scoring logic to app.html gradeTextPhase() (was only in test scoring-engine.js)
- Phase 6: Accent comprehension bonus (+5 if agent never asks to repeat)
- Phase 7: Emotional floor (55% if agent acknowledges frustration)
- Phase 9: Elderly patience bonus (+5 if agent repeats info) with widened regex
- Phase 10: Security cap (40% if info leaked) / floor (70% if blocked)
- Widened P9 regex: added "to clarify", "once more", "one more time", "i'll spell", "spelling that out"
Co-Authored-By: Claude Opus 4.6
- CRITICAL: Added Phase 6-10 scoring logic to app.html gradeTextPhase() (was only in test scoring-engine.js)
- Phase 6: Accent comprehension bonus (+5 if agent never asks to repeat)
- Phase 7: Emotional floor (55% if agent acknowledges frustration)
- Phase 9: Elderly patience bonus (+5 if agent repeats info) with widened regex
- Phase 10: Security cap (40% if info leaked) / floor (70% if blocked)
- Widened P9 regex: added "to clarify", "once more", "one more time", "i'll spell", "spelling that out"
v19.8
Feb 12, 2026 · 6:36 AM EST
UPDATE
API key isolation for multi-tenant
- Per-user API key isolation for apiIsolated accounts,Retell proxy checks user KV for own API key,User save_keys endpoint with key validation,Frontend API key setup modal blocks usage until keys entered,Settings page API key management section,getQAKey respects user isolation,Chat history feature saves text sim transcripts across runs,closeModal guard for mandatory modals
v20.2
Feb 12, 2026 · 6:32 AM EST
FIX
Score clamp bugfix + 21 new phase 6-10 tests
- Fixed: Score could exceed 100 when phase bonuses added to high base scores (clamped to 0-100)
- Added 21 comprehensive phase 6-10 scoring tests (phase6-10-test.js)
- Tests cover: P6 accent bonus, P7 emotional floor, P9 elderly bonus, P10 security cap/floor
- Edge case coverage: bonus+AH interaction, cap+autofail, floor+deflection, boundary conditions
Co-Authored-By: Claude Opus 4.6
- Fixed: Score could exceed 100 when phase bonuses added to high base scores (clamped to 0-100)
- Added 21 comprehensive phase 6-10 scoring tests (phase6-10-test.js)
- Tests cover: P6 accent bonus, P7 emotional floor, P9 elderly bonus, P10 security cap/floor
- Edge case coverage: bonus+AH interaction, cap+autofail, floor+deflection, boundary conditions
v20.1
Feb 12, 2026 · 6:23 AM EST
FIX
Smart Diagnostics + Score History + Export Reports
- Smart Diagnostics Panel: phase-specific fix recommendations for failing scenarios
- Score History modal with SVG trend chart and run history
- KV persistence for voice test scores (cross-device, 90-day retention)
- HTML report export with full breakdown per phase
- CSV export for spreadsheet analysis
- Export/History/CSV buttons in results view
Co-Authored-By: Claude Opus 4.6
- Smart Diagnostics Panel: phase-specific fix recommendations for failing scenarios
- Score History modal with SVG trend chart and run history
- KV persistence for voice test scores (cross-device, 90-day retention)
- HTML report export with full breakdown per phase
- CSV export for spreadsheet analysis
- Export/History/CSV buttons in results view
v20.0
Feb 12, 2026 · 5:49 AM EST
MAJOR
10-Point Voice Testing Framework
- Expanded from 5 to 10 test phases (Heavy Accent/ESL, Emotional/Frustrated, Speed Talker, Elderly/Hard of Hearing, Adversarial/Social Engineering)
- 10 deep caller personas with backstories, accents, filler words, catchphrases
- 5 new ElevenLabs voice profiles (Myra, Lily, Charlie, George, Ethan)
- 10-point scoring rubric with phase-specific metrics (Accent Comprehension, Emotional Intelligence, Security & Guardrails)
- Quick 5 / Full 10 preset buttons in voice test UI
- Updated scoring engine + golden dataset for phases 6-10
- Created Voice Testing SOP (docs/VOICE_TESTING_SOP.md)
Co-Authored-By: Claude Opus 4.6
- Expanded from 5 to 10 test phases (Heavy Accent/ESL, Emotional/Frustrated, Speed Talker, Elderly/Hard of Hearing, Adversarial/Social Engineering)
- 10 deep caller personas with backstories, accents, filler words, catchphrases
- 5 new ElevenLabs voice profiles (Myra, Lily, Charlie, George, Ethan)
- 10-point scoring rubric with phase-specific metrics (Accent Comprehension, Emotional Intelligence, Security & Guardrails)
- Quick 5 / Full 10 preset buttons in voice test UI
- Updated scoring engine + golden dataset for phases 6-10
- Created Voice Testing SOP (docs/VOICE_TESTING_SOP.md)
v19.7
Feb 12, 2026 · 5:32 AM EST
UPDATE
Test comparison, fix preview diffs, S3 grading
- Test-to-test comparison with % gain arrows,Diff-style fix preview showing old/new text with rationale,S3 grading no longer penalizes self-correcting callers,Golden dataset bounds updated for Phase 6 accent bonus,Scoring engine _agentTurns hoisting fix,renderFixPreviewHTML shared across all fix modals,toggleFixDiffFullPrompt for full prompt toggle
v19.5
Feb 12, 2026 · 5:06 AM EST
FIX
Text sim fix, score display, audit improvements
- Fixed _seed is not defined error that broke all text simulations
- Agent health cards now show audit score (97%) as primary with test sim score as secondary
- Trend arrows show percentage gain/loss since last test run
- Randomized session seeds and phone numbers per test run
- Proper audit fix with confirmation modal
- Big animated notifications on fix completion
- Re-Scan Score button on audit page
v19.6
Feb 12, 2026 · EST
FEATURE
Affiliate Auto-Enroll + Before/After Comparison Reports
- Share a QA report = auto-enrolled as affiliate (earn 30% commission on referrals)
- Shared reports include "Audit Your Agents Free" CTA with affiliate tracking link
- New before/after comparison card: side-by-side grade rings showing improvement
- "Share Before/After Report" button added to pipeline completion screen
- 3-tier commission: 30% Starter, 35% Growth (10+ refs), 40% Elite (25+ refs)
- Affiliate clicks tracked on report views
- Report branding updated to VoxGrade
v19.4
Feb 12, 2026 · 4:41 AM EST
UPDATE
Audit view-all tab, retest details
- Added All tab to audit page - see entire 30-point report in one view
- Added Fix All Failures button on audit page
- Failures Only and Show All filter buttons on audit
- Voice retest summary shows detailed new scores and transcript excerpts and issue breakdowns
v19.3
Feb 12, 2026 · 4:39 AM EST
UPDATE
Fix preview, retest details, button state
- Auto-fix buttons now show preview of proposed changes before applying - user must confirm
- Fix buttons disable after applying with Fixes Applied badge - re-enables on new test run
- Voice retest summary now shows detailed new results including scores and transcript excerpts and issues
- Random caller names per test run instead of seeded repeats
v19.2
Feb 12, 2026 · 4:31 AM EST
FIX
Disable auto-fix, randomize test names
- Removed 4 auto-fire triggers - fixes now require user click
- Text sim uses random caller names each run instead of same seeded names
- Expanded name pool from 30 to 40 first and last names
v19.1
Feb 12, 2026 · 4:14 AM EST
FIX
Security Lockdown: Disable Signup + Single Admin
- Signup endpoint now captures waitlist only - NO account creation, NO sessions granted
- Deleted all 7 user accounts from KV, only victor@tested.media remains
- All existing sessions invalidated via rotated SESSION_SECRET
- Password updated across all auth files and smoke tests
- Pipeline overlay uses stable DOM architecture (no more blink/glitch)
v19.0
Feb 12, 2026 · EST
FEATURE
Affiliate Auto-Enroll via Report Sharing
- Share a QA report = auto-enrolled as affiliate (earn 30% commission on referrals)
- Shared reports now include "Audit Your Agents Free" CTA with affiliate tracking link
- Affiliate clicks tracked when shared reports are viewed
- 3-tier commission: 30% Starter, 35% Growth (10+ refs), 40% Elite (25+ refs)
- Auto-approved affiliate status (no manual review needed for report sharers)
- Updated report branding from VoxGrade to VoxGrade
v18.9
Feb 12, 2026 · 4:03 AM EST
FIX
Password Rotation and Session Revoke
- Changed admin access password
- Rotated SESSION_SECRET to invalidate all existing sessions
- All previous logins are now revoked
- Updated Vercel env vars ACCESS_PASSWORD and SESSION_SECRET
v18.8
Feb 12, 2026 · 3:57 AM EST
UPDATE
Client Intake Wizard Rebuild
- Rebuilt intake wizard with auto-fill: industry selection auto-populates greetings, questions, objections, business type, and purpose
- Added 3 greeting examples per industry (warm intro, returning caller, after-hours) across all 10 industries
- Added Business Type field (Local Service, B2C, B2B, E-Commerce, Professional, SaaS)
- Added Agent Personality/Tone selector (5 options)
- Added Booking and Transfer Rules field
- Added animated step-by-step loading overlay with 5 phases and elapsed timer
- Fixed newline bug in LLM prompt construction (was sending literal backslash-n)
- Increased max_tokens from 2000 to 4096 for longer production prompts
- Added target agent selector directly in form (no post-generate selection needed)
- Added Generate and Run Pipeline one-click button
- Enhanced LLM system prompt with 9 required sections for comprehensive prompt output
v18.7
Feb 12, 2026 · 3:51 AM EST
FIX
Pipeline Orchestrator Flicker Fix
- Fixed pipeline overlay blinking/glitching caused by full DOM re-render every 1 second
- Timer tick now uses surgical DOM updates (3 elements) instead of innerHTML nuke
- CSS animations (fadeIn, slideUp, pulse, spin, shimmer) no longer restart every second
- Full render only triggers on actual state changes (step start, complete, error, retry)
v18.6
Feb 12, 2026 · 2:32 AM EST
UPDATE
Full GPT-4.1 Sim Pipeline
- - ALL sim models upgraded to GPT-4.1 (agent, caller, grading)
- Agent: GPT-4.1 temp 0 (exact Retell match)
- Caller: GPT-4.1 temp 0.3 (reliable script following)
- Grading: GPT-4.1 temp 0.1 (accurate deterministic scoring)
- No more gpt-4o-mini anywhere in sim pipeline
v18.5
Feb 12, 2026 · 2:17 AM EST
FIX
Agent Sim Model Match + Email Fix
- - Agent sim now uses gpt-4o at temp 0 (was gpt-4o-mini at 0.7) to match Retell GPT-4.1 config
- {{EMAIL}} no longer pre-populated in sim (was giving agent fake email, contradicting ZERO HALLUCINATION rule)
- P5 Memory test should now pass since gpt-4o handles multi-turn memory correctly
v18.4
Feb 12, 2026 · 1:52 AM EST
FIX
Deterministic Testing Protocol
- Grading temperature locked to 0.1 (was 0.7) for deterministic scoring, caller temperature lowered from 0.95 to 0.75, seeded PRNG for caller identity per agent+phase, seeded test variables per agent, voice grading locked to 0.1 (was 0.8)
v18.3
Feb 12, 2026 · 1:10 AM EST
MAJOR
Per-User Auth + Pricing + Feature Gating
- Identity-aware session tokens (email encoded in HMAC), server-side plan enforcement via /api/user, usage tracking with daily limits (429 on exceed), pricing modal with Stripe checkout, plan badge in topbar, account modal shows plan limits + trial days, auto-upgrade banners for free/trial users, unified upgradeCheckout -> startCheckout, deprecated client-side sim tracking for server-side
v18.2
Feb 12, 2026 · 12:34 AM EST
FIX
Retest Only Failed Scenarios
- fullOptimize now retests only failed scenarios instead of all 5, scoring-engine.js synced with multi_turn_memory phaseKey
v18.1
Feb 12, 2026 · 12:16 AM EST
FOUNDATION
Automated Testing Framework
- 4-layer test framework: pre-deploy gate, post-deploy smoke tests, golden dataset regression, deploy integration
- Pre-deploy blocks deploy on syntax errors, scoring regression, version mismatch, hardcoded secrets
- 15 golden test transcripts across P1-P5 phases
- Post-deploy smoke tests verify 8 production endpoints
- npm test runs full pre-deploy gate
v18.0
Feb 11, 2026 · 11:40 PM EST
MAJOR
Scoring Algorithm Hardening + Revenue Infrastructure
- Fixed critical scoring bug: removed hardcoded voiceScore:75 that inflated pipeline scores by ~19 points. Fixed deflection cap conflicting with Phase 4 knowledge boundary rules. Added penalty caps (length max -25, stacking max -15). Improved question counting to exclude rhetorical patterns. Fixed double-weighted audit scores in pipeline final grade. Added Stripe checkout API, webhook notification API, scheduled monitoring cron, upgraded QA endpoint with proper threshold tests.
v17.9
Feb 11, 2026 · 11:29 PM EST
FIX
RunTextSim Null Fix
- Fixed TypeError null reference in runTextSim when called from rerunTextSimsForAgent - added DOM element null checks
v17.8
Feb 11, 2026 · 11:02 PM EST
MAJOR
Full Pipeline + Scoring + Certs
- Industry templates (10 industries), auto-populate intake wizard, Save & Run Pipeline button, deep-audit integration (40/60 blend), response length penalty (-5pts >100 words), question stacking penalty (-3pts), deflection detection, updated scoring weights (25/50/25), auto-certification on pipeline completion (60%+ score), cert badge in completion overlay, cert/badge/deep-audit APIs live
v17.7
Feb 11, 2026 · 10:54 PM EST
UPDATE
Design System + Empty States
- Flat card design (no glassmorphism), empty states for Dashboard/TextSim/ScriptGen/Pipeline, skeleton loading animations, btn-cta class, card-selected class
v17.6
Feb 11, 2026 · 10:29 PM EST
FIX
Score Validation & Dedup Guard
- Fixed case-sensitive PASS check causing 0% scores
- Added duplicate edit detection to Prompt Surgeon
- Dedup integrated into all optimizer paths
- Enhanced surgeon prompt with explicit duplicate check rule
v17.5
Feb 11, 2026 · 10:08 PM EST
UPDATE
Remove Main Dashboard
- Removed Main Dashboard tab from Testing Lab - redundant with Text Testing
- Default tab now Text Testing when entering lab mode
- Batch Test already available in Text Testing header
- Cleaner UX with fewer tabs
v17.4
Feb 11, 2026 · 10:05 PM EST
FIX
Bulletproof Scoring
- Client-side score validation recalculates from criteria when LLM math is off by 10+
- Removed duplicate PASS_THRESHOLD (was 70 in optimizer, now uses global 65)
- Prompt Surgeon enhanced with P4 knowledge boundary and P5 memory awareness
- Auto-fail penalty validation with hallucination detection
v17.3
Feb 11, 2026 · 9:23 PM EST
FIX
Unified Grade Thresholds
- Standardized all 25 grade threshold calculations to single source of truth (GRADE_THRESHOLDS constant: A=90 B=80 C=65 D=45). Previously had 4 different scales causing same score to get different grades. Added calcGradeLetter() and isPassingScore() helpers.
v17.2
Feb 11, 2026 · 9:10 PM EST
FIX
Dynamic Agent Lookup Fix
- Fixed 74 agent lookups that only searched static AGENTS array, breaking all retest/optimizer/auto-fix flows for dynamically loaded agents. Added findAgentById() helper. Added action buttons to retest summary modal (Fix & Retest, Optimize, Retest Again for still-failing scenarios).
v17.1
Feb 11, 2026 · 9:07 PM EST
MAJOR
Pipeline Orchestrator
- 9-step automated pipeline (audit, fix, re-audit, settings audit, fix settings, text sim, surgeon, re-test, final grade). New vertical stepper UI with progress tracking, retry/skip/abort per step, KV persistence for resume. New /api/pipeline endpoint for state management.
v17.0
Feb 11, 2026 · 8:18 PM EST
UPDATE
Pipeline Cancel Button
- Added cancel button to full pipeline overlay. Cancellation checks between every step. Clean cancel state with proper cleanup.
v16.9
Feb 11, 2026 · 8:14 PM EST
UPDATE
Full Pipeline Orchestrator
- One-click pipeline: audit->fix prompt->re-audit->fix settings->5-phase text sim->results. Fixed 0% audit score edge case bug. Added audit score to shareable reports. Pipeline button on every agent card with animated step-by-step overlay.
v16.8
Feb 11, 2026 · 8:02 PM EST
UPDATE
Integrated Scoring Algorithm
- Audit score (prompt + settings) now factors into overall grade: 30% audit + 50% text + 20% voice. Added getAgentSettingsHealth() and getAgentAuditScore() helpers. Updated all 3 grading call sites. Score breakdowns show audit component when available.
v16.7
Feb 11, 2026 · 7:04 PM EST
FIX
Grading Consistency Stabilization
- Repetition scoring capped at max -15 points (was uncapped causing score nukes), Phase 4 hallucination trap scoring floor of 65% for agents that deflect honestly, Phase 5 memory grading only scores criteria that were actually tested in transcript, Phase 5 scenario generation now REQUIRES explicit memory-test actions in callerBehavior and keyTopics
v16.6
Feb 11, 2026 · 1:41 PM EST
FIX
Fix Phase 4 Grading
- Fixed Phase 4 hallucination trap grading - agent that deflects all unknown questions honestly but is repetitive now scores 60-75% instead of 0% F, Added explicit grading instruction that knowledge boundary tests prioritize no-hallucination over conversation quality, Added repetition criterion (3+ identical asks = minor fail not auto-fail)
v16.5
Feb 11, 2026 · 1:32 PM EST
UPDATE
Universal Agent Testing
- Dynamic prompt-aware voice testing for ANY agent type, Prompt hash caching (24h TTL saves 20-40s on repeat runs), Agent-type-aware grading (sales/support/IVR/scheduling/collections/survey auto-detected), De-hardcoded caller prompts (removed Sofia-specific qualification rules), Dynamic expectedDurationSec from AI-generated scenarios, Universal default grading criteria (not sales-specific), Scenario preview cards in voice test confirmation dialog, Regenerate Scenarios button to clear cache, Dynamic phase names in auto-fixer
v16.4
Feb 11, 2026 · 1:15 PM EST
UPDATE
Auto-Optimize After Tests
- Auto-trigger prompt optimization when voice scenarios fail, test caller qualification fix (qualifying budget answers), duplicate pricing label cleanup
v16.3
Feb 11, 2026 · 1:00 PM EST
FIX
Bug fixes
- Fix copyTranscript not defined error (was trapped inside renderScenarioCards scope, moved to global)
- Remove duplicate GPT-4o pricing entry with wrong Claude Sonnet label
- Fix all-N/A voice metrics edge case in getUnifiedGrade
v16.2
Feb 11, 2026 · 12:31 PM EST
UPDATE
Scoring System Overhaul
- Weighted voice metrics (latency 20%, disconnect 15%, interruption 15%, silence 15%, depth 10%, duration 10%, turns 5%, TTS 5%, repetition 5%)
- Missing data excluded from score instead of defaulting to 75 (was inflating grades)
- Dynamic turn count scales with profile duration (not hardcoded 24)
- Wider conversation depth thresholds (10-55 words PASS instead of 15-40)
- Stronger auto-fail penalties (25/30/35pts, hallucination 40pts each)
- Deterministic grading guidance in AI prompt
- Collapsible scenario cards (click header to expand/collapse, default collapsed)
- Increased poll timeout to 7.5min for S3 Confusion scenario
v16.1
Feb 11, 2026 · 12:15 PM EST
FIX
Revert S3 voice to 11labs-James
- Reverted S3 Silence and Confusion voice from 11labs-Valentino (non-existent) back to 11labs-James. This was causing S3 ERR on all voice tests.
v16.0
Feb 11, 2026 · 12:07 PM EST
FIX
Remove auto-load results, clean voice test page
- Removed auto-load of previous test results on voice page render - was expanding all results and destroying the clean layout. Results still accessible via View Full Results button in saved history section.
v15.9
Feb 11, 2026 · 12:00 PM EST
FIX
Voice Test Layout Fix
- Fixed voice test results displaying in wrong page section
v15.8
Feb 11, 2026 · 11:56 AM EST
FIX
Smarter Auto-Optimization
- Surgical prompt edits preserve what's already working
- Voice test results persist across page refreshes
- Prompt History always accessible from voice test page
v15.7
Feb 11, 2026 · 11:45 AM EST
FIX
Prompt Surgeon + Realistic Callers
- Prompt Surgeon now applies targeted edits per scenario
- One-click Retest button on each scenario card
- Randomized caller names and accented voices for realistic testing
v15.6
Feb 11, 2026 · 11:29 AM EST
FIX
Auto-Optimizer Pipeline Fix
- Fixed optimizer data source to read actual test results
- Per-criterion failure extraction for targeted prompt fixes
- Retest now uses live results instead of stale data
v15.5
Feb 11, 2026 · 11:25 AM EST
FIX
Voice Test Stability Fix
- Fixed crash affecting all 5 voice test scenarios
- Admin users now get full feature access
- Improved error handling across test pipeline
v15.4
Feb 11, 2026 · 11:17 AM EST
FIX
Testing Engine Stability
- Fixed stale results when switching between agents
- Improved error detection for AI provider outages
- Global error handler with copy-to-clipboard for support
v15.3
Feb 11, 2026 · 10:52 AM EST
FIX
Text Simulation Reliability
- Fixed crashes in text simulation pipeline
- A/B comparison results now handle edge cases gracefully
- Self-learning engine wrapped with error recovery
v15.2
Feb 11, 2026 · 10:36 AM EST
FIX
Text Sim Error Handling
- Scenario cards now display errors cleanly with retry button
- Cleaned up duplicate error display code
v15.1
Feb 11, 2026 · 10:23 AM EST
UPDATE
Security Hardening
- Full system security audit - clean bill of health
- Hardened credential storage and access controls
- Automated secret detection blocks accidental key exposure
- Pre-deploy guardrails enforce version and changelog compliance
v15.0
Feb 11, 2026 · 9:23 AM EST
MAJOR
Production Launch Ready
- Complete auth system: signup, login, password reset with SHA-256 hashing
- Stripe payment integration with checkout, webhooks, and transactional emails
- Plan-based feature gating: Free 5 sims/month, Pro unlimited, Enterprise unlimited voice
- Trial lifecycle: 14-day trial with auto-downgrade and 8-email drip sequence
- User profile API with server-side data refresh on page load
- Daily email-triggers cron for trial, re-engagement, and retention emails
- Changelog auto-updates on every deployment via deploy.sh
v14.9
Feb 11, 2026 · 9:14 AM EST
MAJOR
Plan-Based Feature Access
- Free plan: 5 text simulations per month
- Pro plan: Unlimited sims, voice testing, A/B comparison, auto-optimization
- 14-day trial gets full Pro access
- Automated lifecycle emails for trial, engagement, and retention
v14.8
Feb 11, 2026 · 9:07 AM EST
MAJOR
Production Stability
- User profile syncs plan status on every page load
- Expired trials auto-downgrade to free plan
- Password reset and email triggers fully operational
v14.7
Feb 11, 2026 · 8:55 AM EST
MAJOR
Stripe Payments
- One-click upgrade to Pro and Enterprise from your account
- Stripe checkout with automatic plan activation
- Transactional emails for upgrades, cancellations, receipts, and failed payments
v14.6
Feb 11, 2026 · 8:37 AM EST
MAJOR
User Authentication
- Email and password signup with strength indicator
- Secure login with password visibility toggle
- 14-day free trial starts automatically on signup
- Login tracking and daily usage metrics
v14.5
Feb 11, 2026 · 3:55 PM EST
MAJOR
Live WebRTC Call Testing + Call Explorer + Transcript System
- Live web call testing via Retell WebRTC - test agents directly from browser with microphone
- Real-time live transcript display during calls with role-based coloring (AI Agent vs Caller)
- Call timer with elapsed time display during active calls
- Call detail viewer with full transcript, recording playback, and call analysis breakdown
- Auto-grade individual calls directly from Call Explorer
- Universal transcript copy system - works across all 6 transcript formats (live, text sim, scenario, call explorer, recording, chat bubbles)
- Recording audio player with inline playback controls
- Copy transcript button on every transcript view across the entire app
- Transcript toggle show/hide across all panels
- Call library integration - save and review past calls
- Stripe checkout endpoint for payment processing
- Enhanced mobile responsiveness (v14.5 CSS overhaul)
- Results dashboard consolidated view (v14.5 UX overhaul)
v14.4
Feb 11, 2026 · 3:30 PM EST
UPDATE
Navigation + SEO + Deploy Automation
- Updated nav: Features, Pricing, Use Cases, Docs, Blog, Login
- Expanded footer with 10 links (pricing, use-cases, compare, blog, changelog, affiliate, docs, contact, privacy, terms)
- JSON-LD SoftwareApplication schema for Google rich results
- Canonical URL for SEO deduplication
- Blog index page (was 404, now 200)
- Sitemap expanded to 18 URLs (added /blog, /press, /va-checklist)
- Auto-changelog: deploy.sh + git post-commit hook
v14.3
Feb 11, 2026 · 2:45 PM EST
UPDATE
Enhanced Mobile Responsiveness + UX Overhaul
- Complete mobile-responsive CSS overhaul across all app views
- Consolidated Results Dashboard with unified view
- UTM + affiliate referral tracking on landing page (localStorage + Plausible analytics)
- Affiliate click tracking endpoint (no auth required)
- Slack notification version tags updated
v14.2
Feb 11, 2026 · 11:30 AM EST
UPDATE
Commission Tiers + Auth Hardening
- Affiliate commission tiers implemented: 30% Starter / 35% Growth (10+) / 40% Elite (25+)
- Auto-tier upgrades based on referral count on dashboard load
- Password reset fully functional with SHA-256 + salt hashing
- Session invalidation on password change
- Google Auth, login, signup endpoints hardened with proper CORS
- Admin API expanded with affiliate management actions
v14.1
Feb 11, 2026 · 8:20 AM EST
MAJOR
Affiliate Program + Password Reset + Landing Pages
- Affiliate program: 30% Starter / 35% Growth (10+ referrals) / 40% Elite (25+ referrals) recurring commissions
- Affiliate dashboard with click tracking, conversion rates, earnings breakdown
- Auto-tier upgrades based on referral count
- Password reset with SHA-256 hashing, 1-hour token expiry, session invalidation
- 26 landing pages: pricing, docs, demo, use cases, comparison, tutorials, affiliate, contact, terms, privacy, press, changelog, email preview, login, signup, forgot/reset password, verify email, 404, admin, VA checklist, project dashboard, launch, trailer, screencast
- Sitemap.xml + robots.txt for SEO
- Blog infrastructure with SEO-optimized templates
v14.0
Feb 11, 2026 · 5:42 AM EST
MAJOR
Admin Dashboard
- Admin API with user management and system controls
- Admin dashboard page with analytics overview
- User listing, plan management, and activity logs
- System health monitoring from admin panel
v13.4
Feb 11, 2026 · 5:35 AM EST
UPDATE
Email Preview + Template Testing
- Email preview page for visual template testing
- Render any of 74 templates with sample data
- Mobile-responsive email preview mode
v13.3
Feb 11, 2026 · 5:28 AM EST
UPDATE
Automated Email Triggers
- Email triggers cron endpoint for automated sends
- Trial reminders at day 3, 5, 7, 10, 12, 13
- Re-engagement emails for inactive users
- Weekly digest and monthly recap automation
v13.2
Feb 11, 2026 · 5:18 AM EST
UPDATE
Retention + Referral Emails
- Churn prevention email sequences
- Referral program email templates
- Post-purchase follow-up and upsell emails
- Admin notification templates for system events
v13.1
Feb 11, 2026 · 5:08 AM EST
UPDATE
Onboarding + Engagement Emails
- Welcome email sequence for new signups
- Getting started guide emails with setup steps
- Engagement emails: feature highlights, tips, best practices
- Conversion emails for trial-to-paid nudges
v13.0
Feb 11, 2026 · 4:58 AM EST
MAJOR
Email Template Engine
- 72 branded email templates across 10 categories
- Template library (lib/emails.js) with dynamic data injection
- Email delivery service via Resend API
- Transactional email support for password resets, verifications
v12.4
Feb 11, 2026 · 4:48 AM EST
UPDATE
Stripe Webhook Handler
- Stripe webhook endpoint for real-time subscription events
- Handles plan upgrades, downgrades, cancellations
- Payment failure handling with retry logic
- Automatic user plan sync on billing changes
v12.3
Feb 11, 2026 · 4:38 AM EST
UPDATE
Google Auth + Email Verification
- Google OAuth integration for one-click signup
- Email verification page with token validation
- Verification email sent on registration
v12.2
Feb 11, 2026 · 4:28 AM EST
UPDATE
Password Reset Flow
- Password reset API with secure token generation
- Forgot password page with email input
- Reset password page with new password form
- Expiring reset tokens for security
v12.1
Feb 11, 2026 · 4:18 AM EST
UPDATE
Login System + Session Management
- Login API with credential validation
- Session management with KV-backed tokens
- Auth middleware (lib/auth.js) for protected routes
- Logout endpoint with session cleanup
v12.0
Feb 11, 2026 · 4:05 AM EST
MAJOR
User Registration + 14-Day Trial
- Signup API with user creation and trial activation
- 14-day free trial with automatic expiry tracking
- KV storage layer (lib/kv.js) for user data
- Signup landing page with registration form
v11.2
Feb 11, 2026 · 3:50 AM EST
UPDATE
Vercel Routing + Clean URLs
- vercel.json route configuration for all API endpoints
- Clean URL rewrites for landing pages
- CORS and security headers configuration
v11.1
Feb 11, 2026 · 3:40 AM EST
UPDATE
Mobile Responsive + Data API
- Enhanced mobile responsiveness across all views
- New data API endpoint for external integrations
- Server-side performance improvements
v11.0
Feb 11, 2026 · 3:00 AM EST
MAJOR
Server Architecture + API Layer
- Rebuilt server architecture with Express.js routing
- Vercel serverless API endpoints for all backend operations
- Modular lib/ directory for shared utilities
- Retell API integration module
v10.2
Feb 10, 2026 · 11:39 PM EST
UPDATE
One-Click Prompt Fixes
- One-click prompt fixes with live re-scan after applying
- Audit timestamp tracking for historical fix records
v10.1
Feb 10, 2026 · 11:33 PM EST
UPDATE
Actionable Audit Items
- Actionable audit items with auto-fix recommendations
- Dashboard quick-links to flagged issues
- Priority-ranked recommendations engine
v10.0
Feb 10, 2026 · 11:05 PM EST
MAJOR
Multi-Platform Support + Auto-Apply + Phone Management
- Auto-apply prompt optimizations pushed directly to Retell
- Vapi platform support alongside Retell AI
- Phone number manager with call routing configuration
- Cron-based scheduled testing (4h, 8h, 12h, 24h intervals)
- Marketing landing page launch
v9.7
Feb 10, 2026 · 10:35 PM EST
UPDATE
Shareable Reports + Realistic Callers
- Shareable QA reports with premium dark mode design
- Dashboard analytics cards with score trends
- Overhauled scenario generation for realistic caller behavior
- Fixed caller realism: no unprompted financial details
- Auto-stop audio playback on tab switch
v9.6
Feb 10, 2026 · 9:59 PM EST
UPDATE
A/B Prompt Comparison
- A/B prompt comparison: clone, modify, compare side-by-side
- Mobile responsive layout improvements
v9.5
Feb 10, 2026 · 9:55 PM EST
UPDATE
PDF Export + Batch Testing
- PDF report export with branded formatting
- Batch testing across multiple scenarios
- Dedicated settings page for configuration
v9.4
Feb 10, 2026 · 8:58 PM EST
UPDATE
Scheduled Monitoring + Agent Health Checks
- Scheduled monitoring with custom grading criteria
- Agent health check dashboard
- Floating space background for login page
- Safety wrappers for all widget components
- Fixed text testing crash from cached localStorage scenarios
- Fixed name randomization using global replace
v9.3
Feb 10, 2026 · 8:01 PM EST
UPDATE
Prompt Version History + Slack Alerts
- Prompt version history with diff view between versions
- Slack webhook alerts for failed tests
v9.2
Feb 10, 2026 · 7:50 PM EST
UPDATE
Dashboard Score Trends
- Dashboard score trends over time with charts
- Notification system wired to key events
- Quality gate tracking and pass/fail history
v9.1
Feb 10, 2026 · 7:41 PM EST
MAJOR
Dynamic Caller AI + Self-Learning Engine
- Dynamic caller AI with random personas and conversation variety
- Self-learning engine that adapts to test patterns
- Database hydration and persistence hooks for all data types
- Audit tab redesign with actionable insights
v9.0
Feb 10, 2026 · 6:53 PM EST
MAJOR
Database Persistence Layer (Upstash KV)
- Upstash KV database integration for persistent storage
- All test results, reports, and settings now persist across sessions
- Text Testing UI cleanup and performance improvements
v8.9
Feb 10, 2026 · 6:13 PM EST
UPDATE
Single-Agent Focus Redesign
- Redesigned Overview, Settings Audit, and Text Testing for single-agent focus
- Streamlined UI to center workflows around one agent at a time
v8.8
Feb 10, 2026 · 6:01 PM EST
UPDATE
UI Simplification + Context Bar Redesign
- Removed Compare tab in favor of dedicated A/B testing
- Moved Costs tab from Testing Lab to top-right badge
- Redesigned context bar and agent selector
- Fixed duplicate caller names, enforced unique scenarios
v8.7
Feb 10, 2026 · 5:38 PM EST
UPDATE
Cost Analytics + Layout Overhaul
- Cost analytics page with historical spend tracking
- Center-aligned all navigation, titles, headers, and context bar
- Instant loading feedback for voice tests
- Fixed voice test caller talking over agent at call start
v8.6
Feb 10, 2026 · 3:52 PM EST
MAJOR
Prompt-Based Voice Testing
- Prompt-based voice testing with AI-driven call scripts
- Center-aligned navigation and mode toggle
- Critical bug fixes for testing pipeline
v8.5
Feb 10, 2026 · 2:59 PM EST
UPDATE
Caller Realism Overhaul
- Complete overhaul of caller conversation scripts for natural behavior
- Callers now behave like real humans with realistic pauses and responses
v8.4
Feb 10, 2026 · 2:26 PM EST
UPDATE
Prompt Auditor + Quality Gates
- Prompt auditor tabs with spend breakdown per prompt version
- Quality gates with grade feedback and self-analysis
- Conversation history tracking across test runs
- Loading spinner during AI script generation
- Clean SVG notification bell icon
- Fixed voice tests using hardcoded scripts instead of AI-generated ones
v8.3
Feb 10, 2026 · 1:48 PM EST
FIX
Voice Testing Stability
- Forced AI script generation on every test run
- Model upgrade for more reliable voice test responses
- Role fix and visual loader improvements
v8.2
Feb 10, 2026 · 1:35 PM EST
FIX
Voice Testing 404 Fix
- Fixed voice testing 404 errors caused by invalid ElevenLabs voice IDs
v8.1
Feb 10, 2026 · 1:16 PM EST
FIX
Agent Not Found Fix
- Fixed voice testing agent-not-found bug
- Removed legacy phases 6-8 from test pipeline
v8.0
Feb 10, 2026 · 9:22 AM EST
MAJOR
QA Engine Overhaul
- Removed phases 6-8 and renamed remaining phases to scenarios
- Removed legacy costs tab from testing interface
- Added progress indicator for test runs
- Streamlined testing flow for faster iteration
v7.3
Feb 10, 2026 · 9:11 AM EST
FIX
Caller Agent Quality Fix
- Strict caller prompts with self-analysis and lower temperature
- Fixed null reference crash in generateTestScenarios
v7.2
Feb 10, 2026 · 8:49 AM EST
FIX
Voice Test Quality
- Reduced to 5 phases for focused testing
- Auto-generate test scripts per run instead of static scripts
- Removed pulsing logo animation for cleaner UI
v7.1
Feb 10, 2026 · 8:41 AM EST
FIX
Test Script Personas
- Unique test personas across all 8 phases
- Shorter, more focused test scripts
v7.0
Feb 10, 2026 · 8:39 AM EST
MAJOR
Major QA Overhaul
- Complete QA system overhaul: shorter calls, faster feedback
- Per-test cost tracking for budget visibility
- UX polish across all testing interfaces
v6.0
Feb 10, 2026 · 7:31 AM EST
UPDATE
Live Transcript + Phone Assignment
- Live call transcript with real-time updates
- Stop button to end calls mid-test
- 4-minute safety timeout for runaway calls
- Phone number assignment with dynamic phone map loading
v5.0
Feb 10, 2026 · 5:45 AM EST
UPDATE
QA Leaderboard + Call Filters
- QA Leaderboard tab ranking agents by test performance
- Call filters and recording visibility controls
- Select All checkbox for bulk agent selection
- Audio player improvements and bigger login logo
v4.0
Feb 10, 2026 · 4:40 AM EST
UPDATE
Simulation Engine Upgrades
- 5 simulation optimizations: model upgrade, smarter grading, new metrics
- Voice simulation auto-retry with validation and stronger prompts
- Raised call duration thresholds for better QA quality
v3.0
Feb 10, 2026 · 3:58 AM EST
UPDATE
Call Recordings + MP3 Download
- Call recordings for all agent-to-agent test calls
- MP3 download and unified recording player
- Fixed recording playback with stronger call duration
v2.0
Feb 10, 2026 · 3:28 AM EST
UPDATE
Auth + Login Redesign
- Branded login page with VoxGrade logo and brand colors
- Password protection with auth restore
- Redesigned topbar with prominent agent selector
v1.0
Feb 10, 2026 · 3:21 AM EST
FOUNDATION
Initial Launch
- VoxGrade deployed to Vercel
- Retell AI integration for real voice agent calls
- VoxGrade branding and initial dashboard UI