Skip to content

Changelog

Version history and release notes for Intervu.


v1.2.1

Bug fix for microphone transcription with whisperx-asr-service backend.

Bug Fixes

STT Response Parsing

  • Fixed microphone transcription returning empty text when using whisperx-asr-service (/asr endpoint) with speaker diarization enabled
  • The /asr endpoint returns transcription as a segments array even when diarize=false, but the handler was only checking for json.text as a string
  • Now properly extracts text from segments array when json.text is not a string

v1.2.0

PGVector context retrieval and dedicated chat window.

Features

PGVector Context Retrieval (Phase 1)

  • Per-source text/vector mode toggle for resume, Q&A bank, and company context
  • Pre-fetch vector context in triggerGeneration() before buildMessages(); cached in Zustand keyed by query
  • New main-process modules: pgvector.ts (pool cache, cosine-distance search, SQL-identifier validation), embeddings.ts (OpenAI-compatible /v1/embeddings), context-loader.ts (shared by auto-answer and chat)
  • Test Connection button per source — reports row count and dimension mismatch against live embedding
  • Graceful fallback to text content when vector retrieval returns nothing
  • Connection pools closed on will-quit

Chat Window (Phase 2)

  • Dedicated chat BrowserWindow opened via title-bar button, routed by #/chat hash
  • Step-by-step processing visibility: thinking → fetching/loaded vector or text → generating, with detail lines and color coding
  • Streams tokens to the AI bubble via chat-token IPC; separate chatAbortController from auto-answer
  • Chat history persisted to userData/chat-history.json; 50-message context cap with "not in context" indicator for older messages
  • Transcript snapshot pushed from main window to main-process cache so the chat window can attach live context

PGVector Self-Indexing

  • "Index from text" button per source — chunks the text field, embeds each chunk, and writes to the configured PGVector table
  • Paragraph-aware chunker (1500 char target, 200 char overlap sliding-window fallback for oversized paragraphs)
  • Auto-creates the vector extension and target table if missing; probes and enforces dim consistency across chunks
  • Dim-mismatch safety check against existing table before truncate+insert
  • Live progress via pgvector-index-progress IPC, stamped per fieldKey so multiple fields don't cross-contaminate state
  • Embeddings module now supports both OpenAI-compatible (/v1/embeddings) and Ollama-native (/api/embeddings) endpoints with auto URL resolution

Test Connection UX

  • Missing target table no longer errors — reports "Connected — table does not exist yet" so users know the credentials work and the table will be created on first index
  • Successful indexing clears any stale "relation does not exist" test message

Chat Settings

  • New chatMaxTokens setting in Advanced section — independent of llmMaxTokens (which is tuned for terse auto-answers)
  • Default 0 = unlimited: when unset, the max_tokens field is omitted from the request so the LLM server's own default applies
  • Chat mode now uses its own system prompt (ignores the "answer concisely" base prompt) and instructs the LLM to give complete, non-truncated answers

Session Export

  • Export button in title bar (between Clear and Settings)
  • Export transcript and answers as Markdown file with date stamp
  • Copy to clipboard option for quick sharing
  • Markdown format includes date header, transcript with speaker labels, and Q&A with confidence/key points

Resizable Settings Dialog

  • Settings dialog can be resized by dragging the bottom-right corner
  • Dialog size is persisted to localStorage and restored on next open
  • Minimum dimensions enforced (480x400) for usability

Known Issues

  • Chat replies still get cut off mid-sentence on long outputs. Despite chatMaxTokens: 0 (unlimited) and an explicit "do not truncate" system prompt, longer chat replies terminate early. Root cause is suspected to be server-side — most likely Ollama num_ctx exhaustion (default 2048) once system prompt + vector chunks + transcript exceed the context window, or an OpenWebUI per-user generation cap that overrides the client request. Workaround: raise num_ctx to 8192+ in your Ollama Modelfile or OpenWebUI model advanced params. To be debugged in a follow-up.

v1.1.0

Speaker diarization support for multi-participant interviews.

Features

Speaker Diarization (Phase 1)

  • Added sttDiarization toggle in Settings with on-enable validation — backend must support diarize=true
  • System audio stream is diarized into per-speaker segments (SPEAKER_00, SPEAKER_01, etc.)
  • Client-side speaker embedding matching (256-dim cosine similarity, weighted moving average) for cross-chunk voice consistency
  • Speakr-proven matching algorithm: 30% new / 70% existing weighted update + ambiguity check (5% gap threshold)
  • Click-to-rename speaker labels in transcript pane (persisted as speakerMap in settings)
  • Extended color palette for up to 10+ unique speakers
  • Runtime hard error when backend doesn't support diarization — transcription pauses with actionable message
  • Min speakers setting (2-6) in Settings
  • Full backward compatibility — sttDiarization: false (default) changes nothing

LLM Speaker-Aware Prompting (Phase 2)

  • System prompt now includes multi-speaker context with detected speaker names when diarization is active
  • All non-'you' messages get [DisplayName]: prefixes using speakerMap display names
  • Auto-trigger now fires on any non-'you' speaker (was broken — only checked for 'interviewer')
  • Simple mode question detection finds latest non-'you' entry for card title (was only looking for 'interviewer')
  • Advanced mode extractor uses display names from speakerMap instead of raw SPEAKER_XX labels

Speaker Monitor Panel

  • Popover panel in title bar showing all detected speakers (visible when diarization is enabled)
  • Per-speaker metadata: first seen, last spoke (relative time), message count
  • Last 3 messages displayed per speaker
  • Click-to-rename speaker names directly in monitor
  • "Clear speaker data" button resets speaker names and embeddings
  • Shared speaker utilities extracted to lib/speaker-utils.ts

AI-Assisted Speaker Naming (Phase 3)

  • LLM analyzes transcript content for name mentions ("Hi Anna", "John, what do you think?")
  • Suggestions appear as sparkle (✨) hints next to unnamed SPEAKER_XX labels in transcript
  • Speaker monitor shows suggestions with accept (✓) and dismiss (✗) buttons
  • Uses extractor LLM when advanced mode is enabled, otherwise main LLM
  • Debounced analysis: waits 8 seconds after last transcript update, requires 5+ non-'you' entries
  • Dismissed suggestions are remembered per session to avoid re-suggesting
  • Accepted suggestions automatically move to speakerMap and persist to disk
  • New IPC channel: extract-speaker-names (non-streaming, temperature 0, max_tokens 300)
  • New store fields: speakerNameSuggestions, setSpeakerNameSuggestions, acceptSpeakerSuggestion, dismissSpeakerSuggestion

Chunk Duration Tuning & Speaker Profile Persistence (Phase 4)

  • Audio chunk duration auto-enforced to minimum 6 seconds when diarization is ON for better speaker separation
  • Chunk duration input locked in settings with explanatory hint when diarization is enabled
  • New "Remember Speakers" toggle in diarization settings — saves voice profiles (embeddings) to speaker-profiles.json
  • Saved profiles are restored on next listening start, enabling cross-session speaker recognition
  • "Clear speaker data" also clears saved profiles from disk
  • New IPC channels: save-speaker-profiles, load-speaker-profiles, clear-speaker-profiles
  • New setting: speakerProfilePersistence (default: false)

Clear Speaker Data Fix

  • "Clear transcript & answers" now also clears speaker names (speakerMap) and in-memory embeddings
  • "Clear all" also clears speaker data
  • New "Clear speaker data" option in clear menu (resets speaker names + embeddings only)

Dual STT Backend Support

  • Auto-detects API format based on endpoint URL: /asr (whisperx-asr-service) vs /v1/audio/transcriptions (OpenAI-compatible)
  • whisperx-asr-service (learnedmachine/whisperx-asr-service): recommended for diarization — provides speaker embeddings for cross-chunk tracking. Docker setup at docker/whisperx-asr/
  • WhisperX API Server (aldrickb/whisperx-api-server): OpenAI-compatible. Docker setup at docker/whisperx/
  • Retry-with-fallback on 500 errors (retries without embeddings if server crashes on NaN values)
  • Validation scripts for both backends

v1.0.2

Settings page quality-of-life improvements and new context fields.

Features

Q&A Bank & Company Context

  • Added Q&A Bank field for pre-prepared interview answers and preferences
  • Added Company / Role Context field for company info, job description, and team details
  • Both fields are injected into the LLM system message alongside the resume

Settings Page Improvements

  • Moved System Prompt next to LLM settings (mirrors Extractor System Prompt positioning)
  • Fixed-height textareas with internal scrolling to prevent page bloat
  • "Edit in new window" button on all textareas — opens a dedicated Electron window for comfortable long-text editing

v1.0.1

Bug fix release for macOS packaged app.

Bug Fixes

FFmpeg Not Found in Packaged App

  • Packaged .app bundles don't inherit the shell's PATH, so spawn('ffmpeg') failed with ENOENT even when FFmpeg was installed via Homebrew
  • Added automatic probing of well-known macOS paths (/opt/homebrew/bin/ffmpeg for Apple Silicon, /usr/local/bin/ffmpeg for Intel)

Silent Audio in Packaged App

  • The packaged app was ad-hoc signed with hardened runtime but lacked the com.apple.security.device.audio-input entitlement
  • macOS silently provides a zero-byte audio stream when this entitlement is missing — FFmpeg appeared to capture audio but it was all silence
  • Added custom entitlements plists with audio input access for both the parent app and child processes (FFmpeg)

Invisible Window After Launch

  • The inherited entitlements for child processes were missing com.apple.security.cs.allow-jit, which Electron's renderer process requires for V8/Chromium
  • Without JIT, the renderer couldn't execute JavaScript, resulting in a blank/invisible window despite the app running (dock icon visible)
  • Added JIT entitlement to the inherited entitlements plist

Improvements

  • Improved microphone permission logging on macOS — now logs whether access was granted or denied
  • Added .DS_Store to .gitignore

v1.0.0

Initial release with core functionality.

Features

Real-Time Audio Capture

  • Capture system audio and microphone simultaneously
  • Speaker attribution (interviewer vs. you)
  • Audio level meters in title bar
  • Silent audio filtering (RMS threshold)

Speech-to-Text Integration

  • OpenAI-compatible STT endpoint support
  • Real-time transcription with speaker labels
  • Hallucination phrase filtering
  • Configurable audio chunk duration

AI-Powered Answers

  • Real-time answer generation
  • Streaming response display
  • Resume-based personalization
  • System prompt customization

Answer Quality Feedback

  • Thumbs up/down rating system
  • In-context learning from ratings
  • Local rating storage
  • Clear ratings option

Advanced Mode

  • Dual-LLM question extraction
  • Intelligent answer queue
  • Custom extractor prompts
  • Queue mode for multiple answers

Contextual Answer Tips

  • Key points bullet extraction
  • Confidence score (0-100)
  • Structured answer format
  • Toggle in advanced settings

Audio Configuration

  • Device selection for system audio and microphone
  • VB-Cable integration for system capture (Windows)
  • BlackHole integration for system capture (macOS)
  • Audio routing with SteelSeries GG and Voicemeeter (Windows)
  • Multi-output device setup (macOS)
  • Device refresh and validation
  • Level meter display

Cross-Platform Support

  • Windows: DirectShow audio capture, bundled FFmpeg, installer and portable builds
  • macOS: AVFoundation audio capture, manual FFmpeg install, DMG and ZIP builds
  • Platform-aware device enumeration and error messages
  • Microphone permission handling on macOS

Application

  • App icon and logo for all platforms
  • Floating overlay window
  • Always-on-top mode
  • Compact dark theme

Settings Management

  • Persistent settings storage
  • STT/LLM endpoint configuration
  • Resume and system prompt editor
  • Test connections for STT and LLM

Technical

  • FFmpeg bundled download (Windows)
  • FFmpeg manual install (macOS via Homebrew)
  • IPC communication for CORS bypass
  • File-based logging
  • Error notifications
  • macOS quarantine workaround support

Development History

2026-04-09 — Updates

macOS Support

  • AVFoundation-based audio device enumeration
  • Platform-aware FFmpeg arguments (-f avfoundation on Mac, -f dshow on Windows)
  • Microphone permission request on macOS startup
  • Mac build targets (DMG/ZIP for x64 and arm64)
  • Dock icon and app name for macOS dev mode
  • Help text adapted per platform in settings dialog

Application Branding

  • Added app icon and logo for all platforms
  • Logo displayed in title bar
  • Icons bundled for Windows, macOS, and Linux builds

Windows Build Improvements

  • Configured NSIS installer build
  • Added portable executable build
  • Build outputs now go to dist/ directory

2026-04-08 — v1.0.0 Release

  • Initial public release
  • Core features implemented
  • Windows support
  • Documentation released

Pre-Release Development

Audio System

  • Implemented FFmpeg-based audio capture
  • Added DirectShow device enumeration (Windows)
  • Added AVFoundation device enumeration (macOS)
  • Created audio preview with level metering
  • Integrated silence gate filtering

STT Pipeline

  • Built audio chunk queue system
  • Implemented Whisper transcription
  • Added hallucination filtering
  • Created speaker attribution

LLM Integration

  • Developed streaming response handler
  • Implemented answer queue system
  • Created rating feedback loop
  • Added contextual tips parsing

State Management

  • Zustand store implementation
  • Settings persistence
  • Transcript history
  • Q&A history

Known Issues

Current Limitations

IssueStatusWorkaround
macOS requires manual FFmpeg installBy designbrew install ffmpeg
macOS app not code-signedPlannedxattr -d com.apple.quarantine /Applications/Intervu.app
VB-Cable alone blocks system audio on WindowsDocumentedUse SteelSeries GG or Voicemeeter for audio routing
Linux support not testedPlannedWindows or macOS only for now
No offline modePlannedRequires STT and LLM running
Cannot record sessionsPlannedUse screen recording

Upcoming Features

For planned features, see Roadmap.

Next Release Focus

  • Code signing for macOS and Windows
  • Keyboard shortcuts
  • Session export
  • Answer length control

Version Format

Intervu uses Semantic Versioning:

  • Major (X.0.0): Breaking changes
  • Minor (1.X.0): New features
  • Patch (1.0.X): Bug fixes

Download

Get the latest version from GitHub Releases.

Windows

  • Intervu-Setup-x.x.x.exe — Installer
  • Intervu-x.x.x-portable.exe — Portable (no installation required)

macOS

  • Intervu-x.x.x-arm64.dmg — Apple Silicon (M1/M2/M3/M4)
  • Intervu-x.x.x.dmg — Intel Macs
  • Intervu-x.x.x-arm64-mac.zip — Apple Silicon (portable)
  • Intervu-x.x.x-mac.zip — Intel Macs (portable)

macOS Users

The macOS app is not code-signed. On first launch, remove the quarantine attribute:

bash
xattr -d com.apple.quarantine /Applications/Intervu.app

Code signing will be added in a future release.

Verify Installation

After installing, check your version:

  1. Open Intervu
  2. Click Settings (gear icon)
  3. Scroll to bottom
  4. Version number displayed

Migration Guide

From Pre-Release

If you used a pre-release version:

  1. Settings may not transfer — Reconfigure endpoints
  2. Old logs may have issues — Clear logs folder
  3. Ratings are preserved — No action needed

Full Reset

Windows:

powershell
# Close Intervu
# Delete settings
rmdir /s /q %APPDATA%\intervu
# Restart Intervu
# Reconfigure from scratch

macOS:

bash
# Close Intervu
# Delete settings
rm -rf ~/Library/Application\ Support/intervu
# Restart Intervu
# Reconfigure from scratch

Feedback

Found a bug? Have a suggestion?

  1. Check existing issues on GitHub
  2. Open a new issue with details
  3. Include logs from Settings → Open Logs Folder

For the complete changelog, see GitHub Releases.

Made with ❤️by Aldrick Bonaobra