Changelog

Version history and release notes for Intervu.

v1.2.1

Bug fix for microphone transcription with whisperx-asr-service backend.

Bug Fixes

STT Response Parsing

Fixed microphone transcription returning empty text when using whisperx-asr-service (/asr endpoint) with speaker diarization enabled
The /asr endpoint returns transcription as a segments array even when diarize=false, but the handler was only checking for json.text as a string
Now properly extracts text from segments array when json.text is not a string

v1.2.0

PGVector context retrieval and dedicated chat window.

Features

PGVector Context Retrieval (Phase 1)

Per-source text/vector mode toggle for resume, Q&A bank, and company context
Pre-fetch vector context in triggerGeneration() before buildMessages(); cached in Zustand keyed by query
New main-process modules: pgvector.ts (pool cache, cosine-distance search, SQL-identifier validation), embeddings.ts (OpenAI-compatible /v1/embeddings), context-loader.ts (shared by auto-answer and chat)
Test Connection button per source — reports row count and dimension mismatch against live embedding
Graceful fallback to text content when vector retrieval returns nothing
Connection pools closed on will-quit

Chat Window (Phase 2)

Dedicated chat BrowserWindow opened via title-bar button, routed by #/chat hash
Step-by-step processing visibility: thinking → fetching/loaded vector or text → generating, with detail lines and color coding
Streams tokens to the AI bubble via chat-token IPC; separate chatAbortController from auto-answer
Chat history persisted to userData/chat-history.json; 50-message context cap with "not in context" indicator for older messages
Transcript snapshot pushed from main window to main-process cache so the chat window can attach live context

PGVector Self-Indexing

"Index from text" button per source — chunks the text field, embeds each chunk, and writes to the configured PGVector table
Paragraph-aware chunker (1500 char target, 200 char overlap sliding-window fallback for oversized paragraphs)
Auto-creates the vector extension and target table if missing; probes and enforces dim consistency across chunks
Dim-mismatch safety check against existing table before truncate+insert
Live progress via pgvector-index-progress IPC, stamped per fieldKey so multiple fields don't cross-contaminate state
Embeddings module now supports both OpenAI-compatible (/v1/embeddings) and Ollama-native (/api/embeddings) endpoints with auto URL resolution

Test Connection UX

Missing target table no longer errors — reports "Connected — table does not exist yet" so users know the credentials work and the table will be created on first index
Successful indexing clears any stale "relation does not exist" test message

Chat Settings

New chatMaxTokens setting in Advanced section — independent of llmMaxTokens (which is tuned for terse auto-answers)
Default 0 = unlimited: when unset, the max_tokens field is omitted from the request so the LLM server's own default applies
Chat mode now uses its own system prompt (ignores the "answer concisely" base prompt) and instructs the LLM to give complete, non-truncated answers

Session Export

Export button in title bar (between Clear and Settings)
Export transcript and answers as Markdown file with date stamp
Copy to clipboard option for quick sharing
Markdown format includes date header, transcript with speaker labels, and Q&A with confidence/key points

Resizable Settings Dialog

Settings dialog can be resized by dragging the bottom-right corner
Dialog size is persisted to localStorage and restored on next open
Minimum dimensions enforced (480x400) for usability

Known Issues

Chat replies still get cut off mid-sentence on long outputs. Despite chatMaxTokens: 0 (unlimited) and an explicit "do not truncate" system prompt, longer chat replies terminate early. Root cause is suspected to be server-side — most likely Ollama num_ctx exhaustion (default 2048) once system prompt + vector chunks + transcript exceed the context window, or an OpenWebUI per-user generation cap that overrides the client request. Workaround: raise num_ctx to 8192+ in your Ollama Modelfile or OpenWebUI model advanced params. To be debugged in a follow-up.

v1.1.0

Speaker diarization support for multi-participant interviews.

Features

Speaker Diarization (Phase 1)

Added sttDiarization toggle in Settings with on-enable validation — backend must support diarize=true
System audio stream is diarized into per-speaker segments (SPEAKER_00, SPEAKER_01, etc.)
Client-side speaker embedding matching (256-dim cosine similarity, weighted moving average) for cross-chunk voice consistency
Speakr-proven matching algorithm: 30% new / 70% existing weighted update + ambiguity check (5% gap threshold)
Click-to-rename speaker labels in transcript pane (persisted as speakerMap in settings)
Extended color palette for up to 10+ unique speakers
Runtime hard error when backend doesn't support diarization — transcription pauses with actionable message
Min speakers setting (2-6) in Settings
Full backward compatibility — sttDiarization: false (default) changes nothing

LLM Speaker-Aware Prompting (Phase 2)

System prompt now includes multi-speaker context with detected speaker names when diarization is active
All non-'you' messages get [DisplayName]: prefixes using speakerMap display names
Auto-trigger now fires on any non-'you' speaker (was broken — only checked for 'interviewer')
Simple mode question detection finds latest non-'you' entry for card title (was only looking for 'interviewer')
Advanced mode extractor uses display names from speakerMap instead of raw SPEAKER_XX labels

Speaker Monitor Panel

Popover panel in title bar showing all detected speakers (visible when diarization is enabled)
Per-speaker metadata: first seen, last spoke (relative time), message count
Last 3 messages displayed per speaker
Click-to-rename speaker names directly in monitor
"Clear speaker data" button resets speaker names and embeddings
Shared speaker utilities extracted to lib/speaker-utils.ts

AI-Assisted Speaker Naming (Phase 3)

LLM analyzes transcript content for name mentions ("Hi Anna", "John, what do you think?")
Suggestions appear as sparkle (✨) hints next to unnamed SPEAKER_XX labels in transcript
Speaker monitor shows suggestions with accept (✓) and dismiss (✗) buttons
Uses extractor LLM when advanced mode is enabled, otherwise main LLM
Debounced analysis: waits 8 seconds after last transcript update, requires 5+ non-'you' entries
Dismissed suggestions are remembered per session to avoid re-suggesting
Accepted suggestions automatically move to speakerMap and persist to disk
New IPC channel: extract-speaker-names (non-streaming, temperature 0, max_tokens 300)
New store fields: speakerNameSuggestions, setSpeakerNameSuggestions, acceptSpeakerSuggestion, dismissSpeakerSuggestion

Chunk Duration Tuning & Speaker Profile Persistence (Phase 4)

Audio chunk duration auto-enforced to minimum 6 seconds when diarization is ON for better speaker separation
Chunk duration input locked in settings with explanatory hint when diarization is enabled
New "Remember Speakers" toggle in diarization settings — saves voice profiles (embeddings) to speaker-profiles.json
Saved profiles are restored on next listening start, enabling cross-session speaker recognition
"Clear speaker data" also clears saved profiles from disk
New IPC channels: save-speaker-profiles, load-speaker-profiles, clear-speaker-profiles
New setting: speakerProfilePersistence (default: false)

Clear Speaker Data Fix

"Clear transcript & answers" now also clears speaker names (speakerMap) and in-memory embeddings
"Clear all" also clears speaker data
New "Clear speaker data" option in clear menu (resets speaker names + embeddings only)

Dual STT Backend Support

Auto-detects API format based on endpoint URL: /asr (whisperx-asr-service) vs /v1/audio/transcriptions (OpenAI-compatible)
whisperx-asr-service (learnedmachine/whisperx-asr-service): recommended for diarization — provides speaker embeddings for cross-chunk tracking. Docker setup at docker/whisperx-asr/
WhisperX API Server (aldrickb/whisperx-api-server): OpenAI-compatible. Docker setup at docker/whisperx/
Retry-with-fallback on 500 errors (retries without embeddings if server crashes on NaN values)
Validation scripts for both backends

v1.0.2

Settings page quality-of-life improvements and new context fields.

Features

Q&A Bank & Company Context

Added Q&A Bank field for pre-prepared interview answers and preferences
Added Company / Role Context field for company info, job description, and team details
Both fields are injected into the LLM system message alongside the resume

Settings Page Improvements

Moved System Prompt next to LLM settings (mirrors Extractor System Prompt positioning)
Fixed-height textareas with internal scrolling to prevent page bloat
"Edit in new window" button on all textareas — opens a dedicated Electron window for comfortable long-text editing

v1.0.1

Bug fix release for macOS packaged app.

Bug Fixes

FFmpeg Not Found in Packaged App

Packaged .app bundles don't inherit the shell's PATH, so spawn('ffmpeg') failed with ENOENT even when FFmpeg was installed via Homebrew
Added automatic probing of well-known macOS paths (/opt/homebrew/bin/ffmpeg for Apple Silicon, /usr/local/bin/ffmpeg for Intel)

Silent Audio in Packaged App

The packaged app was ad-hoc signed with hardened runtime but lacked the com.apple.security.device.audio-input entitlement
macOS silently provides a zero-byte audio stream when this entitlement is missing — FFmpeg appeared to capture audio but it was all silence
Added custom entitlements plists with audio input access for both the parent app and child processes (FFmpeg)

Invisible Window After Launch

The inherited entitlements for child processes were missing com.apple.security.cs.allow-jit, which Electron's renderer process requires for V8/Chromium
Without JIT, the renderer couldn't execute JavaScript, resulting in a blank/invisible window despite the app running (dock icon visible)
Added JIT entitlement to the inherited entitlements plist

Improvements

Improved microphone permission logging on macOS — now logs whether access was granted or denied
Added .DS_Store to .gitignore

v1.0.0

Initial release with core functionality.

Features

Real-Time Audio Capture

Capture system audio and microphone simultaneously
Speaker attribution (interviewer vs. you)
Audio level meters in title bar
Silent audio filtering (RMS threshold)

Speech-to-Text Integration

OpenAI-compatible STT endpoint support
Real-time transcription with speaker labels
Hallucination phrase filtering
Configurable audio chunk duration

AI-Powered Answers

Real-time answer generation
Streaming response display
Resume-based personalization
System prompt customization

Answer Quality Feedback

Thumbs up/down rating system
In-context learning from ratings
Local rating storage
Clear ratings option

Advanced Mode

Dual-LLM question extraction
Intelligent answer queue
Custom extractor prompts
Queue mode for multiple answers

Contextual Answer Tips

Key points bullet extraction
Confidence score (0-100)
Structured answer format
Toggle in advanced settings

Audio Configuration

Device selection for system audio and microphone
VB-Cable integration for system capture (Windows)
BlackHole integration for system capture (macOS)
Audio routing with SteelSeries GG and Voicemeeter (Windows)
Multi-output device setup (macOS)
Device refresh and validation
Level meter display

Cross-Platform Support

Windows: DirectShow audio capture, bundled FFmpeg, installer and portable builds
macOS: AVFoundation audio capture, manual FFmpeg install, DMG and ZIP builds
Platform-aware device enumeration and error messages
Microphone permission handling on macOS

Application

App icon and logo for all platforms
Floating overlay window
Always-on-top mode
Compact dark theme

Settings Management

Persistent settings storage
STT/LLM endpoint configuration
Resume and system prompt editor
Test connections for STT and LLM

Technical

FFmpeg bundled download (Windows)
FFmpeg manual install (macOS via Homebrew)
IPC communication for CORS bypass
File-based logging
Error notifications
macOS quarantine workaround support

Development History

2026-04-09 — Updates

macOS Support

AVFoundation-based audio device enumeration
Platform-aware FFmpeg arguments (-f avfoundation on Mac, -f dshow on Windows)
Microphone permission request on macOS startup
Mac build targets (DMG/ZIP for x64 and arm64)
Dock icon and app name for macOS dev mode
Help text adapted per platform in settings dialog

Application Branding

Added app icon and logo for all platforms
Logo displayed in title bar
Icons bundled for Windows, macOS, and Linux builds

Windows Build Improvements

Configured NSIS installer build
Added portable executable build
Build outputs now go to dist/ directory

2026-04-08 — v1.0.0 Release

Initial public release
Core features implemented
Windows support
Documentation released

Pre-Release Development

Audio System

Implemented FFmpeg-based audio capture
Added DirectShow device enumeration (Windows)
Added AVFoundation device enumeration (macOS)
Created audio preview with level metering
Integrated silence gate filtering

STT Pipeline

Built audio chunk queue system
Implemented Whisper transcription
Added hallucination filtering
Created speaker attribution

LLM Integration

Developed streaming response handler
Implemented answer queue system
Created rating feedback loop
Added contextual tips parsing

State Management

Zustand store implementation
Settings persistence
Transcript history
Q&A history

Known Issues

Current Limitations

Issue	Status	Workaround
macOS requires manual FFmpeg install	By design	`brew install ffmpeg`
macOS app not code-signed	Planned	`xattr -d com.apple.quarantine /Applications/Intervu.app`
VB-Cable alone blocks system audio on Windows	Documented	Use SteelSeries GG or Voicemeeter for audio routing
Linux support not tested	Planned	Windows or macOS only for now
No offline mode	Planned	Requires STT and LLM running
Cannot record sessions	Planned	Use screen recording

Upcoming Features

For planned features, see Roadmap.

Next Release Focus

Code signing for macOS and Windows
Keyboard shortcuts
Session export
Answer length control

Version Format

Intervu uses Semantic Versioning:

Major (X.0.0): Breaking changes
Minor (1.X.0): New features
Patch (1.0.X): Bug fixes

Download

Get the latest version from GitHub Releases.

Windows

Intervu-Setup-x.x.x.exe — Installer
Intervu-x.x.x-portable.exe — Portable (no installation required)

macOS

Intervu-x.x.x-arm64.dmg — Apple Silicon (M1/M2/M3/M4)
Intervu-x.x.x.dmg — Intel Macs
Intervu-x.x.x-arm64-mac.zip — Apple Silicon (portable)
Intervu-x.x.x-mac.zip — Intel Macs (portable)

macOS Users

The macOS app is not code-signed. On first launch, remove the quarantine attribute:

bash

xattr -d com.apple.quarantine /Applications/Intervu.app

Code signing will be added in a future release.

Verify Installation

After installing, check your version:

Open Intervu
Click Settings (gear icon)
Scroll to bottom
Version number displayed

Migration Guide

From Pre-Release

If you used a pre-release version:

Settings may not transfer — Reconfigure endpoints
Old logs may have issues — Clear logs folder
Ratings are preserved — No action needed

Full Reset

Windows:

powershell

# Close Intervu
# Delete settings
rmdir /s /q %APPDATA%\intervu
# Restart Intervu
# Reconfigure from scratch

macOS:

bash

# Close Intervu
# Delete settings
rm -rf ~/Library/Application\ Support/intervu
# Restart Intervu
# Reconfigure from scratch

Feedback

Found a bug? Have a suggestion?

Check existing issues on GitHub
Open a new issue with details
Include logs from Settings → Open Logs Folder

For the complete changelog, see GitHub Releases.

Changelog ​

v1.2.1 ​

Bug Fixes ​

STT Response Parsing ​

v1.2.0 ​

Features ​

PGVector Context Retrieval (Phase 1) ​

Chat Window (Phase 2) ​

PGVector Self-Indexing ​

Test Connection UX ​

Chat Settings ​

Session Export ​

Resizable Settings Dialog ​

Known Issues ​

v1.1.0 ​

Features ​

Speaker Diarization (Phase 1) ​

LLM Speaker-Aware Prompting (Phase 2) ​

Speaker Monitor Panel ​

AI-Assisted Speaker Naming (Phase 3) ​

Chunk Duration Tuning & Speaker Profile Persistence (Phase 4) ​

Clear Speaker Data Fix ​

Dual STT Backend Support ​

v1.0.2 ​

Features ​

Q&A Bank & Company Context ​

Settings Page Improvements ​

v1.0.1 ​

Bug Fixes ​

FFmpeg Not Found in Packaged App ​

Silent Audio in Packaged App ​

Invisible Window After Launch ​

Improvements ​

v1.0.0 ​

Features ​

Real-Time Audio Capture ​

Speech-to-Text Integration ​

AI-Powered Answers ​

Answer Quality Feedback ​

Advanced Mode ​

Contextual Answer Tips ​

Audio Configuration ​

Cross-Platform Support ​

Application ​

Settings Management ​

Technical ​

Development History ​

2026-04-09 — Updates ​

macOS Support ​

Application Branding ​

Windows Build Improvements ​

2026-04-08 — v1.0.0 Release ​

Pre-Release Development ​

Audio System ​

STT Pipeline ​

LLM Integration ​

State Management ​

Known Issues ​

Current Limitations ​

Upcoming Features ​

Next Release Focus ​

Version Format ​

Download ​

Windows ​

macOS ​

Verify Installation ​

Migration Guide ​

From Pre-Release ​

Full Reset ​

Feedback ​

Changelog

v1.2.1

Bug Fixes

STT Response Parsing

v1.2.0

Features

PGVector Context Retrieval (Phase 1)

Chat Window (Phase 2)

PGVector Self-Indexing

Test Connection UX

Chat Settings

Session Export

Resizable Settings Dialog

Known Issues

v1.1.0

Features

Speaker Diarization (Phase 1)

LLM Speaker-Aware Prompting (Phase 2)

Speaker Monitor Panel

AI-Assisted Speaker Naming (Phase 3)

Chunk Duration Tuning & Speaker Profile Persistence (Phase 4)

Clear Speaker Data Fix

Dual STT Backend Support

v1.0.2

Features

Q&A Bank & Company Context

Settings Page Improvements

v1.0.1

Bug Fixes

FFmpeg Not Found in Packaged App

Silent Audio in Packaged App

Invisible Window After Launch

Improvements

v1.0.0

Features

Real-Time Audio Capture

Speech-to-Text Integration

AI-Powered Answers

Answer Quality Feedback

Advanced Mode

Contextual Answer Tips

Audio Configuration

Cross-Platform Support

Application

Settings Management

Technical

Development History

2026-04-09 — Updates

macOS Support

Application Branding

Windows Build Improvements

2026-04-08 — v1.0.0 Release

Pre-Release Development

Audio System

STT Pipeline

LLM Integration

State Management

Known Issues

Current Limitations

Upcoming Features

Next Release Focus

Version Format

Download

Windows

macOS

Verify Installation

Migration Guide

From Pre-Release

Full Reset

Feedback