Changelog
Version history and release notes for Intervu.
v1.2.1
Bug fix for microphone transcription with whisperx-asr-service backend.
Bug Fixes
STT Response Parsing
- Fixed microphone transcription returning empty text when using whisperx-asr-service (
/asrendpoint) with speaker diarization enabled - The
/asrendpoint returns transcription as asegmentsarray even whendiarize=false, but the handler was only checking forjson.textas a string - Now properly extracts text from
segmentsarray whenjson.textis not a string
v1.2.0
PGVector context retrieval and dedicated chat window.
Features
PGVector Context Retrieval (Phase 1)
- Per-source text/vector mode toggle for resume, Q&A bank, and company context
- Pre-fetch vector context in
triggerGeneration()beforebuildMessages(); cached in Zustand keyed by query - New main-process modules:
pgvector.ts(pool cache, cosine-distance search, SQL-identifier validation),embeddings.ts(OpenAI-compatible/v1/embeddings),context-loader.ts(shared by auto-answer and chat) - Test Connection button per source — reports row count and dimension mismatch against live embedding
- Graceful fallback to text content when vector retrieval returns nothing
- Connection pools closed on
will-quit
Chat Window (Phase 2)
- Dedicated chat BrowserWindow opened via title-bar button, routed by
#/chathash - Step-by-step processing visibility: thinking → fetching/loaded vector or text → generating, with detail lines and color coding
- Streams tokens to the AI bubble via
chat-tokenIPC; separatechatAbortControllerfrom auto-answer - Chat history persisted to
userData/chat-history.json; 50-message context cap with "not in context" indicator for older messages - Transcript snapshot pushed from main window to main-process cache so the chat window can attach live context
PGVector Self-Indexing
- "Index from text" button per source — chunks the text field, embeds each chunk, and writes to the configured PGVector table
- Paragraph-aware chunker (1500 char target, 200 char overlap sliding-window fallback for oversized paragraphs)
- Auto-creates the
vectorextension and target table if missing; probes and enforces dim consistency across chunks - Dim-mismatch safety check against existing table before truncate+insert
- Live progress via
pgvector-index-progressIPC, stamped perfieldKeyso multiple fields don't cross-contaminate state - Embeddings module now supports both OpenAI-compatible (
/v1/embeddings) and Ollama-native (/api/embeddings) endpoints with auto URL resolution
Test Connection UX
- Missing target table no longer errors — reports "Connected — table does not exist yet" so users know the credentials work and the table will be created on first index
- Successful indexing clears any stale "relation does not exist" test message
Chat Settings
- New
chatMaxTokenssetting in Advanced section — independent ofllmMaxTokens(which is tuned for terse auto-answers) - Default 0 = unlimited: when unset, the
max_tokensfield is omitted from the request so the LLM server's own default applies - Chat mode now uses its own system prompt (ignores the "answer concisely" base prompt) and instructs the LLM to give complete, non-truncated answers
Session Export
- Export button in title bar (between Clear and Settings)
- Export transcript and answers as Markdown file with date stamp
- Copy to clipboard option for quick sharing
- Markdown format includes date header, transcript with speaker labels, and Q&A with confidence/key points
Resizable Settings Dialog
- Settings dialog can be resized by dragging the bottom-right corner
- Dialog size is persisted to localStorage and restored on next open
- Minimum dimensions enforced (480x400) for usability
Known Issues
- Chat replies still get cut off mid-sentence on long outputs. Despite
chatMaxTokens: 0(unlimited) and an explicit "do not truncate" system prompt, longer chat replies terminate early. Root cause is suspected to be server-side — most likely Ollamanum_ctxexhaustion (default 2048) once system prompt + vector chunks + transcript exceed the context window, or an OpenWebUI per-user generation cap that overrides the client request. Workaround: raisenum_ctxto 8192+ in your Ollama Modelfile or OpenWebUI model advanced params. To be debugged in a follow-up.
v1.1.0
Speaker diarization support for multi-participant interviews.
Features
Speaker Diarization (Phase 1)
- Added
sttDiarizationtoggle in Settings with on-enable validation — backend must supportdiarize=true - System audio stream is diarized into per-speaker segments (
SPEAKER_00,SPEAKER_01, etc.) - Client-side speaker embedding matching (256-dim cosine similarity, weighted moving average) for cross-chunk voice consistency
- Speakr-proven matching algorithm: 30% new / 70% existing weighted update + ambiguity check (5% gap threshold)
- Click-to-rename speaker labels in transcript pane (persisted as
speakerMapin settings) - Extended color palette for up to 10+ unique speakers
- Runtime hard error when backend doesn't support diarization — transcription pauses with actionable message
- Min speakers setting (2-6) in Settings
- Full backward compatibility —
sttDiarization: false(default) changes nothing
LLM Speaker-Aware Prompting (Phase 2)
- System prompt now includes multi-speaker context with detected speaker names when diarization is active
- All non-'you' messages get
[DisplayName]:prefixes using speakerMap display names - Auto-trigger now fires on any non-'you' speaker (was broken — only checked for 'interviewer')
- Simple mode question detection finds latest non-'you' entry for card title (was only looking for 'interviewer')
- Advanced mode extractor uses display names from speakerMap instead of raw SPEAKER_XX labels
Speaker Monitor Panel
- Popover panel in title bar showing all detected speakers (visible when diarization is enabled)
- Per-speaker metadata: first seen, last spoke (relative time), message count
- Last 3 messages displayed per speaker
- Click-to-rename speaker names directly in monitor
- "Clear speaker data" button resets speaker names and embeddings
- Shared speaker utilities extracted to
lib/speaker-utils.ts
AI-Assisted Speaker Naming (Phase 3)
- LLM analyzes transcript content for name mentions ("Hi Anna", "John, what do you think?")
- Suggestions appear as sparkle (✨) hints next to unnamed SPEAKER_XX labels in transcript
- Speaker monitor shows suggestions with accept (✓) and dismiss (✗) buttons
- Uses extractor LLM when advanced mode is enabled, otherwise main LLM
- Debounced analysis: waits 8 seconds after last transcript update, requires 5+ non-'you' entries
- Dismissed suggestions are remembered per session to avoid re-suggesting
- Accepted suggestions automatically move to
speakerMapand persist to disk - New IPC channel:
extract-speaker-names(non-streaming, temperature 0, max_tokens 300) - New store fields:
speakerNameSuggestions,setSpeakerNameSuggestions,acceptSpeakerSuggestion,dismissSpeakerSuggestion
Chunk Duration Tuning & Speaker Profile Persistence (Phase 4)
- Audio chunk duration auto-enforced to minimum 6 seconds when diarization is ON for better speaker separation
- Chunk duration input locked in settings with explanatory hint when diarization is enabled
- New "Remember Speakers" toggle in diarization settings — saves voice profiles (embeddings) to
speaker-profiles.json - Saved profiles are restored on next listening start, enabling cross-session speaker recognition
- "Clear speaker data" also clears saved profiles from disk
- New IPC channels:
save-speaker-profiles,load-speaker-profiles,clear-speaker-profiles - New setting:
speakerProfilePersistence(default: false)
Clear Speaker Data Fix
- "Clear transcript & answers" now also clears speaker names (
speakerMap) and in-memory embeddings - "Clear all" also clears speaker data
- New "Clear speaker data" option in clear menu (resets speaker names + embeddings only)
Dual STT Backend Support
- Auto-detects API format based on endpoint URL:
/asr(whisperx-asr-service) vs/v1/audio/transcriptions(OpenAI-compatible) - whisperx-asr-service (
learnedmachine/whisperx-asr-service): recommended for diarization — provides speaker embeddings for cross-chunk tracking. Docker setup atdocker/whisperx-asr/ - WhisperX API Server (
aldrickb/whisperx-api-server): OpenAI-compatible. Docker setup atdocker/whisperx/ - Retry-with-fallback on 500 errors (retries without embeddings if server crashes on NaN values)
- Validation scripts for both backends
v1.0.2
Settings page quality-of-life improvements and new context fields.
Features
Q&A Bank & Company Context
- Added Q&A Bank field for pre-prepared interview answers and preferences
- Added Company / Role Context field for company info, job description, and team details
- Both fields are injected into the LLM system message alongside the resume
Settings Page Improvements
- Moved System Prompt next to LLM settings (mirrors Extractor System Prompt positioning)
- Fixed-height textareas with internal scrolling to prevent page bloat
- "Edit in new window" button on all textareas — opens a dedicated Electron window for comfortable long-text editing
v1.0.1
Bug fix release for macOS packaged app.
Bug Fixes
FFmpeg Not Found in Packaged App
- Packaged
.appbundles don't inherit the shell's PATH, sospawn('ffmpeg')failed with ENOENT even when FFmpeg was installed via Homebrew - Added automatic probing of well-known macOS paths (
/opt/homebrew/bin/ffmpegfor Apple Silicon,/usr/local/bin/ffmpegfor Intel)
Silent Audio in Packaged App
- The packaged app was ad-hoc signed with hardened runtime but lacked the
com.apple.security.device.audio-inputentitlement - macOS silently provides a zero-byte audio stream when this entitlement is missing — FFmpeg appeared to capture audio but it was all silence
- Added custom entitlements plists with audio input access for both the parent app and child processes (FFmpeg)
Invisible Window After Launch
- The inherited entitlements for child processes were missing
com.apple.security.cs.allow-jit, which Electron's renderer process requires for V8/Chromium - Without JIT, the renderer couldn't execute JavaScript, resulting in a blank/invisible window despite the app running (dock icon visible)
- Added JIT entitlement to the inherited entitlements plist
Improvements
- Improved microphone permission logging on macOS — now logs whether access was granted or denied
- Added
.DS_Storeto.gitignore
v1.0.0
Initial release with core functionality.
Features
Real-Time Audio Capture
- Capture system audio and microphone simultaneously
- Speaker attribution (interviewer vs. you)
- Audio level meters in title bar
- Silent audio filtering (RMS threshold)
Speech-to-Text Integration
- OpenAI-compatible STT endpoint support
- Real-time transcription with speaker labels
- Hallucination phrase filtering
- Configurable audio chunk duration
AI-Powered Answers
- Real-time answer generation
- Streaming response display
- Resume-based personalization
- System prompt customization
Answer Quality Feedback
- Thumbs up/down rating system
- In-context learning from ratings
- Local rating storage
- Clear ratings option
Advanced Mode
- Dual-LLM question extraction
- Intelligent answer queue
- Custom extractor prompts
- Queue mode for multiple answers
Contextual Answer Tips
- Key points bullet extraction
- Confidence score (0-100)
- Structured answer format
- Toggle in advanced settings
Audio Configuration
- Device selection for system audio and microphone
- VB-Cable integration for system capture (Windows)
- BlackHole integration for system capture (macOS)
- Audio routing with SteelSeries GG and Voicemeeter (Windows)
- Multi-output device setup (macOS)
- Device refresh and validation
- Level meter display
Cross-Platform Support
- Windows: DirectShow audio capture, bundled FFmpeg, installer and portable builds
- macOS: AVFoundation audio capture, manual FFmpeg install, DMG and ZIP builds
- Platform-aware device enumeration and error messages
- Microphone permission handling on macOS
Application
- App icon and logo for all platforms
- Floating overlay window
- Always-on-top mode
- Compact dark theme
Settings Management
- Persistent settings storage
- STT/LLM endpoint configuration
- Resume and system prompt editor
- Test connections for STT and LLM
Technical
- FFmpeg bundled download (Windows)
- FFmpeg manual install (macOS via Homebrew)
- IPC communication for CORS bypass
- File-based logging
- Error notifications
- macOS quarantine workaround support
Development History
2026-04-09 — Updates
macOS Support
- AVFoundation-based audio device enumeration
- Platform-aware FFmpeg arguments (
-f avfoundationon Mac,-f dshowon Windows) - Microphone permission request on macOS startup
- Mac build targets (DMG/ZIP for x64 and arm64)
- Dock icon and app name for macOS dev mode
- Help text adapted per platform in settings dialog
Application Branding
- Added app icon and logo for all platforms
- Logo displayed in title bar
- Icons bundled for Windows, macOS, and Linux builds
Windows Build Improvements
- Configured NSIS installer build
- Added portable executable build
- Build outputs now go to
dist/directory
2026-04-08 — v1.0.0 Release
- Initial public release
- Core features implemented
- Windows support
- Documentation released
Pre-Release Development
Audio System
- Implemented FFmpeg-based audio capture
- Added DirectShow device enumeration (Windows)
- Added AVFoundation device enumeration (macOS)
- Created audio preview with level metering
- Integrated silence gate filtering
STT Pipeline
- Built audio chunk queue system
- Implemented Whisper transcription
- Added hallucination filtering
- Created speaker attribution
LLM Integration
- Developed streaming response handler
- Implemented answer queue system
- Created rating feedback loop
- Added contextual tips parsing
State Management
- Zustand store implementation
- Settings persistence
- Transcript history
- Q&A history
Known Issues
Current Limitations
| Issue | Status | Workaround |
|---|---|---|
| macOS requires manual FFmpeg install | By design | brew install ffmpeg |
| macOS app not code-signed | Planned | xattr -d com.apple.quarantine /Applications/Intervu.app |
| VB-Cable alone blocks system audio on Windows | Documented | Use SteelSeries GG or Voicemeeter for audio routing |
| Linux support not tested | Planned | Windows or macOS only for now |
| No offline mode | Planned | Requires STT and LLM running |
| Cannot record sessions | Planned | Use screen recording |
Upcoming Features
For planned features, see Roadmap.
Next Release Focus
- Code signing for macOS and Windows
- Keyboard shortcuts
- Session export
- Answer length control
Version Format
Intervu uses Semantic Versioning:
- Major (X.0.0): Breaking changes
- Minor (1.X.0): New features
- Patch (1.0.X): Bug fixes
Download
Get the latest version from GitHub Releases.
Windows
Intervu-Setup-x.x.x.exe— InstallerIntervu-x.x.x-portable.exe— Portable (no installation required)
macOS
Intervu-x.x.x-arm64.dmg— Apple Silicon (M1/M2/M3/M4)Intervu-x.x.x.dmg— Intel MacsIntervu-x.x.x-arm64-mac.zip— Apple Silicon (portable)Intervu-x.x.x-mac.zip— Intel Macs (portable)
macOS Users
The macOS app is not code-signed. On first launch, remove the quarantine attribute:
xattr -d com.apple.quarantine /Applications/Intervu.appCode signing will be added in a future release.
Verify Installation
After installing, check your version:
- Open Intervu
- Click Settings (gear icon)
- Scroll to bottom
- Version number displayed
Migration Guide
From Pre-Release
If you used a pre-release version:
- Settings may not transfer — Reconfigure endpoints
- Old logs may have issues — Clear logs folder
- Ratings are preserved — No action needed
Full Reset
Windows:
# Close Intervu
# Delete settings
rmdir /s /q %APPDATA%\intervu
# Restart Intervu
# Reconfigure from scratchmacOS:
# Close Intervu
# Delete settings
rm -rf ~/Library/Application\ Support/intervu
# Restart Intervu
# Reconfigure from scratchFeedback
Found a bug? Have a suggestion?
- Check existing issues on GitHub
- Open a new issue with details
- Include logs from Settings → Open Logs Folder
For the complete changelog, see GitHub Releases.