Advanced Mode

Use a second LLM to extract complete questions before generating answers.

Overview

Advanced Mode adds an intelligent question extraction step:

Mode	Process
Basic	Interviewer speaks → Generate answer immediately
Advanced	Interviewer speaks → Extract questions → Generate answers for complete questions only

Why Use Advanced Mode?

Reduces noise: Doesn't generate answers for casual conversation
Better context: Waits for complete questions
More relevant: Filters out statements that aren't questions

Enabling Advanced Mode

Open Settings (gear icon)
Scroll to Advanced Settings
Toggle Advanced Mode on
Configure the extractor LLM (can use same endpoint as main LLM)

Configuration

Setting	Description	Default
Advanced Mode	Enable/disable	Off
Extractor Endpoint	LLM for question extraction	Same as main LLM
Extractor Model	Model to use	Same as main LLM
Extractor API Key	API key if required	Empty

How It Works

Basic Mode Flow

Interviewer speaks
       ↓
Transcript captured
       ↓
Wait for debounce (2s)
       ↓
Generate answer

Advanced Mode Flow

Interviewer speaks
       ↓
Transcript captured
       ↓
Wait for debounce (2s)
       ↓
Extractor LLM analyzes transcript
       ↓
Identify complete questions
       ↓
Generate answer for each question

Question Extraction

The extractor LLM:

Receives recent transcript entries
Identifies interviewer's speech
Detects complete questions
Returns question text

Example extraction:

Transcript:
[Interviewer]: So I see you worked at TechCorp... tell me about your role there. What were your main responsibilities?

Extracted Question:
"What were your main responsibilities at TechCorp?"

Use Cases

When to Use Advanced Mode

✅ Good for:

Verbose interviewers who ramble
Conversational interviews with back-and-forth
When you only want answers for actual questions
Reducing noise in the answer panel

❌ Not needed for:

Direct, concise interviewers
Technical interviews with short questions
Fast-paced interviews

Queue Mode Integration

Advanced Mode works with Queue Mode for multiple questions:

Queue Mode Off (Default)

Only the most recent question is answered
Previous questions are cancelled

Queue Mode On

All extracted questions are queued
Each question gets an answer in order
Cards show queue position

Extractor LLM Configuration

Using Same LLM

You can use the same endpoint for both extraction and answering:

Extractor Endpoint: http://localhost:11434/v1/chat/completions
Extractor Model: llama3.2:3b

This is the default and simplest setup.

Using Different LLM

For better extraction, you may use a different model:

Main LLM: llama3.2:3b (for answers)
Extractor LLM: phi3:mini (for question detection)

Why different models?

Question extraction needs speed, not creativity
Smaller models work well for detection
Frees up resources for main answer generation

Troubleshooting

No Questions Extracted

Causes:

Interviewer hasn't asked a complete question yet
Extractor model isn't detecting questions
Transcript is empty

Solutions:

Wait for interviewer to finish speaking
Check extractor endpoint is working
Test with a clear question: "What is your experience with React?"

Too Many Questions Extracted

Causes:

Every statement is being treated as a question
Model is too aggressive in detection

Solutions:

Add explicit prompts in extractor system prompt
Use a different model
Check if hallucination phrases need updating

Slow Performance

Causes:

Two LLM calls instead of one
Large extractor model

Solutions:

Use a smaller model for extraction (phi3:mini)
Ensure GPU acceleration
Check network latency

System Prompt for Extractor

The default extractor prompt:

You are a question extraction assistant for a live interview transcript.
The transcript is captured in real-time using short audio chunks, so
interviewer questions may be split across multiple consecutive entries.
Your job is to identify complete, unanswered interview questions.

You can customize this in Settings → Advanced → Extractor System Prompt. Click the Edit in new window button next to the Extractor System Prompt label for a larger editing area.

Customization Examples

Strict Question Detection

Only extract questions that:
1. End with a question mark
2. Start with a question word
3. Are complete sentences
Ignore statements, commands, and incomplete thoughts.

Include Follow-ups

Identify both direct questions and implied follow-up questions.
For example, "That's interesting..." implies "Tell me more."

Performance Tips

Optimize Extractor Model

bash

# Use a fast, small model
ollama pull phi3:mini

# Or use the same model if resources allow
# llama3.2:3b works fine for both

Reduce Redundant Calls

Increase debounce seconds to wait for complete thoughts
Use Queue mode to batch question extraction

Next Steps

Basic Mode — Default workflow
Contextual Tips — Key points extraction

Advanced Mode ​

Overview ​

Why Use Advanced Mode? ​

Enabling Advanced Mode ​

Configuration ​

How It Works ​

Basic Mode Flow ​

Advanced Mode Flow ​

Question Extraction ​

Use Cases ​

When to Use Advanced Mode ​

Queue Mode Integration ​

Queue Mode Off (Default) ​

Queue Mode On ​

Extractor LLM Configuration ​

Using Same LLM ​

Using Different LLM ​

Troubleshooting ​

No Questions Extracted ​

Too Many Questions Extracted ​

Slow Performance ​

System Prompt for Extractor ​

Customization Examples ​

Strict Question Detection ​

Include Follow-ups ​

Performance Tips ​

Optimize Extractor Model ​

Reduce Redundant Calls ​

Next Steps ​

Advanced Mode

Overview

Why Use Advanced Mode?

Enabling Advanced Mode

Configuration

How It Works

Basic Mode Flow

Advanced Mode Flow

Question Extraction

Use Cases

When to Use Advanced Mode

Queue Mode Integration

Queue Mode Off (Default)

Queue Mode On

Extractor LLM Configuration

Using Same LLM

Using Different LLM

Troubleshooting

No Questions Extracted

Too Many Questions Extracted

Slow Performance

System Prompt for Extractor

Customization Examples

Strict Question Detection

Include Follow-ups

Performance Tips

Optimize Extractor Model

Reduce Redundant Calls

Next Steps