Advanced Mode
Use a second LLM to extract complete questions before generating answers.
Overview
Advanced Mode adds an intelligent question extraction step:
| Mode | Process |
|---|---|
| Basic | Interviewer speaks → Generate answer immediately |
| Advanced | Interviewer speaks → Extract questions → Generate answers for complete questions only |
Why Use Advanced Mode?
- Reduces noise: Doesn't generate answers for casual conversation
- Better context: Waits for complete questions
- More relevant: Filters out statements that aren't questions
Enabling Advanced Mode
- Open Settings (gear icon)
- Scroll to Advanced Settings
- Toggle Advanced Mode on
- Configure the extractor LLM (can use same endpoint as main LLM)
Configuration
| Setting | Description | Default |
|---|---|---|
| Advanced Mode | Enable/disable | Off |
| Extractor Endpoint | LLM for question extraction | Same as main LLM |
| Extractor Model | Model to use | Same as main LLM |
| Extractor API Key | API key if required | Empty |
How It Works
Basic Mode Flow
Interviewer speaks
↓
Transcript captured
↓
Wait for debounce (2s)
↓
Generate answerAdvanced Mode Flow
Interviewer speaks
↓
Transcript captured
↓
Wait for debounce (2s)
↓
Extractor LLM analyzes transcript
↓
Identify complete questions
↓
Generate answer for each questionQuestion Extraction
The extractor LLM:
- Receives recent transcript entries
- Identifies interviewer's speech
- Detects complete questions
- Returns question text
Example extraction:
Transcript:
[Interviewer]: So I see you worked at TechCorp... tell me about your role there. What were your main responsibilities?
Extracted Question:
"What were your main responsibilities at TechCorp?"Use Cases
When to Use Advanced Mode
✅ Good for:
- Verbose interviewers who ramble
- Conversational interviews with back-and-forth
- When you only want answers for actual questions
- Reducing noise in the answer panel
❌ Not needed for:
- Direct, concise interviewers
- Technical interviews with short questions
- Fast-paced interviews
Queue Mode Integration
Advanced Mode works with Queue Mode for multiple questions:
Queue Mode Off (Default)
- Only the most recent question is answered
- Previous questions are cancelled
Queue Mode On
- All extracted questions are queued
- Each question gets an answer in order
- Cards show queue position
Extractor LLM Configuration
Using Same LLM
You can use the same endpoint for both extraction and answering:
Extractor Endpoint: http://localhost:11434/v1/chat/completions
Extractor Model: llama3.2:3bThis is the default and simplest setup.
Using Different LLM
For better extraction, you may use a different model:
Main LLM: llama3.2:3b (for answers)
Extractor LLM: phi3:mini (for question detection)Why different models?
- Question extraction needs speed, not creativity
- Smaller models work well for detection
- Frees up resources for main answer generation
Troubleshooting
No Questions Extracted
Causes:
- Interviewer hasn't asked a complete question yet
- Extractor model isn't detecting questions
- Transcript is empty
Solutions:
- Wait for interviewer to finish speaking
- Check extractor endpoint is working
- Test with a clear question: "What is your experience with React?"
Too Many Questions Extracted
Causes:
- Every statement is being treated as a question
- Model is too aggressive in detection
Solutions:
- Add explicit prompts in extractor system prompt
- Use a different model
- Check if hallucination phrases need updating
Slow Performance
Causes:
- Two LLM calls instead of one
- Large extractor model
Solutions:
- Use a smaller model for extraction (
phi3:mini) - Ensure GPU acceleration
- Check network latency
System Prompt for Extractor
The default extractor prompt:
You are a question extraction assistant for a live interview transcript.
The transcript is captured in real-time using short audio chunks, so
interviewer questions may be split across multiple consecutive entries.
Your job is to identify complete, unanswered interview questions.You can customize this in Settings → Advanced → Extractor System Prompt.
Customization Examples
Strict Question Detection
Only extract questions that:
1. End with a question mark
2. Start with a question word
3. Are complete sentences
Ignore statements, commands, and incomplete thoughts.Include Follow-ups
Identify both direct questions and implied follow-up questions.
For example, "That's interesting..." implies "Tell me more."Performance Tips
Optimize Extractor Model
bash
# Use a fast, small model
ollama pull phi3:mini
# Or use the same model if resources allow
# llama3.2:3b works fine for bothReduce Redundant Calls
- Increase debounce seconds to wait for complete thoughts
- Use Queue mode to batch question extraction
Next Steps
- Basic Mode — Default workflow
- Contextual Tips — Key points extraction