Skip to content

PGVector Context Retrieval

Connect Intervu to PostgreSQL with the pgvector extension for semantic search over your interview preparation materials.

Overview

PGVector integration allows Intervu to perform semantic similarity searches over your indexed documents. Instead of sending your entire resume or Q&A bank to the LLM, only the most relevant chunks are retrieved based on the current conversation context.

Benefits

  • Efficient context usage — Only relevant sections are sent to the LLM
  • Large document support — Index documents that exceed context limits
  • Cross-source retrieval — Search across resume, Q&A bank, and company context simultaneously
  • Configurable similarity thresholds — Tune retrieval precision to your needs

Prerequisites

PostgreSQL with pgvector

You need a PostgreSQL database with the pgvector extension installed.

Option 1: Docker (Recommended)

bash
docker run -d \
  --name pgvector \
  -e POSTGRES_PASSWORD=yourpassword \
  -e POSTGRES_DB=intervu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

Option 2: Existing PostgreSQL

Install the extension:

sql
CREATE EXTENSION IF NOT EXISTS vector;

Embedding Endpoint

Intervu needs an embedding endpoint to vectorize your text. Configure this in Settings:

SettingDescription
Embedding EndpointOpenAI-compatible /v1/embeddings or Ollama-native /api/embeddings
Embedding ModelModel to use (e.g., text-embedding-3-small, nomic-embed-text)
Embedding API KeyOptional, for authenticated endpoints
Embedding DimensionsOptional override (0 = use model default)

OpenWebUI + Ollama

For OpenWebUI with Ollama backends, use the Ollama-native path (e.g., https://your-server.com/ollama/api/embeddings) instead of OpenWebUI's /api/v1/embeddings, which may alter dimensions.


Configuration

Per-Source Settings

Each context source (Resume, Q&A Bank, Company Context) has its own PGVector configuration:

  1. Mode Toggle — Switch between Text and PGVector modes
  2. Connection Settings — Database credentials and table name

Connection Fields

FieldDescription
HostPostgreSQL server address
PortDatabase port (default: 5432)
DatabaseDatabase name
UserDatabase username
PasswordDatabase password
TableTarget table for this source
SSLEnable SSL connection

Test Connection

Click "Test Connection" to verify your settings. The test reports:

  • Connection success/failure
  • Row count in the target table
  • Dimension mismatch warnings (if table exists)

Table Not Found

A "table does not exist" message means the connection works but the table hasn't been created yet. Use "Index from text" to create and populate it.


Indexing Your Content

Index from Text

Click "Index from text" to populate your PGVector table from the text field:

  1. Intervu chunks your text into segments (target: 1500 chars, overlap: 200 chars)
  2. Each chunk is embedded using your configured embedding endpoint
  3. Chunks are inserted into the specified table

The table is auto-created with this schema:

sql
CREATE TABLE your_table (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding VECTOR(dimension)
);

Chunking Strategy

  • Paragraph-aware — Respects paragraph boundaries when possible
  • Sliding window — Falls back to overlapping windows for long paragraphs
  • Dimension consistency — All chunks use the same embedding dimension

Re-indexing

Re-indexing truncates the existing table and inserts fresh chunks. A dimension mismatch check prevents data corruption.


Retrieval Settings

Configure retrieval behavior in Settings → Advanced:

SettingDefaultDescription
Retrieval Top-K5Number of chunks to retrieve per source
Min Similarity0Minimum cosine similarity (0-1) for retrieved chunks

Higher Top-K = more context but more tokens. Higher Min Similarity = more precise but fewer results.


How It Works

Auto-Answer Mode

When PGVector is enabled for any source:

  1. User asks a question (transcript entry)
  2. Question is embedded via the embedding endpoint
  3. Cosine similarity search retrieves top-K chunks from each enabled source
  4. Retrieved chunks are injected into the LLM system prompt
  5. LLM generates an answer using the relevant context

Chat Window

The chat window uses the same retrieval pipeline:

  1. Your message is embedded
  2. Relevant chunks are fetched from all PGVector sources
  3. Chunks appear as "Loaded vector context" in the processing UI
  4. LLM receives the retrieved context along with your message

Fallback Behavior

If PGVector retrieval returns no results (below similarity threshold or empty table), Intervu falls back to the text content for that source. This ensures answers are always generated even if vector search fails.


Best Practices

Table Organization

  • Use separate tables for each source (resume, qa_bank, company_context)
  • Use descriptive table names to avoid confusion
  • Consider separate schemas for different interview prep projects

Chunk Size

  • 1500 characters works well for most documents
  • Smaller chunks = more precise retrieval but more database rows
  • Larger chunks = more context per result but less targeted

Embedding Model Selection

ModelDimensionsBest For
text-embedding-3-small1536General purpose, OpenAI
text-embedding-3-large3072Higher precision
nomic-embed-text768Local Ollama, efficient
all-minilm384Lightweight, fast

Similarity Threshold

  • 0.0 — Retrieve anything (may include irrelevant results)
  • 0.5 — Moderate filtering (balanced precision/recall)
  • 0.7+ — High precision (only very similar chunks)

Troubleshooting

Connection Refused

  • Verify PostgreSQL is running
  • Check host/port settings
  • Ensure firewall allows connections

Dimension Mismatch

Error: Table has dimension 768 but embedding produces 1536

Solution: Re-index the table with your current embedding model, or switch to a model matching the table's dimension.

No Results Retrieved

  • Lower the Min Similarity threshold
  • Check that your table has data (Test Connection shows row count)
  • Verify embedding endpoint is working (Test Embedding button)
  • Ensure the query relates to your indexed content

Indexing Fails

  • Check embedding endpoint logs
  • Verify API key if required
  • Ensure the model name is correct
  • Check network connectivity

Example Setup

Quick Start with Docker

bash
# 1. Start PostgreSQL with pgvector
docker run -d --name pgvector \
  -e POSTGRES_PASSWORD=mypassword \
  -e POSTGRES_DB=intervu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# 2. In Intervu Settings, configure embedding:
#    Endpoint: http://localhost:11434/api/embeddings (Ollama)
#    Model: nomic-embed-text

# 3. For each source (Resume, Q&A Bank, Company):
#    - Toggle to PGVector mode
#    - Host: localhost, Port: 5432, Database: intervu
#    - User: postgres, Password: mypassword
#    - Table: resume_vectors (unique per source)
#    - Click "Test Connection"

# 4. Paste your content in the text field

# 5. Click "Index from text"

# 6. Start your interview — context will be retrieved automatically

Security Considerations

  • Database credentials are stored locally in plain text
  • Use strong passwords and restricted database users
  • Consider SSL for remote databases
  • Embedding API keys are also stored locally

Made with ❤️by Aldrick Bonaobra