<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Untitled Publication]]></title><description><![CDATA[Untitled Publication]]></description><link>https://blogs.ummerfarooq.dev</link><generator>RSS for Node</generator><lastBuildDate>Sun, 26 Apr 2026 11:15:12 GMT</lastBuildDate><atom:link href="https://blogs.ummerfarooq.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Qdrant 101]]></title><description><![CDATA[Introduction
Qdrant is an open-source vector database designed for storing, searching, and managing high-dimensional vectors. It's particularly useful for AI applications like semantic search, recommendation systems, and RAG (Retrieval-Augmented Gene...]]></description><link>https://blogs.ummerfarooq.dev/qdrant-101</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/qdrant-101</guid><category><![CDATA[qdrant]]></category><category><![CDATA[vector database]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Thu, 21 Aug 2025 06:41:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755758453536/5b35ce87-bd10-465d-bce5-74c7e42388c6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>Qdrant is an open-source vector database designed for storing, searching, and managing high-dimensional vectors. It's particularly useful for AI applications like semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) implementations.</p>
<h2 id="heading-why-qdrant">Why Qdrant?</h2>
<ul>
<li><p><strong>High Performance</strong>: Written in Rust for optimal speed</p>
</li>
<li><p><strong>Scalability</strong>: Handles millions of vectors efficiently</p>
</li>
<li><p><strong>Rich Filtering</strong>: Complex metadata filtering capabilities</p>
</li>
<li><p><strong>Multiple Metrics</strong>: Support for various similarity metrics</p>
</li>
<li><p><strong>Easy Integration</strong>: REST API and multiple language clients</p>
</li>
</ul>
<h2 id="heading-key-terminologies">Key Terminologies</h2>
<h3 id="heading-core-concepts">Core Concepts</h3>
<p><strong>Vector</strong>: A high-dimensional numerical representation of data (text, images, etc.)</p>
<pre><code class="lang-python"><span class="hljs-comment"># Example vector (384 dimensions)</span>
vector = [<span class="hljs-number">0.1</span>, <span class="hljs-number">-0.2</span>, <span class="hljs-number">0.3</span>, ..., <span class="hljs-number">0.5</span>]
</code></pre>
<p><strong>Embedding</strong>: The process of converting raw data into vector representations</p>
<pre><code class="lang-python"><span class="hljs-comment"># Text to vector embedding</span>
text = <span class="hljs-string">"Artificial intelligence is transforming healthcare"</span>
<span class="hljs-comment"># After embedding: [0.023, -0.156, 0.891, ...]</span>
</code></pre>
<p><strong>Point</strong>: A vector combined with an optional payload (metadata) and unique ID</p>
<pre><code class="lang-python">point = {
    <span class="hljs-string">"id"</span>: <span class="hljs-number">1</span>,
    <span class="hljs-string">"vector"</span>: [<span class="hljs-number">0.1</span>, <span class="hljs-number">0.2</span>, <span class="hljs-number">0.3</span>, ...],  <span class="hljs-comment"># 1536-dim vector</span>
    <span class="hljs-string">"payload"</span>: {
        <span class="hljs-string">"title"</span>: <span class="hljs-string">"AI in Medical Diagnosis"</span>, 
        <span class="hljs-string">"category"</span>: <span class="hljs-string">"healthcare"</span>,
        <span class="hljs-string">"confidence"</span>: <span class="hljs-number">0.95</span>
    }
}
</code></pre>
<p><strong>Collection</strong>: A named set of vectors with the same dimensionality and distance metric</p>
<pre><code class="lang-python"><span class="hljs-comment"># Collections are like tables in traditional databases</span>
collection_name = <span class="hljs-string">"documents"</span>
</code></pre>
<p><strong>Payload</strong>: Metadata associated with a vector for filtering and additional information</p>
<pre><code class="lang-python">payload = {
    <span class="hljs-string">"title"</span>: <span class="hljs-string">"AI in Healthcare"</span>,
    <span class="hljs-string">"author"</span>: <span class="hljs-string">"John Doe"</span>,
    <span class="hljs-string">"published_date"</span>: <span class="hljs-string">"2024-01-15"</span>,
    <span class="hljs-string">"tags"</span>: [<span class="hljs-string">"AI"</span>, <span class="hljs-string">"healthcare"</span>, <span class="hljs-string">"machine learning"</span>]
}
</code></pre>
<p><strong>Distance Metric</strong>: Method to measure similarity between vectors</p>
<ul>
<li><p><strong>Cosine Distance</strong>: Measures angle between vectors</p>
<ul>
<li><p>Range: 0 (identical) to 2 (opposite)</p>
</li>
<li><p>Best for: Text embeddings, when magnitude doesn't matter</p>
</li>
<li><p>Use case: Document similarity, semantic search</p>
</li>
</ul>
</li>
<li><p><strong>Euclidean Distance</strong>: Measures straight-line distance in space</p>
<ul>
<li><p>Range: 0 (identical) to ∞</p>
</li>
<li><p>Best for: When both direction and magnitude matter</p>
</li>
<li><p>Use case: Image features, spatial data</p>
</li>
</ul>
</li>
<li><p><strong>Dot Product</strong>: Measures alignment and magnitude</p>
<ul>
<li><p>Range: -∞ to +∞ (higher = more similar)</p>
</li>
<li><p>Best for: When you want to consider vector magnitude</p>
</li>
<li><p>Use case: Recommendation systems with user preferences</p>
</li>
</ul>
</li>
</ul>
<hr />
<h2 id="heading-advanced-concepts">Advanced Concepts</h2>
<p><strong>Quantization</strong>: Technique to reduce memory usage by compressing vectors</p>
<ul>
<li><p><strong>Why</strong>: Reduces memory footprint by 2-8x, enables larger datasets</p>
</li>
<li><p><strong>Trade-off</strong>: Slight accuracy loss for significant memory savings</p>
</li>
<li><p><strong>Types</strong>: Scalar (INT8), Product (PQ), Binary</p>
</li>
</ul>
<p><strong>Indexing</strong>: Data structure optimization for faster search</p>
<ul>
<li><p><strong>HNSW</strong>: Hierarchical Navigable Small World graphs</p>
</li>
<li><p><strong>Purpose</strong>: Trade memory for search speed</p>
</li>
<li><p><strong>Configuration</strong>: Affects build time, memory usage, and search accuracy</p>
</li>
</ul>
<p><strong>Sharding</strong>: Distributing data across multiple nodes</p>
<ul>
<li><p><strong>Horizontal scaling</strong>: Split collection across multiple machines</p>
</li>
<li><p><strong>Load balancing</strong>: Distribute query load</p>
</li>
<li><p><strong>Fault tolerance</strong>: Replicas ensure availability</p>
</li>
</ul>
<p><strong>Replication</strong>: Creating copies for fault tolerance</p>
<ul>
<li><p><strong>Data safety</strong>: Multiple copies prevent data loss</p>
</li>
<li><p><strong>Read scaling</strong>: Distribute read queries across replicas</p>
</li>
<li><p><strong>Consistency</strong>: Strong or eventual consistency models</p>
</li>
</ul>
<hr />
<h2 id="heading-installation-amp-setup">Installation &amp; Setup</h2>
<h3 id="heading-docker-installation-recommended">Docker Installation (Recommended)</h3>
<p><strong>Basic Setup</strong></p>
<pre><code class="lang-bash"><span class="hljs-comment"># Basic setup</span>
docker run -p 6333:6333 qdrant/qdrant

<span class="hljs-comment"># With persistent storage</span>
docker run -p 6333:6333 -v $(<span class="hljs-built_in">pwd</span>)/storage:/qdrant/storage qdrant/qdrant
</code></pre>
<p><strong>Docker Compose for Production</strong></p>
<pre><code class="lang-bash"><span class="hljs-comment"># docker-compose.yml</span>
version: <span class="hljs-string">'3.7'</span>
services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - <span class="hljs-string">"6333:6333"</span>
    volumes:
      - qdrant_storage:/qdrant/storage
    environment:
      - QDRANT__SERVICE__HTTP_PORT=6333
      - QDRANT__STORAGE__STORAGE_PATH=/qdrant/storage
    restart: unless-stopped

volumes:
  qdrant_storage:
</code></pre>
<h3 id="heading-python-client-installation">Python Client Installation</h3>
<pre><code class="lang-bash">pip install qdrant-client openai python-dotenv
</code></pre>
<h3 id="heading-basic-connection-setup">Basic Connection Setup</h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient
<span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> Distance, VectorParams

<span class="hljs-comment"># Setup</span>
os.environ[<span class="hljs-string">"OPENAI_API_KEY"</span>] = <span class="hljs-string">"your-key-here"</span>
openai.api_key = os.getenv(<span class="hljs-string">"OPENAI_API_KEY"</span>)

<span class="hljs-comment"># Connect to Qdrant</span>
client = QdrantClient(<span class="hljs-string">"localhost"</span>, port=<span class="hljs-number">6333</span>)

<span class="hljs-comment"># Test OpenAI embedding</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_embedding</span>(<span class="hljs-params">text</span>):</span>
    response = openai.Embedding.create(
        input=text,
        model=<span class="hljs-string">"text-embedding-ada-002"</span>
    )
    <span class="hljs-keyword">return</span> response[<span class="hljs-string">'data'</span>][<span class="hljs-number">0</span>][<span class="hljs-string">'embedding'</span>]

<span class="hljs-comment"># Test</span>
test_embedding = get_embedding(<span class="hljs-string">"Hello world"</span>)
print(<span class="hljs-string">f"Embedding dimension: <span class="hljs-subst">{len(test_embedding)}</span>"</span>)  <span class="hljs-comment"># Should be 1536</span>
</code></pre>
<hr />
<h2 id="heading-core-concepts-1">Core Concepts</h2>
<h3 id="heading-vector-similarity">Vector Similarity</h3>
<p>Vector similarity is the foundation of semantic search. Let's explore how it works:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">cosine_similarity</span>(<span class="hljs-params">a, b</span>):</span>
    <span class="hljs-keyword">return</span> np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

<span class="hljs-comment"># Similar medical texts</span>
text1 = <span class="hljs-string">"AI is revolutionizing medical diagnosis"</span>
text2 = <span class="hljs-string">"Artificial intelligence transforms healthcare"</span>
text3 = <span class="hljs-string">"The weather is sunny today"</span>

vec1 = get_embedding(text1)
vec2 = get_embedding(text2)
vec3 = get_embedding(text3)

print(<span class="hljs-string">f"Medical similarity: <span class="hljs-subst">{cosine_similarity(vec1, vec2):<span class="hljs-number">.3</span>f}</span>"</span>)  <span class="hljs-comment"># High</span>
print(<span class="hljs-string">f"Different topic: <span class="hljs-subst">{cosine_similarity(vec1, vec3):<span class="hljs-number">.3</span>f}</span>"</span>)    <span class="hljs-comment"># Low</span>
</code></pre>
<hr />
<h2 id="heading-basic-operations">Basic Operations</h2>
<h3 id="heading-1-create-collection">1. Create Collection</h3>
<pre><code class="lang-python"><span class="hljs-comment"># Create collection for OpenAI embeddings</span>
client.create_collection(
    collection_name=<span class="hljs-string">"documents"</span>,
    vectors_config=VectorParams(
        size=<span class="hljs-number">1536</span>,  <span class="hljs-comment"># OpenAI ada-002 dimension</span>
        distance=Distance.COSINE
    )
)
</code></pre>
<h3 id="heading-2-insert-documents">2. Insert Documents</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> PointStruct
<span class="hljs-keyword">import</span> uuid

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">insert_document</span>(<span class="hljs-params">title, content, category</span>):</span>
    <span class="hljs-comment"># Generate embedding</span>
    text = <span class="hljs-string">f"<span class="hljs-subst">{title}</span>. <span class="hljs-subst">{content}</span>"</span>
    vector = get_embedding(text)

    <span class="hljs-comment"># Create point</span>
    point = PointStruct(
        id=str(uuid.uuid4()),
        vector=vector,
        payload={
            <span class="hljs-string">"title"</span>: title,
            <span class="hljs-string">"content"</span>: content,
            <span class="hljs-string">"category"</span>: category,
            <span class="hljs-string">"word_count"</span>: len(content.split())
        }
    )

    <span class="hljs-comment"># Insert</span>
    client.upsert(
        collection_name=<span class="hljs-string">"documents"</span>,
        points=[point]
    )
    <span class="hljs-keyword">return</span> point.id

<span class="hljs-comment"># Example usage</span>
doc_id = insert_document(
    <span class="hljs-string">"AI in Healthcare"</span>, 
    <span class="hljs-string">"Machine learning is transforming medical diagnosis..."</span>,
    <span class="hljs-string">"technology"</span>
)
</code></pre>
<h3 id="heading-3-advanced-search-with-openai-embeddings">3. Advanced Search with OpenAI Embeddings</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">search_documents</span>(<span class="hljs-params">query, limit=<span class="hljs-number">5</span></span>):</span>
    <span class="hljs-comment"># Generate query embedding</span>
    query_vector = get_embedding(query)

    <span class="hljs-comment"># Search</span>
    results = client.search(
        collection_name=<span class="hljs-string">"documents"</span>,
        query_vector=query_vector,
        limit=limit,
        with_payload=<span class="hljs-literal">True</span>
    )

    <span class="hljs-comment"># Format results</span>
    <span class="hljs-keyword">return</span> [{
        <span class="hljs-string">"title"</span>: r.payload[<span class="hljs-string">"title"</span>],
        <span class="hljs-string">"score"</span>: r.score,
        <span class="hljs-string">"category"</span>: r.payload[<span class="hljs-string">"category"</span>]
    } <span class="hljs-keyword">for</span> r <span class="hljs-keyword">in</span> results]

<span class="hljs-comment"># Search example</span>
results = search_documents(<span class="hljs-string">"artificial intelligence medicine"</span>)
<span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> results:
    print(<span class="hljs-string">f"<span class="hljs-subst">{result[<span class="hljs-string">'title'</span>]}</span> - Score: <span class="hljs-subst">{result[<span class="hljs-string">'score'</span>]:<span class="hljs-number">.3</span>f}</span>"</span>)
</code></pre>
<h3 id="heading-4-filter-search">4. Filter Search</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> Filter, FieldCondition, MatchValue

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">search_with_filter</span>(<span class="hljs-params">query, category, limit=<span class="hljs-number">5</span></span>):</span>
    query_vector = get_embedding(query)

    category_filter = Filter(
        must=[
            FieldCondition(
                key=<span class="hljs-string">"category"</span>,
                match=MatchValue(value=category)
            )
        ]
    )

    results = client.search(
        collection_name=<span class="hljs-string">"documents"</span>,
        query_vector=query_vector,
        query_filter=category_filter,
        limit=limit
    )

    <span class="hljs-keyword">return</span> results

<span class="hljs-comment"># Search only in technology category</span>
tech_results = search_with_filter(<span class="hljs-string">"machine learning"</span>, <span class="hljs-string">"technology"</span>)
</code></pre>
<hr />
<h2 id="heading-performance-optimization">Performance Optimization</h2>
<h3 id="heading-quantization">Quantization</h3>
<p><strong>Why Quantization?</strong> Quantization reduces memory usage by representing vectors with lower precision. This is crucial when dealing with millions of vectors.</p>
<p><strong>Memory Savings:</strong></p>
<ul>
<li><p><strong>Scalar (INT8)</strong>: General purpose, 75% memory reduction, 1-3% accuracy loss</p>
</li>
<li><p><strong>Product</strong>: Maximum compression, 87-95% reduction, 5-15% accuracy loss</p>
</li>
<li><p><strong>Binary</strong>: Extreme compression, 96% reduction, 20-40% accuracy loss</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> ScalarQuantization, QuantizationType

<span class="hljs-comment"># Apply INT8 quantization (recommended for most cases)</span>
client.update_collection(
    collection_name=<span class="hljs-string">"documents"</span>,
    quantization_config=ScalarQuantization(
        scalar=models.ScalarQuantizationConfig(
            type=QuantizationType.INT8,
            quantile=<span class="hljs-number">0.99</span>,
            always_ram=<span class="hljs-literal">True</span>
        )
    )
)
</code></pre>
<h3 id="heading-indexing-optimization">Indexing Optimization</h3>
<pre><code class="lang-python"><span class="hljs-comment"># Optimize for search speed</span>
client.update_collection(
    collection_name=<span class="hljs-string">"documents"</span>,
    hnsw_config=models.HnswConfigDiff(
        m=<span class="hljs-number">32</span>,              <span class="hljs-comment"># Higher = better quality, more memory</span>
        ef_construct=<span class="hljs-number">200</span>,  <span class="hljs-comment"># Higher = better quality, slower build</span>
        full_scan_threshold=<span class="hljs-number">10000</span>
    )
)
</code></pre>
<hr />
<h2 id="heading-advanced-operations">Advanced Operations</h2>
<h3 id="heading-complex-filtering">Complex Filtering</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> Range, MatchAny

<span class="hljs-comment"># Complex filter: technology OR healthcare, with substantial content</span>
complex_filter = Filter(
    must=[
        FieldCondition(key=<span class="hljs-string">"category"</span>, match=MatchAny(any=[<span class="hljs-string">"technology"</span>, <span class="hljs-string">"healthcare"</span>])),
        FieldCondition(key=<span class="hljs-string">"word_count"</span>, range=Range(gte=<span class="hljs-number">100</span>))
    ],
    must_not=[
        FieldCondition(key=<span class="hljs-string">"status"</span>, match=MatchValue(value=<span class="hljs-string">"draft"</span>))
    ]
)

results = client.search(
    collection_name=<span class="hljs-string">"documents"</span>,
    query_vector=query_vector,
    query_filter=complex_filter,
    limit=<span class="hljs-number">10</span>
)
</code></pre>
<h3 id="heading-batch-operations">Batch Operations</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">batch_insert_documents</span>(<span class="hljs-params">documents, batch_size=<span class="hljs-number">100</span></span>):</span>
    points = []

    <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> documents:
        text = <span class="hljs-string">f"<span class="hljs-subst">{doc[<span class="hljs-string">'title'</span>]}</span>. <span class="hljs-subst">{doc[<span class="hljs-string">'content'</span>]}</span>"</span>
        vector = get_embedding(text)

        points.append(PointStruct(
            id=str(uuid.uuid4()),
            vector=vector,
            payload=doc
        ))

    <span class="hljs-comment"># Insert in batches</span>
    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">0</span>, len(points), batch_size):
        batch = points[i:i + batch_size]
        client.upsert(collection_name=<span class="hljs-string">"documents"</span>, points=batch)
        print(<span class="hljs-string">f"Inserted batch <span class="hljs-subst">{i//batch_size + <span class="hljs-number">1</span>}</span>"</span>)

<span class="hljs-comment"># Usage</span>
sample_docs = [
    {<span class="hljs-string">"title"</span>: <span class="hljs-string">"Doc 1"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Content 1..."</span>, <span class="hljs-string">"category"</span>: <span class="hljs-string">"tech"</span>},
    {<span class="hljs-string">"title"</span>: <span class="hljs-string">"Doc 2"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">"Content 2..."</span>, <span class="hljs-string">"category"</span>: <span class="hljs-string">"health"</span>}
]
batch_insert_documents(sample_docs)
</code></pre>
<hr />
<h2 id="heading-best-practices">Best Practices</h2>
<h3 id="heading-1-embedding-generation">1. Embedding Generation</h3>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">EmbeddingManager</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        self.model = <span class="hljs-string">"text-embedding-ada-002"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">preprocess_text</span>(<span class="hljs-params">self, text</span>):</span>
        <span class="hljs-comment"># Clean and normalize text</span>
        text = text.strip()
        text = <span class="hljs-string">' '</span>.join(text.split())  <span class="hljs-comment"># Normalize whitespace</span>

        <span class="hljs-comment"># Truncate if too long (8191 tokens max for ada-002)</span>
        max_chars = <span class="hljs-number">8191</span> * <span class="hljs-number">4</span>  <span class="hljs-comment"># ~4 chars per token</span>
        <span class="hljs-keyword">if</span> len(text) &gt; max_chars:
            text = text[:max_chars]

        <span class="hljs-keyword">return</span> text

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_embedding</span>(<span class="hljs-params">self, text</span>):</span>
        processed_text = self.preprocess_text(text)

        response = openai.Embedding.create(
            input=processed_text,
            model=self.model
        )
        <span class="hljs-keyword">return</span> response[<span class="hljs-string">'data'</span>][<span class="hljs-number">0</span>][<span class="hljs-string">'embedding'</span>]

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize_for_search</span>(<span class="hljs-params">self, title, content</span>):</span>
        <span class="hljs-comment"># Weight title more heavily</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"<span class="hljs-subst">{title}</span>. <span class="hljs-subst">{title}</span>. <span class="hljs-subst">{content}</span>"</span>
</code></pre>
<h3 id="heading-2-error-handling">2. Error Handling</h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> time
<span class="hljs-keyword">from</span> functools <span class="hljs-keyword">import</span> wraps

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">retry_on_failure</span>(<span class="hljs-params">max_retries=<span class="hljs-number">3</span></span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">decorator</span>(<span class="hljs-params">func</span>):</span>
<span class="hljs-meta">        @wraps(func)</span>
        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">wrapper</span>(<span class="hljs-params">*args, **kwargs</span>):</span>
            <span class="hljs-keyword">for</span> attempt <span class="hljs-keyword">in</span> range(max_retries):
                <span class="hljs-keyword">try</span>:
                    <span class="hljs-keyword">return</span> func(*args, **kwargs)
                <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
                    <span class="hljs-keyword">if</span> attempt == max_retries - <span class="hljs-number">1</span>:
                        <span class="hljs-keyword">raise</span> e
                    wait_time = <span class="hljs-number">2</span> ** attempt
                    print(<span class="hljs-string">f"Attempt <span class="hljs-subst">{attempt + <span class="hljs-number">1</span>}</span> failed, retrying in <span class="hljs-subst">{wait_time}</span>s..."</span>)
                    time.sleep(wait_time)
        <span class="hljs-keyword">return</span> wrapper
    <span class="hljs-keyword">return</span> decorator

<span class="hljs-meta">@retry_on_failure(max_retries=3)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">safe_upsert</span>(<span class="hljs-params">collection_name, points</span>):</span>
    <span class="hljs-keyword">return</span> client.upsert(collection_name=collection_name, points=points)
</code></pre>
<h3 id="heading-3-collection-design">3. Collection Design</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_optimized_collection</span>(<span class="hljs-params">name, expected_size</span>):</span>
    <span class="hljs-keyword">if</span> expected_size &lt; <span class="hljs-number">10000</span>:
        <span class="hljs-comment"># Small collection - no quantization needed</span>
        config = VectorParams(size=<span class="hljs-number">1536</span>, distance=Distance.COSINE)
        quantization = <span class="hljs-literal">None</span>
    <span class="hljs-keyword">elif</span> expected_size &lt; <span class="hljs-number">1000000</span>:
        <span class="hljs-comment"># Medium collection - use scalar quantization</span>
        config = VectorParams(size=<span class="hljs-number">1536</span>, distance=Distance.COSINE)
        quantization = ScalarQuantization(
            scalar=models.ScalarQuantizationConfig(
                type=QuantizationType.INT8,
                quantile=<span class="hljs-number">0.99</span>
            )
        )
    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># Large collection - aggressive optimization</span>
        config = VectorParams(
            size=<span class="hljs-number">1536</span>, 
            distance=Distance.COSINE,
            hnsw_config=models.HnswConfigDiff(m=<span class="hljs-number">16</span>, ef_construct=<span class="hljs-number">100</span>)
        )
        quantization = models.ProductQuantization(
            product=models.ProductQuantizationConfig(
                compression=models.CompressionRatio.X8
            )
        )

    client.create_collection(
        collection_name=name,
        vectors_config=config,
        quantization_config=quantization
    )
</code></pre>
<hr />
<h2 id="heading-production-considerations">Production Considerations</h2>
<h3 id="heading-health-monitoring">Health Monitoring</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">health_check</span>():</span>
    <span class="hljs-keyword">try</span>:
        collections = client.get_collections()

        <span class="hljs-keyword">for</span> collection <span class="hljs-keyword">in</span> collections.collections:
            info = client.get_collection(collection.name)
            print(<span class="hljs-string">f"Collection: <span class="hljs-subst">{collection.name}</span>"</span>)
            print(<span class="hljs-string">f"  Points: <span class="hljs-subst">{info.points_count}</span>"</span>)
            print(<span class="hljs-string">f"  Indexed: <span class="hljs-subst">{info.indexed_vectors_count}</span>"</span>)

            <span class="hljs-keyword">if</span> info.vectors_count != info.indexed_vectors_count:
                print(<span class="hljs-string">f"  Warning: Indexing behind!"</span>)

        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        print(<span class="hljs-string">f"Health check failed: <span class="hljs-subst">{e}</span>"</span>)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
</code></pre>
<h3 id="heading-backup-strategy">Backup Strategy</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">backup_collection</span>(<span class="hljs-params">collection_name, backup_file</span>):</span>
    all_points = []
    offset = <span class="hljs-literal">None</span>

    <span class="hljs-keyword">while</span> <span class="hljs-literal">True</span>:
        result = client.scroll(
            collection_name=collection_name,
            offset=offset,
            limit=<span class="hljs-number">1000</span>,
            with_payload=<span class="hljs-literal">True</span>,
            with_vectors=<span class="hljs-literal">True</span>
        )

        points, next_offset = result
        all_points.extend([{
            <span class="hljs-string">"id"</span>: p.id,
            <span class="hljs-string">"vector"</span>: p.vector,
            <span class="hljs-string">"payload"</span>: p.payload
        } <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> points])

        <span class="hljs-keyword">if</span> next_offset <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
            <span class="hljs-keyword">break</span>
        offset = next_offset

    <span class="hljs-keyword">import</span> json
    <span class="hljs-keyword">with</span> open(backup_file, <span class="hljs-string">'w'</span>) <span class="hljs-keyword">as</span> f:
        json.dump(all_points, f)

    print(<span class="hljs-string">f"Backed up <span class="hljs-subst">{len(all_points)}</span> points"</span>)
</code></pre>
]]></content:encoded></item><item><title><![CDATA[Why Dense and Sparse Vectors Work Better Together: A Beginner's Guide to Multi-Vector Collections]]></title><description><![CDATA[Imagine you're looking for a book in a massive library. You might search by the exact title (keyword search) or describe what the book is about (semantic search). Sometimes you need both approaches to find exactly what you're looking for. This is pre...]]></description><link>https://blogs.ummerfarooq.dev/why-dense-and-sparse-vectors-work-better-together-a-beginners-guide-to-multi-vector-collections</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/why-dense-and-sparse-vectors-work-better-together-a-beginners-guide-to-multi-vector-collections</guid><category><![CDATA[qdrant]]></category><category><![CDATA[embedding]]></category><category><![CDATA[similarity search]]></category><category><![CDATA[semantic search]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Thu, 14 Aug 2025 12:22:05 GMT</pubDate><content:encoded><![CDATA[<p>Imagine you're looking for a book in a massive library. You might search by the exact title (keyword search) or describe what the book is about (semantic search). Sometimes you need both approaches to find exactly what you're looking for. This is precisely why we use multi-vector collections in vector databases!</p>
<h2 id="heading-the-problem-with-the-single-vector-approach">The Problem with the Single Vector Approach</h2>
<p>Let's start with a real-world example to understand the limitation:</p>
<p><strong>Document</strong>: "The neural network model exhibits overfitting behaviour on the training dataset"</p>
<p><strong>User Query 1</strong>: "overfitting neural network"<br /><strong>User Query 2</strong>: "AI model performing poorly on training data"</p>
<p>With a single dense vector approach:</p>
<ul>
<li><p>Query 1 might not match well because dense vectors focus on the overall meaning</p>
</li>
<li><p>Query 2 might miss the document because it doesn't contain exact terms like "AI" or "performing poorly"</p>
</li>
</ul>
<p>This is where the magic of combining different vector types comes in!</p>
<h2 id="heading-what-are-multi-vector-collections">What Are Multi-Vector Collections?</h2>
<p>Multi-vector collections allow you to store multiple different representations of the same content within a single collection. Think of it as having different "lenses" through which you can view and search your data:</p>
<ul>
<li><p><strong>Dense Vectors</strong>: Understand meaning and context</p>
</li>
<li><p><strong>Sparse Vectors</strong>: Focus on exact keywords and terms</p>
</li>
<li><p><strong>Hybrid Approach</strong>: Combines both for superior search results</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-comment"># Example of multi-vector structure</span>
document_vectors = {
    <span class="hljs-string">"semantic"</span>: [<span class="hljs-number">0.123</span>, <span class="hljs-number">-0.456</span>, <span class="hljs-number">0.789</span>, ...],      <span class="hljs-comment"># Dense vector (1536 dims)</span>
    <span class="hljs-string">"keywords"</span>: [<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.67</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0.45</span>, <span class="hljs-number">0</span>, ...],    <span class="hljs-comment"># Sparse vector (5000 dims)</span>
    <span class="hljs-string">"metadata"</span>: {<span class="hljs-string">"title"</span>: <span class="hljs-string">"ML Paper"</span>, <span class="hljs-string">"author"</span>: <span class="hljs-string">"John Doe"</span>}
}
</code></pre>
<h2 id="heading-why-do-we-need-a-multi-vector-approach">Why Do We Need a Multi-Vector Approach?</h2>
<h3 id="heading-1-complementary-strengths">1. <strong>Complementary Strengths</strong></h3>
<p><strong>Dense Vectors Excel At:</strong></p>
<ul>
<li><p>Understanding synonyms ("car" ≈ ", automobile")</p>
</li>
<li><p>Capturing context and meaning</p>
</li>
<li><p>Finding conceptually similar content</p>
</li>
<li><p>Handling paraphrases and different ways of expressing ideas</p>
</li>
</ul>
<p><strong>Sparse Vectors Excel At:</strong></p>
<ul>
<li><p>Exact keyword matching</p>
</li>
<li><p>Finding specific terms or phrases</p>
</li>
<li><p>Technical terminology searches</p>
</li>
<li><p>Proper nouns and unique identifiers</p>
</li>
</ul>
<h3 id="heading-2-real-world-search-scenarios">2. <strong>Real-World Search Scenarios</strong></h3>
<p>Consider an e-commerce product search:</p>
<pre><code class="lang-plaintext"># Product: "Apple MacBook Pro 16-inch M2 laptop computer"

# User searches: "16 inch Apple laptop"
# Dense vector: Understands "laptop" ≈ "computer" ≈ "MacBook"
# Sparse vector: Matches exact terms "16", "inch", "Apple", "laptop"
# Combined: Perfect match!

# User searches: "portable workstation for developers"  
# Dense vector: Connects "portable workstation" with "laptop computer"
# Sparse vector: Might miss due to different terminology
# Combined: Dense carries the weight, sparse provides precision
</code></pre>
<h3 id="heading-3-improved-retrieval-quality">3. <strong>Improved Retrieval Quality</strong></h3>
<p>Studies show that hybrid search (dense + sparse) typically achieves:</p>
<ul>
<li><p><strong>20-40% better recall</strong> than dense alone</p>
</li>
<li><p><strong>15-30% better precision</strong> than sparse alone</p>
</li>
<li><p>More robust results across different query types</p>
</li>
</ul>
<hr />
<h2 id="heading-types-of-sparse-vector-creation-methods">Types of Sparse Vector Creation Methods</h2>
<h3 id="heading-1-tf-idf-term-frequency-inverse-document-frequency">1. <strong>TF-IDF (Term Frequency-Inverse Document Frequency)</strong></h3>
<p>The classic statistical approach that weighs terms by their importance.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> sklearn.feature_extraction.text <span class="hljs-keyword">import</span> TfidfVectorizer

<span class="hljs-comment"># Simple TF-IDF example</span>
documents = [
    <span class="hljs-string">"machine learning algorithms"</span>,
    <span class="hljs-string">"deep learning neural networks"</span>, 
    <span class="hljs-string">"artificial intelligence applications"</span>
]

vectorizer = TfidfVectorizer(max_features=<span class="hljs-number">1000</span>)
tfidf_matrix = vectorizer.fit_transform(documents)

<span class="hljs-comment"># For new document</span>
new_doc = <span class="hljs-string">"machine learning models"</span>
sparse_vector = vectorizer.transform([new_doc]).toarray()[<span class="hljs-number">0</span>]
print(<span class="hljs-string">f"Sparse vector shape: <span class="hljs-subst">{sparse_vector.shape}</span>"</span>)  <span class="hljs-comment"># (1000,)</span>
print(<span class="hljs-string">f"Non-zero elements: <span class="hljs-subst">{(sparse_vector != <span class="hljs-number">0</span>).sum()}</span>"</span>)  <span class="hljs-comment"># Only few non-zero</span>
</code></pre>
<p><strong>When to use TF-IDF:</strong></p>
<ul>
<li><p>General-purpose keyword matching</p>
</li>
<li><p>When you have a well-defined vocabulary</p>
</li>
<li><p>Documents with clear term boundaries</p>
</li>
</ul>
<h3 id="heading-2-bm25-best-matching-25">2. <strong>BM25 (Best Matching 25)</strong></h3>
<p>An improved version of TF-IDF that handles document length better.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> rank_bm25 <span class="hljs-keyword">import</span> BM25Okapi
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

<span class="hljs-comment"># BM25 implementation</span>
documents = [
    <span class="hljs-string">"machine learning algorithms for data science"</span>,
    <span class="hljs-string">"deep learning and neural network architectures"</span>,
    <span class="hljs-string">"natural language processing with transformers"</span>
]

<span class="hljs-comment"># Tokenize documents</span>
tokenized_docs = [doc.split() <span class="hljs-keyword">for</span> doc <span class="hljs-keyword">in</span> documents]
bm25 = BM25Okapi(tokenized_docs)

<span class="hljs-comment"># Create sparse vector for query</span>
query = <span class="hljs-string">"machine learning data"</span>
query_tokens = query.split()

<span class="hljs-comment"># Get BM25 scores for all documents</span>
scores = bm25.get_scores(query_tokens)
print(<span class="hljs-string">f"BM25 scores: <span class="hljs-subst">{scores}</span>"</span>)

<span class="hljs-comment"># Convert to sparse vector representation</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_bm25_sparse_vector</span>(<span class="hljs-params">query_tokens, bm25_model, vocab_size</span>):</span>
    <span class="hljs-comment"># This is a simplified representation</span>
    sparse_vector = np.zeros(vocab_size)
    <span class="hljs-comment"># Map BM25 scores to vocabulary positions</span>
    <span class="hljs-comment"># (Implementation depends on your vocabulary mapping)</span>
    <span class="hljs-keyword">return</span> sparse_vector
</code></pre>
<p><strong>When to use BM25:</strong></p>
<ul>
<li><p>Document retrieval systems</p>
</li>
<li><p>When document length varies significantly</p>
</li>
<li><p>Search engines and information retrieval</p>
</li>
</ul>
<h3 id="heading-3-splade-sparse-lexical-and-expansion">3. <strong>SPLADE (Sparse Lexical and Expansion)</strong></h3>
<p>A neural approach that learns to create sparse vectors.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoTokenizer, AutoModelForMaskedLM
<span class="hljs-keyword">import</span> torch

<span class="hljs-comment"># SPLADE creates learned sparse vectors</span>
model_name = <span class="hljs-string">"naver/splade-cocondenser-ensembledistil"</span>
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_splade_vector</span>(<span class="hljs-params">text</span>):</span>
    inputs = tokenizer(text, return_tensors=<span class="hljs-string">"pt"</span>, truncation=<span class="hljs-literal">True</span>, max_length=<span class="hljs-number">512</span>)

    <span class="hljs-keyword">with</span> torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits

    <span class="hljs-comment"># Apply activation and create sparse representation</span>
    sparse_vector = torch.relu(logits.squeeze()).cpu().numpy()

    <span class="hljs-comment"># Keep only top-k dimensions (for sparsity)</span>
    top_k = <span class="hljs-number">100</span>
    top_indices = np.argsort(sparse_vector)[-top_k:]
    final_sparse = np.zeros_like(sparse_vector)
    final_sparse[top_indices] = sparse_vector[top_indices]

    <span class="hljs-keyword">return</span> final_sparse

<span class="hljs-comment"># Usage</span>
text = <span class="hljs-string">"machine learning model optimization"</span>
splade_vector = create_splade_vector(text)
print(<span class="hljs-string">f"SPLADE vector sparsity: <span class="hljs-subst">{(splade_vector == <span class="hljs-number">0</span>).sum() / len(splade_vector):<span class="hljs-number">.2</span>%}</span>"</span>)
</code></pre>
<p><strong>When to use SPLADE:</strong></p>
<ul>
<li><p>When you need learned sparse representations</p>
</li>
<li><p>Complex domain-specific terminology</p>
</li>
<li><p>When you can afford the computational cost</p>
</li>
</ul>
<h3 id="heading-4-colbert-contextualized-late-interaction">4. <strong>ColBERT (Contextualized Late Interaction)</strong></h3>
<p>Creates multiple vectors per document for fine-grained matching.</p>
<pre><code class="lang-python"><span class="hljs-comment"># ColBERT conceptual approach</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">colbert_sparse_simulation</span>(<span class="hljs-params">text, model</span>):</span>
    <span class="hljs-comment"># ColBERT creates a vector for each token</span>
    tokens = text.split()
    token_vectors = []

    <span class="hljs-keyword">for</span> token <span class="hljs-keyword">in</span> tokens:
        <span class="hljs-comment"># Each token gets its own contextualized vector</span>
        token_embedding = model.encode(token)  <span class="hljs-comment"># Simplified</span>
        token_vectors.append(token_embedding)

    <span class="hljs-keyword">return</span> token_vectors

<span class="hljs-comment"># This creates multiple sparse-like representations</span>
<span class="hljs-comment"># that can be stored and searched efficiently</span>
</code></pre>
<hr />
<h2 id="heading-practical-implementation-with-qdrant">Practical Implementation with Qdrant</h2>
<p>Here's how to implement multi-vector collections in your setup:</p>
<h3 id="heading-1-collection-setup">1. <strong>Collection Setup</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient
<span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> VectorParams, Distance

client = QdrantClient(<span class="hljs-string">"localhost"</span>, port=<span class="hljs-number">6333</span>)

<span class="hljs-comment"># Create collection with multiple vectors</span>
client.create_collection(
    collection_name=<span class="hljs-string">"hybrid_search"</span>,
    vectors_config={
        <span class="hljs-string">"semantic"</span>: VectorParams(size=<span class="hljs-number">1536</span>, distance=Distance.COSINE),  <span class="hljs-comment"># OpenAI</span>
        <span class="hljs-string">"keywords"</span>: VectorParams(size=<span class="hljs-number">5000</span>, distance=Distance.DOT),     <span class="hljs-comment"># TF-IDF</span>
        <span class="hljs-string">"bm25"</span>: VectorParams(size=<span class="hljs-number">3000</span>, distance=Distance.DOT),         <span class="hljs-comment"># BM25</span>
    }
)
</code></pre>
<h3 id="heading-2-document-processing-pipeline">2. <strong>Document Processing Pipeline</strong></h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">from</span> sklearn.feature_extraction.text <span class="hljs-keyword">import</span> TfidfVectorizer

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">MultiVectorProcessor</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        self.tfidf = TfidfVectorizer(max_features=<span class="hljs-number">5000</span>)
        self.openai_client = openai.Client()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_dense_vector</span>(<span class="hljs-params">self, text</span>):</span>
        <span class="hljs-string">"""Create semantic dense vector using OpenAI"""</span>
        response = self.openai_client.embeddings.create(
            input=text,
            model=<span class="hljs-string">"text-embedding-ada-002"</span>
        )
        <span class="hljs-keyword">return</span> response.data[<span class="hljs-number">0</span>].embedding

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_sparse_vector</span>(<span class="hljs-params">self, text, method=<span class="hljs-string">"tfidf"</span></span>):</span>
        <span class="hljs-string">"""Create sparse vector using specified method"""</span>
        <span class="hljs-keyword">if</span> method == <span class="hljs-string">"tfidf"</span>:
            <span class="hljs-keyword">return</span> self.tfidf.transform([text]).toarray()[<span class="hljs-number">0</span>].tolist()
        <span class="hljs-comment"># Add other methods as needed</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">process_document</span>(<span class="hljs-params">self, text, doc_id</span>):</span>
        <span class="hljs-string">"""Process single document into multi-vector format"""</span>
        <span class="hljs-keyword">return</span> {
            <span class="hljs-string">"id"</span>: doc_id,
            <span class="hljs-string">"vectors"</span>: {
                <span class="hljs-string">"semantic"</span>: self.create_dense_vector(text),
                <span class="hljs-string">"keywords"</span>: self.create_sparse_vector(text, <span class="hljs-string">"tfidf"</span>),
            },
            <span class="hljs-string">"payload"</span>: {
                <span class="hljs-string">"text"</span>: text,
                <span class="hljs-string">"processed_at"</span>: time.time()
            }
        }
</code></pre>
<h3 id="heading-3-hybrid-search-implementation">3. <strong>Hybrid Search Implementation</strong></h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">hybrid_search</span>(<span class="hljs-params">query, collection_name, weights=None</span>):</span>
    <span class="hljs-string">"""
    Perform hybrid search combining multiple vector types
    """</span>
    <span class="hljs-keyword">if</span> weights <span class="hljs-keyword">is</span> <span class="hljs-literal">None</span>:
        weights = {<span class="hljs-string">"semantic"</span>: <span class="hljs-number">0.7</span>, <span class="hljs-string">"keywords"</span>: <span class="hljs-number">0.3</span>}

    processor = MultiVectorProcessor()

    <span class="hljs-comment"># Create query vectors</span>
    query_semantic = processor.create_dense_vector(query)
    query_keywords = processor.create_sparse_vector(query)

    <span class="hljs-comment"># Search with semantic vector</span>
    semantic_results = client.search(
        collection_name=collection_name,
        query_vector=(<span class="hljs-string">"semantic"</span>, query_semantic),
        limit=<span class="hljs-number">20</span>
    )

    <span class="hljs-comment"># Search with keyword vector</span>
    keyword_results = client.search(
        collection_name=collection_name,
        query_vector=(<span class="hljs-string">"keywords"</span>, query_keywords),
        limit=<span class="hljs-number">20</span>
    )

    <span class="hljs-comment"># Combine results with weighted scoring</span>
    combined_results = combine_and_rerank(
        semantic_results, keyword_results, weights
    )

    <span class="hljs-keyword">return</span> combined_results[:<span class="hljs-number">10</span>]  <span class="hljs-comment"># Top 10 results</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">combine_and_rerank</span>(<span class="hljs-params">semantic_results, keyword_results, weights</span>):</span>
    <span class="hljs-string">"""Combine and rerank results from different vector searches"""</span>
    result_scores = {}

    <span class="hljs-comment"># Score semantic results</span>
    <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> semantic_results:
        doc_id = result.id
        result_scores[doc_id] = result_scores.get(doc_id, <span class="hljs-number">0</span>) + \
                               (result.score * weights[<span class="hljs-string">"semantic"</span>])

    <span class="hljs-comment"># Score keyword results  </span>
    <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> keyword_results:
        doc_id = result.id
        result_scores[doc_id] = result_scores.get(doc_id, <span class="hljs-number">0</span>) + \
                               (result.score * weights[<span class="hljs-string">"keywords"</span>])

    <span class="hljs-comment"># Sort by combined score</span>
    sorted_results = sorted(result_scores.items(), 
                          key=<span class="hljs-keyword">lambda</span> x: x[<span class="hljs-number">1</span>], reverse=<span class="hljs-literal">True</span>)
    <span class="hljs-keyword">return</span> sorted_results
</code></pre>
<hr />
<h2 id="heading-when-to-use-a-multi-vector-approach">When to Use a Multi-Vector Approach</h2>
<h3 id="heading-ideal-use-cases"><strong>Ideal Use Cases:</strong></h3>
<ol>
<li><p><strong>E-commerce Search</strong>: Product catalogues need both exact matches and semantic understanding</p>
</li>
<li><p><strong>Legal Document Retrieval</strong>: Exact legal terms + conceptual case law matching</p>
</li>
<li><p><strong>Academic Paper Search</strong>: Technical keywords + research concept similarity</p>
</li>
<li><p><strong>Customer Support</strong>: FAQ systems benefit from keyword precision + intent understanding</p>
</li>
<li><p><strong>Enterprise Search</strong>: Internal documents with domain-specific terminology</p>
</li>
</ol>
]]></content:encoded></item><item><title><![CDATA[A Guide to Document Chunking and Vector Search]]></title><description><![CDATA[Introduction: Why Traditional Search Falls Short
Imagine you're searching through a massive company knowledge base for information about "machine learning best practices." Traditional keyword search might return hundreds of documents, but you end up ...]]></description><link>https://blogs.ummerfarooq.dev/a-guide-to-document-chunking-and-vector-search</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/a-guide-to-document-chunking-and-vector-search</guid><category><![CDATA[qdrant]]></category><category><![CDATA[semantic search]]></category><category><![CDATA[langchain]]></category><category><![CDATA[chunking]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Thu, 14 Aug 2025 11:17:59 GMT</pubDate><content:encoded><![CDATA[<h2 id="heading-introduction-why-traditional-search-falls-short">Introduction: Why Traditional Search Falls Short</h2>
<p>Imagine you're searching through a massive company knowledge base for information about "machine learning best practices." Traditional keyword search might return hundreds of documents, but you end up scrolling through irrelevant results because:</p>
<ul>
<li><p>The term "machine learning" appears in random sentences throughout documents</p>
</li>
<li><p>You get the entire 50-page document when you need a specific section</p>
</li>
<li><p>Important documents are buried because they use synonyms like "AI" or "artificial intelligence"</p>
</li>
</ul>
<p>This is where <strong>intelligent document chunking</strong> and <strong>vector search</strong> come to the rescue. Instead of treating documents like black boxes, we break them down intelligently and search through them using AI that understands meaning, not just keywords.</p>
<p>But here's the thing: there are multiple ways to approach this problem, each with its strengths and use cases. Let's examine the primary strategies that production systems employ today.</p>
<h2 id="heading-the-three-main-approaches-explained">The Three Main Approaches Explained</h2>
<h3 id="heading-1-fixed-size-chunking-the-simple-approach">1. Fixed-Size Chunking: The Simple Approach</h3>
<p><strong>What it is</strong>: Cut documents into equal-sized pieces, like slicing bread.</p>
<p><strong>How it works</strong>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> CharacterTextSplitter

<span class="hljs-comment"># Simple character-based chunking</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">simple_chunking</span>(<span class="hljs-params">document, chunk_size=<span class="hljs-number">500</span>, chunk_overlap=<span class="hljs-number">50</span></span>):</span>
    text_splitter = CharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap,
        separator=<span class="hljs-string">"\n"</span>
    )
    chunks = text_splitter.split_text(document)
    <span class="hljs-keyword">return</span> chunks

<span class="hljs-comment"># Example</span>
document = <span class="hljs-string">"Machine learning is revolutionizing healthcare..."</span>
chunks = simple_chunking(document, <span class="hljs-number">200</span>)
<span class="hljs-comment"># Result: ["Machine learning is revolutionizing healthcare by enabling...", </span>
<span class="hljs-comment">#          "...doctors to diagnose diseases faster. Recent studies show..."]</span>
</code></pre>
<p><strong>Real-world example</strong>: Netflix might use this approach for subtitles or movie descriptions where the content is relatively uniform.</p>
<p><strong>Pros</strong>:</p>
<ul>
<li><p>✅ Simple to implement</p>
</li>
<li><p>✅ Predictable memory usage</p>
</li>
<li><p>✅ Works well for uniform content (novels, articles)</p>
</li>
</ul>
<p><strong>Cons</strong>:</p>
<ul>
<li><p>❌ Cuts through sentences mid-thought</p>
</li>
<li><p>❌ Loses document structure</p>
</li>
<li><p>❌ Poor for complex documents</p>
</li>
</ul>
<h3 id="heading-2-semantic-chunking-the-smart-approach">2. Semantic Chunking: The Smart Approach</h3>
<p><strong>What it is</strong>: Split documents based on meaning and structure, like organising a library by topics.</p>
<p><strong>How it works</strong>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> RecursiveCharacterTextSplitter
<span class="hljs-keyword">from</span> langchain.text_splitter <span class="hljs-keyword">import</span> MarkdownHeaderTextSplitter
<span class="hljs-keyword">import</span> re

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">semantic_chunking_with_langchain</span>(<span class="hljs-params">document, doc_type=<span class="hljs-string">"markdown"</span></span>):</span>
    chunks = []

    <span class="hljs-keyword">if</span> doc_type == <span class="hljs-string">"markdown"</span>:
        <span class="hljs-comment"># For markdown documents, split by headers</span>
        headers_to_split_on = [
            (<span class="hljs-string">"#"</span>, <span class="hljs-string">"Header 1"</span>),
            (<span class="hljs-string">"##"</span>, <span class="hljs-string">"Header 2"</span>),
            (<span class="hljs-string">"###"</span>, <span class="hljs-string">"Header 3"</span>),
        ]

        markdown_splitter = MarkdownHeaderTextSplitter(
            headers_to_split_on=headers_to_split_on
        )
        md_header_splits = markdown_splitter.split_text(document)

        <span class="hljs-comment"># Further split large sections</span>
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=<span class="hljs-number">500</span>,
            chunk_overlap=<span class="hljs-number">50</span>
        )

        <span class="hljs-keyword">for</span> header_chunk <span class="hljs-keyword">in</span> md_header_splits:
            chunk_type = detect_section_type(header_chunk.page_content, header_chunk.metadata)

            <span class="hljs-comment"># Split large chunks further while preserving structure</span>
            <span class="hljs-keyword">if</span> len(header_chunk.page_content) &gt; <span class="hljs-number">800</span>:
                sub_chunks = text_splitter.split_text(header_chunk.page_content)
                <span class="hljs-keyword">for</span> i, sub_chunk <span class="hljs-keyword">in</span> enumerate(sub_chunks):
                    chunks.append({
                        <span class="hljs-string">'content'</span>: sub_chunk,
                        <span class="hljs-string">'type'</span>: chunk_type,
                        <span class="hljs-string">'metadata'</span>: {
                            **header_chunk.metadata,
                            <span class="hljs-string">'sub_chunk_index'</span>: i,
                            <span class="hljs-string">'is_split_chunk'</span>: <span class="hljs-literal">True</span>
                        }
                    })
            <span class="hljs-keyword">else</span>:
                chunks.append({
                    <span class="hljs-string">'content'</span>: header_chunk.page_content,
                    <span class="hljs-string">'type'</span>: chunk_type,
                    <span class="hljs-string">'metadata'</span>: header_chunk.metadata
                })

    <span class="hljs-keyword">else</span>:
        <span class="hljs-comment"># For plain text, use recursive splitting with custom separators</span>
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=<span class="hljs-number">500</span>,
            chunk_overlap=<span class="hljs-number">50</span>,
            separators=[<span class="hljs-string">"\n\n\n"</span>, <span class="hljs-string">"\n\n"</span>, <span class="hljs-string">"\n"</span>, <span class="hljs-string">"."</span>, <span class="hljs-string">"!"</span>, <span class="hljs-string">"?"</span>, <span class="hljs-string">" "</span>, <span class="hljs-string">""</span>]
        )

        raw_chunks = text_splitter.split_text(document)

        <span class="hljs-keyword">for</span> i, chunk <span class="hljs-keyword">in</span> enumerate(raw_chunks):
            chunk_type = detect_section_type(chunk)
            chunks.append({
                <span class="hljs-string">'content'</span>: chunk,
                <span class="hljs-string">'type'</span>: chunk_type,
                <span class="hljs-string">'metadata'</span>: {<span class="hljs-string">'chunk_index'</span>: i}
            })

    <span class="hljs-keyword">return</span> chunks

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">detect_section_type</span>(<span class="hljs-params">text, existing_metadata=None</span>):</span>
    text_lower = text.lower()

    <span class="hljs-comment"># Use existing header metadata if available</span>
    <span class="hljs-keyword">if</span> existing_metadata:
        <span class="hljs-keyword">for</span> key, value <span class="hljs-keyword">in</span> existing_metadata.items():
            <span class="hljs-keyword">if</span> <span class="hljs-string">'header'</span> <span class="hljs-keyword">in</span> key.lower():
                header_text = value.lower()
                <span class="hljs-keyword">if</span> any(keyword <span class="hljs-keyword">in</span> header_text <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'summary'</span>, <span class="hljs-string">'abstract'</span>, <span class="hljs-string">'overview'</span>]):
                    <span class="hljs-keyword">return</span> <span class="hljs-string">'summary'</span>
                <span class="hljs-keyword">elif</span> any(keyword <span class="hljs-keyword">in</span> header_text <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'conclusion'</span>, <span class="hljs-string">'summary'</span>, <span class="hljs-string">'results'</span>]):
                    <span class="hljs-keyword">return</span> <span class="hljs-string">'conclusion'</span>
                <span class="hljs-keyword">elif</span> any(keyword <span class="hljs-keyword">in</span> header_text <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'introduction'</span>, <span class="hljs-string">'background'</span>]):
                    <span class="hljs-keyword">return</span> <span class="hljs-string">'introduction'</span>

    <span class="hljs-comment"># Fallback to content-based detection</span>
    <span class="hljs-keyword">if</span> len(text) &lt; <span class="hljs-number">100</span> <span class="hljs-keyword">and</span> <span class="hljs-string">':'</span> <span class="hljs-keyword">in</span> text:
        <span class="hljs-keyword">return</span> <span class="hljs-string">'title'</span>
    <span class="hljs-keyword">elif</span> any(keyword <span class="hljs-keyword">in</span> text_lower <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'summary'</span>, <span class="hljs-string">'abstract'</span>, <span class="hljs-string">'overview'</span>]):
        <span class="hljs-keyword">return</span> <span class="hljs-string">'summary'</span>
    <span class="hljs-keyword">elif</span> any(keyword <span class="hljs-keyword">in</span> text_lower <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'conclusion'</span>, <span class="hljs-string">'in conclusion'</span>, <span class="hljs-string">'to conclude'</span>]):
        <span class="hljs-keyword">return</span> <span class="hljs-string">'conclusion'</span>
    <span class="hljs-keyword">elif</span> any(keyword <span class="hljs-keyword">in</span> text_lower <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> [<span class="hljs-string">'introduction'</span>, <span class="hljs-string">'background'</span>]):
        <span class="hljs-keyword">return</span> <span class="hljs-string">'introduction'</span>
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">return</span> <span class="hljs-string">'content'</span>
</code></pre>
<p><strong>Real-world example</strong>: A legal firm's document system where lawyers need to quickly find case summaries vs. detailed legal reasoning, vs. final judgments.</p>
<p><strong>Pros</strong>:</p>
<ul>
<li><p>✅ Preserves document structure</p>
</li>
<li><p>✅ Enables targeted search (search only in summaries)</p>
</li>
<li><p>✅ Better context preservation</p>
</li>
<li><p>✅ Widely used in production</p>
</li>
</ul>
<p><strong>Cons</strong>:</p>
<ul>
<li><p>❌ More complex to implement</p>
</li>
<li><p>❌ Requires understanding of document structure</p>
</li>
<li><p>❌ May create uneven chunk sizes</p>
</li>
</ul>
<h3 id="heading-3-multi-vector-collections-the-advanced-approach">3. Multi-Vector Collections: The Advanced Approach</h3>
<p><strong>What it is</strong>: Create multiple different representations of the same content, like having multiple indexes for the same library book.</p>
<p><strong>How it works</strong>:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">multi_vector_approach</span>(<span class="hljs-params">document</span>):</span>
    <span class="hljs-comment"># Same document, multiple representations</span>
    representations = {}

    <span class="hljs-comment"># Semantic representation (for meaning)</span>
    representations[<span class="hljs-string">'semantic'</span>] = create_embedding(document, model=<span class="hljs-string">'semantic'</span>)

    <span class="hljs-comment"># Keyword representation (for exact matches)</span>
    keyword_enhanced = extract_keywords(document) + document
    representations[<span class="hljs-string">'keyword'</span>] = create_embedding(keyword_enhanced, model=<span class="hljs-string">'keyword'</span>)

    <span class="hljs-comment"># Summary representation (for high-level concepts)</span>
    summary = generate_summary(document)
    representations[<span class="hljs-string">'summary'</span>] = create_embedding(summary, model=<span class="hljs-string">'large'</span>)

    <span class="hljs-keyword">return</span> {
        <span class="hljs-string">'document'</span>: document,
        <span class="hljs-string">'vectors'</span>: representations
    }

<span class="hljs-comment"># When searching, you can choose which representation to use</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">search_with_strategy</span>(<span class="hljs-params">query, search_type=<span class="hljs-string">'semantic'</span></span>):</span>
    <span class="hljs-keyword">if</span> search_type == <span class="hljs-string">'semantic'</span>:
        <span class="hljs-keyword">return</span> search_vector(query, vector_type=<span class="hljs-string">'semantic'</span>)
    <span class="hljs-keyword">elif</span> search_type == <span class="hljs-string">'keyword'</span>:
        <span class="hljs-keyword">return</span> search_vector(query, vector_type=<span class="hljs-string">'keyword'</span>)
    <span class="hljs-keyword">elif</span> search_type == <span class="hljs-string">'conceptual'</span>:
        <span class="hljs-keyword">return</span> search_vector(query, vector_type=<span class="hljs-string">'summary'</span>)
</code></pre>
<p><strong>Real-world example</strong>: A research platform where the same paper needs to be findable by exact technical terms, general concepts, and semantic similarity.</p>
<p><strong>Pros</strong>:</p>
<ul>
<li><p>✅ Multiple search strategies for the same content</p>
</li>
<li><p>✅ Can combine different AI models</p>
</li>
<li><p>✅ Handles diverse query types well</p>
</li>
</ul>
<p><strong>Cons</strong>:</p>
<ul>
<li><p>❌ Much more complex and expensive</p>
</li>
<li><p>❌ Requires multiple embedding API calls</p>
</li>
<li><p>❌ Higher storage costs</p>
</li>
<li><p>❌ Less commonly used in production</p>
</li>
</ul>
<hr />
<h2 id="heading-real-world-use-cases-which-approach-when">Real-World Use Cases: Which Approach When?</h2>
<h3 id="heading-e-commerce-platform-product-search">E-commerce Platform: Product Search</h3>
<p><strong>Scenario</strong>: Customers search for products using various terms</p>
<p><strong>Best approach</strong>: <strong>Semantic chunking</strong> with product attribute separation</p>
<pre><code class="lang-plaintext">Product chunks:
- Title: "iPhone 15 Pro Max"
- Features: "6.7-inch display, A17 Pro chip, titanium design"
- Reviews: "Great camera quality, excellent battery life"
- Specifications: "256GB storage, 5G connectivity"
</code></pre>
<p><strong>Why</strong>: Customers might want to search specifically in reviews ("battery life") or specifications ("storage"), making targeted search valuable.</p>
<h3 id="heading-legal-document-management">Legal Document Management</h3>
<p><strong>Scenario</strong>: Lawyers need to find specific information in thousands of legal documents</p>
<p><strong>Best approach</strong>: <strong>Semantic chunking</strong> with legal document structure</p>
<pre><code class="lang-plaintext">Legal document chunks:
- Case summary: "Plaintiff vs. Defendant regarding contract dispute"
- Facts: "On January 15, 2023, the parties entered into agreement..."
- Legal reasoning: "Under contract law precedent established in..."
- Judgment: "The court finds in favor of plaintiff and awards..."
</code></pre>
<p><strong>Why</strong>: Legal professionals have specific information needs - they might want only case summaries for quick review or only judgments for precedent research.</p>
<h3 id="heading-customer-support-knowledge-base">Customer Support Knowledge Base</h3>
<p><strong>Scenario</strong>: Support agents need quick answers to customer questions</p>
<p><strong>Best approach</strong>: <strong>Multi-vector collections</strong> for diverse query handling</p>
<pre><code class="lang-python">Same article about <span class="hljs-string">"Password Reset"</span> gets multiple representations:
- Semantic vector: Understands <span class="hljs-string">"I can't log in"</span> → password reset
- Keyword vector: Finds exact matches <span class="hljs-keyword">for</span> <span class="hljs-string">"forgot password"</span>
- Summary vector: Matches high-level concepts like <span class="hljs-string">"account access issues"</span>
</code></pre>
<p><strong>Why</strong>: Customer questions come in many forms - some use exact terminology, others describe problems in natural language.</p>
<h3 id="heading-academic-research-platform">Academic Research Platform</h3>
<p><strong>Scenario</strong>: Researchers search through millions of scientific papers</p>
<p><strong>Best approach</strong>: <strong>Semantic chunking</strong> with academic paper structure</p>
<pre><code class="lang-python">Research paper chunks:
- Abstract: High-level research summary
- Introduction: Problem background <span class="hljs-keyword">and</span> motivation  
- Methodology: How the research was conducted
- Results: What was discovered
- Conclusion: Implications <span class="hljs-keyword">and</span> future work
</code></pre>
<p><strong>Why</strong>: Researchers have different needs - some want quick overviews (abstracts), others need implementation details (methodology).</p>
<hr />
<h2 id="heading-best-practices-and-common-pitfalls">Best Practices and Common Pitfalls</h2>
<h3 id="heading-dos">Do's ✅</h3>
<ol>
<li><p><strong>Start simple</strong>: Begin with semantic chunking before considering multi-vector</p>
</li>
<li><p><strong>Test with real queries</strong>: Use actual user queries to evaluate effectiveness</p>
</li>
<li><p><strong>Monitor chunk sizes</strong>: Aim for 200-800 tokens per chunk for most embedding models</p>
</li>
<li><p><strong>Preserve context</strong>: Include some overlap between chunks to maintain context</p>
</li>
<li><p><strong>Use metadata effectively</strong>: Store document source, creation date, author, etc.</p>
</li>
</ol>
<h3 id="heading-donts">Don'ts ❌</h3>
<ol>
<li><p><strong>Don't over-engineer</strong>: Multi-vector isn't always better than good semantic chunking</p>
</li>
<li><p><strong>Don't ignore document structure</strong>: Fixed chunking often loses important context</p>
</li>
<li><p><strong>Don't forget evaluation</strong>: Measure search quality with real user scenarios</p>
</li>
<li><p><strong>Don't chunk too small</strong>: Very small chunks lose context</p>
</li>
<li><p><strong>Don't chunk too large</strong>: Very large chunks dilute specific information</p>
</li>
</ol>
<h2 id="heading-common-pitfalls-and-how-to-avoid-them">Common Pitfalls and How to Avoid Them</h2>
<h3 id="heading-pitfall-1-more-vectors-better-results">Pitfall 1: "More Vectors = Better Results"</h3>
<p><strong>Problem</strong>: Assuming multi-vector always outperforms simpler approaches</p>
<p><strong>Solution</strong>: Start with semantic chunking and only add complexity if you have specific use cases that require it</p>
<h3 id="heading-pitfall-2-ignoring-document-structure">Pitfall 2: Ignoring Document Structure</h3>
<p><strong>Problem</strong>: Using fixed chunking on structured documents like research papers</p>
<p><strong>Solution</strong>: Analyze your document types and chunk according to their natural structure</p>
<h3 id="heading-pitfall-3-not-testing-with-real-queries">Pitfall 3: Not Testing with Real Queries</h3>
<p><strong>Problem</strong>: Optimizing for theoretical scenarios instead of actual user needs</p>
<p><strong>Solution</strong>: Collect real user queries and evaluate your chunking strategy against them</p>
<hr />
<h3 id="heading-sample-code">Sample code</h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">import</span> uuid
<span class="hljs-keyword">import</span> logging
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Dict, Any, Optional
<span class="hljs-keyword">from</span> dataclasses <span class="hljs-keyword">import</span> dataclass
<span class="hljs-keyword">import</span> openai
<span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient
<span class="hljs-keyword">from</span> qdrant_client.models <span class="hljs-keyword">import</span> Distance, VectorParams, PointStruct, Filter, FieldCondition, MatchValue
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

load_dotenv()

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


<span class="hljs-meta">@dataclass</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">DocumentChunk</span>:</span>
    <span class="hljs-string">"""Represents a chunk of a document with its metadata"""</span>
    chunk_id: str
    content: str
    chunk_type: str  <span class="hljs-comment"># 'summary', 'paragraph', 'title', 'conclusion', etc.</span>
    parent_doc_id: str
    chunk_index: int
    metadata: Dict[str, Any]


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">OpenAIEmbeddingService</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, model: str = <span class="hljs-string">"text-embedding-3-small"</span></span>):</span>
        self.client = openai.OpenAI(api_key=os.getenv(<span class="hljs-string">'OPENAI_API_KEY'</span>))
        self.model = model
        self.embedding_dimension = <span class="hljs-number">1536</span> <span class="hljs-keyword">if</span> <span class="hljs-string">"3-small"</span> <span class="hljs-keyword">in</span> model <span class="hljs-keyword">else</span> <span class="hljs-number">3072</span>
        logger.info(<span class="hljs-string">f"Initialized OpenAI embedding service with model: <span class="hljs-subst">{model}</span>"</span>)

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_embeddings</span>(<span class="hljs-params">self, texts: List[str]</span>) -&gt; List[List[float]]:</span>
        <span class="hljs-keyword">try</span>:
            logger.info(<span class="hljs-string">f"Creating embeddings for <span class="hljs-subst">{len(texts)}</span> texts"</span>)
            response = self.client.embeddings.create(
                model=self.model,
                input=texts,
                encoding_format=<span class="hljs-string">"float"</span>
            )

            embeddings = [embedding.embedding <span class="hljs-keyword">for</span> embedding <span class="hljs-keyword">in</span> response.data]
            logger.info(<span class="hljs-string">f"Successfully created <span class="hljs-subst">{len(embeddings)}</span> embeddings"</span>)
            <span class="hljs-keyword">return</span> embeddings

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Failed to create embeddings: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">create_single_embedding</span>(<span class="hljs-params">self, text: str</span>) -&gt; List[float]:</span>
        <span class="hljs-string">"""Create embedding for a single text"""</span>
        <span class="hljs-keyword">return</span> self.create_embeddings([text])[<span class="hljs-number">0</span>]


<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">QdrantCollection</span>:</span>
    <span class="hljs-string">"""
    Multi-vector collection implementation using Qdrant.
    Each document is split into multiple chunks, with each chunk getting its own vector.
    """</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self,
                 collection_name: str,
                 qdrant_client: QdrantClient,
                 embedding_service: OpenAIEmbeddingService</span>):</span>
        self.collection_name = collection_name
        self.qdrant_client = qdrant_client
        self.embedding_service = embedding_service

        <span class="hljs-comment"># Create the collection if it doesn't exist</span>
        self._ensure_collection_exists()

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_ensure_collection_exists</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-string">"""Create the Qdrant collection for multi-vector storage"""</span>
        <span class="hljs-keyword">try</span>:
            <span class="hljs-comment"># Check if collection exists</span>
            collections = self.qdrant_client.get_collections()
            existing_names = [col.name <span class="hljs-keyword">for</span> col <span class="hljs-keyword">in</span> collections.collections]

            <span class="hljs-keyword">if</span> self.collection_name <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> existing_names:
                logger.info(<span class="hljs-string">f"Creating new collection: <span class="hljs-subst">{self.collection_name}</span>"</span>)

                <span class="hljs-comment"># Create collection with vector configuration</span>
                self.qdrant_client.create_collection(
                    collection_name=self.collection_name,
                    vectors_config=VectorParams(
                        size=self.embedding_service.embedding_dimension,
                        distance=Distance.COSINE
                    )
                )
                logger.info(<span class="hljs-string">f"✅ Collection '<span class="hljs-subst">{self.collection_name}</span>' created successfully"</span>)
            <span class="hljs-keyword">else</span>:
                logger.info(<span class="hljs-string">f"✅ Collection '<span class="hljs-subst">{self.collection_name}</span>' already exists"</span>)

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Failed to create/verify collection: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">_create_document_chunks</span>(<span class="hljs-params">self, doc_id: str, content: str, metadata: Dict</span>) -&gt; List[DocumentChunk]:</span>
        <span class="hljs-string">"""
        Split document into multiple chunks of different types.
        This is where the 'multi-vector' concept comes into play.
        """</span>
        chunks = []
        lines = content.strip().split(<span class="hljs-string">'\n'</span>)
        paragraphs = [p.strip() <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> content.split(<span class="hljs-string">'\n\n'</span>) <span class="hljs-keyword">if</span> p.strip()]

        <span class="hljs-comment"># 1. Title chunk (first non-empty line if it looks like a title)</span>
        <span class="hljs-keyword">if</span> lines <span class="hljs-keyword">and</span> len(lines[<span class="hljs-number">0</span>].strip()) &lt; <span class="hljs-number">100</span>:
            title_chunk = DocumentChunk(
                chunk_id=<span class="hljs-string">f"<span class="hljs-subst">{doc_id}</span>_title"</span>,
                content=lines[<span class="hljs-number">0</span>].strip(),
                chunk_type=<span class="hljs-string">"title"</span>,
                parent_doc_id=doc_id,
                chunk_index=<span class="hljs-number">0</span>,
                metadata={**metadata, <span class="hljs-string">"is_title"</span>: <span class="hljs-literal">True</span>}
            )
            chunks.append(title_chunk)

        <span class="hljs-comment"># 2. Summary chunk (first paragraph as summary)</span>
        <span class="hljs-keyword">if</span> paragraphs:
            summary_content = paragraphs[<span class="hljs-number">0</span>]
            <span class="hljs-keyword">if</span> len(summary_content) &gt; <span class="hljs-number">50</span>:  <span class="hljs-comment"># Only if substantial</span>
                summary_chunk = DocumentChunk(
                    chunk_id=<span class="hljs-string">f"<span class="hljs-subst">{doc_id}</span>_summary"</span>,
                    content=<span class="hljs-string">f"Summary: <span class="hljs-subst">{summary_content}</span>"</span>,
                    chunk_type=<span class="hljs-string">"summary"</span>,
                    parent_doc_id=doc_id,
                    chunk_index=<span class="hljs-number">1</span>,
                    metadata={**metadata, <span class="hljs-string">"is_summary"</span>: <span class="hljs-literal">True</span>}
                )
                chunks.append(summary_chunk)

        <span class="hljs-comment"># 3. Individual paragraph chunks</span>
        <span class="hljs-keyword">for</span> i, paragraph <span class="hljs-keyword">in</span> enumerate(paragraphs):
            <span class="hljs-keyword">if</span> len(paragraph) &gt; <span class="hljs-number">100</span>:  <span class="hljs-comment"># Only substantial paragraphs</span>
                para_chunk = DocumentChunk(
                    chunk_id=<span class="hljs-string">f"<span class="hljs-subst">{doc_id}</span>_para_<span class="hljs-subst">{i}</span>"</span>,
                    content=paragraph,
                    chunk_type=<span class="hljs-string">"paragraph"</span>,
                    parent_doc_id=doc_id,
                    chunk_index=i + <span class="hljs-number">2</span>,  <span class="hljs-comment"># After title and summary</span>
                    metadata={**metadata, <span class="hljs-string">"paragraph_number"</span>: i}
                )
                chunks.append(para_chunk)

        <span class="hljs-comment"># 4. Conclusion chunk (last paragraph if it contains conclusion keywords)</span>
        <span class="hljs-keyword">if</span> len(paragraphs) &gt; <span class="hljs-number">1</span>:
            last_para = paragraphs[<span class="hljs-number">-1</span>].lower()
            conclusion_keywords = [<span class="hljs-string">'conclusion'</span>, <span class="hljs-string">'summary'</span>, <span class="hljs-string">'in conclusion'</span>, <span class="hljs-string">'to summarize'</span>, <span class="hljs-string">'finally'</span>]
            <span class="hljs-keyword">if</span> any(keyword <span class="hljs-keyword">in</span> last_para <span class="hljs-keyword">for</span> keyword <span class="hljs-keyword">in</span> conclusion_keywords):
                conclusion_chunk = DocumentChunk(
                    chunk_id=<span class="hljs-string">f"<span class="hljs-subst">{doc_id}</span>_conclusion"</span>,
                    content=paragraphs[<span class="hljs-number">-1</span>],
                    chunk_type=<span class="hljs-string">"conclusion"</span>,
                    parent_doc_id=doc_id,
                    chunk_index=len(chunks) + <span class="hljs-number">1</span>,
                    metadata={**metadata, <span class="hljs-string">"is_conclusion"</span>: <span class="hljs-literal">True</span>}
                )
                chunks.append(conclusion_chunk)

        logger.info(<span class="hljs-string">f"Created <span class="hljs-subst">{len(chunks)}</span> chunks for document '<span class="hljs-subst">{doc_id}</span>'"</span>)
        <span class="hljs-keyword">return</span> chunks

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">add_document</span>(<span class="hljs-params">self, doc_id: str, content: str, metadata: Dict = None</span>) -&gt; Dict[str, Any]:</span>
        <span class="hljs-string">"""
        Add a document to the multi-vector collection.
        This demonstrates the core workflow of multi-vector storage.
        """</span>
        metadata = metadata <span class="hljs-keyword">or</span> {}

        <span class="hljs-keyword">try</span>:
            logger.info(<span class="hljs-string">f"Adding document '<span class="hljs-subst">{doc_id}</span>' to collection"</span>)

            <span class="hljs-comment"># Step 1: Create multiple chunks from the document</span>
            chunks = self._create_document_chunks(doc_id, content, metadata)

            <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> chunks:
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"No valid chunks created from document"</span>)

            <span class="hljs-comment"># Step 2: Generate embeddings for all chunks</span>
            chunk_contents = [chunk.content <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> chunks]
            embeddings = self.embedding_service.create_embeddings(chunk_contents)

            <span class="hljs-comment"># Step 3: Create Qdrant points for each chunk</span>
            points = []
            <span class="hljs-keyword">for</span> chunk, embedding <span class="hljs-keyword">in</span> zip(chunks, embeddings):
                <span class="hljs-comment"># Prepare payload with chunk metadata</span>
                payload = {
                    <span class="hljs-string">"chunk_id"</span>: chunk.chunk_id,
                    <span class="hljs-string">"content"</span>: chunk.content,
                    <span class="hljs-string">"chunk_type"</span>: chunk.chunk_type,
                    <span class="hljs-string">"parent_doc_id"</span>: chunk.parent_doc_id,
                    <span class="hljs-string">"chunk_index"</span>: chunk.chunk_index,
                    **chunk.metadata  <span class="hljs-comment"># Include all custom metadata</span>
                }

                <span class="hljs-comment"># Create point</span>
                point = PointStruct(
                    id=str(uuid.uuid4()),  <span class="hljs-comment"># Unique point ID</span>
                    vector=embedding,
                    payload=payload
                )
                points.append(point)

            <span class="hljs-comment"># Step 4: Upload points to Qdrant</span>
            self.qdrant_client.upsert(
                collection_name=self.collection_name,
                points=points
            )

            result = {
                <span class="hljs-string">"success"</span>: <span class="hljs-literal">True</span>,
                <span class="hljs-string">"doc_id"</span>: doc_id,
                <span class="hljs-string">"chunks_created"</span>: len(chunks),
                <span class="hljs-string">"chunk_types"</span>: [chunk.chunk_type <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> chunks],
                <span class="hljs-string">"points_uploaded"</span>: len(points)
            }

            logger.info(<span class="hljs-string">f"✅ Successfully added document '<span class="hljs-subst">{doc_id}</span>' with <span class="hljs-subst">{len(chunks)}</span> chunks"</span>)
            <span class="hljs-keyword">return</span> result

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Failed to add document '<span class="hljs-subst">{doc_id}</span>': <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">search</span>(<span class="hljs-params">self,
               query: str,
               limit: int = <span class="hljs-number">5</span>,
               chunk_types: Optional[List[str]] = None,
               doc_id_filter: Optional[str] = None</span>) -&gt; List[Dict[str, Any]]:</span>
        <span class="hljs-string">"""
        Search the multi-vector collection with optional filtering.
        This demonstrates the key advantage of multi-vector collections.
        """</span>
        <span class="hljs-keyword">try</span>:
            logger.info(<span class="hljs-string">f"Searching for: '<span class="hljs-subst">{query}</span>' with limit=<span class="hljs-subst">{limit}</span>"</span>)

            <span class="hljs-comment"># Step 1: Create query embedding</span>
            query_embedding = self.embedding_service.create_single_embedding(query)

            <span class="hljs-comment"># Step 2: Build filter conditions</span>
            filter_conditions = []

            <span class="hljs-keyword">if</span> chunk_types:
                <span class="hljs-comment"># Filter by chunk types</span>
                filter_conditions.append(
                    FieldCondition(
                        key=<span class="hljs-string">"chunk_type"</span>,
                        match=MatchValue(value=chunk_types[<span class="hljs-number">0</span>] <span class="hljs-keyword">if</span> len(chunk_types) == <span class="hljs-number">1</span> <span class="hljs-keyword">else</span> {<span class="hljs-string">"$in"</span>: chunk_types})
                    )
                )
                logger.info(<span class="hljs-string">f"Filtering by chunk types: <span class="hljs-subst">{chunk_types}</span>"</span>)

            <span class="hljs-keyword">if</span> doc_id_filter:
                <span class="hljs-comment"># Filter by specific document</span>
                filter_conditions.append(
                    FieldCondition(
                        key=<span class="hljs-string">"parent_doc_id"</span>,
                        match=MatchValue(value=doc_id_filter)
                    )
                )
                logger.info(<span class="hljs-string">f"Filtering by document ID: <span class="hljs-subst">{doc_id_filter}</span>"</span>)

            <span class="hljs-comment"># Combine filters</span>
            search_filter = Filter(must=filter_conditions) <span class="hljs-keyword">if</span> filter_conditions <span class="hljs-keyword">else</span> <span class="hljs-literal">None</span>

            <span class="hljs-comment"># Step 3: Perform vector search</span>
            search_results = self.qdrant_client.search(
                collection_name=self.collection_name,
                query_vector=query_embedding,
                query_filter=search_filter,
                limit=limit,
                with_payload=<span class="hljs-literal">True</span>,
                with_vectors=<span class="hljs-literal">False</span>  <span class="hljs-comment"># Don't return vectors to save bandwidth</span>
            )

            <span class="hljs-comment"># Step 4: Format results</span>
            formatted_results = []
            <span class="hljs-keyword">for</span> result <span class="hljs-keyword">in</span> search_results:
                formatted_result = {
                    <span class="hljs-string">"score"</span>: result.score,
                    <span class="hljs-string">"chunk_id"</span>: result.payload.get(<span class="hljs-string">"chunk_id"</span>),
                    <span class="hljs-string">"content"</span>: result.payload.get(<span class="hljs-string">"content"</span>),
                    <span class="hljs-string">"chunk_type"</span>: result.payload.get(<span class="hljs-string">"chunk_type"</span>),
                    <span class="hljs-string">"parent_doc_id"</span>: result.payload.get(<span class="hljs-string">"parent_doc_id"</span>),
                    <span class="hljs-string">"chunk_index"</span>: result.payload.get(<span class="hljs-string">"chunk_index"</span>),
                    <span class="hljs-string">"metadata"</span>: {k: v <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> result.payload.items()
                                 <span class="hljs-keyword">if</span> k <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> [<span class="hljs-string">"chunk_id"</span>, <span class="hljs-string">"content"</span>, <span class="hljs-string">"chunk_type"</span>, <span class="hljs-string">"parent_doc_id"</span>, <span class="hljs-string">"chunk_index"</span>]}
                }
                formatted_results.append(formatted_result)

            logger.info(<span class="hljs-string">f"Found <span class="hljs-subst">{len(formatted_results)}</span> results"</span>)
            <span class="hljs-keyword">return</span> formatted_results

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Search failed: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_document_chunks</span>(<span class="hljs-params">self, doc_id: str</span>) -&gt; List[Dict[str, Any]]:</span>
        <span class="hljs-string">"""Retrieve all chunks for a specific document"""</span>
        <span class="hljs-keyword">try</span>:
            logger.info(<span class="hljs-string">f"Retrieving chunks for document: <span class="hljs-subst">{doc_id}</span>"</span>)

            <span class="hljs-comment"># Search with document filter</span>
            filter_condition = Filter(
                must=[
                    FieldCondition(
                        key=<span class="hljs-string">"parent_doc_id"</span>,
                        match=MatchValue(value=doc_id)
                    )
                ]
            )

            results = self.qdrant_client.scroll(
                collection_name=self.collection_name,
                scroll_filter=filter_condition,
                limit=<span class="hljs-number">100</span>,  <span class="hljs-comment"># Adjust based on expected chunks per document</span>
                with_payload=<span class="hljs-literal">True</span>,
                with_vectors=<span class="hljs-literal">False</span>
            )

            chunks = []
            <span class="hljs-keyword">for</span> point <span class="hljs-keyword">in</span> results[<span class="hljs-number">0</span>]:  <span class="hljs-comment"># results is a tuple (points, next_page_offset)</span>
                chunk_info = {
                    <span class="hljs-string">"chunk_id"</span>: point.payload.get(<span class="hljs-string">"chunk_id"</span>),
                    <span class="hljs-string">"content"</span>: point.payload.get(<span class="hljs-string">"content"</span>),
                    <span class="hljs-string">"chunk_type"</span>: point.payload.get(<span class="hljs-string">"chunk_type"</span>),
                    <span class="hljs-string">"chunk_index"</span>: point.payload.get(<span class="hljs-string">"chunk_index"</span>),
                    <span class="hljs-string">"metadata"</span>: {k: v <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> point.payload.items()
                                 <span class="hljs-keyword">if</span> k <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> [<span class="hljs-string">"chunk_id"</span>, <span class="hljs-string">"content"</span>, <span class="hljs-string">"chunk_type"</span>, <span class="hljs-string">"parent_doc_id"</span>, <span class="hljs-string">"chunk_index"</span>]}
                }
                chunks.append(chunk_info)

            <span class="hljs-comment"># Sort by chunk index</span>
            chunks.sort(key=<span class="hljs-keyword">lambda</span> x: x.get(<span class="hljs-string">"chunk_index"</span>, <span class="hljs-number">0</span>))

            logger.info(<span class="hljs-string">f"Retrieved <span class="hljs-subst">{len(chunks)}</span> chunks for document '<span class="hljs-subst">{doc_id}</span>'"</span>)
            <span class="hljs-keyword">return</span> chunks

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Failed to retrieve chunks for document '<span class="hljs-subst">{doc_id}</span>': <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_collection_stats</span>(<span class="hljs-params">self</span>) -&gt; Dict[str, Any]:</span>
        <span class="hljs-string">"""Get statistics about the collection"""</span>
        <span class="hljs-keyword">try</span>:
            collection_info = self.qdrant_client.get_collection(self.collection_name)

            <span class="hljs-comment"># Get chunk type distribution</span>
            chunk_types_result = self.qdrant_client.scroll(
                collection_name=self.collection_name,
                limit=<span class="hljs-number">1000</span>,  <span class="hljs-comment"># Adjust based on your collection size</span>
                with_payload=<span class="hljs-literal">True</span>,
                with_vectors=<span class="hljs-literal">False</span>
            )

            chunk_type_counts = {}
            document_counts = {}

            <span class="hljs-keyword">for</span> point <span class="hljs-keyword">in</span> chunk_types_result[<span class="hljs-number">0</span>]:
                chunk_type = point.payload.get(<span class="hljs-string">"chunk_type"</span>, <span class="hljs-string">"unknown"</span>)
                doc_id = point.payload.get(<span class="hljs-string">"parent_doc_id"</span>, <span class="hljs-string">"unknown"</span>)

                chunk_type_counts[chunk_type] = chunk_type_counts.get(chunk_type, <span class="hljs-number">0</span>) + <span class="hljs-number">1</span>
                document_counts[doc_id] = document_counts.get(doc_id, <span class="hljs-number">0</span>) + <span class="hljs-number">1</span>

            stats = {
                <span class="hljs-string">"collection_name"</span>: self.collection_name,
                <span class="hljs-string">"total_points"</span>: collection_info.points_count,
                <span class="hljs-string">"vector_size"</span>: collection_info.config.params.vectors.size,
                <span class="hljs-string">"distance_metric"</span>: collection_info.config.params.vectors.distance.value,
                <span class="hljs-string">"total_documents"</span>: len(document_counts),
                <span class="hljs-string">"chunk_type_distribution"</span>: chunk_type_counts,
                <span class="hljs-string">"avg_chunks_per_document"</span>: round(collection_info.points_count / len(document_counts),
                                                 <span class="hljs-number">2</span>) <span class="hljs-keyword">if</span> document_counts <span class="hljs-keyword">else</span> <span class="hljs-number">0</span>
            }

            <span class="hljs-keyword">return</span> stats

        <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
            logger.error(<span class="hljs-string">f"Failed to get collection stats: <span class="hljs-subst">{str(e)}</span>"</span>)
            <span class="hljs-keyword">raise</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">main</span>():</span>
    print(<span class="hljs-string">"=== Semantic search ===\n"</span>)

    <span class="hljs-keyword">try</span>:
        print(<span class="hljs-string">"1. Initializing services..."</span>)
        embedding_service = OpenAIEmbeddingService()

        <span class="hljs-keyword">from</span> qdrant_client <span class="hljs-keyword">import</span> QdrantClient
        qdrant_client = QdrantClient(url=<span class="hljs-string">"http://localhost:6333"</span>)

        collection = QdrantCollection(
            collection_name=<span class="hljs-string">"semantic_search_demo"</span>,
            qdrant_client=qdrant_client,
            embedding_service=embedding_service
        )
        print(<span class="hljs-string">"✅ Services initialized\n"</span>)

        print(<span class="hljs-string">"2. Adding documents to collection..."</span>)
        sample_docs = {
            <span class="hljs-string">"ai_overview"</span>: <span class="hljs-string">"""Artificial Intelligence: An Overview
            Artificial Intelligence (AI) represents one of the most transformative technologies of our time, fundamentally changing how we interact with machines and process information.
            AI encompasses machine learning, natural language processing, computer vision, and robotics. These technologies enable computers to perform tasks that typically require human intelligence.
            The applications are vast: from autonomous vehicles and medical diagnosis to financial trading and content recommendation systems.
            As AI continues to evolve, it presents both tremendous opportunities and significant challenges that society must carefully navigate."""</span>,

            <span class="hljs-string">"machine_learning"</span>: <span class="hljs-string">"""Machine Learning Fundamentals
            Machine learning is a subset of artificial intelligence that enables systems to automatically learn and improve from experience without being explicitly programmed.
            There are three primary types of machine learning: supervised learning uses labeled training data, unsupervised learning finds patterns in unlabeled data, and reinforcement learning learns through interaction with an environment.
            Common algorithms include linear regression, decision trees, neural networks, and support vector machines. Each has strengths for different types of problems.
            In conclusion, machine learning forms the backbone of modern AI applications and continues to drive innovation across industries."""</span>
        }

        <span class="hljs-keyword">for</span> doc_id, content <span class="hljs-keyword">in</span> sample_docs.items():
            result = collection.add_document(
                doc_id=doc_id,
                content=content,
                metadata={<span class="hljs-string">"topic"</span>: <span class="hljs-string">"artificial_intelligence"</span>, <span class="hljs-string">"language"</span>: <span class="hljs-string">"english"</span>}
            )
            print(<span class="hljs-string">f"   Added '<span class="hljs-subst">{doc_id}</span>': <span class="hljs-subst">{result[<span class="hljs-string">'chunks_created'</span>]}</span> chunks (<span class="hljs-subst">{<span class="hljs-string">', '</span>.join(result[<span class="hljs-string">'chunk_types'</span>])}</span>)"</span>)
        print()

        print(<span class="hljs-string">"3. Demonstrating search capabilities...\n"</span>)

        print(<span class="hljs-string">"🔍 General search for 'machine learning applications':"</span>)
        results = collection.search(<span class="hljs-string">"machine learning applications"</span>, limit=<span class="hljs-number">3</span>)
        <span class="hljs-keyword">for</span> i, result <span class="hljs-keyword">in</span> enumerate(results, <span class="hljs-number">1</span>):
            print(<span class="hljs-string">f"   <span class="hljs-subst">{i}</span>. [<span class="hljs-subst">{result[<span class="hljs-string">'chunk_type'</span>]}</span>] Score: <span class="hljs-subst">{result[<span class="hljs-string">'score'</span>]:<span class="hljs-number">.3</span>f}</span>"</span>)
            print(<span class="hljs-string">f"      From: <span class="hljs-subst">{result[<span class="hljs-string">'parent_doc_id'</span>]}</span>"</span>)
            print(<span class="hljs-string">f"      Content: <span class="hljs-subst">{result[<span class="hljs-string">'content'</span>][:<span class="hljs-number">80</span>]}</span>..."</span>)
        print()

        print(<span class="hljs-string">"🔍 Search only in summaries for 'artificial intelligence':"</span>)
        results = collection.search(<span class="hljs-string">"artificial intelligence"</span>, limit=<span class="hljs-number">2</span>, chunk_types=[<span class="hljs-string">'summary'</span>])
        <span class="hljs-keyword">for</span> i, result <span class="hljs-keyword">in</span> enumerate(results, <span class="hljs-number">1</span>):
            print(<span class="hljs-string">f"   <span class="hljs-subst">{i}</span>. [<span class="hljs-subst">{result[<span class="hljs-string">'chunk_type'</span>]}</span>] Score: <span class="hljs-subst">{result[<span class="hljs-string">'score'</span>]:<span class="hljs-number">.3</span>f}</span>"</span>)
            print(<span class="hljs-string">f"      Content: <span class="hljs-subst">{result[<span class="hljs-string">'content'</span>][:<span class="hljs-number">100</span>]}</span>..."</span>)
        print()

        print(<span class="hljs-string">"🔍 Search only in titles for 'machine learning':"</span>)
        results = collection.search(<span class="hljs-string">"machine learning"</span>, limit=<span class="hljs-number">2</span>, chunk_types=[<span class="hljs-string">'title'</span>])
        <span class="hljs-keyword">for</span> i, result <span class="hljs-keyword">in</span> enumerate(results, <span class="hljs-number">1</span>):
            print(<span class="hljs-string">f"   <span class="hljs-subst">{i}</span>. [<span class="hljs-subst">{result[<span class="hljs-string">'chunk_type'</span>]}</span>] Score: <span class="hljs-subst">{result[<span class="hljs-string">'score'</span>]:<span class="hljs-number">.3</span>f}</span>"</span>)
            print(<span class="hljs-string">f"      Content: <span class="hljs-subst">{result[<span class="hljs-string">'content'</span>]}</span>"</span>)
        print()

        print(<span class="hljs-string">"🔍 Search within specific document for 'algorithms':"</span>)
        results = collection.search(<span class="hljs-string">"algorithms"</span>, limit=<span class="hljs-number">3</span>, doc_id_filter=<span class="hljs-string">"machine_learning"</span>)
        <span class="hljs-keyword">for</span> i, result <span class="hljs-keyword">in</span> enumerate(results, <span class="hljs-number">1</span>):
            print(<span class="hljs-string">f"   <span class="hljs-subst">{i}</span>. [<span class="hljs-subst">{result[<span class="hljs-string">'chunk_type'</span>]}</span>] Score: <span class="hljs-subst">{result[<span class="hljs-string">'score'</span>]:<span class="hljs-number">.3</span>f}</span>"</span>)
            print(<span class="hljs-string">f"      Content: <span class="hljs-subst">{result[<span class="hljs-string">'content'</span>][:<span class="hljs-number">80</span>]}</span>..."</span>)
        print()

        print(<span class="hljs-string">"4. Document structure analysis...\n"</span>)
        <span class="hljs-keyword">for</span> doc_id <span class="hljs-keyword">in</span> sample_docs.keys():
            print(<span class="hljs-string">f"📄 Document: <span class="hljs-subst">{doc_id}</span>"</span>)
            chunks = collection.get_document_chunks(doc_id)
            <span class="hljs-keyword">for</span> chunk <span class="hljs-keyword">in</span> chunks:
                print(<span class="hljs-string">f"   <span class="hljs-subst">{chunk[<span class="hljs-string">'chunk_index'</span>]}</span>. [<span class="hljs-subst">{chunk[<span class="hljs-string">'chunk_type'</span>]}</span>] <span class="hljs-subst">{chunk[<span class="hljs-string">'content'</span>][:<span class="hljs-number">60</span>]}</span>..."</span>)
            print()

        print(<span class="hljs-string">"5. Collection statistics..."</span>)
        stats = collection.get_collection_stats()
        print(<span class="hljs-string">f"   Collection: <span class="hljs-subst">{stats[<span class="hljs-string">'collection_name'</span>]}</span>"</span>)
        print(<span class="hljs-string">f"   Total points: <span class="hljs-subst">{stats[<span class="hljs-string">'total_points'</span>]}</span>"</span>)
        print(<span class="hljs-string">f"   Total documents: <span class="hljs-subst">{stats[<span class="hljs-string">'total_documents'</span>]}</span>"</span>)
        print(<span class="hljs-string">f"   Avg chunks per document: <span class="hljs-subst">{stats[<span class="hljs-string">'avg_chunks_per_document'</span>]}</span>"</span>)
        print(<span class="hljs-string">f"   Chunk type distribution: <span class="hljs-subst">{stats[<span class="hljs-string">'chunk_type_distribution'</span>]}</span>"</span>)
        print()

        print(<span class="hljs-string">"✅ Semantic search demonstration completed successfully!"</span>)

    <span class="hljs-keyword">except</span> Exception <span class="hljs-keyword">as</span> e:
        logger.error(<span class="hljs-string">f"Demo failed: <span class="hljs-subst">{str(e)}</span>"</span>)
        print(<span class="hljs-string">f"❌ Demo failed: <span class="hljs-subst">{str(e)}</span>"</span>)


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    main()
</code></pre>
]]></content:encoded></item><item><title><![CDATA[Kubernetes/Helm-charts commonly used commands]]></title><description><![CDATA[Uninstall Helm Release
helm uninstall <release> --namespace <namespace>

#example
helm uninstall qdrant --namespace qdrant

Delete PVCs (Persistent Volume Claims)
kubectl delete pvc --all -n <namespace>

#example
kubectl delete pvc --all -n qdrant

D...]]></description><link>https://blogs.ummerfarooq.dev/kuberneteshelm-charts-commonly-used-commands</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/kuberneteshelm-charts-commonly-used-commands</guid><category><![CDATA[helm chart]]></category><category><![CDATA[kubectl]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[Helm]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Tue, 05 Aug 2025 10:30:30 GMT</pubDate><content:encoded><![CDATA[<h3 id="heading-uninstall-helm-release">Uninstall Helm Release</h3>
<pre><code class="lang-bash">helm uninstall &lt;release&gt; --namespace &lt;namespace&gt;

<span class="hljs-comment">#example</span>
helm uninstall qdrant --namespace qdrant
</code></pre>
<h3 id="heading-delete-pvcs-persistent-volume-claims">Delete PVCs (Persistent Volume Claims)</h3>
<pre><code class="lang-bash">kubectl delete pvc --all -n &lt;namespace&gt;

<span class="hljs-comment">#example</span>
kubectl delete pvc --all -n qdrant
</code></pre>
<h3 id="heading-delete-namespace-optional-full-cleanup">Delete Namespace (Optional – full cleanup)</h3>
<pre><code class="lang-bash">kubectl delete namespace &lt;namespace&gt;

<span class="hljs-comment">#example</span>
kubectl delete namespace qdrant
</code></pre>
<h3 id="heading-check-pod-status">Check Pod Status</h3>
<pre><code class="lang-bash">kubectl get pods -n qdrant
</code></pre>
<h3 id="heading-check-events">Check Events</h3>
<pre><code class="lang-bash">kubectl get events -n qdrant --sort-by=.metadata.creationTimestamp
</code></pre>
<h3 id="heading-checking-nodeports">Checking NodePorts</h3>
<pre><code class="lang-bash">kubectl get svc qdrant -n qdrant-cluster
</code></pre>
<h3 id="heading-check-pod-endpoints">Check Pod Endpoints</h3>
<pre><code class="lang-bash">kubectl get endpoints -n qdrant-cluster qdrant
</code></pre>
]]></content:encoded></item><item><title><![CDATA[WSL Networking and Port Forwarding]]></title><description><![CDATA[Considering vLLM is running inside WSL, and we are trying to expose it to the Windows network.
Step 1: Confirm WSL Server Is Listening on 0.0.0.0
In your docker-compose or command, make sure vLLM is not bound to 127.0.0.1. It should be bound to all i...]]></description><link>https://blogs.ummerfarooq.dev/wsl-networking-and-port-forwarding</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/wsl-networking-and-port-forwarding</guid><category><![CDATA[WSL]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Mon, 04 Aug 2025 08:18:46 GMT</pubDate><content:encoded><![CDATA[<p><strong>Considering vLLM is running inside WSL, and we are trying to expose it to the Windows network.</strong></p>
<h3 id="heading-step-1-confirm-wsl-server-is-listening-on-0000"><strong>Step 1: Confirm WSL Server Is Listening on 0.0.0.0</strong></h3>
<p>In your docker-compose or command, make sure vLLM is not bound to 127.0.0.1. It should be bound to all interfaces (0.0.0.0):</p>
<p><code>python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --port 8000</code></p>
<p>Also, confirm with:</p>
<p><code>netstat -tuln | grep 8000</code></p>
<p>Should show:</p>
<p><code>tcp        0      0 0.0.0.0:8000      0.0.0.0:*       LISTEN</code></p>
<h3 id="heading-step-2-identify-the-wsl-ip"><strong>Step 2: Identify the WSL IP</strong></h3>
<p>Inside WSL, run:</p>
<p><code>ip addr | grep inet</code></p>
<p>You’ll see something like inet 172.22.x.x — that's the internal IP of your WSL instance. <strong>But this is not accessible from the outside.</strong></p>
<h3 id="heading-step-3-set-up-port-forwarding-from-windows-host-to-wsl2"><strong>Step 3: Set up Port Forwarding (From Windows Host to WSL2)</strong></h3>
<p>Since WSL2 is on a different virtual network, you need to <strong>forward ports</strong> from the Windows host to WSL.</p>
<p>Use <strong>PowerShell as Administrator</strong> on Windows and run:</p>
<p><code>netsh interface portproxy add v4tov4 listenaddress=0.0.0.0 listenport=8000 connectaddress=WSL-IP connectport=8000</code></p>
<p>Replace WSL-IP with the IP from Step 2. You can also use <a target="_blank" href="http://localhost">localhost</a> if you're sure the service is bound to 0.0.0.0 in WSL.</p>
<p>Example:</p>
<p><code>netsh interface portproxy add v4tov4 listenaddress=0.0.0.0 listenport=8000 connectaddress=172.22.64.1 connectport=8000</code></p>
<p>Then enable the firewall rule:</p>
<p><code>netsh advfirewall firewall add rule name="vLLM Port 8000" dir=in action=allow protocol=TCP localport=8000</code></p>
<h3 id="heading-ia"> </h3>
<p><strong>Step 4: Use Your Windows Machine’s Private IP</strong></p>
<p>From another machine on your LAN, access your vLLM server using:</p>
<p><code>http://&lt;your-windows-private-ip&gt;:8000</code></p>
<p>You can find this by running on Windows CMD:</p>
<p><code>ipconfig</code></p>
<p>Look for the IPv4 Address under your active network adapter (e.g., 192.168.x.x).</p>
<h3 id="heading-step-5-verify-from-another-device"><strong>Step 5: Verify from Another Device</strong></h3>
<p>Try this in your browser or with curl:</p>
<p><code>curl http://&lt;windows-ip&gt;:8000/v1/models</code></p>
<p>You should get a response from the vLLM server.</p>
<h2 id="heading-optional-clean-up-forwarding-rules"><strong>(Optional) Clean Up Forwarding Rules</strong></h2>
<p>To remove the proxy:</p>
<p><code>netsh interface portproxy delete v4tov4 listenport=8000 listenaddress=0.0.0.0</code></p>
<p><strong>⚠️ Notes</strong></p>
<ul>
<li><p>If you're using Docker inside WSL2, ensure Docker is binding to 0.0.0.0, or use Docker’s ports section in docker-compose.</p>
</li>
<li><p>Windows Defender Firewall can block incoming traffic. Ensure the port is allowed</p>
</li>
<li><p>WSL2 still doesn’t support host bridging, so this proxying is a stable workaround.</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[QDrant Multi-node Cluster Deployment on AWS EC2 with Helm Charts]]></title><description><![CDATA[Prerequisites

AWS Account with appropriate permissions

Basic knowledge of Kubernetes and Helm

SSH key pair for EC2 access


Phase 1: AWS Infrastructure Setup
Step 1: Create VPC and Networking

Create VPC

Go to AWS Console → VPC → Create VPC

Name...]]></description><link>https://blogs.ummerfarooq.dev/qdrant-multi-node-cluster-deployment-on-aws-ec2-with-helm-charts</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/qdrant-multi-node-cluster-deployment-on-aws-ec2-with-helm-charts</guid><category><![CDATA[AWS]]></category><category><![CDATA[qdrant]]></category><category><![CDATA[distributed system]]></category><category><![CDATA[vector database]]></category><category><![CDATA[Multi-Node Cluster]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Mon, 04 Aug 2025 07:06:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754291134555/c2d5a770-86b2-4e99-b526-0465d6462b86.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-prerequisites">Prerequisites</h2>
<ul>
<li><p>AWS Account with appropriate permissions</p>
</li>
<li><p>Basic knowledge of Kubernetes and Helm</p>
</li>
<li><p>SSH key pair for EC2 access</p>
</li>
</ul>
<h2 id="heading-phase-1-aws-infrastructure-setup">Phase 1: AWS Infrastructure Setup</h2>
<h3 id="heading-step-1-create-vpc-and-networking">Step 1: Create VPC and Networking</h3>
<ol>
<li><p><strong>Create VPC</strong></p>
<ul>
<li><p>Go to AWS Console → VPC → Create VPC</p>
</li>
<li><p>Name: <code>qdrant-vpc</code></p>
</li>
<li><p>IPv4 CIDR: <code>10.0.0.0/16</code></p>
</li>
<li><p>Enable DNS hostnames and DNS resolution</p>
</li>
</ul>
</li>
<li><p><strong>Create Subnets</strong></p>
<ul>
<li><p>Create 3 private subnets in different AZs:</p>
<ul>
<li><p><code>qdrant-subnet-1a</code>: <code>10.0.1.0/24</code> (ap-south-1a)</p>
</li>
<li><p><code>qdrant-subnet-1b</code>: <code>10.0.2.0/24</code> (ap-south-1b)</p>
</li>
<li><p><code>qdrant-subnet-1c</code>: <code>10.0.3.0/24</code> (ap-south-1c)</p>
</li>
</ul>
</li>
<li><p>Create 1 public subnet for NAT Gateway:</p>
<ul>
<li><code>qdrant-public-subnet</code>: <code>10.0.100.0/24</code> (ap-south-1a)</li>
</ul>
</li>
</ul>
</li>
<li><p><strong>Create Internet Gateway</strong></p>
<ul>
<li><p>Name: <code>qdrant-igw</code></p>
</li>
<li><p>Attach to <code>qdrant-vpc</code></p>
</li>
</ul>
</li>
<li><p><strong>Create NAT Gateway</strong></p>
<ul>
<li><p>Place in <code>qdrant-public-subnet</code></p>
</li>
<li><p>Allocate Elastic IP</p>
</li>
</ul>
</li>
<li><p><strong>Configure Route Tables</strong></p>
<ul>
<li><p>Public Route Table:</p>
<ul>
<li>Route: <code>0.0.0.0/0</code> → Internet Gateway</li>
</ul>
</li>
<li><p>Private Route Table:</p>
<ul>
<li><p>Route: <code>0.0.0.0/0</code> → NAT Gateway</p>
</li>
<li><p>Associate with all private subnets</p>
</li>
</ul>
</li>
</ul>
</li>
</ol>
<h3 id="heading-step-2-security-groups">Step 2: Security Groups</h3>
<ol>
<li><p><strong>Create Security Group:</strong> <code>qdrant-cluster-sg</code></p>
<ul>
<li><p>VPC: <code>qdrant-vpc</code></p>
</li>
<li><p>Inbound Rules:</p>
<ul>
<li><p>SSH: Port 22 (Source: Your IP)</p>
</li>
<li><p>Kubernetes API: Port 6443 (Source: Security Group itself)</p>
</li>
<li><p>QDrant HTTP: Port 6333 (Source: Security Group itself)</p>
</li>
<li><p>QDrant gRPC: Port 6334 (Source: Security Group itself)</p>
</li>
<li><p>Etcd: Ports 2379-2380 (Source: Security Group itself)</p>
</li>
<li><p>Kubelet: Port 10250 (Source: Security Group itself)</p>
</li>
<li><p>NodePort Range: Ports 30000-32767 (Source: Security Group itself)</p>
</li>
<li><p>All Traffic: All ports (Source: Security Group itself)</p>
</li>
</ul>
</li>
<li><p>Outbound Rules: All traffic to 0.0.0.0/0</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-step-3-iam-roles-and-policies">Step 3: IAM Roles and Policies</h3>
<ol>
<li><p><strong>Create IAM Role:</strong> <code>qdrant-node-role</code></p>
<ul>
<li><p>Trusted entity: EC2</p>
</li>
<li><p>Attach policies:</p>
<ul>
<li><p><code>AmazonEC2FullAccess</code></p>
</li>
<li><p><code>AmazonEBSCSIDriverPolicy</code></p>
</li>
</ul>
</li>
<li><p>Create custom policy <code>QDrantEBSPolicy</code>:</p>
</li>
</ul>
</li>
</ol>
<pre><code class="lang-json">{
    <span class="hljs-attr">"Version"</span>: <span class="hljs-string">"2012-10-17"</span>,
    <span class="hljs-attr">"Statement"</span>: [
        {
            <span class="hljs-attr">"Effect"</span>: <span class="hljs-string">"Allow"</span>,
            <span class="hljs-attr">"Action"</span>: [
                <span class="hljs-string">"ec2:AttachVolume"</span>,
                <span class="hljs-string">"ec2:DetachVolume"</span>,
                <span class="hljs-string">"ec2:DescribeVolumes"</span>,
                <span class="hljs-string">"ec2:DescribeInstances"</span>,
                <span class="hljs-string">"ec2:CreateVolume"</span>,
                <span class="hljs-string">"ec2:DeleteVolume"</span>,
                <span class="hljs-string">"ec2:CreateSnapshot"</span>,
                <span class="hljs-string">"ec2:DeleteSnapshot"</span>,
                <span class="hljs-string">"ec2:DescribeSnapshots"</span>,
                <span class="hljs-string">"ec2:CreateTags"</span>
            ],
            <span class="hljs-attr">"Resource"</span>: <span class="hljs-string">"*"</span>
        }
    ]
}
</code></pre>
<ol start="2">
<li><p><strong>Create Instance Profile</strong></p>
<ul>
<li><p>Name: <code>qdrant-instance-profile</code></p>
</li>
<li><p>Add role: <code>qdrant-node-role</code></p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-phase-2-ec2-instances-setup">Phase 2: EC2 Instances Setup</h2>
<h3 id="heading-step-4-launch-ec2-instances">Step 4: Launch EC2 Instances</h3>
<p>Launch 3 EC2 instances with the following specifications:</p>
<p><strong>Instance Configuration:</strong></p>
<ul>
<li><p>AMI: Ubuntu 22.04 LTS</p>
</li>
<li><p>Instance Type: <code>t3.medium</code> (minimum) or <code>t3.large</code> (recommended)</p>
</li>
<li><p>Key Pair: Your SSH key</p>
</li>
<li><p>VPC: <code>qdrant-vpc</code></p>
</li>
<li><p>Subnets: Place each instance in different subnets</p>
</li>
<li><p>Security Group: <code>qdrant-cluster-sg</code></p>
</li>
<li><p>IAM Role: <code>qdrant-instance-profile</code></p>
</li>
<li><p>Storage: 20GB gp3 root volume + 50GB gp3 data volume for each instance</p>
</li>
</ul>
<p><strong>Instance Names:</strong></p>
<ul>
<li><p><code>qdrant-master-1</code> (in qdrant-subnet-1a)</p>
</li>
<li><p><code>qdrant-worker-1</code> (in qdrant-subnet-1b)</p>
</li>
<li><p><code>qdrant-worker-2</code> (in qdrant-subnet-1c)</p>
</li>
</ul>
<h3 id="heading-step-5-create-additional-ebs-volumes">Step 5: Create Additional EBS Volumes</h3>
<p>For each instance, create additional EBS volumes for persistent storage:</p>
<ol>
<li><p>Go to EC2 → Volumes → Create Volume</p>
</li>
<li><p>Create 3 volumes (one per instance):</p>
<ul>
<li><p>Volume Type: gp3</p>
</li>
<li><p>Size: 50GB each</p>
</li>
<li><p>Availability Zone: Match instance AZ</p>
</li>
<li><p>Tags: Name = <code>qdrant-data-volume-{1,2,3}</code></p>
</li>
</ul>
</li>
<li><p>Attach each volume to corresponding instance</p>
</li>
</ol>
<h2 id="heading-phase-3-kubernetes-cluster-setup">Phase 3: Kubernetes Cluster Setup</h2>
<h3 id="heading-step-6-install-prerequisites-on-all-nodes">Step 6: Install Prerequisites on All Nodes</h3>
<p>SSH into each instance and run:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
<span class="hljs-comment"># Update system</span>
sudo apt update &amp;&amp; sudo apt upgrade -y

<span class="hljs-comment"># Install Docker</span>
sudo apt install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu <span class="hljs-subst">$(lsb_release -cs)</span> stable"</span> | sudo tee /etc/apt/sources.list.d/docker.list &gt; /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io

<span class="hljs-comment"># Configure Docker</span>
sudo usermod -aG docker <span class="hljs-variable">$USER</span>
sudo systemctl <span class="hljs-built_in">enable</span> docker
sudo systemctl start docker

<span class="hljs-comment"># Install kubeadm, kubelet, kubectl</span>
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb https://apt.kubernetes.io/ kubernetes-xenial main"</span> | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

<span class="hljs-comment"># Configure containerd</span>
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i <span class="hljs-string">'s/SystemdCgroup = false/SystemdCgroup = true/'</span> /etc/containerd/config.toml
sudo systemctl restart containerd

<span class="hljs-comment"># Disable swap</span>
sudo swapoff -a
sudo sed -i <span class="hljs-string">'/ swap / s/^\(.*\)$/#\1/g'</span> /etc/fstab

<span class="hljs-comment"># Load kernel modules</span>
sudo modprobe br_netfilter
<span class="hljs-built_in">echo</span> <span class="hljs-string">'br_netfilter'</span> | sudo tee /etc/modules-load.d/k8s.conf

<span class="hljs-comment"># Configure sysctl</span>
sudo tee /etc/sysctl.d/k8s.conf &lt;&lt;EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
</code></pre>
<h3 id="heading-step-7-initialize-master-node">Step 7: Initialize Master Node</h3>
<p>On the master node (<code>qdrant-master-1</code>):</p>
<pre><code class="lang-bash"><span class="hljs-comment"># Initialize cluster</span>
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=&lt;MASTER_PRIVATE_IP&gt;

<span class="hljs-comment"># Configure kubectl</span>
mkdir -p <span class="hljs-variable">$HOME</span>/.kube
sudo cp -i /etc/kubernetes/admin.conf <span class="hljs-variable">$HOME</span>/.kube/config
sudo chown $(id -u):$(id -g) <span class="hljs-variable">$HOME</span>/.kube/config

<span class="hljs-comment"># Install Calico CNI</span>
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml

<span class="hljs-comment"># Generate join command (save this output)</span>
kubeadm token create --print-join-command
</code></pre>
<h3 id="heading-step-8-join-worker-nodes">Step 8: Join Worker Nodes</h3>
<p>On both worker nodes, run the join command from previous step:</p>
<pre><code class="lang-bash">sudo kubeadm join &lt;MASTER_IP&gt;:6443 --token &lt;TOKEN&gt; --discovery-token-ca-cert-hash &lt;HASH&gt;
</code></pre>
<h3 id="heading-step-9-verify-cluster">Step 9: Verify Cluster</h3>
<p>On master node:</p>
<pre><code class="lang-bash">kubectl get nodes
kubectl get pods -A
</code></pre>
<h2 id="heading-phase-4-storage-setup">Phase 4: Storage Setup</h2>
<h3 id="heading-step-10-install-ebs-csi-driver">Step 10: Install EBS CSI Driver</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Install EBS CSI Driver</span>
kubectl apply -k <span class="hljs-string">"github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.23"</span>

<span class="hljs-comment"># Verify installation</span>
kubectl get pods -n kube-system | grep ebs-csi
</code></pre>
<h3 id="heading-step-11-create-storage-class">Step 11: Create Storage Class</h3>
<p>Create <code>ebs-storageclass.yaml</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">storage.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">StorageClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">ebs-gp3</span>
<span class="hljs-attr">provisioner:</span> <span class="hljs-string">ebs.csi.aws.com</span>
<span class="hljs-attr">parameters:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">gp3</span>
  <span class="hljs-attr">iops:</span> <span class="hljs-string">"3000"</span>
  <span class="hljs-attr">throughput:</span> <span class="hljs-string">"125"</span>
  <span class="hljs-attr">encrypted:</span> <span class="hljs-string">"true"</span>
<span class="hljs-attr">volumeBindingMode:</span> <span class="hljs-string">WaitForFirstConsumer</span>
<span class="hljs-attr">allowVolumeExpansion:</span> <span class="hljs-literal">true</span>
<span class="hljs-attr">reclaimPolicy:</span> <span class="hljs-string">Retain</span>
</code></pre>
<p>Apply the storage class:</p>
<pre><code class="lang-bash">kubectl apply -f ebs-storageclass.yaml
</code></pre>
<h2 id="heading-phase-5-helm-and-qdrant-deployment">Phase 5: Helm and QDrant Deployment</h2>
<h3 id="heading-step-12-install-helm">Step 12: Install Helm</h3>
<p>On master node:</p>
<pre><code class="lang-bash">curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg &gt; /dev/null
<span class="hljs-built_in">echo</span> <span class="hljs-string">"deb [arch=<span class="hljs-subst">$(dpkg --print-architecture)</span> signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main"</span> | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt update
sudo apt install helm
</code></pre>
<h3 id="heading-step-13-add-qdrant-helm-repository">Step 13: Add QDrant Helm Repository</h3>
<pre><code class="lang-bash">helm repo add qdrant https://qdrant.github.io/qdrant-helm
helm repo update
</code></pre>
<h3 id="heading-step-14-create-qdrant-values-file">Step 14: Create QDrant Values File</h3>
<p>Create <code>qdrant-values.yaml</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># QDrant Cluster Configuration</span>
<span class="hljs-attr">replicaCount:</span> <span class="hljs-number">3</span>

<span class="hljs-attr">image:</span>
  <span class="hljs-attr">repository:</span> <span class="hljs-string">qdrant/qdrant</span>
  <span class="hljs-attr">tag:</span> <span class="hljs-string">"v1.7.4"</span>
  <span class="hljs-attr">pullPolicy:</span> <span class="hljs-string">IfNotPresent</span>

<span class="hljs-comment"># Service configuration</span>
<span class="hljs-attr">service:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">NodePort</span>
  <span class="hljs-attr">httpPort:</span> <span class="hljs-number">6333</span>
  <span class="hljs-attr">grpcPort:</span> <span class="hljs-number">6334</span>
  <span class="hljs-attr">httpNodePort:</span> <span class="hljs-number">30333</span>
  <span class="hljs-attr">grpcNodePort:</span> <span class="hljs-number">30334</span>

<span class="hljs-comment"># Persistent storage</span>
<span class="hljs-attr">persistence:</span>
  <span class="hljs-attr">enabled:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">storageClass:</span> <span class="hljs-string">"ebs-gp3"</span>
  <span class="hljs-attr">size:</span> <span class="hljs-string">50Gi</span>
  <span class="hljs-attr">accessMode:</span> <span class="hljs-string">ReadWriteOnce</span>

<span class="hljs-comment"># Resource limits</span>
<span class="hljs-attr">resources:</span>
  <span class="hljs-attr">limits:</span>
    <span class="hljs-attr">cpu:</span> <span class="hljs-string">1000m</span>
    <span class="hljs-attr">memory:</span> <span class="hljs-string">2Gi</span>
  <span class="hljs-attr">requests:</span>
    <span class="hljs-attr">cpu:</span> <span class="hljs-string">500m</span>
    <span class="hljs-attr">memory:</span> <span class="hljs-string">1Gi</span>

<span class="hljs-comment"># Pod disruption budget</span>
<span class="hljs-attr">podDisruptionBudget:</span>
  <span class="hljs-attr">enabled:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">minAvailable:</span> <span class="hljs-number">2</span>

<span class="hljs-comment"># Anti-affinity to spread pods across nodes</span>
<span class="hljs-attr">affinity:</span>
  <span class="hljs-attr">podAntiAffinity:</span>
    <span class="hljs-attr">preferredDuringSchedulingIgnoredDuringExecution:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">weight:</span> <span class="hljs-number">100</span>
      <span class="hljs-attr">podAffinityTerm:</span>
        <span class="hljs-attr">labelSelector:</span>
          <span class="hljs-attr">matchExpressions:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">key:</span> <span class="hljs-string">app.kubernetes.io/name</span>
            <span class="hljs-attr">operator:</span> <span class="hljs-string">In</span>
            <span class="hljs-attr">values:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-string">qdrant</span>
        <span class="hljs-attr">topologyKey:</span> <span class="hljs-string">kubernetes.io/hostname</span>

<span class="hljs-comment"># QDrant specific configuration</span>
<span class="hljs-attr">config:</span>
  <span class="hljs-attr">cluster:</span>
    <span class="hljs-attr">enabled:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">p2p:</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">6335</span>
  <span class="hljs-attr">service:</span>
    <span class="hljs-attr">api_key:</span> <span class="hljs-string">"your_secret_master_api_key_here"</span>
    <span class="hljs-attr">read_only_api_key:</span> <span class="hljs-string">"your_secret_read_only_api_key_here"</span>
    <span class="hljs-attr">http_port:</span> <span class="hljs-number">6333</span>
    <span class="hljs-attr">grpc_port:</span> <span class="hljs-number">6334</span>
  <span class="hljs-attr">storage:</span>
    <span class="hljs-attr">storage_path:</span> <span class="hljs-string">"/qdrant/storage"</span>
    <span class="hljs-attr">snapshots_path:</span> <span class="hljs-string">"/qdrant/snapshots"</span>
    <span class="hljs-attr">on_disk_payload:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">log_level:</span> <span class="hljs-string">"INFO"</span>

<span class="hljs-comment"># Environment variables for clustering</span>
<span class="hljs-attr">env:</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QDRANT__CLUSTER__ENABLED</span>
    <span class="hljs-attr">value:</span> <span class="hljs-string">"true"</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">QDRANT__CLUSTER__P2P__PORT</span>
    <span class="hljs-attr">value:</span> <span class="hljs-string">"6335"</span>

<span class="hljs-comment"># Security context</span>
<span class="hljs-attr">securityContext:</span>
  <span class="hljs-attr">runAsNonRoot:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">runAsUser:</span> <span class="hljs-number">1000</span>
  <span class="hljs-attr">fsGroup:</span> <span class="hljs-number">1000</span>

<span class="hljs-comment"># Node selector to ensure pods are scheduled on our nodes</span>
<span class="hljs-attr">nodeSelector:</span> {}

<span class="hljs-comment"># Tolerations</span>
<span class="hljs-attr">tolerations:</span> []
</code></pre>
<h3 id="heading-step-15-deploy-qdrant-cluster">Step 15: Deploy QDrant Cluster</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Create namespace</span>
kubectl create namespace qdrant

<span class="hljs-comment"># Deploy QDrant</span>
helm install qdrant qdrant/qdrant \
  --namespace qdrant \
  --values qdrant-values.yaml \
  --<span class="hljs-built_in">wait</span>

<span class="hljs-comment"># Verify deployment</span>
kubectl get pods -n qdrant
kubectl get pvc -n qdrant
kubectl get svc -n qdrant
</code></pre>
<h3 id="heading-step-16-create-load-balancer-service-optional">Step 16: Create Load Balancer Service (Optional)</h3>
<p>For external access, create <code>qdrant-lb.yaml</code>:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">qdrant-loadbalancer</span>
  <span class="hljs-attr">namespace:</span> <span class="hljs-string">qdrant</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">LoadBalancer</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app.kubernetes.io/name:</span> <span class="hljs-string">qdrant</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">http</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">6333</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">6333</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">grpc</span>
      <span class="hljs-attr">port:</span> <span class="hljs-number">6334</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">6334</span>
</code></pre>
<p>Apply the load balancer:</p>
<pre><code class="lang-bash">kubectl apply -f qdrant-lb.yaml
</code></pre>
<h2 id="heading-phase-6-verification-and-testing">Phase 6: Verification and Testing</h2>
<h3 id="heading-step-17-verify-cluster-status">Step 17: Verify Cluster Status</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Check pods</span>
kubectl get pods -n qdrant -o wide

<span class="hljs-comment"># Check persistent volumes</span>
kubectl get pv
kubectl get pvc -n qdrant

<span class="hljs-comment"># Check services</span>
kubectl get svc -n qdrant

<span class="hljs-comment"># Check logs</span>
kubectl logs -n qdrant -l app.kubernetes.io/name=qdrant

<span class="hljs-comment"># Port forward for testing (run in background)</span>
kubectl port-forward -n qdrant svc/qdrant 6333:6333 &amp;
</code></pre>
<h3 id="heading-step-18-test-qdrant-api">Step 18: Test QDrant API</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Test cluster info</span>
curl -X GET <span class="hljs-string">"http://localhost:6333/cluster"</span>

<span class="hljs-comment"># Test collections</span>
curl -X GET <span class="hljs-string">"http://localhost:6333/collections"</span>

<span class="hljs-comment"># Create a test collection</span>
curl -X PUT <span class="hljs-string">"http://localhost:6333/collections/test_collection"</span> \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -d <span class="hljs-string">'{
    "vectors": {
      "size": 100,
      "distance": "Cosine"
    }
  }'</span>
</code></pre>
<h2 id="heading-phase-7-monitoring-and-maintenance">Phase 7: Monitoring and Maintenance</h2>
<h3 id="heading-step-19-set-up-basic-monitoring">Step 19: Set Up Basic Monitoring</h3>
<p>Create <code>monitoring-values.yaml</code> for Prometheus (optional):</p>
<pre><code class="lang-yaml"><span class="hljs-attr">prometheus:</span>
  <span class="hljs-attr">enabled:</span> <span class="hljs-literal">true</span>
  <span class="hljs-attr">serviceMonitor:</span>
    <span class="hljs-attr">enabled:</span> <span class="hljs-literal">true</span>
    <span class="hljs-attr">namespace:</span> <span class="hljs-string">qdrant</span>
</code></pre>
<h3 id="heading-step-20-backup-strategy">Step 20: Backup Strategy</h3>
<p>Create backup script <a target="_blank" href="http://backup-qdrant.sh"><code>backup-qdrant.sh</code></a>:</p>
<pre><code class="lang-bash"><span class="hljs-meta">#!/bin/bash</span>
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR=<span class="hljs-string">"/backup/qdrant_<span class="hljs-variable">$TIMESTAMP</span>"</span>

<span class="hljs-comment"># Create snapshots via API</span>
<span class="hljs-keyword">for</span> pod <span class="hljs-keyword">in</span> $(kubectl get pods -n qdrant -l app.kubernetes.io/name=qdrant -o jsonpath=<span class="hljs-string">'{.items[*].metadata.name}'</span>); <span class="hljs-keyword">do</span>
  kubectl <span class="hljs-built_in">exec</span> -n qdrant <span class="hljs-variable">$pod</span> -- curl -X POST <span class="hljs-string">"http://localhost:6333/snapshots"</span>
<span class="hljs-keyword">done</span>

<span class="hljs-comment"># Copy snapshots from persistent volumes</span>
kubectl <span class="hljs-built_in">exec</span> -n qdrant qdrant-0 -- tar -czf /tmp/qdrant-backup-<span class="hljs-variable">$TIMESTAMP</span>.tar.gz /qdrant/snapshots
kubectl cp qdrant/qdrant-0:/tmp/qdrant-backup-<span class="hljs-variable">$TIMESTAMP</span>.tar.gz ./qdrant-backup-<span class="hljs-variable">$TIMESTAMP</span>.tar.gz
</code></pre>
<h2 id="heading-troubleshooting">Troubleshooting</h2>
<h3 id="heading-common-issues-and-solutions">Common Issues and Solutions</h3>
<ol>
<li><p><strong>Pods stuck in Pending state:</strong></p>
<ul>
<li><p>Check node resources: <code>kubectl describe nodes</code></p>
</li>
<li><p>Check PVC status: <code>kubectl get pvc -n qdrant</code></p>
</li>
<li><p>Verify EBS CSI driver: <code>kubectl get pods -n kube-system | grep ebs-csi</code></p>
</li>
</ul>
</li>
<li><p><strong>Storage issues:</strong></p>
<ul>
<li><p>Verify IAM permissions for EBS operations</p>
</li>
<li><p>Check storage class: <code>kubectl get storageclass</code></p>
</li>
<li><p>Review EBS volume attachments in AWS Console</p>
</li>
</ul>
</li>
<li><p><strong>Network connectivity issues:</strong></p>
<ul>
<li><p>Verify security group rules</p>
</li>
<li><p>Check Calico pod status: <code>kubectl get pods -n kube-system | grep calico</code></p>
</li>
<li><p>Test pod-to-pod connectivity</p>
</li>
</ul>
</li>
<li><p><strong>QDrant cluster formation issues:</strong></p>
<ul>
<li><p>Check cluster configuration in pod logs</p>
</li>
<li><p>Verify p2p port accessibility between pods</p>
</li>
<li><p>Review QDrant cluster API endpoint</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-maintenance-commands">Maintenance Commands</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Scale cluster</span>
helm upgrade qdrant qdrant/qdrant --namespace qdrant --<span class="hljs-built_in">set</span> replicaCount=5 --values qdrant-values.yaml

<span class="hljs-comment"># Update QDrant version</span>
helm upgrade qdrant qdrant/qdrant --namespace qdrant --<span class="hljs-built_in">set</span> image.tag=v1.8.0 --values qdrant-values.yaml

<span class="hljs-comment"># Backup and restore procedures</span>
kubectl <span class="hljs-built_in">exec</span> -n qdrant qdrant-0 -- /qdrant/backup.sh
</code></pre>
<h2 id="heading-security-considerations">Security Considerations</h2>
<ol>
<li><p><strong>Network Security:</strong></p>
<ul>
<li><p>Use private subnets for all worker nodes</p>
</li>
<li><p>Restrict security group access to minimum required ports</p>
</li>
<li><p>Consider using AWS PrivateLink for internal communication</p>
</li>
</ul>
</li>
<li><p><strong>Storage Security:</strong></p>
<ul>
<li><p>Enable EBS encryption</p>
</li>
<li><p>Use IAM roles with least privilege</p>
</li>
<li><p>Regular backup testing and restoration procedures</p>
</li>
</ul>
</li>
<li><p><strong>Access Control:</strong></p>
<ul>
<li><p>Implement RBAC in Kubernetes</p>
</li>
<li><p>Use network policies to restrict pod communication</p>
</li>
<li><p>Enable audit logging</p>
</li>
</ul>
</li>
</ol>
<p>This deployment provides a production-ready QDrant cluster with high availability, persistent storage, and proper AWS integration.</p>
]]></content:encoded></item><item><title><![CDATA[Step-by-Step Guide to Linking an SSH Key with Your GitHub Repositories]]></title><description><![CDATA[1. Generate a New SSH Key
Run the following command to generate a new SSH key specifically for GitHub:
ssh-keygen -t ed25519 -C "your_email@example.com" -f ~/.ssh/id_github

Explanation:

-t ed25519: Specifies the key type.

-C "your_email@example.co...]]></description><link>https://blogs.ummerfarooq.dev/step-by-step-guide-to-linking-an-ssh-key-with-your-github-repositories</link><guid isPermaLink="true">https://blogs.ummerfarooq.dev/step-by-step-guide-to-linking-an-ssh-key-with-your-github-repositories</guid><category><![CDATA[GitHub]]></category><category><![CDATA[ssh-keys]]></category><category><![CDATA[ssh-keygen]]></category><dc:creator><![CDATA[Ummer Farooq]]></dc:creator><pubDate>Thu, 19 Dec 2024 08:42:13 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1734597564045/7aea6b14-f69f-40b3-86fb-b2cc5023d9f0.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-1-generate-a-new-ssh-key">1. <strong>Generate a New SSH Key</strong></h3>
<p>Run the following command to generate a new SSH key specifically for GitHub:</p>
<pre><code class="lang-bash">ssh-keygen -t ed25519 -C <span class="hljs-string">"your_email@example.com"</span> -f ~/.ssh/id_github
</code></pre>
<p>Explanation:</p>
<ul>
<li><p><code>-t ed25519</code>: Specifies the key type.</p>
</li>
<li><p><code>-C "</code><a target="_blank" href="mailto:your_email@example.com"><code>your_email@example.com</code></a><code>"</code>: Adds a comment to the key (typically your email).</p>
</li>
<li><p><code>-f ~/.ssh/id_github</code>: Specifies the file name for the new key.</p>
</li>
</ul>
<p>During the process, you’ll be prompted to:</p>
<ol>
<li><p>Confirm the file location (press <code>Enter</code> to accept the default).</p>
</li>
<li><p>Enter a passphrase (optional, but recommended for added security).</p>
</li>
</ol>
<h3 id="heading-2-verify-the-key-files">2. <strong>Verify the Key Files</strong></h3>
<p>Once the SSH key is created, confirm the existence of the key files using:</p>
<pre><code class="lang-plaintext">ls -l ~/.ssh/id_github*
</code></pre>
<p>You should see:</p>
<ul>
<li><p><strong>~/.ssh/id_github</strong>: The private key (keep this secure and never share it).</p>
</li>
<li><p><strong>~/.ssh/id_github.pub</strong>: The public key (this will be added to GitHub).</p>
</li>
</ul>
<h3 id="heading-3-add-the-key-to-your-ssh-agent">3. <strong>Add the Key to Your SSH Agent</strong></h3>
<p>Add the private key to your SSH agent to manage it easily:</p>
<ol>
<li><p>Start the SSH agent:</p>
<pre><code class="lang-bash"> <span class="hljs-built_in">eval</span> <span class="hljs-string">"<span class="hljs-subst">$(ssh-agent -s)</span>"</span>
</code></pre>
</li>
<li><p>Add the key to the agent:</p>
<pre><code class="lang-bash"> ssh-add ~/.ssh/id_github
</code></pre>
</li>
</ol>
<h3 id="heading-4-copy-the-public-key-to-your-clipboard">4. Copy the Public Key to Your Clipboard</h3>
<p>To add the key to GitHub, first copy the public key:</p>
<pre><code class="lang-bash">cat ~/.ssh/id_github.pub
</code></pre>
<h3 id="heading-5-add-the-ssh-key-to-your-github-account">5. Add the SSH Key to Your GitHub Account</h3>
<ol>
<li><p>Log in to your GitHub account.</p>
</li>
<li><p>Navigate to <strong>Settings &gt; SSH and GPG keys</strong>.</p>
</li>
<li><p>Click <strong>New SSH key</strong>.</p>
</li>
<li><p>Paste your public key into the provided field.</p>
</li>
<li><p>Add a title to identify this key (e.g., "Server").</p>
</li>
<li><p>Click <strong>Add SSH key</strong>.</p>
</li>
</ol>
<h3 id="heading-6-test-the-ssh-connection">6. Test the SSH Connection</h3>
<p>Verify the connection to GitHub using:</p>
<pre><code class="lang-bash">ssh -i ~/.ssh/id_github -T git@github.com
</code></pre>
<p>If successful, you should see a message like:</p>
<pre><code class="lang-plaintext">Hi &lt;username&gt;! You've successfully authenticated, but GitHub does not provide shell access.
</code></pre>
<h3 id="heading-7-using-the-ssh-key-for-repository-management">7. Using the SSH Key for Repository Management</h3>
<h4 id="heading-clone-a-repository">Clone a Repository</h4>
<p>Use the SSH URL to clone a repository:</p>
<p>When pushing changes, Git will automatically use the SSH key for authentication:</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> git@github.com:&lt;username&gt;/&lt;repository&gt;.git
</code></pre>
<h4 id="heading-push-changes">Push Changes</h4>
<pre><code class="lang-bash">git push origin main
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>By setting up an SSH key with GitHub, you’ve simplified your authentication process and secured your repository management workflow. No more password prompts—just smooth, secure Git operations!</p>
]]></content:encoded></item></channel></rss>