QDrant Multi-node Cluster Deployment on AWS EC2 with Helm Charts

Prerequisites
AWS Account with appropriate permissions
Basic knowledge of Kubernetes and Helm
SSH key pair for EC2 access
Phase 1: AWS Infrastructure Setup
Step 1: Create VPC and Networking
Create VPC
Go to AWS Console → VPC → Create VPC
Name:
qdrant-vpcIPv4 CIDR:
10.0.0.0/16Enable DNS hostnames and DNS resolution
Create Subnets
Create 3 private subnets in different AZs:
qdrant-subnet-1a:10.0.1.0/24(ap-south-1a)qdrant-subnet-1b:10.0.2.0/24(ap-south-1b)qdrant-subnet-1c:10.0.3.0/24(ap-south-1c)
Create 1 public subnet for NAT Gateway:
qdrant-public-subnet:10.0.100.0/24(ap-south-1a)
Create Internet Gateway
Name:
qdrant-igwAttach to
qdrant-vpc
Create NAT Gateway
Place in
qdrant-public-subnetAllocate Elastic IP
Configure Route Tables
Public Route Table:
- Route:
0.0.0.0/0→ Internet Gateway
- Route:
Private Route Table:
Route:
0.0.0.0/0→ NAT GatewayAssociate with all private subnets
Step 2: Security Groups
Create Security Group:
qdrant-cluster-sgVPC:
qdrant-vpcInbound Rules:
SSH: Port 22 (Source: Your IP)
Kubernetes API: Port 6443 (Source: Security Group itself)
QDrant HTTP: Port 6333 (Source: Security Group itself)
QDrant gRPC: Port 6334 (Source: Security Group itself)
Etcd: Ports 2379-2380 (Source: Security Group itself)
Kubelet: Port 10250 (Source: Security Group itself)
NodePort Range: Ports 30000-32767 (Source: Security Group itself)
All Traffic: All ports (Source: Security Group itself)
Outbound Rules: All traffic to 0.0.0.0/0
Step 3: IAM Roles and Policies
Create IAM Role:
qdrant-node-roleTrusted entity: EC2
Attach policies:
AmazonEC2FullAccessAmazonEBSCSIDriverPolicy
Create custom policy
QDrantEBSPolicy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:DetachVolume",
"ec2:DescribeVolumes",
"ec2:DescribeInstances",
"ec2:CreateVolume",
"ec2:DeleteVolume",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot",
"ec2:DescribeSnapshots",
"ec2:CreateTags"
],
"Resource": "*"
}
]
}
Create Instance Profile
Name:
qdrant-instance-profileAdd role:
qdrant-node-role
Phase 2: EC2 Instances Setup
Step 4: Launch EC2 Instances
Launch 3 EC2 instances with the following specifications:
Instance Configuration:
AMI: Ubuntu 22.04 LTS
Instance Type:
t3.medium(minimum) ort3.large(recommended)Key Pair: Your SSH key
VPC:
qdrant-vpcSubnets: Place each instance in different subnets
Security Group:
qdrant-cluster-sgIAM Role:
qdrant-instance-profileStorage: 20GB gp3 root volume + 50GB gp3 data volume for each instance
Instance Names:
qdrant-master-1(in qdrant-subnet-1a)qdrant-worker-1(in qdrant-subnet-1b)qdrant-worker-2(in qdrant-subnet-1c)
Step 5: Create Additional EBS Volumes
For each instance, create additional EBS volumes for persistent storage:
Go to EC2 → Volumes → Create Volume
Create 3 volumes (one per instance):
Volume Type: gp3
Size: 50GB each
Availability Zone: Match instance AZ
Tags: Name =
qdrant-data-volume-{1,2,3}
Attach each volume to corresponding instance
Phase 3: Kubernetes Cluster Setup
Step 6: Install Prerequisites on All Nodes
SSH into each instance and run:
#!/bin/bash
# Update system
sudo apt update && sudo apt upgrade -y
# Install Docker
sudo apt install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io
# Configure Docker
sudo usermod -aG docker $USER
sudo systemctl enable docker
sudo systemctl start docker
# Install kubeadm, kubelet, kubectl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd
# Disable swap
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load kernel modules
sudo modprobe br_netfilter
echo 'br_netfilter' | sudo tee /etc/modules-load.d/k8s.conf
# Configure sysctl
sudo tee /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Step 7: Initialize Master Node
On the master node (qdrant-master-1):
# Initialize cluster
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=<MASTER_PRIVATE_IP>
# Configure kubectl
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Install Calico CNI
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml
# Generate join command (save this output)
kubeadm token create --print-join-command
Step 8: Join Worker Nodes
On both worker nodes, run the join command from previous step:
sudo kubeadm join <MASTER_IP>:6443 --token <TOKEN> --discovery-token-ca-cert-hash <HASH>
Step 9: Verify Cluster
On master node:
kubectl get nodes
kubectl get pods -A
Phase 4: Storage Setup
Step 10: Install EBS CSI Driver
# Install EBS CSI Driver
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.23"
# Verify installation
kubectl get pods -n kube-system | grep ebs-csi
Step 11: Create Storage Class
Create ebs-storageclass.yaml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-gp3
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Retain
Apply the storage class:
kubectl apply -f ebs-storageclass.yaml
Phase 5: Helm and QDrant Deployment
Step 12: Install Helm
On master node:
curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt update
sudo apt install helm
Step 13: Add QDrant Helm Repository
helm repo add qdrant https://qdrant.github.io/qdrant-helm
helm repo update
Step 14: Create QDrant Values File
Create qdrant-values.yaml:
# QDrant Cluster Configuration
replicaCount: 3
image:
repository: qdrant/qdrant
tag: "v1.7.4"
pullPolicy: IfNotPresent
# Service configuration
service:
type: NodePort
httpPort: 6333
grpcPort: 6334
httpNodePort: 30333
grpcNodePort: 30334
# Persistent storage
persistence:
enabled: true
storageClass: "ebs-gp3"
size: 50Gi
accessMode: ReadWriteOnce
# Resource limits
resources:
limits:
cpu: 1000m
memory: 2Gi
requests:
cpu: 500m
memory: 1Gi
# Pod disruption budget
podDisruptionBudget:
enabled: true
minAvailable: 2
# Anti-affinity to spread pods across nodes
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- qdrant
topologyKey: kubernetes.io/hostname
# QDrant specific configuration
config:
cluster:
enabled: true
p2p:
port: 6335
service:
api_key: "your_secret_master_api_key_here"
read_only_api_key: "your_secret_read_only_api_key_here"
http_port: 6333
grpc_port: 6334
storage:
storage_path: "/qdrant/storage"
snapshots_path: "/qdrant/snapshots"
on_disk_payload: true
log_level: "INFO"
# Environment variables for clustering
env:
- name: QDRANT__CLUSTER__ENABLED
value: "true"
- name: QDRANT__CLUSTER__P2P__PORT
value: "6335"
# Security context
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
# Node selector to ensure pods are scheduled on our nodes
nodeSelector: {}
# Tolerations
tolerations: []
Step 15: Deploy QDrant Cluster
# Create namespace
kubectl create namespace qdrant
# Deploy QDrant
helm install qdrant qdrant/qdrant \
--namespace qdrant \
--values qdrant-values.yaml \
--wait
# Verify deployment
kubectl get pods -n qdrant
kubectl get pvc -n qdrant
kubectl get svc -n qdrant
Step 16: Create Load Balancer Service (Optional)
For external access, create qdrant-lb.yaml:
apiVersion: v1
kind: Service
metadata:
name: qdrant-loadbalancer
namespace: qdrant
spec:
type: LoadBalancer
selector:
app.kubernetes.io/name: qdrant
ports:
- name: http
port: 6333
targetPort: 6333
- name: grpc
port: 6334
targetPort: 6334
Apply the load balancer:
kubectl apply -f qdrant-lb.yaml
Phase 6: Verification and Testing
Step 17: Verify Cluster Status
# Check pods
kubectl get pods -n qdrant -o wide
# Check persistent volumes
kubectl get pv
kubectl get pvc -n qdrant
# Check services
kubectl get svc -n qdrant
# Check logs
kubectl logs -n qdrant -l app.kubernetes.io/name=qdrant
# Port forward for testing (run in background)
kubectl port-forward -n qdrant svc/qdrant 6333:6333 &
Step 18: Test QDrant API
# Test cluster info
curl -X GET "http://localhost:6333/cluster"
# Test collections
curl -X GET "http://localhost:6333/collections"
# Create a test collection
curl -X PUT "http://localhost:6333/collections/test_collection" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 100,
"distance": "Cosine"
}
}'
Phase 7: Monitoring and Maintenance
Step 19: Set Up Basic Monitoring
Create monitoring-values.yaml for Prometheus (optional):
prometheus:
enabled: true
serviceMonitor:
enabled: true
namespace: qdrant
Step 20: Backup Strategy
Create backup script backup-qdrant.sh:
#!/bin/bash
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backup/qdrant_$TIMESTAMP"
# Create snapshots via API
for pod in $(kubectl get pods -n qdrant -l app.kubernetes.io/name=qdrant -o jsonpath='{.items[*].metadata.name}'); do
kubectl exec -n qdrant $pod -- curl -X POST "http://localhost:6333/snapshots"
done
# Copy snapshots from persistent volumes
kubectl exec -n qdrant qdrant-0 -- tar -czf /tmp/qdrant-backup-$TIMESTAMP.tar.gz /qdrant/snapshots
kubectl cp qdrant/qdrant-0:/tmp/qdrant-backup-$TIMESTAMP.tar.gz ./qdrant-backup-$TIMESTAMP.tar.gz
Troubleshooting
Common Issues and Solutions
Pods stuck in Pending state:
Check node resources:
kubectl describe nodesCheck PVC status:
kubectl get pvc -n qdrantVerify EBS CSI driver:
kubectl get pods -n kube-system | grep ebs-csi
Storage issues:
Verify IAM permissions for EBS operations
Check storage class:
kubectl get storageclassReview EBS volume attachments in AWS Console
Network connectivity issues:
Verify security group rules
Check Calico pod status:
kubectl get pods -n kube-system | grep calicoTest pod-to-pod connectivity
QDrant cluster formation issues:
Check cluster configuration in pod logs
Verify p2p port accessibility between pods
Review QDrant cluster API endpoint
Maintenance Commands
# Scale cluster
helm upgrade qdrant qdrant/qdrant --namespace qdrant --set replicaCount=5 --values qdrant-values.yaml
# Update QDrant version
helm upgrade qdrant qdrant/qdrant --namespace qdrant --set image.tag=v1.8.0 --values qdrant-values.yaml
# Backup and restore procedures
kubectl exec -n qdrant qdrant-0 -- /qdrant/backup.sh
Security Considerations
Network Security:
Use private subnets for all worker nodes
Restrict security group access to minimum required ports
Consider using AWS PrivateLink for internal communication
Storage Security:
Enable EBS encryption
Use IAM roles with least privilege
Regular backup testing and restoration procedures
Access Control:
Implement RBAC in Kubernetes
Use network policies to restrict pod communication
Enable audit logging
This deployment provides a production-ready QDrant cluster with high availability, persistent storage, and proper AWS integration.
