<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://smaht.ai/feed.xml" rel="self" type="application/atom+xml" /><link href="https://smaht.ai/" rel="alternate" type="text/html" /><updated>2026-03-08T16:10:59+00:00</updated><id>https://smaht.ai/feed.xml</id><title type="html">Smaht.ai</title><subtitle>Smaht.ai is a community of experienced AI engineers and entrepreneurs who are passionate about building products and building companies.
</subtitle><author><name>Smaht.ai</name><email>hello@smaht.ai</email></author><entry><title type="html">Self-Hosted MongoDB on Kubernetes with Atlas Search (mongot)</title><link href="https://smaht.ai/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search/" rel="alternate" type="text/html" title="Self-Hosted MongoDB on Kubernetes with Atlas Search (mongot)" /><published>2026-03-08T00:00:00+00:00</published><updated>2026-03-08T00:00:00+00:00</updated><id>https://smaht.ai/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search</id><content type="html" xml:base="https://smaht.ai/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search/"><![CDATA[<p>For air-gapped environments, on-premises clusters, or any deployment where MongoDB Atlas is not an option, you can run a production-grade MongoDB replica set with optional <strong>Atlas Search</strong> (full-text and vector indexes) entirely inside Kubernetes. This post describes the <a href="https://github.com/analytiq-hub/analytiq-charts"><code class="language-plaintext highlighter-rouge">mongodb-atlas-local</code></a> Helm chart and the operational details we learned running it on EKS and elsewhere.</p>

<p>If you’re new to Kubernetes, the <a href="/tech/kubernetes/devops/kubernetes-for-docker-users-primer/">Kubernetes for Docker Users primer</a> covers Pods, Deployments, Services, PVCs, and Helm basics. For packaging and GitOps, see <a href="/tech/kubernetes/devops/kubernetes-packaging-helm-gitops/">Kubernetes Packaging and Deployment</a>.</p>

<h2 id="why-not-bitnami">Why not Bitnami?</h2>

<p>The obvious choice for an in-cluster MongoDB is the Bitnami chart, which is widely used and simple to install. The problem is <strong>vector search</strong>. Applications that need semantic search or Atlas-style indexes require the <code class="language-plaintext highlighter-rouge">mongot</code> process — a sidecar that runs alongside <code class="language-plaintext highlighter-rouge">mongod</code> and handles full-text and vector indexes. Bitnami deploys a plain community MongoDB without <code class="language-plaintext highlighter-rouge">mongot</code>, so Atlas Search is simply not available.</p>

<p>The only supported path to <code class="language-plaintext highlighter-rouge">mongot</code> in a self-hosted environment is the <a href="https://github.com/mongodb/mongodb-kubernetes-operator">MongoDB Kubernetes Operator</a>, which introduces the <code class="language-plaintext highlighter-rouge">MongoDBCommunity</code> and <code class="language-plaintext highlighter-rouge">MongoDBSearch</code> custom resources. The operator manages the StatefulSet, replica set initialization, user creation, and TLS — and, when <code class="language-plaintext highlighter-rouge">MongoDBSearch</code> is enabled, injects the <code class="language-plaintext highlighter-rouge">mongot</code> sidecar with the right configuration.</p>

<p>Our chart wraps the operator’s CRDs with sensible defaults and a single <code class="language-plaintext highlighter-rouge">helm upgrade --install</code> interface, so operators don’t need to understand the operator’s internals to get a working cluster. You can run MongoDB with or without search; if you don’t need vector or full-text search, you can disable the <code class="language-plaintext highlighter-rouge">mongot</code> sidecar and save resources.</p>

<h2 id="two-phase-install">Two-phase install</h2>

<p><code class="language-plaintext highlighter-rouge">mongot</code> requires a running, authenticated replica set to connect to — it cannot start on a fresh cluster. The install therefore happens in two phases:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Phase 1: bring up the replica set without search</span>
helm upgrade <span class="nt">--install</span> mongodb oci://ghcr.io/analytiq-hub/mongodb-atlas-local <span class="se">\</span>
  <span class="nt">--version</span> 2.0.1 <span class="nt">--namespace</span> mongodb <span class="se">\</span>
  <span class="nt">--set</span> mongodb.adminPassword<span class="o">=</span><span class="s2">"..."</span> <span class="se">\</span>
  <span class="nt">--set</span> mongodb.appUser.password<span class="o">=</span><span class="s2">"..."</span> <span class="se">\</span>
  <span class="nt">--set</span> search.enabled<span class="o">=</span><span class="nb">false</span>

<span class="c"># Wait for replica set Ready</span>
kubectl <span class="nb">wait</span> <span class="nt">--for</span><span class="o">=</span><span class="nv">condition</span><span class="o">=</span>ready pod <span class="nt">-l</span> <span class="nv">app</span><span class="o">=</span>mongodb-mongodb-atlas-local <span class="se">\</span>
  <span class="nt">-n</span> mongodb <span class="nt">--timeout</span><span class="o">=</span>300s

<span class="c"># Phase 2: enable search</span>
helm upgrade mongodb oci://ghcr.io/analytiq-hub/mongodb-atlas-local <span class="se">\</span>
  <span class="nt">--version</span> 2.0.1 <span class="nt">--namespace</span> mongodb <span class="nt">--reuse-values</span> <span class="se">\</span>
  <span class="nt">--set</span> search.enabled<span class="o">=</span><span class="nb">true</span>
</code></pre></div></div>

<p>Attempting a single-phase install with <code class="language-plaintext highlighter-rouge">search.enabled=true</code> results in <code class="language-plaintext highlighter-rouge">mongot</code> crash-looping because the replica set isn’t ready to accept its connection.</p>

<h2 id="node-sizing-for-stateful-workloads">Node sizing for stateful workloads</h2>

<p>Adding MongoDB changes the cluster sizing arithmetic considerably. Each replica pod runs two containers: <code class="language-plaintext highlighter-rouge">mongod</code> (500m CPU, 400Mi) and <code class="language-plaintext highlighter-rouge">mongodb-agent</code> (500m CPU, 400Mi), plus a <code class="language-plaintext highlighter-rouge">mongot</code> sidecar (250m CPU, 250Mi) when search is enabled. A 3-replica set therefore requests ~2.25 vCPU and ~3.15 Gi of memory, on top of whatever other workloads you run.</p>

<p>The scheduler must fit the entire pod on one node. On a cluster with two <code class="language-plaintext highlighter-rouge">t3.medium</code> nodes (2 vCPU / 4 Gi each), if existing workloads already consume ~1.7 vCPU in requests, there may be ~2.2 vCPU free across both nodes — but never more than ~740m on a single node. A MongoDB pod that needs ~750m CPU cannot be scheduled. Adding a third node (or sizing nodes with enough headroom) resolves it.</p>

<p>The practical lesson: <strong>account for stateful pods when sizing the initial node group</strong>, or ensure the autoscaler can provision new nodes quickly enough not to block workloads.</p>

<h2 id="ebs-csi-driver-and-the-gp2-trap-eks">EBS CSI Driver and the gp2 trap (EKS)</h2>

<p>When we added MongoDB to an EKS cluster, PVCs sat in <code class="language-plaintext highlighter-rouge">Pending</code> indefinitely with the error:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>no persistent volumes available for this claim and no storage class is set
</code></pre></div></div>

<p>EKS creates a <code class="language-plaintext highlighter-rouge">gp2</code> StorageClass by default, but it has two problems. First, it is not marked as the default class — PVCs with an empty <code class="language-plaintext highlighter-rouge">storageClassName</code> get no provisioner assigned. Second, and more importantly, <code class="language-plaintext highlighter-rouge">gp2</code> uses the legacy in-tree <code class="language-plaintext highlighter-rouge">kubernetes.io/aws-ebs</code> provisioner, which was removed in Kubernetes 1.27. On EKS 1.35, it is simply gone.</p>

<p>The fix is to create a <code class="language-plaintext highlighter-rouge">gp3</code> StorageClass backed by the EBS CSI driver (<code class="language-plaintext highlighter-rouge">ebs.csi.aws.com</code>) and mark it as the cluster default:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">storage.k8s.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">StorageClass</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">gp3</span>
  <span class="na">annotations</span><span class="pi">:</span>
    <span class="na">storageclass.kubernetes.io/is-default-class</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
<span class="na">provisioner</span><span class="pi">:</span> <span class="s">ebs.csi.aws.com</span>
<span class="na">volumeBindingMode</span><span class="pi">:</span> <span class="s">WaitForFirstConsumer</span>
<span class="na">allowVolumeExpansion</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">parameters</span><span class="pi">:</span>
  <span class="na">type</span><span class="pi">:</span> <span class="s">gp3</span>
  <span class="na">encrypted</span><span class="pi">:</span> <span class="s2">"</span><span class="s">true"</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">WaitForFirstConsumer</code> is important — it delays EBS volume creation until the pod is actually scheduled to a node, which ensures the volume is created in the correct availability zone. <code class="language-plaintext highlighter-rouge">allowVolumeExpansion: true</code> enables online resizing without pod restarts.</p>

<p>Provision this StorageClass (and the EBS CSI driver) via Terraform or your preferred IaC so new clusters get it automatically.</p>

<h2 id="summary">Summary</h2>

<table>
  <thead>
    <tr>
      <th>Topic</th>
      <th>Takeaway</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Chart</strong></td>
      <td><code class="language-plaintext highlighter-rouge">mongodb-atlas-local</code> on <a href="https://github.com/analytiq-hub/analytiq-charts">analytiq-charts</a> — replica set + optional <code class="language-plaintext highlighter-rouge">mongot</code> for Atlas Search</td>
    </tr>
    <tr>
      <td><strong>Install</strong></td>
      <td>Two-phase: bring up replica set with <code class="language-plaintext highlighter-rouge">search.enabled=false</code>, then enable search</td>
    </tr>
    <tr>
      <td><strong>Sizing</strong></td>
      <td>Reserve enough CPU/memory per node for the full MongoDB pod; scheduler places whole pod on one node</td>
    </tr>
    <tr>
      <td><strong>EKS storage</strong></td>
      <td>Use a <code class="language-plaintext highlighter-rouge">gp3</code> StorageClass with <code class="language-plaintext highlighter-rouge">ebs.csi.aws.com</code>; don’t rely on the default <code class="language-plaintext highlighter-rouge">gp2</code></td>
    </tr>
  </tbody>
</table>

<p>We use this chart for <a href="https://docrouter.ai">Doc Router</a> and other applications that need MongoDB with vector search. For the full Doc Router deployment story (Helm chart, workers, CI/CD, multi-cloud), see <a href="/tech/kubernetes/devops/docrouter/deploying-doc-router-on-kubernetes/">Deploying Doc Router on Kubernetes</a>.</p>

<hr />

<p><em>Andrei Radulescu-Banu is the founder of <a href="https://docrouter.ai">DocRouter.AI</a> (document processing with LLMs) and <a href="https://sigagent.ai">SigAgent.AI</a> (Claude Agent monitoring). His company <a href="https://analytiqhub.com">AnalytiqHub.com</a> provides consulting services for cloud and AI engineering.</em></p>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="tech" /><category term="kubernetes" /><category term="devops" /><category term="mongodb" /><summary type="html"><![CDATA[Run a production-grade MongoDB replica set with optional Atlas Search (vector and full-text) inside Kubernetes — for air-gapped, on-prem, or any environment where Atlas isn't an option.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/self-hosted-mongodb-kubernetes-atlas-search-splash.png" /><media:content medium="image" url="https://smaht.ai/assets/images/self-hosted-mongodb-kubernetes-atlas-search-splash.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Deploying Doc Router on Kubernetes: From Docker Compose to EKS and Digital Ocean</title><link href="https://smaht.ai/tech/kubernetes/devops/docrouter/deploying-doc-router-on-kubernetes/" rel="alternate" type="text/html" title="Deploying Doc Router on Kubernetes: From Docker Compose to EKS and Digital Ocean" /><published>2026-03-07T00:00:00+00:00</published><updated>2026-03-07T00:00:00+00:00</updated><id>https://smaht.ai/tech/kubernetes/devops/docrouter/deploying-doc-router-on-kubernetes</id><content type="html" xml:base="https://smaht.ai/tech/kubernetes/devops/docrouter/deploying-doc-router-on-kubernetes/"><![CDATA[<p>We recently added production-grade Kubernetes support to Doc Router. This post walks through the key decisions and challenges we encountered along the way.</p>

<p>If you’re new to Kubernetes, start with <a href="/tech/kubernetes/devops/kubernetes-for-docker-users-primer/">Kubernetes for Docker Users: A Practical Primer</a>, which covers the core concepts — Pods, Deployments, Services, Namespaces, Secrets, PVCs, Helm, and Kind — before diving into this post. For packaging and GitOps (Kustomize, Helm, Flux), see <a href="/tech/kubernetes/devops/kubernetes-packaging-helm-gitops/">Kubernetes Packaging and Deployment</a>.</p>

<h2 id="why-kubernetes">Why Kubernetes?</h2>

<p>Doc Router was originally deployed using Docker Compose, which worked well for single-node setups. As we started onboarding enterprise customers with availability and scalability requirements, we needed:</p>

<ul>
  <li><strong>Horizontal scaling</strong> — multiple replicas behind a load balancer</li>
  <li><strong>Automated failover</strong> — pods restarted on failure without manual intervention</li>
  <li><strong>Rolling deployments</strong> — zero-downtime upgrades</li>
  <li><strong>Resource isolation</strong> — CPU and memory limits per component</li>
</ul>

<h2 id="architecture">Architecture</h2>

<p>The production deployment consists of two main workloads:</p>

<ul>
  <li><strong>Frontend</strong> — Next.js server (SSR + API routes via NextAuth)</li>
  <li><strong>Backend</strong> — FastAPI application with embedded background workers</li>
</ul>

<p>Both run as Kubernetes Deployments behind a shared nginx ingress with TLS terminated by cert-manager (Let’s Encrypt).</p>

<p>MongoDB can run outside the cluster (MongoDB Atlas) or in-cluster via our <a href="https://github.com/analytiq-hub/analytiq-charts"><code class="language-plaintext highlighter-rouge">mongodb-atlas-local</code></a> Helm chart — see <a href="/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search/">Self-Hosted MongoDB on Kubernetes with Atlas Search</a> for the install guide. AWS S3 remains an external dependency.</p>

<h2 id="helm-chart">Helm Chart</h2>

<p>We packaged the deployment as a Helm chart (<code class="language-plaintext highlighter-rouge">deploy/charts/doc-router</code>) published to GitHub Container Registry (ghcr.io) as an OCI artifact. The chart is versioned independently of the Docker images, so we can update deployment configuration without rebuilding the application.</p>

<p>Key design decisions:</p>

<ul>
  <li><strong>Single <code class="language-plaintext highlighter-rouge">values.yaml</code></strong> with sensible defaults — operators override only what differs per cluster</li>
  <li><strong>ConfigMap for non-secret config</strong> — <code class="language-plaintext highlighter-rouge">NEXTAUTH_URL</code>, <code class="language-plaintext highlighter-rouge">FASTAPI_ROOT_PATH</code>, worker count, S3 bucket</li>
  <li><strong>Kubernetes Secret for credentials</strong> — MongoDB URI, API keys, NextAuth secret — created by the deploy script, never stored in the chart</li>
  <li><strong>Ingress host derived from <code class="language-plaintext highlighter-rouge">APP_HOST</code></strong> — a single variable drives the entire URL configuration</li>
</ul>

<h2 id="choosing-a-container-registry">Choosing a Container Registry</h2>

<p>We evaluated two natural options: <strong>Amazon ECR</strong> (since we’re already on AWS/EKS) and <strong>GitHub Container Registry (ghcr.io)</strong> (since our source is on GitHub).</p>

<p><strong>ECR</strong> has one significant operational advantage for EKS: nodes authenticate via IAM role, so there is no image pull secret to manage. Costs are low — $0.10/GB stored, with no data transfer charge for pulls within the same AWS region. However, ECR is tightly coupled to AWS. A second deployment on Digital Ocean or a customer’s on-premises cluster would need separate registry credentials and mirroring, making it a poor fit for a multi-cloud or self-hosted product.</p>

<p><strong>ghcr.io</strong> is cloud-neutral — any cluster anywhere can pull images with a single token. It integrates naturally with GitHub Actions (the <code class="language-plaintext highlighter-rouge">GITHUB_TOKEN</code> secret already has <code class="language-plaintext highlighter-rouge">packages: write</code> permission), so publishing images is zero-configuration. The chart package also appears directly on the repository’s GitHub page alongside the source code and releases, which is the right home for an open-source project.</p>

<p>The catch: ghcr.io packages are <strong>private by default</strong> for organizations, and GitHub’s free tier includes only 500 MB storage and 1 GB transfer per month. For clusters that pull large images repeatedly, those limits are reached quickly. Making packages public eliminates the cost entirely, but requires an organization admin to enable public package creation in the org settings — it is disabled by default.</p>

<p>We chose ghcr.io and made our packages public. The images contain no secrets — only application code — so public visibility is appropriate and keeps infrastructure simple. Clusters pull anonymously with no credentials required.</p>

<p>For customers who need private images (for example, an enterprise build with proprietary integrations), the <code class="language-plaintext highlighter-rouge">REGISTRY_PROVIDER</code> variable in the overlay <code class="language-plaintext highlighter-rouge">.env</code> file can be switched to <code class="language-plaintext highlighter-rouge">aws</code> or <code class="language-plaintext highlighter-rouge">do</code> to use ECR or Digital Ocean Container Registry instead, with registry login handled automatically by the deploy scripts.</p>

<h2 id="merging-workers-into-fastapi">Merging Workers into FastAPI</h2>

<p>The original architecture ran the background workers (OCR, LLM, KB indexing, webhooks) as a separate process alongside uvicorn. In Kubernetes, this meant each backend pod ran two Python processes, consuming ~375 MB of memory.</p>

<p>We merged the workers into the FastAPI lifespan using <code class="language-plaintext highlighter-rouge">asyncio.create_task</code>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">@</span><span class="n">asynccontextmanager</span>
<span class="k">async</span> <span class="k">def</span> <span class="nf">lifespan</span><span class="p">(</span><span class="n">app</span><span class="p">):</span>
    <span class="c1"># startup
</span>    <span class="n">worker_tasks</span> <span class="o">=</span> <span class="n">start_workers</span><span class="p">(</span><span class="n">n_workers</span><span class="p">)</span>
    <span class="k">yield</span>
    <span class="c1"># shutdown
</span>    <span class="k">for</span> <span class="n">task</span> <span class="ow">in</span> <span class="n">worker_tasks</span><span class="p">:</span>
        <span class="n">task</span><span class="p">.</span><span class="n">cancel</span><span class="p">()</span>
    <span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">gather</span><span class="p">(</span><span class="o">*</span><span class="n">worker_tasks</span><span class="p">,</span> <span class="n">return_exceptions</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre></div></div>

<p>This halved per-pod memory usage (~190 MB) and eliminated the process management overhead. The workers share the same event loop as the API, which is safe because all worker I/O is already async.</p>

<h2 id="worker-polling-optimization">Worker Polling Optimization</h2>

<p>With multiple replicas, each pod runs a full set of worker coroutines polling MongoDB queues. At idle with 4 workers per pod, that was ~80 MongoDB queries per second cluster-wide.</p>

<p>We implemented exponential backoff with shared state across parallel workers:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_queue_idle_sleep</span><span class="p">:</span> <span class="nb">dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">float</span><span class="p">]</span> <span class="o">=</span> <span class="p">{}</span>  <span class="c1"># shared across all workers on a queue
</span>
<span class="c1"># on idle: back off
</span><span class="n">sleep</span> <span class="o">=</span> <span class="n">_queue_idle_sleep</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">"ocr"</span><span class="p">,</span> <span class="n">POLL_MIN_SLEEP</span><span class="p">)</span>
<span class="k">await</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">sleep</span><span class="p">(</span><span class="n">sleep</span><span class="p">)</span>
<span class="n">_queue_idle_sleep</span><span class="p">[</span><span class="s">"ocr"</span><span class="p">]</span> <span class="o">=</span> <span class="nb">min</span><span class="p">(</span><span class="n">sleep</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">POLL_MAX_SLEEP</span><span class="p">)</span>

<span class="c1"># on message found: reset for all workers on this queue
</span><span class="n">_queue_idle_sleep</span><span class="p">[</span><span class="s">"ocr"</span><span class="p">]</span> <span class="o">=</span> <span class="n">POLL_MIN_SLEEP</span>
</code></pre></div></div>

<p>This reduces idle polling to near-zero while keeping response latency low when work arrives.</p>

<h2 id="graceful-shutdown">Graceful Shutdown</h2>

<p>When Kubernetes scales down a pod (HPA scale-in or rolling update), it sends SIGTERM. We needed in-flight jobs to be marked as failed rather than silently abandoned.</p>

<p>Since workers are asyncio tasks, cancellation arrives as <code class="language-plaintext highlighter-rouge">asyncio.CancelledError</code> — a <code class="language-plaintext highlighter-rouge">BaseException</code>, not caught by <code class="language-plaintext highlighter-rouge">except Exception</code>. We added explicit handling in each worker:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">try</span><span class="p">:</span>
    <span class="k">await</span> <span class="n">ad</span><span class="p">.</span><span class="n">msg_handlers</span><span class="p">.</span><span class="n">process_ocr_msg</span><span class="p">(</span><span class="n">analytiq_client</span><span class="p">,</span> <span class="n">msg</span><span class="p">)</span>
<span class="k">except</span> <span class="n">asyncio</span><span class="p">.</span><span class="n">CancelledError</span><span class="p">:</span>
    <span class="n">logger</span><span class="p">.</span><span class="n">warning</span><span class="p">(</span><span class="sa">f</span><span class="s">"Worker cancelled mid-flight on msg </span><span class="si">{</span><span class="n">msg</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">'_id'</span><span class="p">)</span><span class="si">}</span><span class="s">, marking failed"</span><span class="p">)</span>
    <span class="k">await</span> <span class="n">ad</span><span class="p">.</span><span class="n">queue</span><span class="p">.</span><span class="n">delete_msg</span><span class="p">(</span><span class="n">analytiq_client</span><span class="p">,</span> <span class="s">"ocr"</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">msg</span><span class="p">[</span><span class="s">"_id"</span><span class="p">]),</span> <span class="n">status</span><span class="o">=</span><span class="s">"failed"</span><span class="p">)</span>
    <span class="k">raise</span>  <span class="c1"># allow the task to actually cancel
</span></code></pre></div></div>

<p>The failed job can then be retried on another pod.</p>

<h2 id="database-migrations-as-a-helm-pre-upgrade-hook">Database Migrations as a Helm Pre-Upgrade Hook</h2>

<p>Running database migrations safely in a multi-replica environment requires that migrations complete before any new application code starts serving traffic. In Docker Compose this is handled by startup ordering, but in Kubernetes rolling updates, new pods can start before old ones are gone — with no guarantee about migration timing.</p>

<p>We solved this with a Helm hook Job that runs <code class="language-plaintext highlighter-rouge">migrate.py</code> using the same backend image, annotated to execute before the upgrade rolls out:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">annotations</span><span class="pi">:</span>
  <span class="s2">"</span><span class="s">helm.sh/hook"</span><span class="err">:</span> <span class="s">pre-upgrade,pre-rollback</span>
  <span class="s">"helm.sh/hook-weight"</span><span class="err">:</span> <span class="s2">"</span><span class="s">-5"</span>
  <span class="s2">"</span><span class="s">helm.sh/hook-delete-policy"</span><span class="err">:</span> <span class="s">hook-succeeded,before-hook-creation</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">pre-upgrade</code> hook ensures migrations run and complete successfully before Helm touches any Deployment. If the migration Job fails, Helm aborts the upgrade entirely — the old version keeps running. <code class="language-plaintext highlighter-rouge">hook-delete-policy: hook-succeeded</code> cleans up the completed Job automatically, keeping the namespace tidy. The <code class="language-plaintext highlighter-rouge">before-hook-creation</code> policy ensures the old Job is removed if a previous run left one behind.</p>

<p>One subtlety: at pre-upgrade time, the ConfigMap has not yet been updated by Helm (hooks run before regular resources). The migration Job therefore mounts only the Secret — which contains <code class="language-plaintext highlighter-rouge">MONGODB_URI</code> — and not the ConfigMap:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">envFrom</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">secretRef</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">doc-router-secrets</span>
<span class="c1"># ConfigMap intentionally omitted — not yet updated at hook time</span>
</code></pre></div></div>

<p>This means <code class="language-plaintext highlighter-rouge">migrate.py</code> must be written to need only the database connection string, with no dependency on application config values.</p>

<p>The result is a safe, atomic upgrade sequence: <strong>migrate → roll out new pods → terminate old pods</strong> — with automatic rollback if the migration fails.</p>

<h2 id="hpa-tuning">HPA Tuning</h2>

<p>We configured Horizontal Pod Autoscaler on the backend with both CPU and memory targets:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">metrics</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">type</span><span class="pi">:</span> <span class="s">Resource</span>
  <span class="na">resource</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">cpu</span>
    <span class="na">target</span><span class="pi">:</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">Utilization</span>
      <span class="na">averageUtilization</span><span class="pi">:</span> <span class="m">80</span>
<span class="pi">-</span> <span class="na">type</span><span class="pi">:</span> <span class="s">Resource</span>
  <span class="na">resource</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">memory</span>
    <span class="na">target</span><span class="pi">:</span>
      <span class="na">type</span><span class="pi">:</span> <span class="s">Utilization</span>
      <span class="na">averageUtilization</span><span class="pi">:</span> <span class="m">80</span>
</code></pre></div></div>

<p>A subtle issue: HPA scale-down uses <code class="language-plaintext highlighter-rouge">ceil(currentReplicas × currentUtil / targetUtil)</code>. With 5 pods at 72% memory utilization against an 80% target, <code class="language-plaintext highlighter-rouge">ceil(5 × 72/80) = ceil(4.5) = 5</code> — the ceiling arithmetic created a deadlock where the cluster could never scale below 5 pods.</p>

<p>The fix was increasing the memory request from 512 Mi to 768 Mi. After the worker merge reduced actual usage to ~190 MB, utilization dropped to ~25% — well below the threshold — and the cluster scaled back down to the minimum of 2 replicas.</p>

<h2 id="environment-configuration">Environment Configuration</h2>

<p>Next.js <code class="language-plaintext highlighter-rouge">NEXT_PUBLIC_*</code> variables are baked into the browser bundle at build time, not injected at runtime. This caused a subtle bug: our local <code class="language-plaintext highlighter-rouge">.env.local</code> file set <code class="language-plaintext highlighter-rouge">NEXT_PUBLIC_FASTAPI_FRONTEND_URL=http://127.0.0.1:8000</code>. Because <code class="language-plaintext highlighter-rouge">.env.local</code> wasn’t listed in <code class="language-plaintext highlighter-rouge">.dockerignore</code>, it was copied into the Docker build context and read by Next.js during <code class="language-plaintext highlighter-rouge">npm run build</code> — silently overriding the intended production value and baking the localhost URL into every image.</p>

<p>We fixed this in two steps:</p>

<ol>
  <li>
    <p><strong>Exclude all <code class="language-plaintext highlighter-rouge">.env.*</code> files from the Docker build context</strong> by adding <code class="language-plaintext highlighter-rouge">**/.env.*</code> to <code class="language-plaintext highlighter-rouge">.dockerignore</code>, so local development env files can never leak into images.</p>
  </li>
  <li>
    <p><strong>Remove <code class="language-plaintext highlighter-rouge">NEXT_PUBLIC_FASTAPI_FRONTEND_URL</code> entirely.</strong> Rather than baking an absolute URL into the bundle, the frontend now always calls <code class="language-plaintext highlighter-rouge">/fastapi</code> — a relative path that works from any hostname. Next.js rewrites proxy <code class="language-plaintext highlighter-rouge">/fastapi/:path*</code> to the backend service URL at the server layer:</p>
  </li>
</ol>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// next.config.mjs</span>
<span class="k">async</span> <span class="nx">rewrites</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">return</span> <span class="p">[{</span>
    <span class="na">source</span><span class="p">:</span> <span class="dl">'</span><span class="s1">/fastapi/:path*</span><span class="dl">'</span><span class="p">,</span>
    <span class="na">destination</span><span class="p">:</span> <span class="s2">`</span><span class="p">${</span><span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">FASTAPI_BACKEND_URL</span><span class="p">}</span><span class="s2">/fastapi/:path*`</span><span class="p">,</span>
  <span class="p">}];</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">FASTAPI_BACKEND_URL</code> is a server-side runtime variable (not <code class="language-plaintext highlighter-rouge">NEXT_PUBLIC_</code>) pointing to the in-cluster backend service (<code class="language-plaintext highlighter-rouge">http://backend.&lt;namespace&gt;.svc.cluster.local:8000</code>). It is never exposed to the browser. The result is a truly environment-agnostic frontend image that requires no rebuild when moving between clusters.</p>

<h2 id="cicd-pipeline">CI/CD Pipeline</h2>

<h3 id="structure">Structure</h3>

<p>We use three GitHub Actions workflows:</p>

<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">backend-tests.yml</code></strong> — runs Python tests against a local MongoDB Atlas instance (with vector search via <code class="language-plaintext highlighter-rouge">mongodb-atlas-local</code>) plus TypeScript tests. Triggered by <code class="language-plaintext highlighter-rouge">workflow_call</code> or <code class="language-plaintext highlighter-rouge">workflow_dispatch</code>.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">frontend-build.yml</code></strong> — runs <code class="language-plaintext highlighter-rouge">npm run build</code> for the Next.js frontend. Also triggered by <code class="language-plaintext highlighter-rouge">workflow_call</code> or <code class="language-plaintext highlighter-rouge">workflow_dispatch</code>.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">ci.yml</code></strong> — runs both test workflows on every pull request to <code class="language-plaintext highlighter-rouge">main</code>.</li>
  <li><strong><code class="language-plaintext highlighter-rouge">release.yml</code></strong> — triggered on semver tags (<code class="language-plaintext highlighter-rouge">v[0-9]*.[0-9]*.[0-9]*</code>). Runs both test workflows first, then builds and pushes Docker images if they pass.</li>
</ul>

<h3 id="why-semver-tags-not-branch-pushes">Why semver tags, not branch pushes</h3>

<p>An early version of the pipeline ran tests on every push to <code class="language-plaintext highlighter-rouge">main</code> and triggered builds from there. This caused two problems:</p>

<ol>
  <li><strong>Tests ran twice per release</strong> — once on the branch push, once triggered by the tag.</li>
  <li><strong>The tag trigger didn’t wait for tests</strong> — if a tag was pushed immediately after a commit, the build could race ahead of a still-running test run.</li>
</ol>

<p>The current design avoids both: <code class="language-plaintext highlighter-rouge">release.yml</code> is only triggered by a semver tag, and the <code class="language-plaintext highlighter-rouge">build-push</code> job declares <code class="language-plaintext highlighter-rouge">needs: [test-backend, test-frontend]</code>, so Docker images are never built unless all tests pass on that exact commit. Tests run exactly once per release.</p>

<p>The <code class="language-plaintext highlighter-rouge">ci.yml</code> workflow handles the PR gate separately — developers get test feedback on their branch without triggering a build.</p>

<h3 id="reusable-test-workflows">Reusable test workflows</h3>

<p>Making the test workflows <code class="language-plaintext highlighter-rouge">workflow_call</code>-able (rather than duplicating the job definitions in both <code class="language-plaintext highlighter-rouge">ci.yml</code> and <code class="language-plaintext highlighter-rouge">release.yml</code>) keeps the test logic in one place. Both workflows call the same definitions; any change to the test steps is automatically reflected in both gates.</p>

<p><code class="language-plaintext highlighter-rouge">workflow_dispatch</code> is kept on each test workflow so that individual test suites can be re-run manually from the GitHub Actions UI without needing to push a commit or tag.</p>

<h3 id="image-tagging">Image tagging</h3>

<p>The build step computes image tags from the git tag:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">TAG="$"</span>          <span class="c1"># e.g. v27.0.1-rc2 or v27.0.1</span>
<span class="s">FRONTEND_TAGS="${FRONTEND}:${TAG}"</span>
<span class="c1"># :latest only for stable releases (no pre-release suffix)</span>
<span class="s">if [[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then</span>
  <span class="s">FRONTEND_TAGS="${FRONTEND_TAGS},${FRONTEND}:latest"</span>
<span class="s">fi</span>
</code></pre></div></div>

<p>Release candidates (<code class="language-plaintext highlighter-rouge">v27.0.1-rc2</code>) get a versioned tag only. Stable releases (<code class="language-plaintext highlighter-rouge">v27.0.1</code>) also update <code class="language-plaintext highlighter-rouge">:latest</code>. This means a cluster running <code class="language-plaintext highlighter-rouge">:latest</code> auto-updates on the next <code class="language-plaintext highlighter-rouge">helm upgrade</code>, while a cluster pinned to a specific tag is unaffected.</p>

<h3 id="helm-chart-publishing-is-manual">Helm chart publishing is manual</h3>

<p>The Helm chart is published separately with <code class="language-plaintext highlighter-rouge">./deploy/scripts/publish-chart.sh &lt;overlay&gt;</code>. We kept this manual for two reasons: the chart version is independent of the app version (you might push 10 image releases without any chart changes), and publishing the chart is a deliberate operator action — it should not happen automatically on every tag.</p>

<h2 id="egress-ips-and-external-service-whitelisting">Egress IPs and External Service Whitelisting</h2>

<p>A practical difference between EKS and DOKS emerged when connecting to MongoDB Atlas, which requires IP whitelisting for all incoming connections.</p>

<p><strong>On EKS</strong>, the cluster’s private node group sits behind a single NAT gateway. All outbound traffic from every pod — regardless of which node it runs on — exits through one stable public IP. Adding that single IP to MongoDB Atlas’s allowlist is all that’s needed, and the IP never changes when nodes are replaced or the cluster scales.</p>

<p><strong>On DOKS</strong>, there is no NAT gateway by default. Each node is assigned its own public IP, and pods reach the internet directly through the node they’re scheduled on. This means:</p>

<ul>
  <li>There is no single egress IP — the source address MongoDB sees depends on which node the backend pod happens to be running on.</li>
  <li>With two nodes, you need two IPs in the allowlist. With autoscaling, new nodes get new IPs, and the allowlist breaks until you add them.</li>
</ul>

<p>For a fixed-size dev cluster, the workaround is to whitelist all current node IPs. For a production DOKS cluster with autoscaling, the correct solution is to provision a <strong>Digital Ocean Load Balancer as a NAT gateway</strong>, routing all cluster egress through a single stable IP. This adds ~$12/month but is the only reliable option when the external service requires a static source address.</p>

<p>For our dev cluster (<code class="language-plaintext highlighter-rouge">doc-router-dev</code>), we whitelist the two node IPs directly. For production DOKS deployments, a managed NAT gateway is required.</p>

<h2 id="overlay-based-deploy-scripts">Overlay-based Deploy Scripts</h2>

<p>Rather than a one-size-fits-all deploy script, we use an overlay pattern:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.env              # shared defaults (local dev values)
.env.eks-test     # overrides for the test EKS cluster
.env.eks-prod     # overrides for production
</code></pre></div></div>

<p>The deploy scripts (<code class="language-plaintext highlighter-rouge">k8s-deploy.sh</code>, <code class="language-plaintext highlighter-rouge">build-push.sh</code>) accept an overlay name and source both files, with the overlay taking precedence. A single variable — <code class="language-plaintext highlighter-rouge">APP_HOST</code> — drives all URL configuration, making it straightforward to add a new environment. <code class="language-plaintext highlighter-rouge">k8s-deploy.sh</code> is idempotent — it uses <code class="language-plaintext highlighter-rouge">helm upgrade --install</code> and handles both fresh installs and rolling updates without any distinction.</p>

<h2 id="whats-next">What’s Next</h2>

<ul>
  <li><strong>On-premises distribution</strong> — Helm chart and images are public on ghcr.io; self-hosted MongoDB is available via the <a href="https://github.com/analytiq-hub/analytiq-charts"><code class="language-plaintext highlighter-rouge">mongodb-atlas-local</code></a> chart (see <a href="/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search/">Self-Hosted MongoDB on Kubernetes with Atlas Search</a>); documentation for a one-command on-prem install is the next step</li>
  <li><strong>Offline license keys</strong> — JWT-based licenses signed with a private key, verified against a public key baked into the image, for air-gapped installations</li>
  <li><strong>Multi-cloud support</strong> — Digital Ocean Kubernetes is now supported alongside EKS; Azure Kubernetes Service support is planned</li>
</ul>

<hr />

<p><em>Andrei Radulescu-Banu is the founder of <a href="https://docrouter.ai">DocRouter.AI</a> (document processing with LLMs) and <a href="https://sigagent.ai">SigAgent.AI</a> (Claude Agent monitoring). His company <a href="https://analytiqhub.com">AnalytiqHub.com</a> provides consulting services for cloud and AI engineering.</em></p>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="tech" /><category term="kubernetes" /><category term="devops" /><category term="docrouter" /><summary type="html"><![CDATA[Production-grade Kubernetes support for Doc Router: key decisions, Helm chart, worker merging, graceful shutdown, and multi-cloud deployment.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/deploying-doc-router-kubernetes-splash.png" /><media:content medium="image" url="https://smaht.ai/assets/images/deploying-doc-router-kubernetes-splash.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Kubernetes Packaging and Deployment: Kustomize, Helm, and GitOps</title><link href="https://smaht.ai/tech/kubernetes/devops/kubernetes-packaging-helm-gitops/" rel="alternate" type="text/html" title="Kubernetes Packaging and Deployment: Kustomize, Helm, and GitOps" /><published>2026-03-06T00:00:00+00:00</published><updated>2026-03-06T00:00:00+00:00</updated><id>https://smaht.ai/tech/kubernetes/devops/kubernetes-packaging-helm-gitops</id><content type="html" xml:base="https://smaht.ai/tech/kubernetes/devops/kubernetes-packaging-helm-gitops/"><![CDATA[<p>This is the second part of the Kubernetes primer series. The <a href="/tech/kubernetes/devops/kubernetes-for-docker-users-primer/">first part</a> covered the core building blocks — Pods, Deployments, Services, Secrets, PVCs, and Helm basics. This part goes deeper into the two dominant approaches to packaging Kubernetes manifests, and then introduces GitOps as an alternative to running deploy scripts manually.</p>

<hr />

<h2 id="the-manifest-problem">The manifest problem</h2>

<p>A real Kubernetes application needs dozens of YAML files: Deployments, Services, ConfigMaps, Secrets, Ingress rules, HorizontalPodAutoscalers, PodDisruptionBudgets. Writing them by hand is feasible once, but the moment you need the same app running in three environments — local, staging, production — you face a choice:</p>

<ul>
  <li><strong>Copy the files for each environment</strong> and keep them in sync manually (fragile)</li>
  <li><strong>Use a tool that handles the variation</strong> for you</li>
</ul>

<p>Two tools dominate: <strong>Kustomize</strong> and <strong>Helm</strong>. They solve the same problem differently, and many projects use both — Helm for third-party software, Kustomize for their own app.</p>

<hr />

<h2 id="kustomize--layered-yaml-patches">Kustomize — layered YAML patches</h2>

<p>Kustomize ships with <code class="language-plaintext highlighter-rouge">kubectl</code> (no install needed) and works with plain YAML. The idea is a <strong>base + overlays</strong> structure:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>manifests/
  base/
    deployment.yaml      # canonical deployment
    service.yaml
    kustomization.yaml   # lists the resources
  overlays/
    dev/
      kustomization.yaml # patches for dev
      patch-replicas.yaml
    prod/
      kustomization.yaml # patches for prod
      patch-replicas.yaml
      patch-resources.yaml
</code></pre></div></div>

<p>The base defines the resource once. Each overlay patches only what differs. A typical patch looks like:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># overlays/prod/patch-replicas.yaml</span>
<span class="na">apiVersion</span><span class="pi">:</span> <span class="s">apps/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Deployment</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">backend</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">replicas</span><span class="pi">:</span> <span class="m">4</span>       <span class="c1"># override base value of 2</span>
</code></pre></div></div>

<p>To deploy the prod overlay:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kubectl apply <span class="nt">-k</span> overlays/prod/
</code></pre></div></div>

<p>Kustomize merges the base YAML with all patches before sending anything to the API server. You always see plain, readable YAML — there is no templating language to learn, and the output is predictable.</p>

<h3 id="variable-substitution">Variable substitution</h3>

<p>For values that vary by environment (hostnames, image tags, resource sizes), Kustomize offers <code class="language-plaintext highlighter-rouge">substituteFrom</code>: it reads variables from a ConfigMap or Secret and injects them into the manifests at apply time:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># kustomization.yaml</span>
<span class="na">configurations</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">var-references.yaml</span>
<span class="na">vars</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">APP_DOMAIN</span>
    <span class="na">objref</span><span class="pi">:</span>
      <span class="na">kind</span><span class="pi">:</span> <span class="s">ConfigMap</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">project-values</span>
      <span class="na">apiVersion</span><span class="pi">:</span> <span class="s">v1</span>
    <span class="na">fieldref</span><span class="pi">:</span>
      <span class="na">fieldpath</span><span class="pi">:</span> <span class="s">data.domain</span>
</code></pre></div></div>

<p>This is less flexible than Helm’s full templating but keeps the YAML closer to what Kubernetes actually receives.</p>

<h3 id="what-kustomize-does-not-do">What Kustomize does not do</h3>

<p>Kustomize has no concept of a release, no revision history, and no built-in rollback. If you apply a broken overlay, you must fix it and reapply, or manually apply a previous version. For the same reason, there is no <code class="language-plaintext highlighter-rouge">--atomic</code> safety net — if a deployment fails mid-rollout, you notice from <code class="language-plaintext highlighter-rouge">kubectl</code> output, not from the packaging tool.</p>

<hr />

<h2 id="helm--templated-packages">Helm — templated packages</h2>

<p>Helm wraps Kubernetes YAML in a full templating engine (Go templates) and adds lifecycle management on top. A chart is a directory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>doc-router/
  Chart.yaml          # name, version, appVersion
  values.yaml         # default values
  templates/
    deployment.yaml   # Go template
    service.yaml
    ingress.yaml
    _helpers.tpl      # reusable template fragments
</code></pre></div></div>

<p>A template looks like:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># templates/deployment.yaml</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">replicas</span><span class="pi">:</span> 
  <span class="na">template</span><span class="pi">:</span>
    <span class="na">spec</span><span class="pi">:</span>
      <span class="na">containers</span><span class="pi">:</span>
        <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">backend</span>
          <span class="na">image</span><span class="pi">:</span> <span class="s2">"</span><span class="s">:"</span>
          <span class="na">resources</span><span class="pi">:</span>
            <span class="na">requests</span><span class="pi">:</span>
              <span class="na">cpu</span><span class="pi">:</span> 
</code></pre></div></div>

<p>To install with custom values:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm upgrade <span class="nt">--install</span> doc-router ./doc-router <span class="se">\</span>
  <span class="nt">--set</span> <span class="nv">replicaCount</span><span class="o">=</span>4 <span class="se">\</span>
  <span class="nt">--set</span> image.tag<span class="o">=</span>v1.2.3
</code></pre></div></div>

<p>Or via an override file:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm upgrade <span class="nt">--install</span> doc-router ./doc-router <span class="nt">-f</span> values-prod.yaml
</code></pre></div></div>

<h3 id="release-history-and-rollback">Release history and rollback</h3>

<p>Helm records every install and upgrade as a numbered revision in the cluster. You can inspect history and roll back:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm <span class="nb">history </span>doc-router <span class="nt">-n</span> doc-router
helm rollback doc-router 2 <span class="nt">-n</span> doc-router   <span class="c"># back to revision 2</span>
</code></pre></div></div>

<p>With <code class="language-plaintext highlighter-rouge">--atomic</code>, a failed upgrade automatically triggers a rollback — the old version keeps running uninterrupted.</p>

<h3 id="publishing-charts-as-oci-artifacts">Publishing charts as OCI artifacts</h3>

<p>A packaged chart can be pushed to any OCI-compatible registry (ghcr.io, ECR, Docker Hub) and pulled from anywhere:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm push doc-router-0.3.7.tgz oci://ghcr.io/analytiq-hub
helm upgrade <span class="nt">--install</span> doc-router oci://ghcr.io/analytiq-hub/doc-router <span class="nt">--version</span> 0.3.7
</code></pre></div></div>

<p>This means a customer cluster can install your app with a single command, pulling both the chart and images from the same registry, with no Git access required.</p>

<hr />

<h2 id="kustomize-vs-helm--when-to-use-each">Kustomize vs Helm — when to use each</h2>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>Kustomize</th>
      <th>Helm</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Learning curve</td>
      <td>Low — just YAML</td>
      <td>Higher — Go templates + chart structure</td>
    </tr>
    <tr>
      <td>Flexibility</td>
      <td>Patches and substitutions</td>
      <td>Full templating, conditionals, loops</td>
    </tr>
    <tr>
      <td>Release history</td>
      <td>None</td>
      <td>Built-in, per-revision</td>
    </tr>
    <tr>
      <td>Rollback</td>
      <td>Manual</td>
      <td><code class="language-plaintext highlighter-rouge">helm rollback</code></td>
    </tr>
    <tr>
      <td>Failure safety</td>
      <td>None</td>
      <td><code class="language-plaintext highlighter-rouge">--atomic</code> auto-rollback</td>
    </tr>
    <tr>
      <td>Publishing</td>
      <td>OCI artifact via Flux</td>
      <td><code class="language-plaintext highlighter-rouge">helm push</code> to any OCI registry</td>
    </tr>
    <tr>
      <td>Best for</td>
      <td>Your own first-party manifests</td>
      <td>Distributable packages, third-party software</td>
    </tr>
  </tbody>
</table>

<p>In practice many projects use both: Helm for installing third-party dependencies (ingress-nginx, cert-manager, MongoDB operator), and Kustomize for their own application manifests. The two are compatible — a Kustomize overlay can reference a Helm chart as a generator.</p>

<hr />

<h2 id="gitops--the-cluster-manages-itself">GitOps — the cluster manages itself</h2>

<p>Both Kustomize and Helm, as described so far, are <strong>imperative</strong>: a human (or a CI job) runs a command that pushes changes into the cluster. GitOps flips this model.</p>

<p>In GitOps, the desired cluster state is declared in a Git repository (or an OCI artifact registry). A controller running <em>inside</em> the cluster continuously watches that source and reconciles actual state to match it. No one runs <code class="language-plaintext highlighter-rouge">helm upgrade</code> — the cluster pulls its own updates.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Developer pushes to Git / CI pushes OCI artifact
         ↓
  Source of truth updated
         ↓
  In-cluster controller detects drift
         ↓
  Controller applies the diff
         ↓
  Cluster matches desired state
</code></pre></div></div>

<p>The key property: <strong>the cluster self-heals</strong>. If someone manually deletes a Deployment or edits a ConfigMap, the controller notices the drift and reverts it within seconds. The Git repo (or OCI artifact) is always the authoritative source.</p>

<hr />

<h2 id="flux--a-gitops-controller">Flux — a GitOps controller</h2>

<p><strong>Flux</strong> is one of the two dominant GitOps controllers (the other is Argo CD). It runs as a set of controllers in the cluster and watches sources:</p>

<h3 id="sources">Sources</h3>

<p>Flux can watch:</p>
<ul>
  <li><strong>Git repositories</strong> — on every push, Flux reconciles the cluster</li>
  <li><strong>OCI artifact registries</strong> — on every <code class="language-plaintext highlighter-rouge">flux push artifact</code>, Flux pulls and applies</li>
  <li><strong>Helm repositories</strong> — for managing Helm releases declaratively</li>
</ul>

<h3 id="core-resources">Core resources</h3>

<p><strong>GitRepository / OCIRepository</strong> — defines where Flux watches:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">source.toolkit.fluxcd.io/v1beta2</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">OCIRepository</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">my-app</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">flux-system</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">interval</span><span class="pi">:</span> <span class="s">1m</span>
  <span class="na">url</span><span class="pi">:</span> <span class="s">oci://123456789.dkr.ecr.us-east-1.amazonaws.com/my-app-manifests</span>
  <span class="na">ref</span><span class="pi">:</span>
    <span class="na">tag</span><span class="pi">:</span> <span class="s">latest</span>
</code></pre></div></div>

<p><strong>Kustomization</strong> — tells Flux what to apply from the source:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">kustomize.toolkit.fluxcd.io/v1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">Kustomization</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">my-app</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">flux-system</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">interval</span><span class="pi">:</span> <span class="s">5m</span>
  <span class="na">sourceRef</span><span class="pi">:</span>
    <span class="na">kind</span><span class="pi">:</span> <span class="s">OCIRepository</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">my-app</span>
  <span class="na">path</span><span class="pi">:</span> <span class="s">./manifests/kubernetes/overlays/prod</span>
  <span class="na">prune</span><span class="pi">:</span> <span class="no">true</span>      <span class="c1"># delete resources removed from source</span>
  <span class="na">healthChecks</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">apiVersion</span><span class="pi">:</span> <span class="s">apps/v1</span>
      <span class="na">kind</span><span class="pi">:</span> <span class="s">Deployment</span>
      <span class="na">name</span><span class="pi">:</span> <span class="s">backend</span>
      <span class="na">namespace</span><span class="pi">:</span> <span class="s">my-app</span>
</code></pre></div></div>

<p><strong>HelmRelease</strong> — manages a Helm release declaratively:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">apiVersion</span><span class="pi">:</span> <span class="s">helm.toolkit.fluxcd.io/v2beta1</span>
<span class="na">kind</span><span class="pi">:</span> <span class="s">HelmRelease</span>
<span class="na">metadata</span><span class="pi">:</span>
  <span class="na">name</span><span class="pi">:</span> <span class="s">ingress-nginx</span>
  <span class="na">namespace</span><span class="pi">:</span> <span class="s">flux-system</span>
<span class="na">spec</span><span class="pi">:</span>
  <span class="na">interval</span><span class="pi">:</span> <span class="s">1h</span>
  <span class="na">chart</span><span class="pi">:</span>
    <span class="na">spec</span><span class="pi">:</span>
      <span class="na">chart</span><span class="pi">:</span> <span class="s">ingress-nginx</span>
      <span class="na">version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">4.11.3"</span>
      <span class="na">sourceRef</span><span class="pi">:</span>
        <span class="na">kind</span><span class="pi">:</span> <span class="s">HelmRepository</span>
        <span class="na">name</span><span class="pi">:</span> <span class="s">ingress-nginx</span>
  <span class="na">values</span><span class="pi">:</span>
    <span class="na">controller</span><span class="pi">:</span>
      <span class="na">replicaCount</span><span class="pi">:</span> <span class="m">2</span>
</code></pre></div></div>

<h3 id="cicd-with-flux">CI/CD with Flux</h3>

<p>A typical Flux-based pipeline looks like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1. Developer opens a PR
2. CI runs tests
3. PR merged to main
4. CI builds Docker image → pushes to ECR
5. CI packages Kustomize manifests as OCI artifact → flux push artifact → ECR
6. Flux detects new artifact version
7. Flux applies manifests to cluster
8. Cluster rolls out new Deployment
</code></pre></div></div>

<p>Steps 6–8 happen automatically, inside the cluster, with no deploy script and no human intervention.</p>

<h3 id="flux-vs-running-deploy-scripts">Flux vs running deploy scripts</h3>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>Shell script (<code class="language-plaintext highlighter-rouge">helm upgrade</code>)</th>
      <th>Flux GitOps</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Who initiates deploy</td>
      <td>Human or CI job</td>
      <td>Cluster controller</td>
    </tr>
    <tr>
      <td>Drift detection</td>
      <td>None — manual kubectl needed</td>
      <td>Continuous — auto-reverts</td>
    </tr>
    <tr>
      <td>Audit trail</td>
      <td>CI logs</td>
      <td>Git history + Flux events</td>
    </tr>
    <tr>
      <td>Rollback</td>
      <td><code class="language-plaintext highlighter-rouge">helm rollback</code></td>
      <td>Revert commit, Flux reconciles</td>
    </tr>
    <tr>
      <td>Complexity</td>
      <td>Low — just a shell script</td>
      <td>Higher — Flux controllers + CRDs</td>
    </tr>
    <tr>
      <td>Air-gapped / on-prem</td>
      <td>Simple</td>
      <td>Requires Flux + registry access</td>
    </tr>
  </tbody>
</table>

<p>GitOps is the right choice for teams with multiple people deploying to shared clusters, or for production environments where drift must be detected and prevented. For a small team or a self-hosted product where simplicity matters, shell scripts with <code class="language-plaintext highlighter-rouge">helm upgrade --install</code> are easier to understand, debug, and hand off to a customer.</p>

<hr />

<h2 id="summary">Summary</h2>

<table>
  <thead>
    <tr>
      <th>Tool</th>
      <th>Role</th>
      <th>Key strength</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Kustomize</strong></td>
      <td>Overlay-based YAML patching</td>
      <td>Plain YAML, no templates, built into kubectl</td>
    </tr>
    <tr>
      <td><strong>Helm</strong></td>
      <td>Templated package manager</td>
      <td>Release history, rollback, publishable charts</td>
    </tr>
    <tr>
      <td><strong>Flux</strong></td>
      <td>GitOps controller</td>
      <td>Self-healing cluster, drift detection, no manual deploys</td>
    </tr>
    <tr>
      <td><strong>Argo CD</strong></td>
      <td>GitOps controller (alternative to Flux)</td>
      <td>Web UI, application health visualisation</td>
    </tr>
  </tbody>
</table>

<p>A mature production setup typically uses all three: Kustomize or Helm for defining manifests, Flux or Argo CD for reconciling them, and a CI pipeline that produces the artifacts both consume.</p>

<p><strong>Next:</strong> <a href="/tech/kubernetes/devops/docrouter/deploying-doc-router-on-kubernetes/">Deploying Doc Router on Kubernetes</a> walks through a real application deployment (Helm chart, workers, CI/CD, EKS and Digital Ocean). If you need in-cluster MongoDB with vector search, see <a href="/tech/kubernetes/devops/mongodb/self-hosted-mongodb-kubernetes-atlas-search/">Self-Hosted MongoDB on Kubernetes with Atlas Search</a>.</p>

<hr />

<p><em>Andrei Radulescu-Banu is the founder of <a href="https://docrouter.ai">DocRouter.AI</a> (document processing with LLMs) and <a href="https://sigagent.ai">SigAgent.AI</a> (Claude Agent monitoring). His company <a href="https://analytiqhub.com">AnalytiqHub.com</a> provides consulting services for cloud and AI engineering.</em></p>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="tech" /><category term="kubernetes" /><category term="devops" /><summary type="html"><![CDATA[The second part of the Kubernetes primer series: Kustomize, Helm, and GitOps with Flux — packaging manifests and letting the cluster manage itself.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/kubernetes-packaging-helm-gitops-splash.png" /><media:content medium="image" url="https://smaht.ai/assets/images/kubernetes-packaging-helm-gitops-splash.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Kubernetes for Docker Users: A Practical Primer</title><link href="https://smaht.ai/tech/kubernetes/devops/kubernetes-for-docker-users-primer/" rel="alternate" type="text/html" title="Kubernetes for Docker Users: A Practical Primer" /><published>2026-03-05T00:00:00+00:00</published><updated>2026-03-05T00:00:00+00:00</updated><id>https://smaht.ai/tech/kubernetes/devops/kubernetes-for-docker-users-primer</id><content type="html" xml:base="https://smaht.ai/tech/kubernetes/devops/kubernetes-for-docker-users-primer/"><![CDATA[<p>If you’ve used Docker Compose, you already understand the core idea: define your services, wire them together with a network, and let the runtime manage the processes. Kubernetes takes that same idea and extends it to run across a cluster of machines, with built-in handling for failures, scaling, and upgrades.</p>

<p>Here’s how the key concepts map across.</p>

<h2 id="from-containers-to-pods">From containers to Pods</h2>

<p>In Docker Compose, the unit of work is a container. In Kubernetes, it is a <strong>Pod</strong> — a group of one or more containers that always run together on the same machine and share a network namespace. Most Pods contain a single container, but some use sidecars: a main process plus a helper (a log shipper, a proxy, or in our case the <code class="language-plaintext highlighter-rouge">mongot</code> search process alongside <code class="language-plaintext highlighter-rouge">mongod</code>).</p>

<p>Pods are ephemeral. When a Pod dies, Kubernetes replaces it with a new one — possibly on a different machine, with a new IP address. You never SSH into a Pod or rely on its IP being stable.</p>

<h2 id="deployments--the-equivalent-of-a-compose-service">Deployments — the equivalent of a Compose service</h2>

<p>A <strong>Deployment</strong> tells Kubernetes: “keep N replicas of this Pod running at all times.” If a Pod crashes, the Deployment controller starts a replacement. If you push a new image, it performs a rolling update — starting new Pods before terminating old ones so traffic is never interrupted.</p>

<p>In Docker Compose terms, a Deployment is your <code class="language-plaintext highlighter-rouge">service:</code> block plus restart policies and rolling update logic built in.</p>

<h2 id="services--stable-internal-addresses">Services — stable internal addresses</h2>

<p>Because Pod IPs change on every restart, Kubernetes introduces <strong>Services</strong>: stable DNS names and virtual IPs that front a group of Pods. A Service named <code class="language-plaintext highlighter-rouge">backend</code> in the <code class="language-plaintext highlighter-rouge">doc-router</code> namespace is reachable at <code class="language-plaintext highlighter-rouge">backend.doc-router.svc.cluster.local</code> from anywhere in the cluster, regardless of how many backend Pods exist or where they are running.</p>

<p>This replaces the automatic DNS that Docker Compose sets up between containers on the same network.</p>

<h2 id="namespaces--isolation-within-a-cluster">Namespaces — isolation within a cluster</h2>

<p>A <strong>Namespace</strong> is a logical partition of the cluster. Resources in different namespaces don’t collide even if they share a name. A typical setup uses separate namespaces for each concern: <code class="language-plaintext highlighter-rouge">doc-router</code> for the application, <code class="language-plaintext highlighter-rouge">mongodb</code> for the database, <code class="language-plaintext highlighter-rouge">ingress-nginx</code> for the load balancer, <code class="language-plaintext highlighter-rouge">cert-manager</code> for TLS certificates.</p>

<p>In Docker Compose terms, a namespace is roughly equivalent to a separate Compose project — distinct networks and name scopes.</p>

<h2 id="configmaps-and-secrets--environment-variables-at-scale">ConfigMaps and Secrets — environment variables at scale</h2>

<p>Docker Compose lets you set <code class="language-plaintext highlighter-rouge">environment:</code> variables inline or via an <code class="language-plaintext highlighter-rouge">.env</code> file. Kubernetes separates non-sensitive config from sensitive config:</p>

<ul>
  <li><strong>ConfigMap</strong> — key-value pairs mounted as environment variables or files. Used for things like <code class="language-plaintext highlighter-rouge">FASTAPI_ROOT_PATH</code>, worker count, S3 bucket name.</li>
  <li><strong>Secret</strong> — base64-encoded values stored (optionally encrypted at rest) separately from your app manifests. Used for database URIs, API keys, and auth secrets. Pods reference Secrets by name; the values are injected at runtime, never baked into the image.</li>
</ul>

<h2 id="persistentvolumeclaims--durable-storage">PersistentVolumeClaims — durable storage</h2>

<p>Docker Compose uses named volumes (backed by the local filesystem) to persist data across container restarts. Kubernetes uses <strong>PersistentVolumeClaims (PVCs)</strong>: a request for a piece of storage of a given size and access mode. The cluster fulfils the claim by provisioning a real volume — an EBS disk on AWS, a DO Block Storage volume on Digital Ocean — and mounting it into the Pod.</p>

<p>PVCs survive Pod restarts and rescheduling. If a database Pod moves to a different node, the volume is detached and reattached automatically. Storage is provisioned dynamically by a <strong>StorageClass</strong>, which specifies the provisioner (e.g. <code class="language-plaintext highlighter-rouge">ebs.csi.aws.com</code> on EKS) and volume type.</p>

<h2 id="ingress-and-the-load-balancer">Ingress and the load balancer</h2>

<p>In Docker Compose you typically expose one port from one container. In Kubernetes, multiple Services need to be reachable from the outside under different paths or hostnames, all through a single external IP.</p>

<p><strong>ingress-nginx</strong> is a Kubernetes controller that runs an nginx reverse proxy inside the cluster. When deployed on EKS, it automatically provisions an AWS Network Load Balancer with a stable public IP. You define <strong>Ingress</strong> rules — “route <code class="language-plaintext highlighter-rouge">/fastapi</code> to the backend Service, everything else to the frontend Service” — and ingress-nginx handles the routing. On a new cluster, the load balancer is the only resource with a public IP; everything else is internal.</p>

<h2 id="cert-manager--automatic-tls">cert-manager — automatic TLS</h2>

<p>cert-manager is a Kubernetes controller that watches Ingress resources and automatically requests TLS certificates from Let’s Encrypt. When you annotate an Ingress with <code class="language-plaintext highlighter-rouge">cert-manager.io/cluster-issuer: letsencrypt-prod</code>, cert-manager handles the ACME challenge, obtains the certificate, stores it in a Secret, and renews it before it expires. You never touch a certificate manually.</p>

<h2 id="helm--packaging-it-all-together">Helm — packaging it all together</h2>

<p>Kubernetes resources are defined as YAML files. A real application needs dozens of them: Deployments, Services, ConfigMaps, Secrets, Ingress rules, PodDisruptionBudgets. <strong>Helm</strong> is the package manager for Kubernetes — it bundles all those YAML files into a <strong>chart</strong>, parameterises them with a <code class="language-plaintext highlighter-rouge">values.yaml</code> file, and installs or upgrades the whole bundle with a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm upgrade <span class="nt">--install</span> doc-router oci://ghcr.io/analytiq-hub/doc-router <span class="se">\</span>
  <span class="nt">--namespace</span> doc-router <span class="nt">--set</span> <span class="nv">appHost</span><span class="o">=</span>example.com ...
</code></pre></div></div>

<p>A chart can be published as an OCI artifact to any container registry alongside the Docker images.</p>

<p>If Docker Compose is a <code class="language-plaintext highlighter-rouge">docker run</code> wrapper, Helm is closer to an apt package: versioned, reproducible, and upgradeable.</p>

<h3 id="how-helm-applies-changes">How Helm applies changes</h3>

<p>Every time you run <code class="language-plaintext highlighter-rouge">helm upgrade</code>, Helm compares the new rendered YAML against what it last applied and sends only the diff to the Kubernetes API — resources that haven’t changed are left untouched. Helm records each upgrade as a numbered <strong>revision</strong>, stored as a Secret in the cluster:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>helm <span class="nb">history </span>doc-router <span class="nt">-n</span> doc-router
REVISION  STATUS     CHART           APP VERSION  DESCRIPTION
1         superseded doc-router-0.3.5  v27.0.0    Install <span class="nb">complete
</span>2         superseded doc-router-0.3.6  v27.0.1    Upgrade <span class="nb">complete
</span>3         deployed   doc-router-0.3.7  v27.0.2    Upgrade <span class="nb">complete</span>
</code></pre></div></div>

<h3 id="rolling-back-to-a-known-good-state">Rolling back to a known-good state</h3>

<p>If an upgrade goes wrong, rolling back to the previous revision is a single command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helm rollback doc-router <span class="nt">-n</span> doc-router        <span class="c"># rolls back to revision 2</span>
helm rollback doc-router 1 <span class="nt">-n</span> doc-router      <span class="c"># rolls back to a specific revision</span>
</code></pre></div></div>

<p>Helm re-applies the exact YAML from that revision — the same image tags, the same config values — so the cluster returns to the state that last worked. Using <code class="language-plaintext highlighter-rouge">--atomic</code> during an upgrade makes this automatic: if the new Pods don’t become healthy within the timeout, Helm rolls back on its own without any manual intervention.</p>

<h3 id="zero-downtime-rolling-updates">Zero-downtime rolling updates</h3>

<p>When Helm upgrades a Deployment with a new image, Kubernetes does not restart all Pods at once. It uses a <strong>rolling update</strong> strategy controlled by two parameters:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">strategy</span><span class="pi">:</span>
  <span class="na">type</span><span class="pi">:</span> <span class="s">RollingUpdate</span>
  <span class="na">rollingUpdate</span><span class="pi">:</span>
    <span class="na">maxUnavailable</span><span class="pi">:</span> <span class="m">0</span>   <span class="c1"># never take a pod down before a new one is ready</span>
    <span class="na">maxSurge</span><span class="pi">:</span> <span class="m">1</span>         <span class="c1"># allow one extra pod above the desired count during the rollout</span>
</code></pre></div></div>

<p>With <code class="language-plaintext highlighter-rouge">maxUnavailable: 0</code>, Kubernetes starts a new Pod with the new image first. Only after that Pod passes its readiness probe — meaning it is actually serving traffic — does Kubernetes terminate one of the old Pods. This continues one Pod at a time until all replicas are on the new version. At no point does the number of healthy Pods drop below the desired count.</p>

<p>The result: an upgrade from <code class="language-plaintext highlighter-rouge">v27.0.1</code> to <code class="language-plaintext highlighter-rouge">v27.0.2</code> with two replicas proceeds as:</p>

<ol>
  <li>Start new Pod (v27.0.2) — 2 old + 1 new running</li>
  <li>New Pod passes readiness check</li>
  <li>Terminate one old Pod — 1 old + 1 new running</li>
  <li>Start second new Pod — 1 old + 2 new running</li>
  <li>Second new Pod passes readiness — terminate last old Pod</li>
  <li>Rollout complete — 2 new Pods running, zero downtime</li>
</ol>

<p>If the new Pod fails its readiness check at step 2, the rollout pauses. No old Pods have been terminated, so the old version continues serving 100% of traffic. With <code class="language-plaintext highlighter-rouge">--atomic</code>, Helm then rolls the release back automatically.</p>

<h2 id="running-kubernetes-locally-with-kind">Running Kubernetes locally with Kind</h2>

<p>Before deploying to a real cluster, it’s useful to test locally using <strong>Kind</strong> (Kubernetes in Docker). Kind runs an entire Kubernetes cluster — control plane and worker nodes — as Docker containers on your laptop. There is no cloud provider, no load balancer, and no cloud volumes; Kind uses your local filesystem for storage and <code class="language-plaintext highlighter-rouge">NodePort</code> services for external access.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./deploy/scripts/setup-kind.sh   <span class="c"># creates the Kind cluster</span>
./deploy/scripts/deploy-kind.sh  <span class="c"># installs the Helm chart locally</span>
</code></pre></div></div>

<p>The same chart that runs on EKS runs on Kind, with a different <code class="language-plaintext highlighter-rouge">values-kind.yaml</code> override file. This lets you iterate on chart changes without incurring cloud costs or waiting for node provisioning.</p>

<h2 id="summary">Summary</h2>

<table>
  <thead>
    <tr>
      <th>Docker Compose concept</th>
      <th>Kubernetes equivalent</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Container</td>
      <td>Pod (usually 1 container, sometimes with sidecars)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">service:</code> block</td>
      <td>Deployment + Service</td>
    </tr>
    <tr>
      <td>Container DNS (service name)</td>
      <td>Service DNS (<code class="language-plaintext highlighter-rouge">name.namespace.svc.cluster.local</code>)</td>
    </tr>
    <tr>
      <td>Compose project</td>
      <td>Namespace</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">environment:</code> / <code class="language-plaintext highlighter-rouge">.env</code></td>
      <td>ConfigMap (non-secret) + Secret (sensitive)</td>
    </tr>
    <tr>
      <td>Named volume</td>
      <td>PersistentVolumeClaim + StorageClass</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">ports:</code> expose</td>
      <td>Ingress + LoadBalancer Service</td>
    </tr>
    <tr>
      <td>Manual TLS</td>
      <td>cert-manager (automatic Let’s Encrypt)</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">docker-compose.yml</code></td>
      <td>Helm chart (<code class="language-plaintext highlighter-rouge">values.yaml</code> + templates)</td>
    </tr>
    <tr>
      <td>Local Docker</td>
      <td>Kind (Kubernetes in Docker)</td>
    </tr>
  </tbody>
</table>

<p><strong>Next:</strong> <a href="/tech/kubernetes/devops/kubernetes-packaging-helm-gitops/">Kubernetes Packaging and Deployment: Kustomize, Helm, and GitOps</a> goes deeper into packaging manifests and GitOps with Flux.</p>

<hr />

<p><em>Andrei Radulescu-Banu is the founder of <a href="https://docrouter.ai">DocRouter.AI</a> (document processing with LLMs) and <a href="https://sigagent.ai">SigAgent.AI</a> (Claude Agent monitoring). His company <a href="https://analytiqhub.com">AnalytiqHub.com</a> provides consulting services for cloud and AI engineering.</em></p>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="tech" /><category term="kubernetes" /><category term="devops" /><summary type="html"><![CDATA[If you've used Docker Compose, you already understand the core idea. Kubernetes takes that same idea and extends it to run across a cluster of machines, with built-in handling for failures, scaling, and upgrades.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/kubernetes-docker-users-primer-splash.png" /><media:content medium="image" url="https://smaht.ai/assets/images/kubernetes-docker-users-primer-splash.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Three SaaS Strategies for an Age of AI</title><link href="https://smaht.ai/business/saas/ai/strategy/three-saas-strategies-for-an-age-of-ai/" rel="alternate" type="text/html" title="Three SaaS Strategies for an Age of AI" /><published>2026-02-27T00:00:00+00:00</published><updated>2026-02-27T00:00:00+00:00</updated><id>https://smaht.ai/business/saas/ai/strategy/three-saas-strategies-for-an-age-of-ai</id><content type="html" xml:base="https://smaht.ai/business/saas/ai/strategy/three-saas-strategies-for-an-age-of-ai/"><![CDATA[<p><img src="/assets/images/three-musketeers.png" alt="Three SaaS Strategies for an Age of AI" /></p>

<p>How can you structure a SaaS business so that AI makes your business more valuable, rather than just blowing it up?</p>

<p>These three strategies match AI to different parts of a product life cycle - launching, growing, or harvesting profits.</p>

<h2 id="launch-to-ai-buyers"><strong>Launch to AI Buyers</strong></h2>

<p>SaaS companies are losing human seats. On the other hand, they are expanding sales to AI.</p>

<p>AI is a rapidly growing channel. It works like this: People ask their AI to do something. Then the AI finds, acquires, and uses services. Coding agents work aggressively to find and embed services. Other types of AI buyers include task-focused agents, chatbots and search engines, and personal assistants like Claude Cowork and OpenClaw.</p>

<p>You can sell your existing product to AI buyers. However, you will want to try some new packaging. For example, humans have spent 30 years rejecting micropayments, preferring to to buy chunky subscriptions that require fewer purchase decisions. AI agents use “X402” to make a stream of small payments, averaging around 20 cents per transaction.</p>

<p>Have you built an AI channel?</p>

<ul>
  <li>Can AI find and recommend your product?</li>
  <li>Can it use your product?</li>
  <li>Can it onboard its human to use your product?</li>
  <li>Can it buy your product?</li>
</ul>

<h2 id="grow-with-increased-velocity"><strong>Grow with Increased Velocity</strong></h2>

<p>Is the product actively competing for new customers in an existing market?</p>

<p>In the age of AI, you will be less sure about what features, benefits, and channels will win. When you see that something is working, you want to grab share. A basic starting point is to maximize velocity with AI.</p>

<p>Your programmers have increased their velocity by using AI coders. Now it is your turn to automate more of the steps in product delivery. AI can look at data and interview customers to figure out what use cases to focus on. It can write change requests and feature requests. It can write code (we knew that). It can review code. It can deploy and monitor deployments. It can encourage each customer to expand usage. It can pull their data from competing systems. It can’t do all of these things with high quality today. It will be surprisingly good at all of them by the end of 2026.</p>

<p>You will free up time as you add individual agents and tools for each step. You will get a sudden surge of velocity when the steps link together into a “software factory”.</p>

<p>Then you will be ready to beat the competition when you see an opportunity.</p>

<h2 id="increase-profitability-for-mature-products"><strong>Increase Profitability for Mature Products</strong></h2>

<p>Does the product have a bombproof customer base that renews every year and just wants the product to work?</p>

<p>Many SaaS products end up in this situation. The customers want continuity and reliability. As a new product developer, it causes me agony to say this, but those customers do NOT want you to bother them about a lot of new stuff.</p>

<p>You can run this business with six people, and a bunch of AI assistants. Then it will be more profitable. The Silicon Valley guys are talking about how AI assistants can power a billion dollar company run by a single person. That will be awesome because one person is both agile (no meetings about HR or change management) and cheap. More practically, we can catch up with <a href="https://www.saastr.com/a-big-year-for-saastr-ai-what-we-got-done-in-2025/">SaaStr</a> to enjoy “Eight-figure revenue with single-digit headcount.”</p>

<p>The classic startup team is:</p>

<ul>
  <li>A “hustler” who makes sales, gets resources, and organizes. This is often a CEO role</li>
  <li>A “hipster” who handles marketing and customer experience</li>
  <li>A “hacker” who builds out the technical aspects of a product</li>
</ul>

<p>You can run a business with these three roles. Because your deliverable is continuity, each person will need a backup and successor. You end up with six people.</p>

<p>Lock these guys down, pay them well, and find good backups. You are delivering confidence in the continuity and reliability of your product and team.</p>

<h2 id="next-steps"><strong>Next Steps</strong></h2>

<p>This article illustrates three different ways to use AI:</p>

<ul>
  <li>As a customer for new launches. I am working on an AI channel grader that will figure out where a product is effectively selling to AI, and where we need to improve the packaging.</li>
  <li>As a worker in a software factory that increases production velocity. I am working on adding software factory capabilities to our TIPL project launcher.</li>
  <li>As an assistant for humans that increases revenue per employee. You can fill this slot with your own priorities.</li>
</ul>

<p>Follow here for updates and ideas.</p>]]></content><author><name>Andy Singleton</name></author><category term="business" /><category term="saas" /><category term="ai" /><category term="strategy" /><summary type="html"><![CDATA[How can you structure a SaaS business so that AI makes your business more valuable, rather than just blowing it up?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/three-musketeers.png" /><media:content medium="image" url="https://smaht.ai/assets/images/three-musketeers.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How To Create Document Workflows With Temporal And DocRouter.AI</title><link href="https://smaht.ai/tech/programming/ai/tutorials/how-to-create-document-workflows-with-temporal-and-docrouter-ai/" rel="alternate" type="text/html" title="How To Create Document Workflows With Temporal And DocRouter.AI" /><published>2025-12-25T00:00:00+00:00</published><updated>2025-12-25T00:00:00+00:00</updated><id>https://smaht.ai/tech/programming/ai/tutorials/how-to-create-document-workflows-with-temporal-and-docrouter-ai</id><content type="html" xml:base="https://smaht.ai/tech/programming/ai/tutorials/how-to-create-document-workflows-with-temporal-and-docrouter-ai/"><![CDATA[<p>🚀 Just spent the last few days building a powerful multi-step document processing pipeline — and it handles 200+ page medical records like a champ!</p>

<p>Single-prompt tools like DocRouter.AI shine for ~20-25 page docs… but what about massive collated files with labs, facesheets, insurance cards, and multiple patients mixed together? → One prompt = impossible.</p>

<p><strong>Enter the solution: Temporal + DocRouter.AI in a smart, scalable workflow.</strong></p>

<p>This post describes a real-world implementation that uses <a href="https://temporal.io/">Temporal</a> to orchestrate document processing workflows with <a href="http://docrouter.ai">DocRouter.AI</a>, solving the challenge of processing massive medical records through intelligent multi-step orchestration.</p>

<p>The implementation is available at <a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025">doc-router-temporal</a> and processes medical documents containing hundreds of pages, extracting patient names, dates of birth, and medical insurance information.</p>

<h2 id="the-problem-massive-medical-records-need-smart-orchestration">The Problem: Massive Medical Records Need Smart Orchestration</h2>

<p>Medical records often come as massive collated files containing 200+ pages with:</p>
<ul>
  <li>Lab results and test reports</li>
  <li>Patient facesheets with demographics</li>
  <li>Insurance cards and coverage details</li>
  <li>Clinical notes and progress reports</li>
  <li>Multiple patients’ information mixed together</li>
</ul>

<p><strong>The challenge</strong>: These documents are too large to process in a single LLM prompt due to token limits (typically 128K-200K tokens). One prompt = impossible for comprehensive extraction.</p>

<p><strong>The solution</strong>: A multi-step workflow that intelligently orchestrates the process:</p>

<ol>
  <li><strong>Split</strong>: Break the massive PDF into individual pages</li>
  <li><strong>Classify</strong>: Identify each page’s type and which patient it belongs to</li>
  <li><strong>Group</strong>: Intelligently group pages by patient</li>
  <li><strong>Extract</strong>: Process each patient’s page bundle for precise, targeted extraction</li>
</ol>

<h2 id="why-this-pattern-rocks">Why This Pattern Rocks</h2>

<p>This Temporal + DocRouter.AI combination delivers powerful advantages:</p>

<p>✅ <strong>Constant memory usage</strong> — scales effortlessly to 1,000+ pages without running out of resources</p>

<p>✅ <strong>Super general pattern</strong> → classify → group → process per group → works for any document type</p>

<p>✅ <strong>Fully durable &amp; retry-safe</strong> thanks to Temporal’s built-in resilience</p>

<p>✅ <strong>Built lightning-fast</strong> in just a couple of days using AI tools</p>

<p>✅ <strong>Parallel processing</strong> — handles multiple patients simultaneously while maintaining order</p>

<p>✅ <strong>Production-ready</strong> with automatic error handling, timeouts, and state management</p>

<div data-excalidraw="/assets/excalidraw/document_processing_solution.excalidraw" class="excalidraw-container">
  <div class="loading-placeholder">Loading diagram...</div>
</div>
<div style="text-align: center; margin-top: 1rem;">
  <a href="/excalidraw-edit?file=/assets/excalidraw/document_processing_solution.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
    📝 Edit in Excalidraw
  </a>
</div>

<h2 id="the-smart-workflow-in-action">The Smart Workflow in Action</h2>

<p>Here’s how the Temporal + DocRouter.AI workflow processes massive medical records:</p>

<p>🔹 <strong>Temporal splits</strong> the 200+ page PDF into individual pages</p>

<p>🔹 <strong>Uploads them one by one</strong> to DocRouter.AI for processing</p>

<p>🔹 <strong>DocRouter classifies each page</strong> → identifies patient name + document type (lab results, insurance card, facesheet, etc.)</p>

<p>🔹 <strong>Temporal intelligently groups pages by patient</strong> using fuzzy name matching and DOB correlation</p>

<p>🔹 <strong>Sends each patient’s page bundle back to DocRouter.AI</strong> for precise, targeted extraction</p>

<p>🔹 <strong>Temporal aggregates everything</strong> → clean, complete per-patient results</p>

<h2 id="technical-implementation">Technical Implementation</h2>

<h2 id="why-temporal">Why Temporal?</h2>

<p><a href="https://temporal.io/">Temporal</a> provides durable workflow orchestration that’s perfect for this use case. Unlike traditional approaches (queues, background jobs, or simple scripts), Temporal handles:</p>

<ul>
  <li><strong>Durable execution</strong>: Resumes from crashes during 200-page processing</li>
  <li><strong>Parallel processing</strong>: Processes multiple pages simultaneously while maintaining order</li>
  <li><strong>Error handling</strong>: Automatic retries for API rate limits and network issues</li>
  <li><strong>State management</strong>: Tracks processed pages and identified patients</li>
  <li><strong>Long-running workflows</strong>: Handles processes that take minutes to hours</li>
</ul>

<p>Temporal’s architecture is built around two key concepts: <strong>Workflows</strong> (orchestration logic) and <strong>Activities</strong> (actual work). The diagram below illustrates how these components work together:</p>

<div data-excalidraw="/assets/excalidraw/temporal_workflows_activities.excalidraw" class="excalidraw-container">
  <div class="loading-placeholder">Loading diagram...</div>
</div>
<div style="text-align: center; margin-top: 1rem;">
  <a href="/excalidraw-edit?file=/assets/excalidraw/temporal_workflows_activities.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
    📝 Edit in Excalidraw
  </a>
</div>

<h2 id="the-workflow-implementation">The Workflow Implementation</h2>

<p>The implementation uses a hierarchical workflow structure with two main workflows:</p>

<ol>
  <li><strong>Classify and Group PDF Pages</strong> (<code class="language-plaintext highlighter-rouge">ClassifyAndGroupPDFPagesWorkflow</code>): Chunks the PDF, classifies each page, and groups pages by patient</li>
  <li><strong>Extract Insurance Information</strong> (<code class="language-plaintext highlighter-rouge">ClassifyGroupAndExtractInsuranceWorkflow</code>): Creates patient-specific PDFs and extracts insurance card data</li>
</ol>

<p>The main workflow (<code class="language-plaintext highlighter-rouge">ClassifyGroupAndExtractInsuranceWorkflow</code>) orchestrates the entire process:</p>

<div data-excalidraw="/assets/excalidraw/temporal_docrouter_workflow.excalidraw" class="excalidraw-container">
  <div class="loading-placeholder">Loading diagram...</div>
</div>
<div style="text-align: center; margin-top: 1rem;">
  <a href="/excalidraw-edit?file=/assets/excalidraw/temporal_docrouter_workflow.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
    📝 Edit in Excalidraw
  </a>
</div>

<style>
.excalidraw-container {
  width: 100%;
  border: 2px solid #e0e0e0;
  border-radius: 8px;
  box-shadow: 0 2px 8px rgba(0,0,0,0.1);
  background: white;
  display: block;
  margin: 2rem 0;
  min-height: 400px;
}

.excalidraw-container svg {
  width: 100%;
  height: auto;
  display: block;
  margin: 0;
}

.loading-placeholder {
  padding: 2rem;
  text-align: center;
  color: #666;
}
</style>

<script type="module" src="/assets/js/excalidraw/render-excalidraw.js"></script>

<h2 id="creating-schemas-and-prompts-with-claude-agent">Creating Schemas and Prompts with Claude Agent</h2>

<p>Before building the Temporal workflow, we created the extraction schemas and prompts using the <strong>Claude Agent for DocRouter.AI</strong> (an MCP server at <a href="https://github.com/analytiq-hub/doc-router/tree/main/packages/typescript/mcp"><code class="language-plaintext highlighter-rouge">doc-router/packages/typescript/mcp</code></a>).</p>

<p>The Claude Agent allows Claude Code to create extraction schemas and prompts. For example, you can prompt: <em>“Create a schema for extracting patient information from medical record pages”</em> and it will validate, create, and test the schema automatically.</p>

<p>The diagram below illustrates how DocRouter.AI operations work and how they integrate with Temporal workflows:</p>

<div data-excalidraw="/assets/excalidraw/docrouter_operations.excalidraw" class="excalidraw-container">
  <div class="loading-placeholder">Loading diagram...</div>
</div>
<div style="text-align: center; margin-top: 1rem;">
  <a href="/excalidraw-edit?file=/assets/excalidraw/docrouter_operations.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
    📝 Edit in Excalidraw
  </a>
</div>

<p><strong>Key distinction</strong>: DocRouter.AI implements <strong>discrete operations</strong> (single prompt-and-schema processing per document), while Temporal implements the <strong>workflow orchestration</strong> (chunking, grouping, uploading pages for classification, and uploading chunks for extraction).</p>

<p>For this implementation, we created:</p>
<ul>
  <li><strong><code class="language-plaintext highlighter-rouge">medical_page_classifier</code></strong>: Classifies pages as labs, facesheets, insurance cards, clinical notes, or other document types</li>
  <li><strong><code class="language-plaintext highlighter-rouge">insurance_card</code></strong>: Extracts insurance card information from patient pages</li>
</ul>

<h2 id="workflow-implementation">Workflow Implementation</h2>

<p>The main workflow (<code class="language-plaintext highlighter-rouge">ClassifyGroupAndExtractInsuranceWorkflow</code>) orchestrates the entire process. Complete implementation: <a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025/workflows/classify_group_and_extract_insurance.py"><code class="language-plaintext highlighter-rouge">workflows/classify_group_and_extract_insurance.py</code></a>.</p>

<h3 id="step-1-classify-and-group-pages">Step 1: Classify and Group Pages</h3>

<p>The workflow calls <code class="language-plaintext highlighter-rouge">ClassifyAndGroupPDFPagesWorkflow</code> to:</p>
<ol>
  <li><strong>Chunk the PDF</strong> into individual pages</li>
  <li><strong>Classify each page</strong> using DocRouter.AI</li>
  <li><strong>Group pages by patient</strong> using name and DOB matching</li>
</ol>

<p>The grouping logic (<a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025/activities/group_classification_results.py"><code class="language-plaintext highlighter-rouge">activities/group_classification_results.py</code></a>) includes name normalization, DOB parsing, and fuzzy matching with Levenshtein distance to handle typos and variations.</p>

<h3 id="step-2-extract-insurance-information">Step 2: Extract Insurance Information</h3>

<p>For each patient group, the workflow:</p>
<ol>
  <li><strong>Creates patient-specific PDFs</strong> with only that patient’s pages (<a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025/activities/create_and_upload_patient_pdf.py"><code class="language-plaintext highlighter-rouge">activities/create_and_upload_patient_pdf.py</code></a>)</li>
  <li><strong>Uploads them to DocRouter.AI</strong> for insurance card extraction</li>
  <li><strong>Polls for completion</strong> and retrieves results</li>
</ol>

<p>To avoid passing large binary data through Temporal, PDFs are read from disk and uploaded directly to DocRouter.AI.</p>

<h2 id="creating-temporal-workflows-with-cursor">Creating Temporal Workflows with Cursor</h2>

<p>The Temporal workflow was developed in <strong>Cursor</strong> using natural language prompts. Cursor’s AI understood the codebase context and Temporal patterns, enabling rapid development without deep workflow expertise.</p>

<p><strong>Key benefits:</strong></p>
<ul>
  <li>Context awareness across multiple files and existing activities</li>
  <li>Automatic Temporal pattern suggestions (activities, workflows, child workflows)</li>
  <li>Natural language refactoring and error handling implementation</li>
</ul>

<p><strong>Example development prompts:</strong></p>

<p><em>“Create a workflow that processes each patient’s pages into separate PDFs, uploads them with insurance_card tag, waits for completion, then retrieves insurance extraction results.”</em></p>

<p><em>“Add fuzzy name matching to group pages with names differing by up to 2 letters using Levenshtein distance.”</em></p>

<p><em>“Handle edge cases where medical records contain individual patient names vs. multiple patient summaries.”</em></p>

<p>Cursor handled the complex Temporal implementation, error handling, and performance optimizations, resulting in production-ready code in just a few hours.</p>

<h2 id="key-implementation-details">Key Implementation Details</h2>

<h3 id="design-decisions">Design Decisions</h3>

<ul>
  <li><strong>Avoid large data transfer</strong>: PDFs are read from disk and uploaded directly to DocRouter.AI, not passed through Temporal</li>
  <li><strong>Parallel processing</strong>: Multiple patients processed concurrently with status polling</li>
  <li><strong>Error handling</strong>: Retry logic, graceful degradation, and timeout handling</li>
  <li><strong>State management</strong>: Only document IDs and metadata flow through Temporal to keep history efficient</li>
</ul>

<h2 id="results">Results</h2>

<p>The implementation successfully processes massive medical record documents with hundreds of pages, extracting patient names, dates of birth, and medical insurance information. It handles large documents (200+ pages), parallel patient processing, error recovery, and long-running operations.</p>

<h3 id="running-the-workflow">Running the Workflow</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Start the Temporal worker</span>
python worker.py

<span class="c"># In another terminal, run the client</span>
python client_classify_group_and_extract_insurance.py &lt;path_to_pdf&gt;
</code></pre></div></div>

<p>See the <a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025/README.md">README</a> and <a href="https://github.com/analytiq-hub/doc-router-temporal/blob/blog_post_dec_2025/client_classify_group_and_extract_insurance.py">client script</a> for details.</p>

<p>The workflow returns JSON with file name, page classifications, schedule pages, and patient data with insurance information.</p>

<h2 id="were-in-a-new-era">We’re in a New Era</h2>

<p><strong>What used to take months of engineering can now be shipped in days.</strong></p>

<p>This Temporal + DocRouter.AI pipeline was built end-to-end using Claude Code-based Agent (for prompts + schemas) + Cursor (for Temporal workflows). I barely knew Temporal before starting — didn’t matter. AI tools let me iterate fast, prototype, and perfect the logic in record time.</p>

<p>The result: reliable, scalable document processing with durable workflows, parallel processing, and rapid schema iteration. The implementation took just 2 days to build and handles 200+ page medical records like a champ.</p>

<p>If you’re building AI-powered document workflows (especially in healthcare), this combo is 🔥.</p>

<p>Code available at <a href="https://github.com/analytiq-hub/doc-router-temporal/tree/blog_post_dec_2025">doc-router-temporal</a>.</p>

<ul>
  <li><a href="https://docs.temporal.io/">Temporal Documentation</a></li>
  <li><a href="https://docrouter.ai/docs/quick-start">DocRouter.AI Documentation</a></li>
  <li><a href="https://docrouter.ai/docs/mcp">DocRouter.AI MCP Server</a></li>
</ul>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="tech" /><category term="programming" /><category term="ai" /><category term="tutorials" /><summary type="html"><![CDATA[🚀 Just spent the last few days building a powerful multi-step document processing pipeline — and it handles 200+ page medical records like a champ!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/temporal_docrouter_workflows.svg" /><media:content medium="image" url="https://smaht.ai/assets/images/temporal_docrouter_workflows.svg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Announcing Analytiq Pages Theme: A Modern Jekyll Theme with Tailwind CSS</title><link href="https://smaht.ai/webdev/jekyll/tailwind/github-pages/theme/release/announcing-analytiq-pages-theme/" rel="alternate" type="text/html" title="Announcing Analytiq Pages Theme: A Modern Jekyll Theme with Tailwind CSS" /><published>2025-11-29T00:00:00+00:00</published><updated>2025-11-29T00:00:00+00:00</updated><id>https://smaht.ai/webdev/jekyll/tailwind/github-pages/theme/release/announcing-analytiq-pages-theme</id><content type="html" xml:base="https://smaht.ai/webdev/jekyll/tailwind/github-pages/theme/release/announcing-analytiq-pages-theme/"><![CDATA[<p>🎉 <strong>We’re excited to announce the release of Analytiq Pages Theme v0.1.6</strong> - a modern, feature-rich Jekyll theme that transforms our Analytiq Pages approach into a reusable, professional-grade solution for building beautiful company websites.</p>

<h2 id="the-evolution-from-method-to-theme">The Evolution: From Method to Theme</h2>

<p>Analytiq Pages started as a methodology for building company websites using Jekyll, GitHub Pages, and Tailwind CSS. Today, we’re proud to release it as <strong>Analytiq Pages Theme</strong> - a fully packaged Jekyll theme that makes this powerful combination accessible to everyone.</p>

<div class="grid md:grid-cols-2 gap-8 my-8">
  <div class="bg-white rounded-lg shadow-lg p-6 border border-gray-200">
    <div class="w-12 h-12 bg-blue-600 rounded-lg flex items-center justify-center mb-4">
      <svg class="w-6 h-6 text-white" fill="currentColor" viewBox="0 0 24 24">
        <path d="M12 2L2 7l10 5 10-5-10-5zM2 17l10 5 10-5M2 12l10 5 10-5" />
      </svg>
    </div>
    <h3 class="text-xl font-semibold text-gray-900 mb-3">Before: Analytiq Pages</h3>
    <p class="text-gray-600">A methodology requiring manual setup of Jekyll, Tailwind, and custom configurations for each site.</p>
  </div>

  <div class="bg-white rounded-lg shadow-lg p-6 border border-gray-200">
    <div class="w-12 h-12 bg-green-600 rounded-lg flex items-center justify-center mb-4">
      <svg class="w-6 h-6 text-white" fill="currentColor" viewBox="0 0 24 24">
        <path d="M12 2l3.09 6.26L22 9.27l-5 4.87 1.18 6.88L12 17.77l-6.18 3.25L7 14.14 2 9.27l6.91-1.01L12 2z" />
      </svg>
    </div>
    <h3 class="text-xl font-semibold text-gray-900 mb-3">Now: Analytiq Pages Theme</h3>
    <p class="text-gray-600">A complete, ready-to-use Jekyll theme with all features pre-configured and professionally designed.</p>
  </div>
</div>

<h2 id="-whats-new-in-analytiq-pages-theme">✨ What’s New in Analytiq Pages Theme</h2>

<h3 id="-advanced-features">🚀 Advanced Features</h3>

<ul>
  <li><strong>Tailwind CSS Integration</strong>: Modern, responsive design with utility-first styling</li>
  <li><strong>Enhanced Syntax Highlighting</strong>: Beautiful code blocks with copy functionality using highlight.js</li>
  <li><strong>Interactive Diagrams</strong>: Full Excalidraw integration for creating and embedding technical diagrams</li>
  <li><strong>Professional Blog Layouts</strong>: Complete blog system with sidebar, pagination, and category support</li>
  <li><strong>Responsive Navigation</strong>: Mobile-first navigation with dropdown menus and hamburger menu</li>
  <li><strong>Dark Theme Support</strong>: Built-in dark mode (Minima skin) for modern aesthetics</li>
</ul>

<h3 id="-developer-experience">🛠 Developer Experience</h3>

<ul>
  <li><strong>Three Customization Hooks</strong>: Override <code class="language-plaintext highlighter-rouge">custom-head.html</code>, <code class="language-plaintext highlighter-rouge">custom-header.html</code>, and <code class="language-plaintext highlighter-rouge">custom-footer.html</code> for site-specific modifications</li>
  <li><strong>Reusable Components</strong>: Pre-built Tailwind components (alerts, buttons, cards)</li>
  <li><strong>SEO Optimized</strong>: Integrated jekyll-seo-tag for better search engine visibility</li>
  <li><strong>PDF Embedding</strong>: Native support for embedding PDF documents</li>
  <li><strong>RSS Feed Generation</strong>: Automatic blog feed generation with jekyll-feed</li>
</ul>

<h3 id="-content-features">🎨 Content Features</h3>

<ul>
  <li><strong>MathJax Support</strong>: Render mathematical equations in your content</li>
  <li><strong>Multiple Layouts</strong>: Specialized layouts for homepages, blog posts, documentation, and more</li>
  <li><strong>Excalidraw Editor</strong>: Built-in diagram editor accessible at <code class="language-plaintext highlighter-rouge">/excalidraw-edit</code></li>
  <li><strong>Smart Embeds</strong>: Flexible diagram embedding with static, interactive, and link modes</li>
</ul>

<h2 id="installation-get-started-in-minutes">Installation: Get Started in Minutes</h2>

<h3 id="option-1-quick-start-with-existing-site">Option 1: Quick Start with Existing Site</h3>

<p>Add to your Jekyll site’s <code class="language-plaintext highlighter-rouge">Gemfile</code>:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">gem</span> <span class="s2">"analytiq-pages-theme"</span><span class="p">,</span> <span class="ss">git: </span><span class="s2">"https://github.com/analytiq-hub/analytiq-pages-theme"</span>
</code></pre></div></div>

<p>Or for a specific stable version:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="n">gem</span> <span class="s2">"analytiq-pages-theme"</span><span class="p">,</span> <span class="ss">git: </span><span class="s2">"https://github.com/analytiq-hub/analytiq-pages-theme"</span><span class="p">,</span> <span class="ss">tag: </span><span class="s2">"v0.1.6"</span>
</code></pre></div></div>

<p>Update your <code class="language-plaintext highlighter-rouge">_config.yml</code>:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">theme</span><span class="pi">:</span> <span class="s">analytiq-pages-theme</span>
</code></pre></div></div>

<p>Install and serve:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle <span class="nb">install
</span>bundle <span class="nb">exec </span>jekyll serve
</code></pre></div></div>

<h3 id="option-2-new-site-from-scratch">Option 2: New Site from Scratch</h3>

<p>The simplest way to get started:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create new Jekyll site</span>
jekyll new my-company-site
<span class="nb">cd </span>my-company-site

<span class="c"># Replace minima theme with analytiq-pages-theme in Gemfile</span>
<span class="nb">sed</span> <span class="nt">-i</span> <span class="s1">'s/gem "minima".*/gem "analytiq-pages-theme", git: "https:\/\/github.com\/analytiq-hub\/analytiq-pages-theme", tag: "v0.1.6"/'</span> Gemfile

<span class="c"># Configure theme in _config.yml</span>
<span class="nb">sed</span> <span class="nt">-i</span> <span class="s1">'s/^theme: .*/theme: analytiq-pages-theme/'</span> _config.yml

<span class="c"># Install and serve</span>
bundle <span class="nb">install
</span>bundle <span class="nb">exec </span>jekyll serve
</code></pre></div></div>

<p>Or if you prefer manual editing:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create new Jekyll site</span>
jekyll new my-company-site
<span class="nb">cd </span>my-company-site

<span class="c"># Edit Gemfile: replace the minima line with:</span>
<span class="c"># gem "analytiq-pages-theme", git: "https://github.com/analytiq-hub/analytiq-pages-theme", tag: "v0.1.6"</span>

<span class="c"># Edit _config.yml: replace the theme line with:</span>
<span class="c"># theme: analytiq-pages-theme</span>

<span class="c"># Install and serve</span>
bundle <span class="nb">install
</span>bundle <span class="nb">exec </span>jekyll serve
</code></pre></div></div>

<p>Visit <code class="language-plaintext highlighter-rouge">http://localhost:4000</code> to see your new site!</p>

<h3 id="option-3-local-installation-alternative">Option 3: Local Installation (Alternative)</h3>

<p>If you encounter repository access issues, you can install the theme locally:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Download the theme release</span>
curl <span class="nt">-L</span> https://github.com/analytiq-hub/analytiq-pages-theme/archive/refs/tags/v0.1.6.zip <span class="nt">-o</span> theme.zip
unzip theme.zip
<span class="nb">mv </span>analytiq-pages-theme-0.1.6/ _themes/analytiq-pages-theme/

<span class="c"># Or clone locally if you have access</span>
git clone https://github.com/analytiq-hub/analytiq-pages-theme.git _themes/analytiq-pages-theme
<span class="nb">cd </span>_themes/analytiq-pages-theme <span class="o">&amp;&amp;</span> git checkout v0.1.6

<span class="c"># Add to _config.yml</span>
<span class="nb">echo</span> <span class="s2">"theme: _themes/analytiq-pages-theme"</span> <span class="o">&gt;&gt;</span> _config.yml
</code></pre></div></div>

<h2 id="key-improvements-over-manual-setup">Key Improvements Over Manual Setup</h2>

<h3 id="before-analytiq-pages-method">Before (Analytiq Pages Method)</h3>
<ul>
  <li>Manual Tailwind CSS configuration</li>
  <li>Custom Jekyll setup for each project</li>
  <li>Repeated configuration of syntax highlighting</li>
  <li>Manual Excalidraw integration setup</li>
  <li>No standardized component library</li>
</ul>

<h3 id="after-analytiq-pages-theme">After (Analytiq Pages Theme)</h3>
<ul>
  <li><strong>One-line installation</strong>: <code class="language-plaintext highlighter-rouge">theme: analytiq-pages-theme</code></li>
  <li><strong>Pre-configured features</strong>: Everything works out of the box</li>
  <li><strong>Professional components</strong>: Reusable Tailwind components included</li>
  <li><strong>Advanced integrations</strong>: Excalidraw, MathJax, PDF embeds ready to use</li>
  <li><strong>Consistent experience</strong>: Standardized layouts and styling across sites</li>
</ul>

<h2 id="showcase-real-world-examples">Showcase: Real-World Examples</h2>

<p>The theme powers several professional websites:</p>

<ul>
  <li><strong><a href="https://analytiqhub.com">Analytiq Hub</a></strong> - Business intelligence and analytics platform</li>
  <li><strong><a href="https://docrouter.ai">DocRouter.AI</a></strong> - AI-powered document routing solution</li>
  <li><strong><a href="https://sigagent.ai">SigAgent.AI</a></strong> - Signature analysis and automation platform</li>
  <li><strong><a href="https://bitdribble.github.io">Bitdribble</a></strong> - Technology consulting and development</li>
</ul>

<h2 id="migration-guide-upgrading-from-analytiq-pages">Migration Guide: Upgrading from Analytiq Pages</h2>

<p>If you’re currently using the Analytiq Pages methodology, migration is straightforward:</p>

<ol>
  <li><strong>Add the theme</strong> to your Gemfile and <code class="language-plaintext highlighter-rouge">_config.yml</code></li>
  <li><strong>Remove manual Tailwind configuration</strong> (now handled by the theme)</li>
  <li><strong>Update custom includes</strong> to use the new hook system</li>
  <li><strong>Migrate Excalidraw files</strong> to <code class="language-plaintext highlighter-rouge">assets/excalidraw/</code> directory</li>
</ol>

<p>Your existing content and configuration will continue to work seamlessly.</p>

<h2 id="technical-architecture">Technical Architecture</h2>

<p>The theme is built with modern web standards:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>analytiq-pages-theme/
├── _layouts/           # 5 specialized layouts
├── _includes/          # 16+ reusable components
├── assets/
│   ├── css/           # Tailwind + custom styles
│   └── js/            # Pagination, Excalidraw renderer
├── _config.yml        # Default configuration
└── analytiq-pages-theme.gemspec
</code></pre></div></div>

<p><strong>Dependencies:</strong></p>
<ul>
  <li>Jekyll &gt;= 3.9, &lt; 5.0 (supports both GitHub Pages and Jekyll 4.x)</li>
  <li>jekyll-feed ~&gt; 0.12</li>
  <li>jekyll-seo-tag ~&gt; 2.6</li>
  <li>jekyll-pdf-embed ~&gt; 1.1</li>
</ul>

<h2 id="why-choose-analytiq-pages-theme">Why Choose Analytiq Pages Theme?</h2>

<h3 id="for-startups--small-businesses">For Startups &amp; Small Businesses</h3>
<ul>
  <li><strong>Zero hosting costs</strong> with GitHub Pages</li>
  <li><strong>Professional appearance</strong> without design costs</li>
  <li><strong>Content-first approach</strong> with Markdown simplicity</li>
  <li><strong>Scalable foundation</strong> that grows with your business</li>
</ul>

<h3 id="for-agencies--consultants">For Agencies &amp; Consultants</h3>
<ul>
  <li><strong>Rapid deployment</strong> for client websites</li>
  <li><strong>Consistent branding</strong> across projects</li>
  <li><strong>Advanced features</strong> for technical content</li>
  <li><strong>Easy customization</strong> for client-specific needs</li>
</ul>

<h3 id="for-enterprise-teams">For Enterprise Teams</h3>
<ul>
  <li><strong>Git-based workflows</strong> for version control and collaboration</li>
  <li><strong>Security compliance</strong> with GitHub’s enterprise infrastructure</li>
  <li><strong>SEO optimization</strong> built-in</li>
  <li><strong>Extensible architecture</strong> for custom requirements</li>
</ul>

<h2 id="troubleshooting">Troubleshooting</h2>

<h3 id="jekyll-version-compatibility">Jekyll Version Compatibility</h3>

<p>The theme supports both Jekyll 3.9+ (GitHub Pages) and Jekyll 4.x (modern installations). If you encounter version conflicts:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># For GitHub Pages compatibility (Jekyll 3.x)</span>
gem <span class="s2">"github-pages"</span>, group: :jekyll_plugins

<span class="c"># For modern Jekyll 4.x installations</span>
gem <span class="s2">"jekyll"</span>, <span class="s2">"~&gt; 4.3"</span>
</code></pre></div></div>

<p>The theme will work with either version automatically.</p>

<h3 id="bundle-install-issues">Bundle Install Issues</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Clear bundle cache</span>
bundle cache clean <span class="nt">--force</span>

<span class="c"># Clear bundler git cache</span>
<span class="nb">rm</span> <span class="nt">-rf</span> ~/.local/share/gem/ruby/cache/bundler/git/

<span class="c"># Try installing again</span>
bundle <span class="nb">install</span>
</code></pre></div></div>

<h3 id="theme-not-loading">Theme Not Loading</h3>

<ul>
  <li>Verify <code class="language-plaintext highlighter-rouge">_config.yml</code> has <code class="language-plaintext highlighter-rouge">theme: analytiq-pages-theme</code></li>
  <li>Clear Jekyll cache: <code class="language-plaintext highlighter-rouge">rm -rf _site .jekyll-cache</code></li>
  <li>Rebuild: <code class="language-plaintext highlighter-rouge">bundle exec jekyll build</code></li>
</ul>

<h2 id="getting-help--contributing">Getting Help &amp; Contributing</h2>

<ul>
  <li><strong>Documentation</strong>: Comprehensive README at <a href="https://github.com/analytiq-hub/analytiq-pages-theme">analytiq-pages-theme</a></li>
  <li><strong>Issues &amp; Support</strong>: GitHub Issues for bug reports and feature requests</li>
  <li><strong>Contributing</strong>: Pull requests welcome for theme improvements</li>
  <li><strong>Migration Support</strong>: Contact us for help upgrading from manual Analytiq Pages setups</li>
</ul>

<h2 id="how-this-fits-into-your-stack">How This Fits Into Your Stack</h2>

<p>Analytiq Pages Theme transforms our proven methodology into a professional, reusable solution that makes building beautiful company websites accessible to everyone. Whether you’re launching a startup, building client sites, or managing enterprise web presence, this theme delivers the perfect balance of simplicity and power.</p>

<p>Ready to upgrade your web presence? Try Analytiq Pages Theme today!</p>

<hr />

<p><em>This theme powers the very website you’re reading now. Experience the Analytiq Pages Theme in action and see the <a href="https://github.com/analytiq-hub/analytiq-hub.github.io">source code</a> for implementation examples.</em></p>

<p><em>📢 <a href="https://www.linkedin.com/feed/update/urn:li:activity:7367581674697629697/">Join the discussion on LinkedIn</a> about modern Jekyll themes and web development workflows.</em></p>]]></content><author><name>Andrei Radulescu-Banu</name></author><category term="webdev" /><category term="jekyll" /><category term="tailwind" /><category term="github-pages" /><category term="theme" /><category term="release" /><summary type="html"><![CDATA[🎉 We’re excited to announce the release of Analytiq Pages Theme v0.1.6 - a modern, feature-rich Jekyll theme that transforms our Analytiq Pages approach into a reusable, professional-grade solution for building beautiful company websites.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/announcing_analytiq_pages_theme.png" /><media:content medium="image" url="https://smaht.ai/assets/images/announcing_analytiq_pages_theme.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">The ‘One Codebase, Many Products’ Playbook</title><link href="https://smaht.ai/ai/webdev/programming/one-codebase-many-products-playbook/" rel="alternate" type="text/html" title="The ‘One Codebase, Many Products’ Playbook" /><published>2025-11-27T00:00:00+00:00</published><updated>2025-11-27T00:00:00+00:00</updated><id>https://smaht.ai/ai/webdev/programming/one-codebase-many-products-playbook</id><content type="html" xml:base="https://smaht.ai/ai/webdev/programming/one-codebase-many-products-playbook/"><![CDATA[<p>In the AI era, velocity matters. Building each AI product from scratch—authentication, billing, observability, LLM routing—takes months. What if you could reuse the same codebase for multiple products?</p>

<p>This playbook shows how we built one open-source AI SaaS framework that powers <a href="https://sigagent.ai">SigAgent.AI</a> (real-time AI agent monitoring), <a href="https://docrouter.ai">DocRouter.AI</a> (smart document understanding), and client consulting portals. The same infrastructure, different AI workflows.</p>

<hr />

<h2 id="the-problem-infrastructure-is-commodity">The Problem: Infrastructure is Commodity</h2>

<p>When we launched DocRouter.AI, we spent three months building the same infrastructure every AI product needs:</p>

<ul>
  <li><strong>Authentication</strong>: NextAuth for user sessions, OAuth providers, role-based access</li>
  <li><strong>Billing</strong>: Stripe integration for subscriptions, credit packs, usage tracking</li>
  <li><strong>AI Layer</strong>: LiteLLM for LLM routing, error handling, cost tracking</li>
  <li><strong>Observability</strong>: OpenTelemetry for tracing AI workflows, debugging failures</li>
  <li><strong>Data Storage</strong>: MongoDB for user data, usage logs, analytics</li>
</ul>

<p>When we built SigAgent.AI, we cloned DocRouter’s codebase. In three weeks, it was live with full Stripe integration, authentication, and monitoring—90% code reuse.</p>

<p><strong>Key Insight</strong>: AI SaaS infrastructure is commodity. Differentiation lies in AI workflows, not plumbing.</p>

<hr />

<h2 id="the-solution-modular-reusable-stack">The Solution: Modular, Reusable Stack</h2>

<p>Our platform follows four principles:</p>

<h3 id="1-shared-core-custom-workflows">1. <strong>Shared Core, Custom Workflows</strong></h3>

<p>The core provides:</p>
<ul>
  <li><strong>Frontend</strong>: Next.js with NextAuth and Tailwind CSS</li>
  <li><strong>Backend</strong>: FastAPI with MongoDB</li>
  <li><strong>AI Layer</strong>: LiteLLM for multi-provider LLM APIs</li>
  <li><strong>Observability</strong>: OpenTelemetry integration</li>
  <li><strong>Billing</strong>: Stripe for subscriptions, credit packs, usage-based invoicing</li>
</ul>

<p>Each product adds specialized workflows:</p>
<ul>
  <li><strong>DocRouter.AI</strong>: Document parsing, field extraction, validation</li>
  <li><strong>SigAgent.AI</strong>: Trace ingestion, anomaly detection, performance analytics</li>
  <li><strong>Consulting Portals</strong>: Lab automation, custom reporting, enterprise integrations</li>
</ul>

<h4 id="architecture-comparison">Architecture Comparison</h4>

<p>Here’s how the two products differ architecturally while sharing the same foundation. These diagrams are rendered directly from Excalidraw files:</p>

<div class="architecture-comparison">
  <div class="arch-diagram">
    <div data-excalidraw="/assets/excalidraw/sig_agent_architecture.excalidraw" class="excalidraw-container">
      <div class="loading-placeholder">Loading diagram...</div>
    </div>
    <div class="arch-label">
      <a href="/excalidraw-edit?file=/assets/excalidraw/sig_agent_architecture.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
        📝 Edit in Excalidraw
      </a>
    </div>
  </div>
  <div class="arch-diagram">
    <div data-excalidraw="/assets/excalidraw/doc_router_architecture.excalidraw" class="excalidraw-container">
      <div class="loading-placeholder">Loading diagram...</div>
    </div>
    <div class="arch-label">
      <a href="/excalidraw-edit?file=/assets/excalidraw/doc_router_architecture.excalidraw" target="_blank" style="color: #2563eb; text-decoration: none; font-weight: 500;">
        📝 Edit in Excalidraw
      </a>
    </div>
  </div>
</div>

<style>
.architecture-comparison {
  display: grid;
  grid-template-columns: 1fr;
  gap: 3rem;
  margin: 2rem 0;
}

.arch-diagram {
  cursor: pointer;
  transition: transform 0.2s;
  text-align: center;
}

.arch-diagram:hover {
  transform: scale(1.02);
}

.excalidraw-container {
  width: 100%;
  border: 2px solid #e0e0e0;
  border-radius: 8px;
  box-shadow: 0 2px 8px rgba(0,0,0,0.1);
  background: white;
  display: block;
  overflow: hidden;
  padding: 0;
}

.excalidraw-container svg {
  width: 100%;
  height: auto;
  display: block;
  margin: 0;
}

.loading-placeholder {
  color: #666;
  font-size: 0.9rem;
  padding: 2rem;
}

.arch-label {
  margin-top: 0.5rem;
  font-size: 0.9rem;
  color: #666;
  min-height: 2.5rem;
  display: flex;
  align-items: center;
  justify-content: center;
}

.arch-label a:hover {
  text-decoration: underline;
}

@media (max-width: 768px) {
  .architecture-comparison {
    grid-template-columns: 1fr;
  }
}
</style>

<script type="module" src="/assets/js/excalidraw/render-excalidraw.js"></script>

<p>Both architectures share:</p>
<ul>
  <li><strong>Next.js</strong> frontend with <strong>NextAuth</strong> authentication</li>
  <li><strong>FastAPI</strong> backend integrated with <strong>Stripe</strong> for payments</li>
  <li><strong>MongoDB</strong> for data persistence</li>
  <li><strong>REST APIs, Python &amp; TypeScript SDKs</strong> for programmatic access</li>
  <li><strong>MCP Server</strong> and <strong>Claude agent</strong></li>
</ul>

<p>The key difference is in the specialized routes and data models:</p>
<ul>
  <li><strong>SigAgent</strong> adds telemetry, traces, and OpenTelemetry endpoints</li>
  <li><strong>DocRouter</strong> adds documents, OCR, forms, schemas, and prompts</li>
</ul>

<h3 id="2-vibe-coded-branding">2. <strong>Vibe-Coded Branding</strong></h3>

<p>Products are forked and branded directly in source code—colors, logos, messaging, domains. No abstraction layers:</p>

<ul>
  <li><strong>Fast Iteration</strong>: Clone repo, search-replace branding, update Tailwind colors</li>
  <li><strong>Full Control</strong>: Every pixel customizable</li>
  <li><strong>Stripe Integration</strong>: Product-specific metadata tags (<code class="language-plaintext highlighter-rouge">product=sig_agent</code>, <code class="language-plaintext highlighter-rouge">product=doc_router</code>)</li>
</ul>

<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// SigAgent.AI branding in Layout.tsx</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">metadata</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">title</span><span class="p">:</span> <span class="dl">'</span><span class="s1">SigAgent.AI</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Real-time AI agent monitoring and telemetry...</span><span class="dl">'</span><span class="p">,</span>
<span class="p">};</span>

<span class="p">&lt;</span><span class="nt">header</span> <span class="na">className</span><span class="p">=</span><span class="s">"bg-blue-600 border-b border-blue-700"</span><span class="p">&gt;</span>
  <span class="p">&lt;</span><span class="nc">Link</span> <span class="na">href</span><span class="p">=</span><span class="s">"/"</span> <span class="na">className</span><span class="p">=</span><span class="s">"text-xl font-semibold text-white"</span><span class="p">&gt;</span>
    SigAgent.AI
  <span class="p">&lt;/</span><span class="nc">Link</span><span class="p">&gt;</span>
<span class="p">&lt;/</span><span class="nt">header</span><span class="p">&gt;</span>

<span class="c1">// DocRouter.AI branding (same file, different values)</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">metadata</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">title</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Smart Document Router</span><span class="dl">'</span><span class="p">,</span>
  <span class="na">description</span><span class="p">:</span> <span class="dl">'</span><span class="s1">AI-powered document understanding...</span><span class="dl">'</span><span class="p">,</span>
<span class="p">};</span>

<span class="p">&lt;</span><span class="nt">header</span> <span class="na">className</span><span class="p">=</span><span class="s">"bg-green-600 border-b border-green-700"</span><span class="p">&gt;</span>
  <span class="p">&lt;</span><span class="nc">Link</span> <span class="na">href</span><span class="p">=</span><span class="s">"/"</span> <span class="na">className</span><span class="p">=</span><span class="s">"text-xl font-semibold text-white"</span><span class="p">&gt;</span>
    <span class="p">&lt;</span><span class="nt">span</span> <span class="na">className</span><span class="p">=</span><span class="s">"block sm:hidden"</span><span class="p">&gt;</span>DocRouter.AI<span class="p">&lt;/</span><span class="nt">span</span><span class="p">&gt;</span>
    <span class="p">&lt;</span><span class="nt">span</span> <span class="na">className</span><span class="p">=</span><span class="s">"hidden sm:block"</span><span class="p">&gt;</span>Smart Document Router<span class="p">&lt;/</span><span class="nt">span</span><span class="p">&gt;</span>
  <span class="p">&lt;/</span><span class="nc">Link</span><span class="p">&gt;</span>
<span class="p">&lt;/</span><span class="nt">header</span><span class="p">&gt;</span>
</code></pre></div></div>

<p><strong>Why it works</strong>: Vibe coding trades abstraction for speed. Need a new product? Fork, customize, ship.</p>

<h3 id="3-open-core-closed-workflows">3. <strong>Open Core, Closed Workflows</strong></h3>

<ul>
  <li><strong>Open-source core</strong>: Apache license enables community contributions, transparency for enterprise buyers</li>
  <li><strong>Closed workflows</strong>: AI logic (SigAgent’s anomaly detection, DocRouter’s extraction) remains proprietary IP</li>
  <li><strong>Hybrid advantage</strong>: Open plumbing attracts contributors, closed AI preserves competitive moats</li>
</ul>

<hr />

<h2 id="real-world-applications">Real-World Applications</h2>

<h3 id="docrouterai-the-foundation-3-months-to-develop">DocRouter.AI: The Foundation (3 months to develop)</h3>

<p>Our first product extracts structured data from documents using LLMs. We built the full infrastructure from scratch in 3 months:</p>

<p><strong>Monetization Model</strong>:</p>
<ul>
  <li><strong>Free Tier</strong>: 100 Service Processing Units (SPUs)</li>
  <li><strong>Individual/Team</strong>: $250/$1,000/month with SPU allowances</li>
  <li><strong>Credit Packs</strong>: A-la-carte SPUs for usage spikes</li>
  <li><strong>Enterprise</strong>: Custom contracts with outcome-based pricing</li>
</ul>

<p><strong>Key Lesson</strong>: Treat billing as infrastructure, not a feature. Build once, reuse everywhere.</p>

<h3 id="sigagentai-the-clone-3-weeks">SigAgent.AI: The Clone (3 weeks)</h3>

<p>Real-time AI agent monitoring using OpenTelemetry traces. 90% code reuse:</p>

<p><strong>Same Infrastructure</strong>:</p>
<ul>
  <li>NextAuth authentication with Google/GitHub OAuth</li>
  <li>Stripe billing with product-specific metadata (<code class="language-plaintext highlighter-rouge">product=sig_agent</code>)</li>
  <li>OpenTelemetry for trace analysis</li>
</ul>

<p><strong>New AI Logic</strong>: Trace anomaly detection replaces document processing</p>

<p><strong>Pricing</strong>: $25/$100/month (scaled down from DocRouter’s enterprise focus)</p>

<h3 id="client-consulting-portals-3-weeks">Client Consulting Portals (3 weeks)</h3>

<p>When clients need custom AI portals, we fork and customize:</p>

<p><strong>Process</strong>:</p>
<ol>
  <li>Clone repository and rebrand via source code changes</li>
  <li>Add client-specific AI workflows (lab automation, custom reporting)</li>
  <li>Deploy with pre-configured Kubernetes + Terraform</li>
</ol>

<p><strong>Example</strong>: Lab platform client got an AI portal monitoring their Claude coding copilot and OpenAI chat agents, with automated workflow validation.</p>

<p><strong>Team</strong>: Product manager (10h/week) + AI architect (20h/week)</p>

<p><strong>Result</strong>: Monetization-ready portal reusing 95% of existing infrastructure.</p>

<hr />

<h2 id="why-this-works-key-lessons">Why This Works: Key Lessons</h2>

<div class="lessons-container">

<div class="lesson-card lesson-odd">
<div class="lesson-header">
<span class="lesson-num">1</span>
<span class="lesson-title">Infrastructure is Commodity, Workflows are Unique</span>
</div>
<ul class="lesson-list">
<li>Every AI product needs auth, billing, observability</li>
<li>Building these repeatedly wastes time</li>
<li>Standardize the core to focus on AI logic and UI—the real differentiators</li>
</ul>
</div>

<div class="lesson-card lesson-even">
<div class="lesson-header">
<span class="lesson-num">2</span>
<span class="lesson-title">Vibe Coding Beats Configuration Complexity</span>
</div>
<ul class="lesson-list">
<li>Over-engineered config systems slow development</li>
<li>Fork repositories and customize directly in source code</li>
<li>Full control without abstraction overhead</li>
</ul>
</div>

<div class="lesson-card lesson-odd">
<div class="lesson-header">
<span class="lesson-num">3</span>
<span class="lesson-title">Open Core + Closed Workflows = Perfect Balance</span>
</div>
<ul class="lesson-list">
<li>Open-source infrastructure attracts contributors and builds trust</li>
<li>Closed AI workflows preserve competitive advantages</li>
</ul>
</div>

<div class="lesson-card lesson-even">
<div class="lesson-header">
<span class="lesson-num">4</span>
<span class="lesson-title">Speed Compounds in AI</span>
</div>
<ul class="lesson-list">
<li>Launching SigAgent in 3 weeks (vs. 3 months) enabled earlier revenue</li>
<li>Faster iteration and market advantage</li>
<li>Velocity is a multiplier in AI's fast-moving landscape</li>
</ul>
</div>

</div>

<style>
.lessons-container {
  margin: 1.5rem 0;
  border-radius: 8px;
  overflow: hidden;
  border: 1px solid #e2e8f0;
}

.lesson-card {
  padding: 1.5rem;
}

.lesson-header {
  display: flex;
  align-items: center;
  gap: 1rem;
  margin-bottom: 1rem;
}

.lesson-num {
  display: flex;
  align-items: center;
  justify-content: center;
  width: 2rem;
  height: 2rem;
  min-width: 2rem;
  background-color: #3b82f6;
  color: #fff;
  border-radius: 50%;
  font-size: 1rem;
  font-weight: 700;
}

.lesson-title {
  font-size: 1.15rem;
  font-weight: 600;
  color: #1e293b;
  line-height: 1.3;
  transition: color 0.2s ease;
}

.lesson-title:hover {
  color: #2563eb;
}

.lesson-list {
  margin: 0 0 0 3rem;
  padding: 0;
  list-style: disc;
  color: #475569;
}

.lesson-list li {
  margin-bottom: 0.4rem;
  line-height: 1.5;
}

.lesson-list li:last-child {
  margin-bottom: 0;
}

.lesson-odd {
  background-color: #f8fafc;
}

.lesson-even {
  background-color: #ffffff;
}
</style>

<hr />

<h2 id="your-implementation-playbook">Your Implementation Playbook</h2>

<p><img src="/assets/images/implementation_playbook.svg" alt="Implementation Playbook - 6 steps from identifying commodities to shipping products" style="width: 100%; max-width: 900px; margin: 2rem auto; display: block;" /></p>

<ol>
  <li><strong>Identify Commodities</strong>: Auth, billing, observability, LLM routing are table stakes</li>
  <li><strong>Build Modular Core</strong>: Invest upfront in reusable infrastructure</li>
  <li><strong>Vibe Code Branding</strong>: Fork repos, search-replace strings, customize in source</li>
  <li><strong>Encapsulate UI + AI Logic</strong>: Keep specialized UI and AI workflows separate from infrastructure</li>
  <li><strong>Infrastructure-ize Billing</strong>: Wire Stripe from day one for turnkey monetization</li>
  <li><strong>Open Plumbing, Close AI</strong>: Share infrastructure, protect unique UI + AI logic</li>
</ol>

<hr />

<h2 id="the-open-source-framework">The Open-Source Framework</h2>

<p>We’ve packaged this approach into an open-source framework:</p>

<p><img src="/assets/images/open_source_framework.svg" alt="Open-Source Framework - Reusable AI SaaS Infrastructure with Next.js, FastAPI, MongoDB, Stripe, LiteLLM, and OpenTelemetry" style="width: 100%; max-width: 900px; margin: 2rem auto; display: block;" /></p>

<p><strong>Tech Stack</strong>: Next.js, FastAPI, MongoDB, Stripe, LiteLLM, OpenTelemetry</p>

<p><strong>Features</strong>:</p>
<ul>
  <li>Authentication with NextAuth</li>
  <li>Stripe billing with usage metering</li>
  <li>OpenTelemetry observability</li>
  <li>Multi-tenant support</li>
  <li>Pre-built templates for document AI, agent monitoring, chat portals</li>
</ul>

<p><strong>Documentation</strong>: Deployment guides for AWS or Kubernetes</p>

<p><strong>Why Open Source?</strong> Every AI builder faces infrastructure challenges. By sharing the plumbing, we raise the ecosystem’s bar and differentiate on AI workflows.</p>

<p><strong>Results Proven</strong>:</p>
<ul>
  <li><strong>DocRouter.AI</strong>: 3 months (built infrastructure)</li>
  <li><strong>SigAgent.AI</strong>: 3 weeks (90% reuse)</li>
  <li><strong>Client Portals</strong>: 3 weeks (95% reuse, custom workflows)</li>
</ul>

<p>Ready to build? Start with commodity infrastructure, encapsulate unique AI logic, ship fast. Velocity wins in AI.</p>

<p>Interested? <a href="https://analytiqhub.com/contact">Contact Analytiq Hub</a> or follow <a href="https://sigagent.ai">SigAgent.AI</a> and <a href="https://docrouter.ai">DocRouter.AI</a></p>

<hr />

<h2 id="related-posts">Related Posts</h2>

<ul>
  <li><a href="https://analytiqhub.com/tech/programming/ai/tutorials/how-we-integrated-stripe-into-docrouter-ai/">How I Built a Reusable AI Monetization Platform with Stripe</a></li>
  <li><a href="https://analytiqhub.com/ai/programming/tutorials/how-to-train-your-ai-agent/">How To Train Your AI Agent</a></li>
  <li><a href="https://analytiqhub.com/talks/#an-ai-backbone-for-document-processing">DocRouter.AI: An AI Backbone for Document Processing</a></li>
  <li><a href="https://analytiqhub.com/talks/#sigagentai---tracing-claude-agents">SigAgent.AI - Tracing Claude Agents</a></li>
</ul>

<hr />

<p><em>Subscribe to our <a href="/feed.xml">RSS feed</a> for more on building AI SaaS products.</em></p>]]></content><author><name>Smaht.ai</name><email>hello@smaht.ai</email></author><category term="ai" /><category term="webdev" /><category term="programming" /><summary type="html"><![CDATA[In the AI era, velocity matters. Building each AI product from scratch—authentication, billing, observability, LLM routing—takes months. What if you could reuse the same codebase for multiple products?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://smaht.ai/assets/images/blog/one-codebase-many-products.svg" /><media:content medium="image" url="https://smaht.ai/assets/images/blog/one-codebase-many-products.svg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>