<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>RAG - Tag - Shengxu · Cloud Architecture &amp; DevOps</title><link>https://shengxu.pages.dev/en/tags/rag/</link><description>Cloud architecture &amp; DevOps notes by Shengxu: Kubernetes, Cilium, observability, LLM infra, AI agents.</description><generator>Hugo 0.153.2 &amp; FixIt v0.4.0-alpha.3-20251225101113-8ffb9a95</generator><language>en</language><lastBuildDate>Wed, 04 Feb 2026 10:00:00 +0800</lastBuildDate><atom:link href="https://shengxu.pages.dev/en/tags/rag/index.xml" rel="self" type="application/rss+xml"/><item><title>Practical · Building a Memory-Enabled AI Writing Partner (Part 3): Security Architecture (RAG Protection, Fact Guard, and BYOK)</title><link>https://shengxu.pages.dev/en/posts/fantasy-novel-agent-security/</link><pubDate>Wed, 04 Feb 2026 10:00:00 +0800</pubDate><guid>https://shengxu.pages.dev/en/posts/fantasy-novel-agent-security/</guid><category domain="https://shengxu.pages.dev/en/categories/ai/">AI</category><category domain="https://shengxu.pages.dev/en/categories/security/">Security</category><category domain="https://shengxu.pages.dev/en/categories/devops/">DevOps</category><category domain="https://shengxu.pages.dev/en/categories/observability/">Observability</category><description>&lt;p&gt;In the previous 2.5 articles, I&amp;rsquo;ve already laid out the backbone of &lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-architecture-evolution/"&gt;FantasyNovelAgent&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-architecture-evolution/"&gt;Building a Memory-Enabled AI Writing Partner (Part 1): Multi-Agent Architecture Evolution&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-database-evolution/"&gt;Building a Memory-Enabled AI Writing Partner (Part 2): Database Evolution&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-retrieval-evolution/"&gt;Building a Memory-Enabled AI Writing Partner (Part 3): Retrieval System Evolution&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This article dives deep into the most overlooked yet critical aspect of AI systems: &lt;strong&gt;Security&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;If you&amp;rsquo;re thinking, &amp;ldquo;I&amp;rsquo;m just writing a novel, what security issues could there be?&amp;rdquo;, consider this:&lt;/p&gt;</description></item><item><title>Practical Guide: Building a Memory-Enabled AI Writing Partner (Kun) – Retrieval System (Vector Search, Hybrid Search &amp; Cloud Deployment)</title><link>https://shengxu.pages.dev/en/posts/fantasy-novel-agent-retrieval-evolution/</link><pubDate>Wed, 28 Jan 2026 10:30:00 +0800</pubDate><guid>https://shengxu.pages.dev/en/posts/fantasy-novel-agent-retrieval-evolution/</guid><category domain="https://shengxu.pages.dev/en/categories/ai/">AI</category><category domain="https://shengxu.pages.dev/en/categories/devops/">DevOps</category><description>&lt;blockquote&gt;
&lt;p&gt;In &amp;ldquo;&lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-architecture-evolution/"&gt;Practical · Building a Memory-Enabled AI Writing Partner (Part 1): Multi-Agent Architecture Evolution&lt;/a&gt;&amp;rdquo;, I clarified how multiple agents collaborate and how memory is chained together. In &amp;ldquo;&lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-database-evolution/"&gt;Practical · Building a Memory-Enabled AI Writing Partner (Part 2): Database Evolution (From JSON to Single Database to Relational Tables)&lt;/a&gt;&amp;rdquo;, I reviewed the evolution of the &amp;ldquo;fact layer&amp;rdquo; from JSON to &lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-database-evolution/"&gt;SQLite&lt;/a&gt; and then to relational tables.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;However, when the text length reaches hundreds of thousands of words, what truly determines the experience is often not &amp;ldquo;whether the data exists,&amp;rdquo; but &amp;ldquo;whether I can retrieve it&amp;rdquo;: exact lookup (did it appear or not), structured filtering (who belongs to whom), and semantic association (is it similar, is it the same atmosphere) must all work simultaneously. So I added a clear &amp;ldquo;index layer&amp;rdquo; to &lt;a href="https://shengxu.pages.dev/posts/fantasy-novel-agent-architecture-evolution/"&gt;FantasyNovelAgent&lt;/a&gt; and expanded retrieval from &amp;ldquo;chapters&amp;rdquo; to the &amp;ldquo;full knowledge graph.&amp;rdquo;&lt;/p&gt;</description></item><item><title>Hands-On: Building an Automated AI Semantic Search with Cloudflare Vectorize and Gemini</title><link>https://shengxu.pages.dev/en/posts/building-ai-search-with-cloudflare-and-gemini/</link><pubDate>Fri, 23 Jan 2026 15:30:00 +0800</pubDate><guid>https://shengxu.pages.dev/en/posts/building-ai-search-with-cloudflare-and-gemini/</guid><category domain="https://shengxu.pages.dev/en/categories/ai/">AI</category><category domain="https://shengxu.pages.dev/en/categories/devops/">DevOps</category><description>&lt;p&gt;In 2026, adding AI search to a personal blog is nothing new. But achieving it with &lt;strong&gt;zero cost&lt;/strong&gt;, &lt;strong&gt;full automation&lt;/strong&gt;, and &lt;strong&gt;high performance&lt;/strong&gt; remains a technical topic worth exploring.&lt;/p&gt;
&lt;p&gt;This article breaks down the technical architecture behind this site&amp;rsquo;s AI Search feature, showing how to combine &lt;strong&gt;Cloudflare Workers&lt;/strong&gt;, &lt;strong&gt;Vectorize&lt;/strong&gt;, &lt;strong&gt;D1&lt;/strong&gt;, and &lt;strong&gt;Google Gemini&lt;/strong&gt; to build a closed-loop RAG (Retrieval-Augmented Generation) system.&lt;/p&gt;
&lt;h2 class="heading-element" id="1-core-architecture-design"&gt;&lt;span&gt;1. Core Architecture Design&lt;/span&gt;
 &lt;a href="#1-core-architecture-design" class="heading-mark"&gt;
 &lt;svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"&gt;&lt;path d="m7.775 3.275 1.25-1.25a3.5 3.5 0 1 1 4.95 4.95l-2.5 2.5a3.5 3.5 0 0 1-4.95 0 .751.751 0 0 1 .018-1.042.751.751 0 0 1 1.042-.018 1.998 1.998 0 0 0 2.83 0l2.5-2.5a2.002 2.002 0 0 0-2.83-2.83l-1.25 1.25a.751.751 0 0 1-1.042-.018.751.751 0 0 1-.018-1.042Zm-4.69 9.64a1.998 1.998 0 0 0 2.83 0l1.25-1.25a.751.751 0 0 1 1.042.018.751.751 0 0 1 .018 1.042l-1.25 1.25a3.5 3.5 0 1 1-4.95-4.95l2.5-2.5a3.5 3.5 0 0 1 4.95 0 .751.751 0 0 1-.018 1.042.751.751 0 0 1-1.042.018 1.998 1.998 0 0 0-2.83 0l-2.5 2.5a1.998 1.998 0 0 0 0 2.83Z"&gt;&lt;/path&gt;&lt;/svg&gt;
 &lt;/a&gt;
&lt;/h2&gt;&lt;p&gt;Our goal is a fully automated workflow: &lt;strong&gt;write and deploy&lt;/strong&gt;. The author only needs to push Markdown articles; everything else—vector generation, index updates, frontend deployment—is automated.&lt;/p&gt;</description></item></channel></rss>