For over 15 years, we’ve trusted Cloudflare. Their ever-free tier grants you the world’s fastest DNS without surveillance capitalism. They sell domains at cost. Their free compute tier is the most generous in this galaxy. They’ve earned trust through action, not marketing.
Today, we’re honored to announce that Divinci AI has been accepted into Cloudflare Workers Launchpad Cohort #6—joining 25 other innovative startups building the future on Cloudflare’s edge computing platform.
Why This Matters: Migration to Cloudflare
Joining the Workers Launchpad marks the beginning of our complete infrastructure migration to Cloudflare. Our architecture will cascade through resilience layers:
Primary: Eco-Colo → Secondary: Cloudflare → Tertiary: GCP → Quaternary: AWS
We estimate that when we scale, most compute will run on Cloudflare Workers. Why? Because their pricing structure enables something rare in tech: profitable altruism.
Cloudflare’s economics allow us to expand our budget for supporting non-profits, causes, and organizations making the world better. Specifically, we’re committing resources toward Universal Basic Income research and advocacy (see our UBI blog post).
When infrastructure costs drop, mission-driven work becomes possible.
The Edge Computing Revolution
Traditional cloud computing follows a centralized model: your data travels thousands of miles to a distant data center, gets processed, then returns. For AI applications requiring real-time intelligence, this creates unacceptable latency.
Cloudflare’s edge network changes the equation entirely:
- 330+ cities globally: Your code runs milliseconds from every internet user
- 298% faster than AWS Lambda: Cloudflare Workers outperform traditional serverless by nearly 3x1
- Zero cold starts: V8 isolates eliminate the container startup penalty
- Sub-100ms global latency: Achieving the responsiveness threshold for real-time AI
For Retrieval-Augmented Generation (RAG) systems—where every millisecond compounds through retrieval, embedding, ranking, and generation—edge deployment is transformative.
Global edge deployment: AI at the speed of thought
RAG at the Edge: Why It’s Game-Changing
Retrieval-Augmented Generation emerged in 2024 as the dominant strategy for grounding LLMs in authoritative, up-to-date knowledge. Over 1,200 RAG papers appeared on arXiv in 2024 alone—a 12x increase from 20232.
Traditional RAG architecture suffers from latency accumulation:
- Embedding generation: 50-200ms
- Vector search: 20-100ms (regional database)
- Context retrieval: 50-150ms (object storage)
- LLM generation: 200-800ms
- Network round-trips: 100-400ms (multi-region)
Total latency: 420-1,650ms for a single query.
Edge-based RAG collapses these timelines:
- Embedding, search, retrieval, and generation happen in the same data center
- Document chunks stored at the edge (D1 + Vectorize)
- Network overhead reduced by 60-80%
- Achievable total latency: 100-400ms
This isn’t a minor optimization—it’s the difference between usable and frustrating.
Our Cloudflare-Powered Architecture
We’ve architected Divinci’s platform around Cloudflare’s Developer Platform, creating a sophisticated, globally distributed AI infrastructure:
Cloudflare Workers & Workflows
The backbone of our system. Workers power our serverless compute layer with sub-millisecond cold start times. Cloudflare Workflows orchestrate our multi-step RAG pipelines:
- Document ingestion and chunking
- Embedding generation (Workers AI)
- Vector storage and indexing (Vectorize)
- Query-time retrieval and ranking
- Context synthesis and LLM generation
- Response streaming to users
Each step executes at the edge, minimizing latency and maximizing throughput.
D1: Distributed SQL at the Edge
We use Cloudflare D1 to store structured RAG metadata and chunk references. D1’s edge-based architecture ensures document chunks are geographically close to users, reducing retrieval latency by 60-80% compared to regional databases.
Diagram: D1 distributed SQL database architecture showing global replication and edge-based query execution for RAG chunk storage.
Vectorize: Semantic Search at Scale
Cloudflare Vectorize enables lightning-fast semantic search across millions of document embeddings. Our vector search completes in 20-50ms globally—fast enough for real-time retrieval in multi-hop reasoning chains.
Vectorize’s distributed index architecture means:
- Global consistency: Updates propagate within seconds
- Local queries: Vector search executes at the nearest edge location
- Infinite scale: No database sharding or index partitioning required
Workers AI: Open-Source Models at the Edge
We integrate Cloudflare Workers AI to provide access to open-source models including:
- Llama 3.1 (8B, 70B, 405B): General-purpose reasoning and generation
- Mistral 7B/8x7B: Fast inference for structured outputs
- CodeLlama: Code understanding and generation
- BGE embeddings: Multilingual text embeddings
This gives enterprise customers sovereignty over their AI stack—no vendor lock-in to proprietary models.
Diagram: Workers AI model catalog showing open-source LLMs available for edge inference.
R2: Zero-Egress Object Storage
Cloudflare R2 handles our media and document storage needs:
- Audio/video processing pipelines
- User-uploaded files and documents
- RAG corpus storage (PDFs, presentations, spreadsheets)
- Model weights and artifacts
R2’s zero egress fees mean we can serve petabytes of data without the bandwidth tax imposed by AWS S3. This alone saves enterprises 80-90% on storage costs.
API Shield: Zero-Trust Security
As we scale our public API, Cloudflare API Shield provides:
- Schema validation: Ensure requests match OpenAPI specifications
- Rate limiting: Protect against abuse and DDoS
- JWT validation: Verify authentication tokens at the edge
- Mutual TLS: Certificate-based client authentication
Security enforcement happens before requests reach our Workers—filtering malicious traffic at Cloudflare’s scale.
Durable Objects: Stateful Edge Computing
For real-time collaboration features (shared workspaces, live cursors, collaborative editing), we use Cloudflare Durable Objects:
- Strong consistency: Single-writer semantics for conflict-free state
- WebSocket support: Persistent connections for real-time updates
- Global coordination: Distributed locks and consensus at the edge
Durable Objects enable Google Docs-style collaboration without centralized servers.
The Workers Launchpad Program: What We’re Gaining
Cloudflare’s Workers Launchpad isn’t just credits—it’s a comprehensive accelerator program:
Financial Support:
- Up to $250,000 in cloud credits (one year)
- Eliminates infrastructure costs during critical growth phase
- Enables experimentation without budget constraints
Technical Resources:
- Bootcamp sessions with Cloudflare engineering teams
- Early access to beta products and features
- Design support for architecture optimization
- Direct access to product teams for feedback and bug reports
Network & Growth:
- VC introductions to Cloudflare’s investor network
- Partnership opportunities with Cloudflare’s enterprise customers
- Co-marketing potential with Cloudflare’s brand
Proven Track Record: Since launching in 2022, Workers Launchpad has supported 145 startups from 23 countries. Notable alumni include:
- Nefeli Networks: Acquired by Cloudflare (2024)
- Outerbase: Acquired by Cloudflare (2024)
- Companies now processing billions of monthly requests on Workers
Nearly 1/3 of Cohort #5 were led by female founders—evidence of Cloudflare’s commitment to diverse entrepreneurship.
What This Means for Our Customers
For enterprises using Divinci AI, joining Workers Launchpad translates to tangible benefits:
Performance:
- Sub-100ms AI responses globally: Edge computing eliminates regional bottlenecks
- 99.99% uptime SLA: Cloudflare’s network reliability becomes ours
- Infinite scale: No capacity planning—Workers auto-scale to billions of requests
Privacy & Compliance:
- Data residency: Process data at the edge closest to users
- Zero-knowledge architecture: Cloudflare can’t decrypt customer data
- GDPR/CCPA compliance: Built-in privacy controls and data retention policies
Innovation:
- Beta access: Test cutting-edge features before public release
- Custom integrations: Deeper Cloudflare product integration
- Rapid deployment: Ship new features without infrastructure blockers
Economics:
- Lower costs: Cloudflare’s pricing passes through to customers
- Predictable billing: No surprise egress charges or regional surcharges
- Value reinvestment: Savings redirected to product R&D and customer support
The Road Ahead: Building in Public
Over the coming months, we’ll be documenting our infrastructure migration and lessons learned:
Upcoming deep-dives:
- RAG at the edge: Architecture patterns and performance benchmarks
- D1 for vector metadata: Scaling distributed SQL for AI workloads
- Workflows orchestration: Building multi-step AI pipelines
- Cost analysis: Cloudflare vs. AWS/GCP for AI infrastructure
- Real-world latency: P50/P95/P99 metrics from production traffic
We believe in building in public and sharing knowledge. If you’re building on Cloudflare Workers or exploring edge computing for AI, we’d love to collaborate.
Innovation through the ages: Building the future with timeless principles
Join Us on This Journey
We’re incredibly excited about this partnership and the opportunities ahead. As we build the future of AI-powered enterprise collaboration, Cloudflare’s platform will remain at the heart of our infrastructure—enabling us to deliver exceptional experiences to teams worldwide.
Want to see it in action?
- Request a demo to explore Divinci AI’s platform
- Follow our blog for technical deep-dives and updates
- Join our community to discuss edge AI architecture
Building on Cloudflare Workers?
If you’re exploring edge computing for AI/ML workloads, we’d love to share lessons learned. Reach out at hello@divinci.ai.
About Workers Launchpad
The Cloudflare Workers Launchpad is Cloudflare’s startup accelerator program, providing funding, technical support, and go-to-market resources to companies building on the Workers platform.
Since 2022, the program has supported 145 startups across 23 countries, with two companies acquired by Cloudflare and dozens processing billions of monthly requests.
Learn more about Cohort #6 and participating companies.
Serverless Performance: Cloudflare Workers, Lambda and Lambda@Edge - Cloudflare Engineering Blog (2024)
The Rise and Evolution of RAG in 2024: A Year in Review - RAGFlow Research (2024)
Ready to Build Your Custom AI Solution?
Discover how Divinci AI can help you implement RAG systems, automate quality assurance, and streamline your AI development process.
Get Started Today