Request demo

Divinci AI Joins Cloudflare Workers Launchpad Cohort #6: Building AI at the Speed of Light

Divinci AI has been accepted into Cloudflare's Workers Launchpad Cohort #6. Discover how we're leveraging edge computing to deliver RAG-powered AI with sub-100ms latency globally, and why Cloudflare's platform is reshaping the future of enterprise AI.

For over 15 years, we’ve trusted Cloudflare. Their ever-free tier grants you the world’s fastest DNS without surveillance capitalism. They sell domains at cost. Their free compute tier is the most generous in this galaxy. They’ve earned trust through action, not marketing.

Today, we’re honored to announce that Divinci AI has been accepted into Cloudflare Workers Launchpad Cohort #6—joining 25 other innovative startups building the future on Cloudflare’s edge computing platform.

Why This Matters: Migration to Cloudflare

Joining the Workers Launchpad marks the beginning of our complete infrastructure migration to Cloudflare. Our architecture will cascade through resilience layers:

Primary: Eco-ColoSecondary: CloudflareTertiary: GCPQuaternary: AWS

We estimate that when we scale, most compute will run on Cloudflare Workers. Why? Because their pricing structure enables something rare in tech: profitable altruism.

Cloudflare’s economics allow us to expand our budget for supporting non-profits, causes, and organizations making the world better. Specifically, we’re committing resources toward Universal Basic Income research and advocacy (see our UBI blog post).

When infrastructure costs drop, mission-driven work becomes possible.

The Edge Computing Revolution

Traditional cloud computing follows a centralized model: your data travels thousands of miles to a distant data center, gets processed, then returns. For AI applications requiring real-time intelligence, this creates unacceptable latency.

Cloudflare’s edge network changes the equation entirely:

  • 330+ cities globally: Your code runs milliseconds from every internet user
  • 298% faster than AWS Lambda: Cloudflare Workers outperform traditional serverless by nearly 3x1
  • Zero cold starts: V8 isolates eliminate the container startup penalty
  • Sub-100ms global latency: Achieving the responsiveness threshold for real-time AI

For Retrieval-Augmented Generation (RAG) systems—where every millisecond compounds through retrieval, embedding, ranking, and generation—edge deployment is transformative.

Global edge deployment: AI at the speed of thought

RAG at the Edge: Why It’s Game-Changing

Retrieval-Augmented Generation emerged in 2024 as the dominant strategy for grounding LLMs in authoritative, up-to-date knowledge. Over 1,200 RAG papers appeared on arXiv in 2024 alone—a 12x increase from 20232.

Traditional RAG architecture suffers from latency accumulation:

  1. Embedding generation: 50-200ms
  2. Vector search: 20-100ms (regional database)
  3. Context retrieval: 50-150ms (object storage)
  4. LLM generation: 200-800ms
  5. Network round-trips: 100-400ms (multi-region)

Total latency: 420-1,650ms for a single query.

Edge-based RAG collapses these timelines:

  • Embedding, search, retrieval, and generation happen in the same data center
  • Document chunks stored at the edge (D1 + Vectorize)
  • Network overhead reduced by 60-80%
  • Achievable total latency: 100-400ms

This isn’t a minor optimization—it’s the difference between usable and frustrating.

Our Cloudflare-Powered Architecture

We’ve architected Divinci’s platform around Cloudflare’s Developer Platform, creating a sophisticated, globally distributed AI infrastructure:

Cloudflare Workers & Workflows

The backbone of our system. Workers power our serverless compute layer with sub-millisecond cold start times. Cloudflare Workflows orchestrate our multi-step RAG pipelines:

  1. Document ingestion and chunking
  2. Embedding generation (Workers AI)
  3. Vector storage and indexing (Vectorize)
  4. Query-time retrieval and ranking
  5. Context synthesis and LLM generation
  6. Response streaming to users

Each step executes at the edge, minimizing latency and maximizing throughput.

D1: Distributed SQL at the Edge

We use Cloudflare D1 to store structured RAG metadata and chunk references. D1’s edge-based architecture ensures document chunks are geographically close to users, reducing retrieval latency by 60-80% compared to regional databases.

D1 Distributed Database Architecture

Diagram: D1 distributed SQL database architecture showing global replication and edge-based query execution for RAG chunk storage.

Vectorize: Semantic Search at Scale

Cloudflare Vectorize enables lightning-fast semantic search across millions of document embeddings. Our vector search completes in 20-50ms globally—fast enough for real-time retrieval in multi-hop reasoning chains.

Vectorize’s distributed index architecture means:

  • Global consistency: Updates propagate within seconds
  • Local queries: Vector search executes at the nearest edge location
  • Infinite scale: No database sharding or index partitioning required

Workers AI: Open-Source Models at the Edge

We integrate Cloudflare Workers AI to provide access to open-source models including:

  • Llama 3.1 (8B, 70B, 405B): General-purpose reasoning and generation
  • Mistral 7B/8x7B: Fast inference for structured outputs
  • CodeLlama: Code understanding and generation
  • BGE embeddings: Multilingual text embeddings

This gives enterprise customers sovereignty over their AI stack—no vendor lock-in to proprietary models.

Workers AI Open Source Models

Diagram: Workers AI model catalog showing open-source LLMs available for edge inference.

R2: Zero-Egress Object Storage

Cloudflare R2 handles our media and document storage needs:

  • Audio/video processing pipelines
  • User-uploaded files and documents
  • RAG corpus storage (PDFs, presentations, spreadsheets)
  • Model weights and artifacts

R2’s zero egress fees mean we can serve petabytes of data without the bandwidth tax imposed by AWS S3. This alone saves enterprises 80-90% on storage costs.

API Shield: Zero-Trust Security

As we scale our public API, Cloudflare API Shield provides:

  • Schema validation: Ensure requests match OpenAPI specifications
  • Rate limiting: Protect against abuse and DDoS
  • JWT validation: Verify authentication tokens at the edge
  • Mutual TLS: Certificate-based client authentication

Security enforcement happens before requests reach our Workers—filtering malicious traffic at Cloudflare’s scale.

Durable Objects: Stateful Edge Computing

For real-time collaboration features (shared workspaces, live cursors, collaborative editing), we use Cloudflare Durable Objects:

  • Strong consistency: Single-writer semantics for conflict-free state
  • WebSocket support: Persistent connections for real-time updates
  • Global coordination: Distributed locks and consensus at the edge

Durable Objects enable Google Docs-style collaboration without centralized servers.

The Workers Launchpad Program: What We’re Gaining

Cloudflare’s Workers Launchpad isn’t just credits—it’s a comprehensive accelerator program:

Financial Support:

  • Up to $250,000 in cloud credits (one year)
  • Eliminates infrastructure costs during critical growth phase
  • Enables experimentation without budget constraints

Technical Resources:

  • Bootcamp sessions with Cloudflare engineering teams
  • Early access to beta products and features
  • Design support for architecture optimization
  • Direct access to product teams for feedback and bug reports

Network & Growth:

  • VC introductions to Cloudflare’s investor network
  • Partnership opportunities with Cloudflare’s enterprise customers
  • Co-marketing potential with Cloudflare’s brand

Proven Track Record: Since launching in 2022, Workers Launchpad has supported 145 startups from 23 countries. Notable alumni include:

  • Nefeli Networks: Acquired by Cloudflare (2024)
  • Outerbase: Acquired by Cloudflare (2024)
  • Companies now processing billions of monthly requests on Workers

Nearly 1/3 of Cohort #5 were led by female founders—evidence of Cloudflare’s commitment to diverse entrepreneurship.

What This Means for Our Customers

For enterprises using Divinci AI, joining Workers Launchpad translates to tangible benefits:

Performance:

  • Sub-100ms AI responses globally: Edge computing eliminates regional bottlenecks
  • 99.99% uptime SLA: Cloudflare’s network reliability becomes ours
  • Infinite scale: No capacity planning—Workers auto-scale to billions of requests

Privacy & Compliance:

  • Data residency: Process data at the edge closest to users
  • Zero-knowledge architecture: Cloudflare can’t decrypt customer data
  • GDPR/CCPA compliance: Built-in privacy controls and data retention policies

Innovation:

  • Beta access: Test cutting-edge features before public release
  • Custom integrations: Deeper Cloudflare product integration
  • Rapid deployment: Ship new features without infrastructure blockers

Economics:

  • Lower costs: Cloudflare’s pricing passes through to customers
  • Predictable billing: No surprise egress charges or regional surcharges
  • Value reinvestment: Savings redirected to product R&D and customer support

The Road Ahead: Building in Public

Over the coming months, we’ll be documenting our infrastructure migration and lessons learned:

Upcoming deep-dives:

  • RAG at the edge: Architecture patterns and performance benchmarks
  • D1 for vector metadata: Scaling distributed SQL for AI workloads
  • Workflows orchestration: Building multi-step AI pipelines
  • Cost analysis: Cloudflare vs. AWS/GCP for AI infrastructure
  • Real-world latency: P50/P95/P99 metrics from production traffic

We believe in building in public and sharing knowledge. If you’re building on Cloudflare Workers or exploring edge computing for AI, we’d love to collaborate.

Innovation through the ages: Building the future with timeless principles

Join Us on This Journey

We’re incredibly excited about this partnership and the opportunities ahead. As we build the future of AI-powered enterprise collaboration, Cloudflare’s platform will remain at the heart of our infrastructure—enabling us to deliver exceptional experiences to teams worldwide.

Want to see it in action?

Building on Cloudflare Workers?

If you’re exploring edge computing for AI/ML workloads, we’d love to share lessons learned. Reach out at hello@divinci.ai.


About Workers Launchpad

The Cloudflare Workers Launchpad is Cloudflare’s startup accelerator program, providing funding, technical support, and go-to-market resources to companies building on the Workers platform.

Since 2022, the program has supported 145 startups across 23 countries, with two companies acquired by Cloudflare and dozens processing billions of monthly requests.

Learn more about Cohort #6 and participating companies.


Ready to Build Your Custom AI Solution?

Discover how Divinci AI can help you implement RAG systems, automate quality assurance, and streamline your AI development process.

Get Started Today