Apple Intelligence Agentic RAG

Unleash Apple Intelligence.

An agentic reasoning loop built around Apple's 3B foundation model. Private Cloud Compute routing, native Vision OCR, and on-device Metal SIMD4 vector embeddings. Built to bypass token limits and answer exhaustively.

App Capabilities

Manual Model Override

Dynamically route queries between the on-device 3B Core model, the 20B Advanced model, or offload massive context windows to Apple's Private Cloud Compute.

Universal Apple Silicon

Seamlessly synced and optimized across iOS 27 and macOS Golden Gate. A true native Swift engine built for iPhone, iPad, and Mac.

Citations & Verification

Hallucinations are blocked via Verification Gates, and every answer is injected with interactive citations that link directly to the source document.

Multi-Modal Ingestion

A native multi-modal pipeline that automatically extracts and generates embeddings from complex PDFs, DOCX files, and raw visual data via Vision framework.

RAPTOR Agentic Loops

Stitches together multiple reasoning sessions utilizing the RAPTOR Summary Router to overcome token limits and synthesize exhaustive answers.

Native Local RAG

Powered by SQLite FTS5 and Accelerated by threadgroup-level Metal pipelines across the CPU, GPU, and Neural Engine (ANE).

Real-World Execution

See It In Action.

Watch the raw power of the 29-step retrieval pipeline running natively on Apple Silicon. No smoke, no mirrors—just brutal engineering truth.

OpenIntelligence iOS Demo 1
OpenIntelligence iOS Demo 2
OpenIntelligence iOS Demo 3
OpenIntelligence macOS Demo
View More on YouTube

On-Device Technical Trace

How It Works Under the Hood.

The agentic RAG orchestration relies on a 23-step query loop. Here is a live simulation of the engine processing local documents directly against the foundation model.

Apple Foundation Model Agentic Pipeline OpenIntelligence Quality Modes
Pipeline Strategy
Execution Target
-- ms
Engine Latency
--
Throughput / Rate
--%
Platt Confidence
Simulation Speed 800ms
Step -- Select a node to inspect
File: --
📋 WHAT IT DOES

Click on any node in the architectural diagram above to inspect how that processing step operates inside the native Swift engine.

💡 WHY IT MATTERS

Understanding each phase of on-device RAG is critical to optimizing memory, latency, and context limitations.

⚙️ HOW IT WORKS

Select a node to see technical implementation details, Swift API methods, and algorithmic settings.

Runtime Log Trace
No active log trace.
Engine Implementation Source
// Select a step to inspect source code.

Public Roadmap

What's Next

This roadmap is synced automatically from our Notion database, giving you full transparency into what is currently being built and what is completed.

🔜 To Do

🔨 In Progress

Completed

Product History

Version Changelog

Read the official version release log for OpenIntelligence, focusing on user-facing capabilities and system architectural shifts.

4.4

v4.4 Release — Evidence Threads

June 2026

Approved today by Apple App Store review. Introduces persistent iCloud-synchronized research chat sessions, slide-out sidebar, and resolved Swift 6 compiler concurrency checks.

  • iCloud Evidence Threads: Persistent conversation sessions saved locally and synced automatically via iCloud Drive.
  • Sidebar Navigation: Slide-out menu panel for switching, managing, and deleting active chat sessions.
  • Swift 6 Concurrency: Resolved actor isolation warnings inside key orchestrators.
  • Entitlements Alignment: 1,000 document upload hard limit for Pro tiers and adjusted Annual subscription to $29.99/year.
4.3

v4.3 & v4.3.1 — AFM 3 Suite & Deadlock Fixes

June 2026

Delivered support for third-generation Apple Foundation Models, Siri Screen Awareness AppIntents, and resolved critical MainActor deadlocks during iCloud file locking.

  • AFM 3 Architecture Routing: Real-time local vs. cloud routing between 3B Core, 20B Advanced, and Cloud Pro models.
  • Siri Screen Awareness: Siri background ingestion allows screen files/URLs to load natively into RAG libraries.
  • Image Playground Integration: Bound native macOS/iOS Image Playground (ADM 3) APIs for visual summary rendering.
  • Thread Concurrency Fixes: Offloaded synchronous operations and File Coordinator locks to background tasks.
4.2

v4.2 Release — Liquid Glass Telemetry HUD

June 2026

Rebuilt the RAG telemetry HUD with glassmorphic layouts, hardware sensory haptics, and resizable metrics bottom sheets.

  • Liquid Glass HUD: Telemetry overlay styled with .ultraThinMaterial glassmorphism.
  • Dynamic Verification Gates: Adaptive UI paths tracking 4, 8, or 12 verification gates depending on query modes.
  • Resizable Bottom Sheet: Fluid iOS sheet that can be manually expanded or collapsed.
4.0

v4.0 & v4.1 — Apple Intelligence Milestone

June 2026

Major milestone release integrating native iOS 26+ FoundationModels APIs, Core AI embeddings, and Metal-accelerated vector search.

  • FoundationModels Integration: Migrated LLM orchestration to Apple's native system APIs.
  • Private Cloud Compute Routing: Local 4K context boundaries automatically scale to secure PCC enclaves (32K tokens).
  • Core AI Native Embeddings: Integrated CoreAISentenceEmbeddingProvider for zero-copy memory layouts on Version 27+ devices.
  • Metal GPU Vector Search: Threadgroup-level Metal acceleration for 4x faster vector search.

The App Ecosystem

Three sites, three jobs.

Fascinaiting is the showcase. If you need support, privacy policies, or legal terms for any of my iOS apps, head over to Gunzino.

The App Showcase

Fascinaiting.me

The direct landing page and technical hub for OpenIntelligence.

You are here ->
App Support Hub

Gunzino.me

Official publisher site for support routes, privacy policies, and review-facing legal infrastructure for all 4 iOS apps.

Go to Support Hub ->
The Creator

Gunnarguy.me

Personal builder portfolio, high-agency creator identity, and professional technical anchor.

View Portfolio ->