Experimental Phase: Expect rapid iteration and sweeping changes as we refine the core applications and infrastructure.

What is Ariata

Ariata is your personal AI agent that ingests your digital life—from calendar events and locations to health metrics and screen time—constructing a coherent, queryable timeline. Unlike cloud services that monetize your data, Ariata runs on your infrastructure, ensuring complete privacy and control. Your data is incredibly valuable—companies build trillion-dollar empires on it. Ariata lets you reclaim that value for yourself:
  • Train personal AI on YOUR data, not theirs
  • Life logging and memory augmentation for perfect recall
  • Health and productivity optimization through pattern recognition
  • Build a queryable life archive of your entire digital existence
  • Generate insights for self-improvement from your actual behavior
  • See what data companies collect and take back control

Quick Start

Get Ariata running in under 2 minutes:
1

Clone and setup

# Clone the repository
git clone https://github.com/ariata-os/ariata
cd ariata

# Start the entire stack
make dev
2

Access the dashboard

# Open the dashboard
open http://localhost:3000
3

Configure data sources

Navigate to Settings → Sources in the web UI to connect your data sources.
The system will automatically:
  • Initialize PostgreSQL with PostGIS and pgvector extensions
  • Set up MinIO for object storage
  • Start Redis for task queuing
  • Launch the SvelteKit web application
  • Spin up Celery workers for background processing

Your Data, Your Database

Unlike cloud services that lock away your data, Ariata gives you direct PostgreSQL access. Query your life with SQL, build custom analytics, or export everything—it’s your database.
import psycopg2
import pandas as pd

conn = psycopg2.connect(
    "postgresql://readonly_user:secure_pass@your-server:5432/ariata"
)

# Query your heart rate during meetings
df = pd.read_sql("""
    SELECT h.timestamp, h.heart_rate as bpm, c.summary as meeting
    FROM stream_ios_healthkit h
    JOIN stream_google_calendar c 
        ON h.timestamp BETWEEN c.start_time AND c.end_time
    WHERE h.heart_rate IS NOT NULL
""", conn)
Manage credentials at /settings/database in your Ariata UI—create read-only users for analysis or full access for integrations. Works with any PostgreSQL client: TablePlus, DBeaver, Jupyter notebooks, or your favorite BI tool.

Data Sources

SourceStreamDescription
GoogleCalendarCalendar events and meetings
iOSHealthKitHealth metrics (heart rate, steps, sleep, workouts, HRV)
iOSLocationGPS coordinates, speed, and altitude
iOSMicrophoneAudio levels and transcription
MacApplicationsApp usage and focus tracking
NotionPagesPage and database content
StravaActivitiesWorkouts and performance data

Self-Hosting & Networking

Recommended: Tailscale Setup (5 Minutes)

Tailscale creates a secure, private network between your devices. Your Ariata instance stays completely private while remaining accessible from all your devices.
1

Install Tailscale on your server

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
2

Note your Tailscale IP

After login, note your IP (e.g., 100.64.1.5)
3

Install Tailscale on your devices

4

Update your configuration

# Update your .env file:
PUBLIC_IP=100.64.1.5  # Your Tailscale IP
FRONTEND_URL=http://100.64.1.5:3000

# Restart Ariata:
make restart
5

Access from any device

open http://100.64.1.5:3000
# Or use MagicDNS: http://your-machine.tail-scale.ts.net:3000
Why Tailscale?
  • Zero exposed ports - servers aren’t on the public internet
  • E2E encrypted WireGuard protocol
  • Works behind firewalls, NAT, cellular networks
  • Free tier includes 100 devices and 3 users

Architecture

Ariata follows a stream-based ELT (Extract, Load, Transform) architecture:

Key Components

Sources

External services and devices that provide data (Google, iOS, Mac, etc.)

Streams

Time-series data tables with full fidelity storage

Processing

Celery workers that handle data ingestion and transformation

Storage

PostgreSQL for metadata, MinIO for raw data objects

Tech Stack

  • Backend: Python, Celery, FastAPI, PostgreSQL (PostGIS/pgvector), Redis, MinIO
  • Frontend: SvelteKit, TypeScript, TailwindCSS
  • Mobile: Swift/SwiftUI (iOS/macOS)
  • ML/AI: PELT change detection, HDBSCAN clustering, Vector embeddings

Development

Prerequisites

  • Docker & Docker Compose (v2.0+)
  • 8GB RAM minimum, 16GB recommended
  • 20GB free disk space

Commands

make dev              # Start development environment
make stop             # Stop all services
make clean            # Clean up containers and volumes
make logs             # View application logs
make db-studio        # Open Drizzle Studio for database inspection
make test             # Run test suite
make format           # Format code with Biome
make lint             # Lint codebase

Project Structure

ariata/
├── apps/                      # User-facing applications
│   ├── web/                   # SvelteKit dashboard
│   ├── ios/                   # Native iOS app
│   ├── mac/                   # Native macOS agent
│   └── oauth-proxy/           # OAuth proxy for services
├── sources/                   # Data pipeline logic
│   ├── base/                  # Shared infrastructure
│   ├── google/                # Google service integrations
│   ├── ios/                   # iOS data sources
│   ├── mac/                   # macOS data sources
│   ├── notion/                # Notion integration
│   └── _generated_registry.yaml # Source/stream registry
└── scripts/                   # Utility scripts

Database Schema

The schema follows an ELT pipeline pattern:
  • source_configs: Global catalog of source types
  • sources: Active source instances (e.g., “My iPhone”)
  • stream_configs: Global catalog of stream types
  • streams: Active stream instances with settings
  • stream_*: Time-series data tables (e.g., stream_google_calendar)

Development Tips

Example Queries

-- Analyze heart rate patterns during work hours
SELECT 
  EXTRACT(HOUR FROM timestamp) as hour,
  AVG(heart_rate) as avg_hr,
  MIN(heart_rate) as min_hr,
  MAX(heart_rate) as max_hr,
  COUNT(*) as readings
FROM stream_ios_healthkit
WHERE timestamp >= NOW() - INTERVAL '30 days'
  AND EXTRACT(DOW FROM timestamp) BETWEEN 1 AND 5
  AND heart_rate IS NOT NULL
GROUP BY hour
ORDER BY hour;

Contributing

We believe that only an open-source solution to personal data management can truly respect user privacy while covering the long tail of data sources. We welcome contributions in several areas:

How to Contribute

  1. Code Contributions: Implement new data sources, improve existing ones, or enhance the core platform
  2. Architecture Reviews: Share expertise on iOS/Swift, distributed systems, or data processing
  3. Documentation: Help others understand and use Ariata effectively
  4. Bug Reports: Find something broken? Let us know!
1

Fork and clone

git clone https://github.com/ariata-os/ariata
cd ariata
2

Create a feature branch

git checkout -b feature/your-feature-name
3

Make changes and test

make test
4

Submit a pull request

Push your changes and open a PR on GitHub

License

Ariata uses a dual-license model:
  • MIT License: Core functionality and most components
  • Elastic License 2.0 (ELv2): Certain enterprise components
You can: Self-host, modify, extend, and use Ariata for personal or commercial purposes. You cannot: Offer Ariata as a hosted service or remove license functionality.

Support


Your data should work for you, not against you.