معمار

Ilum is a modular data lakehouse platform built on Kubernetes. It combines أباتشي سبارك, الثلاثي, بطة دي بيو Apache Flink into a unified multi-engine architecture, managed through a primary control plane (إيلوم كور) and a dedicated module-management microservice (ilum-api), deployed entirely via Helm charts. Ilum follows an "all-in-one but optional" philosophy where every component beyond the core services can be enabled or disabled independently at runtime, allowing deployments to range from a lightweight development setup to a full enterprise data platform.

The platform is built around several key architectural principles:

Decoupled compute and storage: processing engines and data storage scale independently.
Stateless services: both إيلوم كور و ilum-api hold no in-memory state, enabling horizontal scaling and crash recovery.
Multi-engine execution: Spark, Trino, DuckDB, and Flink are peer engines, with an automatic router that selects the right engine for each query based on data size, workload type, and locality.
Modular extensibility: optional modules install, upgrade, and disable at runtime via ilum-api. Future releases extend ilum-api مع Model Context Protocol (MCP) capabilities and open APIs for third-party integration.
Open standards: OpenAPI REST, JDBC/ODBC, Apache Iceberg/Delta/Hudi table formats, OpenLineage for lineage.
Kubernetes-native: all workloads run as pods with native resource management, scheduling, and observability. Compatible with any CNCF-compliant distribution including Red Hat OpenShift, EKS, AKS, GKE, k3s, Rancher, and bare-metal Kubernetes.

Ilum platform architecture diagram showing Spark jobs running as Kubernetes microservices

المكونات الرئيسية

                        ┌─────────────┐
                        │   ilum-ui   │
                        │   (React)   │
                        └──────┬──────┘
                               │ REST API
        ┌──────────┐    ┌──────▼──────┐    ┌──────────────────┐
        │ ilum-cli │───▶│  ilum-core  │───▶│  PostgreSQL      │
        └──────────┘    └──┬───┬───┬──┘    │  (primary)       │
                           │   │   │       │  MongoDB (legacy)│
              ┌────────────┘   │   └─────┐ └──────────────────┘
              │                │         │
              │         ┌──────▼──────┐  │
              │         │  ilum-api   │  │  Module management
              │         │ (modules)   │  │  Helm install / upgrade
              │         │  + MCP      │  │  Roadmap: MCP, open APIs
              │         │  (roadmap)  │  │
              │         └─────────────┘  │
              │                          │
     ┌────────▼───────┐ ┌──────────┐ ┌──▼─────────────┐
     │  Kafka / gRPC  │ │ Catalog  │ │ Object Storage │
     │   (messaging)  │ │  Layer   │ │ (MinIO / S3)   │
     └────────┬───────┘ └─────┬────┘ └───────┬────────┘
              │               │              │
     ┌────────▼───────────────▼──────────────▼────────────────┐
     │                  Engine Layer                          │
     │  Spark  ·  Trino  ·  DuckDB  ·  Flink (Beta)           │
     │              fronted by Apache Kyuubi                  │
     └────────────────────────────────────────────────────────┘

إيلوم كور: The main platform service. Manages cluster lifecycle, job scheduling, session management, multi-engine SQL execution, security, lineage capture, and REST API exposure (OpenAPI 3.0). Stateless by design; all persistent state is stored in PostgreSQL (primary, accessed via R2DBC with jOOQ-generated SQL DSL), with MongoDB retained for legacy deployments.
ILUM-UI: React-based web console for managing clusters, submitting jobs, browsing tables and lineage, running multi-engine SQL queries through the SQL Editor, configuring security, and toggling modules. Communicates exclusively with إيلوم كور و ilum-api via REST.
ilum-api: Dedicated module-management microservice. Drives Helm-based install, upgrade, and disable of optional Ilum modules (Trino, JupyterHub, Superset, Airflow, etc.) at runtime via cluster-scoped RBAC. Stateless. Future releases extend ilum-api مع Model Context Protocol (MCP) و open APIs to act as the platform-wide extensibility surface.
ilum-cli: Command-line interface for scripting and automation. Supports all إيلوم كور REST API operations including job submission, cluster management, and configuration. Useful for CI/CD pipelines and headless environments.
مجموعة Kubernetes: The primary execution environment. Spark, Trino, Flink, and other engine pods run with configurable executor counts, resource limits, and node affinities. Ilum manages the full pod lifecycle.
تخزين الكائنات: S3-compatible storage (MinIO, Ceph, RustFS, AWS S3, GCS, Azure Blob) serves as the persistent data layer. All table data, job artifacts, and Spark event logs are stored here, decoupled from compute.
PostgreSQL (primary) and MongoDB (legacy): Internal metadata store for job history, cluster configuration, user accounts, session state, and operational data. PostgreSQL is the recommended store for new deployments; MongoDB is supported for existing deployments with built-in migration tooling (M001 through M009 scripts).
Messaging Layer (Kafka / gRPC): Handles communication between إيلوم كور and running Spark jobs. See أنواع الاتصالات for details.
Catalog Layer: Persistent metadata services (Hive Metastore, Nessie, Unity Catalog, DuckLake) that enable SQL access across all engines. See Data Catalog Layer for details.
Engine Layer: Execution engines (Spark, Trino, DuckDB, Flink) fronted by the Apache Kyuubi SQL gateway. See Multi-Engine Query Architecture.

Integrated Modules

All modules below are optional Helm-deployed components that share authentication, networking, and catalog configuration with ilum-core:

Category	وحدات
تزامن	Airflow, Kestra, Mage, n8n, NiFi
دفاتر الملاحظات	JupyterLab, JupyterHub (Enterprise), Zeppelin
BI & Visualization	Superset, Streamlit
التعلم الآلي & الذكاء الاصطناعي	MLflow, LangFuse, AI Data Analyst
Engines	Trino, Apache Flink
كتالوجات البيانات	Hive Metastore, Nessie, Unity Catalog, DuckLake
قابلية الملاحظة	Grafana, Prometheus, Loki, Marquez (default-on)
Identity	Ory Hydra (Ilum as IdP), OpenLDAP
التحكم في الإصدار	جيتيا
Analytics store	ClickHouse

Each module is enabled via a single Helm flag and automatically inherits cluster networking, LDAP/OAuth2 authentication, and catalog connection settings. See نظره عامه for detailed feature descriptions.

سير العمل

Batch Job Workflow

A user submits a Spark job via the ilum-ui, REST API, or ilum-cli, specifying the job JAR/Python file, Spark configuration, and target cluster.
ilum-core validates the request, schedules the job, and creates a Spark driver pod on the target Kubernetes cluster.
The Spark driver provisions executor pods according to the configured parallelism. Executors scale horizontally across cluster nodes.
The job executes, reading from and writing to object storage through the catalog layer.
Results and execution metadata are returned to إيلوم كور and stored in PostgreSQL (or MongoDB on legacy deployments). Users view results in the UI or fetch them via the REST API.

Interactive Session Workflow

A user creates an interactive session (via UI, API, or CLI), which launches a long-running Spark application pod.
The session remains alive, ready to accept code execution requests without Spark initialization overhead.
Users submit code snippets (Scala, Python, SQL) via REST API. Each snippet executes within the existing Spark context and returns results immediately.
Multiple users can share a session through Code Groups - shared Spark contexts that enable collaborative analysis while isolating variable namespaces.
The session terminates on explicit shutdown or after a configurable idle timeout.

For step-by-step guides, see Run a Spark Job و وظائف تفاعلية.

أنواع الاتصالات

Ilum supports two forms of communication between Spark jobs and ilum-core: Apache Kafka and gRPC.

اتصالات أباتشي كافكا

يسهل تكامل Ilum مع Apache Kafka اتصالا موثوقا به وقابلا للتطوير، ويدعم جميع ميزات Ilum، بما في ذلك التوافر العالي (HA) وقابلية التوسع. يتم إجراء جميع عمليات تبادل الأحداث عبر الموضوعات التي يتم إنشاؤها تلقائيا باستخدام وسطاء Apache Kafka.

اتصال gRPC (افتراضي)

كبديل ، يمكن استخدام gRPC للاتصال. يعمل هذا الخيار على تبسيط عملية النشر عن طريق التخلص من الحاجة إلى Apache Kafka أثناء التثبيت. ينشئ gRPC اتصالات مباشرة بين وظائف ilum-core و Spark ، مما يلغي متطلبات وسيط رسائل منفصل. ومع ذلك، لا يدعم استخدام gRPC التوفر العالي (HA) للنواة ilum في إطار التنفيذ الحالي. بينما يمكن توسيع نطاق ilum-core، ستستمر وظائف Spark الحالية في الاتصال بنفس مثيلات ilum-core.

Comparison

ميزة	أباتشي كافكا	gRPC
التوافر العالي	Yes - ilum-core replicas share state via topics	No - direct point-to-point connections
قابلية التوسع	Horizontally scalable with partitioned topics	Limited to single ilum-core affinity
Deployment Complexity	Requires Kafka cluster (3+ brokers recommended)	Zero additional infrastructure
Recommended For	Production, multi-replica, HA environments	Development, testing, single-node setups

بقشيش

Start with gRPC for development and testing. Switch to Kafka when deploying to production or enabling HA. See Production Deployment for HA configuration.

Multi-Engine Query Architecture

Ilum provides four execution engines through a unified SQL gateway built on Apache Kyuubi. This architecture allows users to submit SQL queries via standard JDBC/ODBC interfaces, the in-product SQL Editor, or the REST API, and have them routed to the most appropriate engine, either by explicit selection or by the automatic engine router.

┌──────────────────────┐
│   JDBC / ODBC        │  BI tools, CLI, applications
│   Clients            │
└─────────┬────────────┘
          │ Thrift Binary Protocol
┌─────────▼────────────┐
│  Apache Kyuubi       │  Session management, engine routing,
│  SQL Gateway         │  authentication, connection pooling,
│  + Auto Router       │  automatic engine selection
└──┬──────┬─────┬───┬──┘
   │      │     │   │
┌──▼──┐ ┌─▼──┐ ┌▼──┐ ┌▼─────┐
│Spark│ │Trino│ │Duck│ │Flink │   Engine selection per query
│ SQL │ │     │ │ DB │ │(Beta)│   (auto or explicit)
└──┬──┘ └──┬──┘ └─┬──┘ └──┬───┘
   │       │      │       │
┌──▼───────▼──────▼───────▼─────┐
│      Catalog Layer            │  Hive / Nessie / Unity / DuckLake
├───────────────────────────────┤
│      Storage Layer            │  MinIO / S3 / GCS / WASBS / HDFS
└───────────────────────────────┘

Automatic Engine Router

The automatic engine router observes incoming queries and selects the engine best suited to each workload, based on:

Data size: Estimated scan size of the query against catalog statistics.
Workload type: Streaming, interactive, batch ETL, or ad-hoc exploration.
Locality: Whether the data lives in DuckLake (local-first) or remote object storage.
Engine availability: Which engines are deployed and currently healthy.

User override is available for every query through the Engine Selector in the SQL Editor and the engine field on the REST API.

شرارة SQL

On-demand sessions with DAG-based parallel execution. Spark SQL is the most versatile engine - it handles ETL, batch processing, ML pipelines, and complex analytical queries. Executors are provisioned dynamically via Kubernetes and can scale from zero to hundreds of pods. Best suited for heavy transformations, large-scale joins, and workloads that benefit from distributed shuffle.

الثلاثي

Always-on MPP (Massively Parallel Processing) engine with a coordinator-worker topology. Trino uses pipelined execution to deliver sub-second interactive query latency on large datasets. Workers remain running for instant query response. Best suited for interactive analytics, dashboarding, and ad-hoc exploration.

بطة دي بي

Embedded analytical engine running inside the إيلوم كور process. DuckDB provides zero-overhead local query execution with no pod provisioning or network latency, and pairs naturally with the DuckLake catalog. Best suited for lightweight analytics, small dataset queries, and rapid prototyping.

Apache Flink

Distributed stream-processing engine for low-latency, event-driven workloads. Available for Enterprise deployments through the Kyuubi SQL gateway. Best suited for continuous data pipelines, real-time enrichment, and event-time analytics.

Engine Selection Guide

حالة الاستخدام	Recommended Engine
ETL / large-scale batch processing	شرارة SQL
Interactive dashboards and BI queries	الثلاثي
Complex ML pipelines	شرارة SQL
Ad-hoc exploration on large datasets	الثلاثي
Quick queries on small datasets / DuckLake	بطة دي بي
Streaming ingestion (Structured Streaming)	شرارة SQL
Low-latency stream processing	شجاع

All engines access the same data through the shared catalog layer and object storage, enabling users to choose the right engine per workload without data movement. See محرر SQL for engine configuration and Performance for optimization details.

Data Catalog Layer

Data catalogs provide the persistent metadata layer that enables SQL access, schema management, and multi-engine data sharing. When a Spark job or Trino query creates a table, the catalog records its schema, location, and format so that any engine can access it later.

Ilum supports four catalog implementations:

Hive Metastore (default)

PostgreSQL-backed metadata service, automatically deployed and configured by Helm. Compatible with Spark, Trino, and Superset out of the box. Provides the broadest ecosystem compatibility and is the recommended default for most deployments.

نيسي

Git-like catalog for Apache Iceberg tables. Supports branching, tagging, and merging of table metadata - enabling CI/CD workflows for data. Create a branch, experiment with schema changes or data transformations, and merge only when validated.

كتالوج الوحدة

Three-level namespace model (catalog, schema, table) with governance-focused access control. Provides a familiar structure for organizations migrating from Databricks or requiring fine-grained catalog-level permissions.

DuckLake

Lightweight embedded catalog optimized for DuckDB. Stores metadata directly in a DuckDB database file, ideal for local development and single-engine DuckDB workloads.

Catalog Integration

Spark jobs launched through ilum are automatically configured with catalog connection parameters at startup - no manual configuration required. Trino and Superset also receive catalog configuration via Helm values.

Table Format Support

كتالوج	بحيرة دلتا	أباتشي آيسبرغ	أباتشي هودي	Parquet
Hive Metastore	نعم	نعم	نعم	نعم
نيسي	لا	نعم	لا	لا
كتالوج الوحدة	نعم	نعم	لا	نعم
DuckLake	لا	لا	لا	نعم

رأى كتالوجات البيانات for setup details and جدول إيلوم for the unified table abstraction.

أنواع المجموعات

يبسط Ilum تكوين نظام مجموعة Spark، وبمجرد تكوينه، يمكن إستخدام نظام المجموعة لمهام مختلفة، بغض النظر عن نوعها أو كميتها. يدعم Ilum حاليا ثلاثة أنواع من المجموعات: Kubernetes و Yarn و Local.

ملاحظه

أثناء التشغيل الأولي، يقوم Ilum تلقائيا بإنشاء مجموعة افتراضية، باستخدام نفس مجموعة Kubernetes التي تم تثبيت Ilum عليها. إذا تم حذف هذه المجموعة الافتراضية عن طريق الخطأ، فيمكنك إعادة إنشاؤها بسهولة عن طريق إعادة تشغيل جراب ilum-core.

مجموعة Kubernetes

ينصب تركيز Ilum الأساسي على تسهيل التكامل السهل بين Spark و Kubernetes. يبسط تكوين تطبيقات Spark وتشغيلها على Kubernetes. للاتصال بمجموعة Kubernetes موجودة، يحتاج المستخدمون إلى توفير معلومات التكوين الافتراضية، مثل عنوان URL لواجهة برمجة تطبيقات Kubernetes ومعلمات المصادقة. يدعم Ilum كلا من طرق المصادقة المستندة إلى المستخدم / كلمة المرور والشهادة. يمكن إدارة مجموعات Kubernetes المتعددة بواسطة Ilum، بشرط أن يكون من الممكن الوصول إليها. تتيح هذه الميزة إنشاء مركز لإدارة العديد من بيئات Spark من موقع واحد.

Ilum is compatible with any CNCF-compliant Kubernetes distribution:

Managed cloud: Google Kubernetes Engine (GKE), Amazon EKS, Azure AKS, DigitalOcean Kubernetes
مؤسسة: Red Hat OpenShift / OKD
Lightweight: k3s, Rancher, MicroK8s
Bare metal: Self-managed Kubernetes installations
Local development: Minikube, k3d, Docker Desktop

مجموعة الغزل

يدعم Ilum أيضا مجموعات Apache Yarn ، والتي يمكن تكوينها بسهولة باستخدام ملفات تكوين Yarn الموجودة في تثبيت Yarn.

المجموعة المحلية

يقوم نوع نظام المجموعة المحلي بتشغيل تطبيقات Spark حيث يتم نشر ilum-core، مما يعني أنه يقوم بتشغيل تطبيقات Spark إما داخل حاوية ilum-core عند نشرها على Docker/Kubernetes، أو على الجهاز المضيف عند نشره بدون منسق. هذا النوع من نظام المجموعة مناسب لأغراض الاختبار نظرا لقيود موارده.

Comparison

ميزة	كوبرنيتيس	غزل	محلي
Production Ready	نعم	نعم	No (testing only)
التوافر العالي	نعم	Yes (via YARN RM HA)	لا
Auto-Scaling	Yes (K8s autoscaler)	Yes (YARN node managers)	لا
Multi-Cluster	نعم	نعم	N/A
Dynamic Executor Allocation	نعم	نعم	Limited

A single ilum instance can manage multiple clusters of different types simultaneously, enabling hybrid K8s + Yarn deployments - useful for organizations migrating from Hadoop to Kubernetes. See المجموعات والمخازن for configuration details.

Decoupled Compute-Storage Architecture

Ilum follows a decoupled compute-storage architecture where processing engines and data storage are independent, horizontally scalable layers. This separation is fundamental to ilum's design and enables independent scaling, multi-engine access, and cost-efficient resource utilization.

┌─────────────────────────────────────────────────────────────┐
│                    Compute Layer                            │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐                   │
│  │  Spark   │  │  Trino   │  │  DuckDB  │   Engines         │
│  │ Executors│  │ Workers  │  │(embedded)│   scale           │
│  └────┬─────┘  └─────┬────┘  └──────┬───┘   independently   │
│       │              │              │                       │
├───────┼──────────────┼──────────────┼───────────────────────┤
│       │ Catalog Layer (Hive Metastore / Nessie)             │
├───────┼──────────────┼──────────────┼───────────────────────┤
│       │              │              │                       │
│  ┌────▼──────────────▼──────────────▼─────┐                 │
│  │           Storage Layer                │                 │
│  │  MinIO / S3 / GCS / WASBS / HDFS       │   Storage       │
│  │  (Iceberg / Delta / Hudi / Parquet)    │   scales        │
│  └────────────────────────────────────────┘   independently │
└─────────────────────────────────────────────────────────────┘

Benefits of this architecture:

Independent scaling: Add more Spark executors or Trino workers without provisioning additional storage, and vice versa
Multi-engine access: Multiple compute engines can read from and write to the same datasets concurrently, mediated by table formats (Iceberg, Delta) that provide ACID guarantees
Cost efficiency: Compute resources can be released when not in use (e.g., Spark dynamic allocation, auto-pause) while data remains persistently available in object storage. Scale compute to zero during off-hours - storage costs persist, but expensive compute does not
Engine flexibility: Choose the right engine for each workload - Spark for ETL, Trino for interactive analytics, DuckDB for lightweight queries - all accessing the same data
Catalog-mediated consistency: The catalog layer ensures all engines see a consistent view of table metadata, enabling safe concurrent reads and writes across Spark, Trino, and DuckDB

قابلية التوسع

تم تصميم ILUM-Core مع وضع قابلية التوسع في الاعتبار. نظرا لكونه عديم الحالة تماما ، يمكن ل ilum-core استعادة حالته الحالية بالكامل بعد التعطل ، مما يجعل من السهل توسيع نطاقه أو تقليله بناء على متطلبات التحميل.

Ilum supports multiple scaling dimensions:

ilum-core horizontal scaling - Deploy multiple stateless replicas behind a load balancer. Requires Kafka for inter-replica coordination
Spark dynamic executor allocation - Executors scale up and down automatically based on workload. Idle executors are released after a configurable timeout
Kubernetes HPA - Horizontal Pod Autoscaler can manage always-on services (Trino workers, ilum-core replicas) based on CPU/memory metrics
Cluster autoscaler - Node-level scaling via Kubernetes Cluster Autoscaler or cloud provider auto-scaling groups, triggered when pending pods cannot be scheduled
Resource quotas - Kubernetes namespaces and ilum resource controls enforce per-tenant resource limits, preventing any single workload from monopolizing cluster resources

رأى التحكم في الموارد و Production Deployment for configuration details.

التوافر العالي

ilum-core and its required components support High Availability (HA) deployments. An HA deployment necessitates the use of Apache Kafka as the communication type, as gRPC does not support HA.

Recommended HA Configuration

Component	Minimum Replicas	Notes
إيلوم كور	3	Stateless; requires Kafka for HA coordination
ilum-api	2	Stateless module-management microservice
PostgreSQL	3	Primary metadata store; primary + standby with streaming replication
MongoDB (legacy)	3	Only for legacy deployments; replica set with automatic failover
أباتشي كافكا	3	KRaft quorum or ZooKeeper-based
ميني آيو	4	Erasure coding for data durability

Failure Domain Mitigation

Pod anti-affinity: Spread replicas of each component across different nodes to survive single-node failures.
Namespace isolation: Deploy Ilum infrastructure (Kafka, PostgreSQL, MongoDB, MinIO) in dedicated namespaces separated from user workloads.
Zone-aware scheduling: In multi-zone clusters, use topology spread constraints to distribute pods across availability zones.

رأى Production Deployment for the full HA deployment guide.

Observability Architecture

Ilum provides three observability pillars - metrics, logs, and lineage - through optional Helm-deployed modules.

Metrics

Spark exposes execution metrics via the built-in PrometheusServlet. Ilum deploys PodMonitor resources that Prometheus scrapes automatically. Pre-configured Grafana dashboards visualize executor utilization, job duration, shuffle I/O, and GC pressure.

Spark Executor → PrometheusServlet → PodMonitor → Prometheus → Grafana

سجلات

Container logs from all Spark driver and executor pods flow through the standard Kubernetes logging pipeline. When Loki is enabled, a Promtail DaemonSet collects logs from each node and ships them to Loki for centralized LogQL querying.

Container stdout → Promtail DaemonSet → Loki → LogQL queries

نسب البيانات

The OpenLineage Spark Listener captures table-level and column-level lineage events during job execution. These events are sent to the Marquez API, which stores them and exposes lineage graphs through the ilum UI (ERD and directed graph views).

Spark Job → OpenLineage Listener → Marquez API → Lineage UI

خادم تاريخ الشرارة

For deep post-execution analysis, Spark History Server reads event logs from object storage and provides detailed DAG visualizations, stage breakdowns, and task-level metrics.

All observability components are optional and enabled via Helm flags. See رصد for metrics and log configuration, and نسب البيانات for lineage setup.

أمن

Ilum provides a comprehensive security architecture covering authentication, authorization, data access control, and network security.

المصادقه

Ilum supports multiple authentication methods:

Internal LDAP - Embedded OpenLDAP server deployed via Helm for standalone deployments
External LDAP/AD - Connect to existing Active Directory or LDAP directories
OAuth2/OIDC - Integration via ORY Hydra supporting Okta, Azure AD, Google, and Keycloak as identity providers

إذن

RBAC - Role-based access control with two modes: unrestricted (development - all users have full access) and restricted (production - permissions are enforced per role)
ABAC - Attribute-based access control using data classification tags for fine-grained policy decisions

Data Access Control

Row-level filters - Restrict query results based on user attributes (e.g., region, department)
Column-level masking - Redact or hash sensitive columns (PII, financial data) based on user roles
Hierarchical privileges - Grant access at catalog, schema, table, or column level

Network Security

TLS/mTLS - Encrypted communication between ilum-core, Spark jobs, and integrated services
Kubernetes Network Policies - Restrict pod-to-pod communication to authorized paths

Secrets Management

Kubernetes Secrets - Native secret storage for credentials, certificates, and API keys
External vault integration - Support for mounting secrets from external KMS providers

ilum as Identity Provider

Ilum can act as an OAuth2 provider for integrated services. When enabled, Airflow, Superset, Grafana, Gitea, and MinIO authenticate users through ilum's OAuth2 endpoints - providing single sign-on across the entire platform.

رأى أمن for the full security guide, Data Access Control for row/column policies, and OAuth2 for OIDC configuration.

المكونات الرئيسية​

Integrated Modules​

سير العمل​

Batch Job Workflow​

Interactive Session Workflow​

أنواع الاتصالات​

اتصالات أباتشي كافكا​

اتصال gRPC (افتراضي)​

Comparison​

Multi-Engine Query Architecture​

Automatic Engine Router​

شرارة SQL​

الثلاثي​

بطة دي بي​

Apache Flink​

Engine Selection Guide​

Data Catalog Layer​

Hive Metastore (default)​

نيسي​

كتالوج الوحدة​

DuckLake​

Catalog Integration​

Table Format Support​

أنواع المجموعات​

مجموعة Kubernetes​

مجموعة الغزل​

المجموعة المحلية​

Comparison​

Decoupled Compute-Storage Architecture​

قابلية التوسع​

التوافر العالي​

Recommended HA Configuration​

Failure Domain Mitigation​

Observability Architecture​

Metrics​

سجلات​

نسب البيانات​

خادم تاريخ الشرارة​

أمن​

المصادقه​

إذن​

Data Access Control​

Network Security​

Secrets Management​

ilum as Identity Provider​

المكونات الرئيسية

Integrated Modules

سير العمل

Batch Job Workflow

Interactive Session Workflow

أنواع الاتصالات

اتصالات أباتشي كافكا

اتصال gRPC (افتراضي)

Comparison

Multi-Engine Query Architecture

Automatic Engine Router

شرارة SQL

الثلاثي

بطة دي بي

Apache Flink

Engine Selection Guide

Data Catalog Layer

Hive Metastore (default)

نيسي

كتالوج الوحدة

DuckLake

Catalog Integration

Table Format Support

أنواع المجموعات

مجموعة Kubernetes

مجموعة الغزل

المجموعة المحلية

Comparison

Decoupled Compute-Storage Architecture

قابلية التوسع

التوافر العالي

Recommended HA Configuration

Failure Domain Mitigation

Observability Architecture

Metrics

سجلات

نسب البيانات

خادم تاريخ الشرارة

أمن

المصادقه

إذن

Data Access Control

Network Security

Secrets Management

ilum as Identity Provider