تخطي إلى المحتوى الرئيسي

نظره عامه

Ilum is a comprehensive Apache Spark management platform for Kubernetes and Yarn clusters. This page provides a technical overview of the platform's architecture, capabilities, and integration ecosystem.

Platform Architecture

Core Components

Ilum consists of two primary components:

  • إيلوم كور : Backend service providing job orchestration, cluster management, and REST API endpoints (gRPC/Kafka-based)
  • ILUM-UI : Web-based interface for cluster monitoring, job submission, and resource visualization

The platform supports both Python (PySpark) and Scala programming languages, with native integration for Spark SQL, Spark Streaming, and MLlib frameworks.

إيلوم

Communication Layer

Ilum supports two communication patterns between Spark jobs and ilum-core:

أباتشي كافكا (recommended for production):

  • Enables High Availability (HA) and horizontal scaling
  • All event exchanges occur via automatically created topics
  • Supports distributed ilum-core instances

gRPC (default, simpler deployment):

  • Direct connections between ilum-core and Spark jobs
  • No message broker required
  • Does not support HA for ilum-core (jobs maintain connections to specific instances)

Supported Cluster Types

Kubernetes Clusters: Native CRD-based Spark application deployment with pod lifecycle management. Supports GKE, EKS, AKS, and on-premise deployments with multi-cluster management from a central control plane.

Yarn Clusters: Apache Hadoop Yarn integration for hybrid architectures, configured using Yarn configuration files.

Local Clusters: Runs Spark applications where ilum-core is deployed, suitable for development and testing.

Job Management

Spark Job Types

Batch Jobs: Traditional Spark applications submitted for one-time execution via:

  • Ilum UI with visual job configuration
  • REST API for programmatic submission
  • شرارة تقديم integration for existing workflows

Interactive Jobs (Ilum Groups): Long-running Spark sessions that execute code immediately without initialization overhead. Multiple users can share the same Spark context by pointing to the same job ID.

Code Groups: Web-based code execution environment allowing direct Spark code writing and execution from the Ilum UI.

Key benefits:

  • Effortless job management: create, delete, clone, stop, and resume with single-click operations
  • Comprehensive monitoring: track logs, CPU/memory usage, stages, and task structures
  • Session reusability: avoid repeated Spark session creation
  • Automated configuration: seamless integration with all enabled tools
  • Orchestration support: schedule-based execution and REST API for advanced workflows

Learn more: ابدأ الآن , Run Spark Job, Run Interactive Job, Run Code Group

REST API for Spark Microservices

Ilum exposes Spark functionality through RESTful endpoints, enabling data applications where Spark computations are triggered by HTTP requests:

# Submit Spark job
POST /api/v1/jobs

# Query interactive session
POST /api/v1/sessions/{id}/execute

# Monitor job status
GET /api/v1/jobs/{id}/status

Use cases include:

  • Real-time feature engineering for ML models
  • On-demand data transformations via API
  • Streaming analytics with REST-based controls
  • Jupyter notebook execution through HTTP interface

Documentation: مرجع واجهة برمجة التطبيقات , API Playground

Job Orchestration

Built-in Scheduler: Create cron-based schedules for launching Spark applications directly from the UI. See Schedule documentation.

External Orchestration: Integrate with enterprise workflow tools:

إدارة المجموعات المتعددة

Centralized Control Plane

Manage heterogeneous Spark clusters from a single interface:

Supported Environments:

  • Cloud clusters: GKE, EKS, AKS with auto-scaling
  • On-premise: Bare metal Kubernetes or Hadoop Yarn
  • Hybrid: Mixed cloud and on-premise for data sovereignty

Capabilities:

  • Single-time certificate setup replaces complex kubeconfig management
  • UI-driven job deployment eliminates kubectl/spark-submit commands
  • Independent resource quotas and security policies per cluster
  • Centralized monitoring and job scheduling

Benefits over traditional approaches:

  • No distributed kubeconfig files or certificate management
  • Simplified access control (grant cluster access once)
  • Automatic certificate updates through Ilum
  • Visual job management without console commands

Learn more: المجموعات والمخازن , إنشاء نظام مجموعة محلي , إنشاء نظام مجموعة Kubernetes

Centralized Storage Management

Configure storage solutions once in Ilum, and all Spark jobs automatically receive authentication:

Supported Storage Types: S3, GCS (Google Cloud Storage), WASBS (Azure Blob Storage), HDFS

Benefits:

  • Unified access across all storage systems from single interface
  • Automatic Spark parameter configuration for each storage
  • Eliminate per-job storage configuration
  • Multi-region storage support for latency reduction

Example: MinIO storage automatically configures these Spark parameters for all jobs:

spark.hadoop.fs.s3a.endpoint = http://ilum-minio:9000 
spark.hadoop.fs.s3a.access.key = minioadmin
spark.hadoop.fs.s3a.secret.key = minioadmin
spark.hadoop.fs.s3a.path.style.access = صحيح

Learn more: إنشاء مساحة تخزين , مستكشف الملفات

منصة البيانات ليكهاوس

Table Formats

Ilum simplifies integration of modern data formats with ACID compliance, schema evolution, and time travel capabilities:

Supported Formats:

  • بحيرة دلتا : ACID transactions with time travel and schema evolution
  • أباتشي آيسبرغ : Partition evolution and hidden partitioning for analytics
  • أباتشي هودي : Record-level upserts and incremental data processing
  • Apache Paimon: Streaming lakehouse for real-time data

جداول Ilum : Unified API abstraction allowing you to leverage Delta, Hudi, and Iceberg using identical code. Learn more: جداول Ilum

Metadata Catalogs

Hive Metastore : Centralized metadata management compatible with Spark, Presto, and Trino. Organizes raw and processed data into SQL-like tables in long-term memory, enabling easy querying through Spark SQL.

مشروع نيسي : Transactional catalog inspired by Git for Apache Iceberg tables, providing version control for data.

Integration: Pre-configured Spark jobs automatically connect to enabled catalogs. Learn more: مستكشف الجدول , كتالوجات

Data Exploration and Visualization

Interactive SQL Querying

إيلوم SQL (Kyuubi-based): Execute SQL queries directly in the UI on Hive Metastore tables. Provides streamlined data interaction without complex setup. Learn more: عارض SQL

Trino Integration: High-performance distributed SQL query engine for interactive analytics, substantially faster than Spark for exploratory queries.

مستكشف الجدول

إيلوم

Advanced data exploration tool providing:

  • Visual data sampling and exploration
  • Chart building with mathematical functions
  • In-depth analysis with aggregations, filtering, and transformations
  • All Hive Metastore tables browsable from single interface

Learn more: مستكشف الجدول

نسب البيانات

إيلوم

Visualize data flows and transformations across your entire platform:

Capabilities:

  • Job and dataset relationship visualization
  • Column-level lineage tracking
  • Automatic capture via OpenLineage integration
  • ERD ↔ Lineage toggle for schema and runtime views
  • Search across tables, columns, and jobs

Implementation: Ilum integrates ماركيز with OpenLineage listeners automatically configured for all jobs. Provides comprehensive visibility into data movement and enables rapid issue tracing.

Learn more: نسب البيانات

Development Environments

دفاتر الملاحظات

إيلوم

إيلوم

Ilum integrates production-ready notebook environments with Spark:

JupyterLab : Modern web-based IDE for single-user workflows with Git integration and Spark Magic.

JupyterHub : Multi-user orchestrator providing:

  • Enterprise authentication (LDAP/SSO via Ilum)
  • User isolation with per-user JupyterLab workspaces
  • Centralized resource management on Kubernetes
  • Built-in version control via Gitea

أباتشي زيبلين : Multi-language notebook emphasizing Spark analytics with flexible visualization.

All environments connect to Ilum via ilum-livy-proxy , binding Spark sessions to Ilum Groups and enabling interactive execution without manual configuration.

Learn more: دفاتر الملاحظات , Notebook Usage

سبارك كونكت

Spark Connect provides client-server architecture allowing remote Spark job execution:

Benefits:

  • Client-server isolation prevents crashes
  • Independent version upgrades
  • Secure remote cluster access
  • IDE and notebook connectivity without full Spark installation

Ilum deploys Spark Connect servers as standard jobs, accessible through pod names, IPs, or Kubernetes services.

Learn more: سبارك كونكت

MLOps and Data Science

Data Science Platform

Ilum provides an end-to-end platform streamlining the entire ML lifecycle:

Pre-configured Environments:

  • Direct Spark and Trino connectivity
  • Catalog access to Delta, Iceberg, Hudi tables
  • Comprehensive ML libraries (scikit-learn, XGBoost, PyTorch, TensorFlow)
  • Starter templates for common ML scenarios

تطوير النموذج :

  • Integrated experiment tracking via MLflow
  • Feature engineering pipelines with Spark ML
  • Automated training and inference pipelines
  • Version control and collaborative development

Production Deployment:

  • Model registry with lifecycle management
  • Auto-scaling endpoints with monitoring
  • A/B testing support
  • Scheduled retraining jobs

Learn more: Data Science Platform

MLflow Integration

Complete ML lifecycle management:

  • Experiment tracking with parameters, metrics, and artifacts
  • Model registry with version control
  • Lifecycle stage transitions (development → staging → production)
  • Direct integration with Ilum deployment pipelines

Configuration: Enable with --set mlflow.enabled=true in Helm deployment.

Learn more: MLflow

AI Data Analyst

Intelligent AI agent assistant powered by LLMs with retrieval capabilities:

Capabilities:

  • Natural language to SQL query generation
  • Context-aware responses leveraging workspace metadata
  • Interactive query editing and re-execution
  • Chart creation and pipeline integration via MCP server
  • Continuous learning from user feedback

Implementation: Deployed as dedicated service with OAuth2 authentication, RAG system over platform metadata, and integration with Table Explorer and SQL Viewer.

Learn more: AI Data Analyst

Business Intelligence Integration

أباتشي سوبر ست

Open-source data visualization platform included in Ilum:

  • Pre-configured connections to Ilum SQL
  • Interactive dashboards and reports
  • Multiple chart types and customization
  • Free alternative to Tableau/PowerBI

Configuration: Enable with --set superset.enabled=true . Learn more: شامله

Tableau and PowerBI

External BI tool connectivity via JDBC:

  • Kyuubi JDBC driver for Ilum SQL connectivity
  • Direct access to all Hive Metastore tables
  • Load balancer exposure for external access
  • Automatic Spark configuration

Learn more: تكامل Tableau

Monitoring and Observability

المراقبة المركزية

إيلوم

خادم تاريخ الشرارة : Comprehensive job monitoring accessible from Ilum UI:

  • Job timeline and stage metrics
  • Executor resource utilization
  • CPU and memory usage tracking
  • Task-level performance details

Event logs stored on default Ilum storage, accessible across multi-cluster deployments.

Metrics Collection:

  • Prometheus + Grafana: Kube Prometheus stack with pre-configured dashboards for Spark metrics
  • الجرافيت : Push-based metrics ideal for multi-cluster environments
  • All Ilum jobs automatically configured to push metrics

Log Aggregation:

  • Loki + Promtail: Centralized log gathering and querying
  • Efficient log management across entire infrastructure
  • Query capabilities suited to distributed environments

Learn more: رصد , المجموعات والمخازن

أمن

Authentication and Authorization

Built-in RBAC: Role-based access control with user and group management through Ilum UI.

LDAP/Active Directory: Enterprise directory service integration for centralized user management. Learn more: LDAP

OAuth2/OIDC: Integration with external identity providers:

  • Keycloak, Okta, Azure AD, Google, GitLab
  • PKCE flow support for secure public clients
  • Automatic user provisioning from JWT tokens

Learn more: أمان OAuth2

Identity Provider Mode: Deploy Ilum as OAuth2 provider for other services (Airflow, Superset, Grafana, Gitea, MinIO).

Learn more: OAuth Provider

RBAC Security Modes

Unrestricted Mode (default): Cluster-wide permissions for simplified deployment in development environments.

Restricted Mode: Namespace-scoped permissions implementing principle of least privilege:

  • Enhanced security for production
  • Namespace isolation and reduced attack surface
  • Compliance-ready configuration
  • Limited to deployment namespace only

Learn more: RBAC Security Modes

Network Security

  • TLS/mTLS for inter-service communication
  • Kubernetes Network Policies for pod-to-pod restrictions
  • Certificate-based encryption
  • Egress controls

Learn more: الأمن الداخلي , الوصول إلى البيانات

Production Deployment

Deployment Architecture

Ilum follows modular architecture separating core Spark execution from optional data platform features:

Base Installation (8-12GB RAM, 6 CPU):

  • Spark 3.x/4.x job orchestration
  • Jupyter notebook integration
  • واجهة برمجة تطبيقات REST

Enterprise Modules (18GB RAM, 12 CPU with metadata/lineage):

  • Hive Metastore for centralized metadata
  • SQL Viewer for interactive queries
  • Data Lineage via OpenLineage
  • Monitoring with Grafana dashboards

Namespace Separation Strategy (recommended for production):

  • Separate namespaces for MongoDB, Kafka, MinIO, PostgreSQL
  • Enhanced security through namespace-level isolation
  • Independent resource quotas and scaling
  • Simplified maintenance and upgrades

Helm-based Installation

# Add Ilum repository
خوذة الريبو إضافة ILUM https://charts.ilum.cloud
تحديث Helm Repo

# Data Platform deployment (recommended)
Helm تثبيت ilum ilum / ilum \
--set ilum-hive-metastore.enabled=true \
--set ilum-core.metastore.enabled=true \
--set ilum-core.metastore.type=hive \
--set ilum-sql.enabled=true \
--set ilum-core.sql.enabled=true \
--set global.lineage.enabled=true

# Basic Spark Platform deployment
Helm تثبيت ilum ilum / ilum

Configuration Management: All components configurable via Helm values. Use module selector for custom integration stacks.

Learn more: Production Deployment, ابدأ الآن

التوافر العالي

ilum-core designed for stateless operation:

  • Automatic state recovery after crashes
  • Horizontal scaling based on load
  • HA support with Kafka communication
  • MongoDB, Kafka, MinIO, and PostgreSQL support HA deployments

Learn more: معمار

حالات الاستخدام

Ilum addresses diverse big data scenarios across industries:

هجرة Hadoop : Simplify transition from Hadoop/HDFS to Kubernetes with Yarn compatibility and incremental migration support.

Real-Time ML Interaction: Deploy ML models on Spark clusters with REST API for real-time predictions (e.g., e-commerce personalization).

التعلم الآلي الآلي : Programmatically submit training, testing, and refinement jobs with interactive Jupyter integration.

Fraud Detection: Real-time transaction processing with ML algorithms accessible via REST API for immediate alerts.

Network Optimization: Predict network outages through ML models for proactive maintenance in telecommunications.

Learn more: حالات الاستخدام , Transaction Use Case

موارد إضافية

الخطوات التالية

  1. تركيب : Follow the Get Started guide for initial deployment
  2. First Job: Submit your first Spark job
  3. Learning: Take the official Ilum course
  4. إنتاج : Review production deployment best practices