Introduction
BDB Platform 11.0 delivers a unified Kinetic Semantic Layer and
platform-native Satellite Apps. With this release, the platform is AI-ready, governed by
default, and extensible to operational use cases well beyond analytics.
The release is organized around four themes:
- Kinetic Semantic Layer: Business Objects (ontology),
multi-step actions, aligned services, a centralized metric store, data quality, and
end-to-end lineage — the governed foundation for zero-hallucination AI Data Agents
and for Satellite Apps.
- Satellite Apps: Operational applications generated
by BDB's coding agent, with the Kinetic Semantic Layer as their structural backbone.
- Modernized Data Engineering: A native Data
Lakehouse, Ray-based distributed compute, and a refreshed base platform — Python
3.12, updated runtimes, plus pipeline triggers, offset mapping, and on-demand
Jobs-as-API.
- Developer Assist and Observability: The BDB Assist
Agent is embedded across Jobs, Data Science Lab notebooks, and Data Quality
authoring, plus a Proactive Monitoring Dashboard built for fast troubleshooting
across jobs, pipelines, and platform components.
Key Highlights
1. BDB Kinetic Semantic Layer
A unified semantic and action layer combining ontology,
governed operations, metrics, data quality, and lineage — purpose-built to ground AI
Data Agents and Satellite Apps in business concepts rather than raw physical schemas.
- Business Objects (Ontology): Model enterprise
concepts — Customer, Order, Policy, Asset — as typed objects with properties,
derived measures, and relationships.
- Multi-Step Actions: Define governed, multi-step
operations (e.g., approve policy, close order, onboard customer) on top of Business
Objects. Actions are callable by Data Agents, Satellite Apps, and external consumers
through a consistent contract.
- Aligned Services: Business-aligned service
definitions mapped to the ontology, giving every consumer a single, governed way to
invoke enterprise operations.
- Centralized Metric Store: Define measures and
dimensions once on Business Objects; reuse them across dashboards, reports, agents,
notebooks, and APIs.
- Data Quality: Rule-based quality checks at both
the Catalog (asset) and Data Center (datastore and table) levels; scores surface at
the point of use.
- End-to-End Lineage: Dataset-to-dataset,
job-to-dataset, and metric-to-source tracing across pipelines, Spark jobs, and
Business Objects.
- Zero-Hallucination Agents: Data Agents reason over
Business Objects, metrics, and governed actions — not raw schemas — eliminating the
hallucinated joins and misread columns typical of text-to-SQL approaches.
2. Satellite Apps
Operational applications generated by BDB's coding agent and
deployed directly inside the platform — with the Kinetic Semantic Layer as their
structural backbone. One surface for data, apps, and agents.
- Agent-Generated: Apps are produced by a coding
agent from business requirements rather than hand-coded module by module,
compressing build timelines from months to days.
- Kinetic Semantic Layer Backbone: Satellite Apps
consume Business Objects, metrics, and governed actions directly — no bolted-on data
access, and no drift between app logic and the semantic model.
- Governed by Default: Inherits platform
authentication, workspace entitlements, and Catalog-level access policies.
- Unified Surface: Operational tools and data
products ship alongside the data they consume. No external app hosting, no separate
identity story.
3. Native Data Lakehouse
A native Data Lakehouse built on Apache Hudi — databases and
tables are provisioned through the platform UI, inherit the full governance and
connectivity surface of Data Center, and are natively addressable from Data Engineering
and Data Science Lab.
- Apache Hudi Foundation: ACID transactions, schema
evolution, upserts and deletes, and time-travel queries on an open table format.
- UI-Driven Provisioning: Create and manage
Lakehouse databases and tables directly from the platform interface — no external
admin console, no hand-written DDL.
- Data Center-Native: Inherits the full set of Data
Center capabilities: connectivity, lifecycle management, governance, access policies,
and data quality rules.
- Cross-Module Access: First-class support in Data
Engineering (Data Lakehouse Reader and Writer pipeline components, Spark Job tasks)
and in Data Science Lab notebooks.
Additional Highlights
- Improved Data Catalog: The Catalog has been
upgraded into the Kinetic Semantic Layer surface — ontology, stewardship,
classification, entitlement, and quality unified into a single governance layer.
- BDB Proactive Monitoring Dashboard: A real-time
troubleshooting surface across jobs, pipelines, and platform components —
structured diagnostic logs, lifecycle events, and failure classifications in a
single view.
- BDB Assist Agent: AI-guided authoring and
troubleshooting embedded across Jobs, Data Science Lab notebooks, and Data Quality
rule configuration.
- Pipeline Triggers and Offset Mapping:
Event-driven pipelines with state-preserving component replacement.
- Ray Job Support: Distributed Python workloads
across Data Engineering and Data Science Lab.
- On-Demand Job as API: Authenticated REST
endpoints with configurable concurrency (parallel, queue, or reject).
- Delta Sharing in DS Lab: Secure, open-protocol
access to external Delta datasets from notebooks.
- Data Agent Mobile App: Dedicated iOS and Android
client for on-the-go agent interactions.
- Security Hardening: Penetration-test remediations
and vulnerability fixes across Catalog and DS Lab.
Module-Specific Updates
This section details the changes within each module.
Module 1: Platform Core
- New Features:
- BDB Proactive Monitoring Dashboard: A
centralized, real-time troubleshooting surface for jobs, pipelines, and
platform components. Consumes structured diagnostic logs, lifecycle events,
and failure classifications from components such as the Pipeline Component
Monitor — enabling faster root-cause analysis and earlier anomaly detection.
- Enhancements:
- Widget UI revamp: Refreshed visuals,
typography, and interaction patterns for a cleaner dashboard composition.
- Penetration-test remediations: Findings
from the latest pen-test cycle have been remediated.
Module 2: Semantic Layer and Data Catalog
The Data Catalog is now a full Semantic Layer — combining
ontology, metrics, quality, stewardship, classification, and entitlement in a single
governance surface.
- New Features:
- Business Objects (Ontology): Typed objects
with properties, relationships, and derived measures, mapped to physical
datasets and models.
- Centralized Metric Store: A single source
of truth for measures and dimensions across every consumer.
- Data Quality in Catalog: Define, execute,
and monitor quality rules against catalog assets; results surface as trust
scores on asset pages.
- End-to-End Lineage: Automated capture
across datasets, pipelines, Spark jobs, and Business Objects for impact
analysis and audit.
- Entitlement in Catalog: Asset-level access
control aligned with the platform-wide entitlement framework.
- Data Stewardship: Assign stewards to
catalog assets for ownership, curation, and quality accountability.
- Semantic Analysis: Automated
classification and relationship inference across cataloged assets,
accelerating Business Object modeling and discovery.
- Data Classification: Tag assets as Public,
Internal, Confidential, or PII. Tags propagate through discovery,
entitlement, and quality workflows.
- Enhancements:
- UI revamp: Aligned with the new platform
design system.
- Python runtime relocated to the Platform
Layer: Modular architecture, simpler service management, and a
cleaner upgrade path.
- Refined permission model: More granular
role-based controls.
- Python 3.12 upgrade: Latest language
features, improved performance, and updated ecosystem compatibility.
Module 3: Data Center
- New Features:
- Native Data Lakehouse: Unified storage for
structured, semi-structured, and unstructured data, with ACID tables, schema
evolution, and time-travel queries. Accessible through the new Data Lakehouse
Reader and Writer pipeline components and from Spark Job tasks.
- Satellite Apps: A first-class deployable
entity for platform-native applications, managed directly from the Data
Center.
- Dynamic Form Framework:
Configuration-driven, runtime-rendered forms — faster onboarding of new
connectors and workflow variants.
- Enhancements:
- Data Quality Monitoring: The monitoring
subsystem has been extended with new rules for completeness, uniqueness,
range, and referential integrity checks; table-level scope; a streamlined UI
for rule configuration and lifecycle management; and AI-assisted rule
authoring via the BDB Assist Agent.
Module 4: Data Pipeline
- New Features:
- Pipeline Trigger: Event- and
dependency-driven pipeline execution — supporting time schedules, upstream
data availability, and external event signals.
- Workload Isolation via Kubernetes Taints and Tolerations:Pipeline workloads can now be scheduled onto dedicated nodes, providing predictable performance and clean cost attribution. See Module 5: Jobs for the full description — supported uniformly across Jobs, Data Pipeline, and Data Science Lab orchestration.
- New Components:
- Data Lakehouse Reader and Writer: Native
pipeline components with consistent schema and transaction semantics.
- Python Schema Validator: Validates incoming
data against JSON schemas in Strict or Permissive mode. Non-conforming
records are routed to a Bad Records event.
- Enhancements:
- Component Monitor log enrichment:
Structured logs with lifecycle events, offset positions, and failure
classifications — feeding the Proactive Monitoring Dashboard.
- API Ingestion optimization: The
intermediate internal topic and the API Reader deployment have been removed;
data now flows directly to the output stage. Eliminates redundant
duplication, removes additional deployment and storage requirements, and
reduces infrastructure cost while improving throughput.
- Offset Mapping: State-preserving component
replacement for streaming and message-based sources. New components resume
from previously deployed offsets — no full reprocessing or duplicate
generation during upgrades.
- Reduced system dependencies: The system
pod log-manager dependency has been removed, lowering runtime overhead,
improving resilience, and simplifying deployment.
- Repository access via Platform Layer
service: Unified access layer; components no longer interact
directly with MongoDB.
Module 5: Jobs
- New Features:
- BDB Assist Agent (in Jobs): AI-guided job
authoring, configuration validation, and troubleshooting recommendations.
Available across the platform's development surfaces — see Data Science Lab
for notebook-based assistance and Data Center for Data Quality rule
authoring.
- Ray Job support: Distributed Python
workloads on managed Ray infrastructure — training, tuning, and parallel
transforms.
- Spark Job — Data Lakehouse Reader and Writer
tasks: Large-scale Spark reads and writes against the Lakehouse.
- On-Demand Job as API: Expose jobs as REST
endpoints with Client ID / Client Secret authentication and per-client rate
limits.
- Concurrency policy for on-demand jobs:
Configurable handling of simultaneous invocations — parallel execution,
queuing, or rejection.
- Workload Isolation via Kubernetes Taints and Tolerations:
Now supported across Jobs, Data Pipeline, and Data Science Lab orchestration — pin specific workloads
to dedicated infrastructure, for example GPU nodes for ML training, high-memory
nodes for large Spark jobs, or tenant-specific node pools for regulated workloads.
This delivers predictable performance for critical workloads (no resource contention from noisy neighbors),
clean cost attribution by line of business or use case, and the workload separation typically required for
compliance, multi-tenancy, and chargeback scenarios.
- Kubernetes taints and tolerations:
Schedule workloads on dedicated nodes for isolation and cost attribution.
- Spark lineage: End-to-end visibility of
Spark data flow, integrated with the Catalog's unified lineage view.
- RBAC for Spark: Fine-grained permissions
across Spark jobs, clusters, and related artifacts.
- Enhancements:
- Job Trigger enhancements: More reliable
trigger evaluation with expanded event-driven execution scenarios and
multi-job dependency linking (e.g., Job D triggers based on the combined
outcomes of Jobs B and C).
Module 6: Data Science Lab
- New Features:
- UV-based virtual environments: UV — a
modern, Rust-based Python package manager — is now used in both the UI and
backend, delivering faster dependency resolution and notebook start-up.
- Delta Sharing: Consume shared Delta
datasets from external providers directly in notebooks, with no duplication.
- Azure Blob connectivity: Native reads and
writes against Azure Blob Storage; required JARs are bundled in the runtime.
- CatBoost and XGBoost support: Native
support in the DS Notebook runtime and in Explainability Dashboards.
- Widgets integration: Inline controls for
parameter tuning, data exploration, and visualization within notebooks.
- Ray Job execution: Run distributed Python
workloads on the platform's Ray infrastructure directly from DS Lab
workspaces.
- BDB Assist Agent (in Notebooks):
AI-guided notebook authoring, code completion, and troubleshooting
recommendations, available alongside UV and Ray within DS Lab.
- Workload Isolation via Kubernetes Taints and Tolerations:
DS Lab orchestration can now schedule notebook and training workloads onto dedicated nodes —
for example, pinning GPU-intensive training to GPU nodes. See Module 5: Jobs for the full description.
- Enhancements:
- Persistent notebook saving — hardened:
Reliable saves across project state transitions; no work is lost in
long-running sessions.
- Notebook UI refinements: Cleaner
navigation, improved cell controls, and greater visual consistency.
- UI logging for critical errors: Frontend
errors are now captured and transmitted for faster diagnosis.
- Angular v16 upgrade: Improved performance
and modernized build tooling.
- Python 3.12 upgrade: Latest language
features and library compatibility.
- PySpark library updates: Azure, GCS, S3,
Kafka, and model-version dependencies updated to the latest supported
versions.
Module 7: Data Agent
- New Features:
- Semantic-layer-grounded agents: Agents
reason over Business Objects, governed metrics, classification tags, and
quality signals — not raw schemas.
- Agent Evaluation Framework: Repeatable
assessment of agent performance, response quality, and behavioral
consistency.
- Data Agent mobile application: Dedicated
iOS and Android client with conversational interaction, platform-standard
secure authentication, push notifications, offline-friendly graceful
handling, and workspace awareness consistent with the web experience.
- Enhancements:
- Design system alignment: The Data Agent
interface has been aligned with the current platform design system.