BDB Release Notes 11.0

Introduction

BDB Platform 11.0 delivers a unified Kinetic Semantic Layer and platform-native Satellite Apps. With this release, the platform is AI-ready, governed by default, and extensible to operational use cases well beyond analytics.

The release is organized around four themes:

Kinetic Semantic Layer: Business Objects (ontology), multi-step actions, aligned services, a centralized metric store, data quality, and end-to-end lineage — the governed foundation for zero-hallucination AI Data Agents and for Satellite Apps.
Satellite Apps: Operational applications generated by BDB's coding agent, with the Kinetic Semantic Layer as their structural backbone.
Modernized Data Engineering: A native Data Lakehouse, Ray-based distributed compute, and a refreshed base platform — Python 3.12, updated runtimes, plus pipeline triggers, offset mapping, and on-demand Jobs-as-API.
Developer Assist and Observability: The BDB Assist Agent is embedded across Jobs, Data Science Lab notebooks, and Data Quality authoring, plus a Proactive Monitoring Dashboard built for fast troubleshooting across jobs, pipelines, and platform components.

Key Highlights

1. BDB Kinetic Semantic Layer

A unified semantic and action layer combining ontology, governed operations, metrics, data quality, and lineage — purpose-built to ground AI Data Agents and Satellite Apps in business concepts rather than raw physical schemas.

Business Objects (Ontology): Model enterprise concepts — Customer, Order, Policy, Asset — as typed objects with properties, derived measures, and relationships.
Multi-Step Actions: Define governed, multi-step operations (e.g., approve policy, close order, onboard customer) on top of Business Objects. Actions are callable by Data Agents, Satellite Apps, and external consumers through a consistent contract.
Aligned Services: Business-aligned service definitions mapped to the ontology, giving every consumer a single, governed way to invoke enterprise operations.
Centralized Metric Store: Define measures and dimensions once on Business Objects; reuse them across dashboards, reports, agents, notebooks, and APIs.
Data Quality: Rule-based quality checks at both the Catalog (asset) and Data Center (datastore and table) levels; scores surface at the point of use.
End-to-End Lineage: Dataset-to-dataset, job-to-dataset, and metric-to-source tracing across pipelines, Spark jobs, and Business Objects.
Zero-Hallucination Agents: Data Agents reason over Business Objects, metrics, and governed actions — not raw schemas — eliminating the hallucinated joins and misread columns typical of text-to-SQL approaches.

2. Satellite Apps

Operational applications generated by BDB's coding agent and deployed directly inside the platform — with the Kinetic Semantic Layer as their structural backbone. One surface for data, apps, and agents.

Agent-Generated: Apps are produced by a coding agent from business requirements rather than hand-coded module by module, compressing build timelines from months to days.
Kinetic Semantic Layer Backbone: Satellite Apps consume Business Objects, metrics, and governed actions directly — no bolted-on data access, and no drift between app logic and the semantic model.
Governed by Default: Inherits platform authentication, workspace entitlements, and Catalog-level access policies.
Unified Surface: Operational tools and data products ship alongside the data they consume. No external app hosting, no separate identity story.

3. Native Data Lakehouse

A native Data Lakehouse built on Apache Hudi — databases and tables are provisioned through the platform UI, inherit the full governance and connectivity surface of Data Center, and are natively addressable from Data Engineering and Data Science Lab.

Apache Hudi Foundation: ACID transactions, schema evolution, upserts and deletes, and time-travel queries on an open table format.
UI-Driven Provisioning: Create and manage Lakehouse databases and tables directly from the platform interface — no external admin console, no hand-written DDL.
Data Center-Native: Inherits the full set of Data Center capabilities: connectivity, lifecycle management, governance, access policies, and data quality rules.
Cross-Module Access: First-class support in Data Engineering (Data Lakehouse Reader and Writer pipeline components, Spark Job tasks) and in Data Science Lab notebooks.

Additional Highlights

Improved Data Catalog: The Catalog has been upgraded into the Kinetic Semantic Layer surface — ontology, stewardship, classification, entitlement, and quality unified into a single governance layer.
BDB Proactive Monitoring Dashboard: A real-time troubleshooting surface across jobs, pipelines, and platform components — structured diagnostic logs, lifecycle events, and failure classifications in a single view.
BDB Assist Agent: AI-guided authoring and troubleshooting embedded across Jobs, Data Science Lab notebooks, and Data Quality rule configuration.
Pipeline Triggers and Offset Mapping: Event-driven pipelines with state-preserving component replacement.
Ray Job Support: Distributed Python workloads across Data Engineering and Data Science Lab.
On-Demand Job as API: Authenticated REST endpoints with configurable concurrency (parallel, queue, or reject).
Delta Sharing in DS Lab: Secure, open-protocol access to external Delta datasets from notebooks.
Data Agent Mobile App: Dedicated iOS and Android client for on-the-go agent interactions.
Security Hardening: Penetration-test remediations and vulnerability fixes across Catalog and DS Lab.

Module-Specific Updates

This section details the changes within each module.

Module 1: Platform Core

New Features:
- BDB Proactive Monitoring Dashboard: A centralized, real-time troubleshooting surface for jobs, pipelines, and platform components. Consumes structured diagnostic logs, lifecycle events, and failure classifications from components such as the Pipeline Component Monitor — enabling faster root-cause analysis and earlier anomaly detection.
Enhancements:
- Widget UI revamp: Refreshed visuals, typography, and interaction patterns for a cleaner dashboard composition.
- Penetration-test remediations: Findings from the latest pen-test cycle have been remediated.

Module 2: Semantic Layer and Data Catalog

The Data Catalog is now a full Semantic Layer — combining ontology, metrics, quality, stewardship, classification, and entitlement in a single governance surface.

New Features:
- Business Objects (Ontology): Typed objects with properties, relationships, and derived measures, mapped to physical datasets and models.
- Centralized Metric Store: A single source of truth for measures and dimensions across every consumer.
- Data Quality in Catalog: Define, execute, and monitor quality rules against catalog assets; results surface as trust scores on asset pages.
- End-to-End Lineage: Automated capture across datasets, pipelines, Spark jobs, and Business Objects for impact analysis and audit.
- Entitlement in Catalog: Asset-level access control aligned with the platform-wide entitlement framework.
- Data Stewardship: Assign stewards to catalog assets for ownership, curation, and quality accountability.
- Semantic Analysis: Automated classification and relationship inference across cataloged assets, accelerating Business Object modeling and discovery.
- Data Classification: Tag assets as Public, Internal, Confidential, or PII. Tags propagate through discovery, entitlement, and quality workflows.
Enhancements:
- UI revamp: Aligned with the new platform design system.
- Python runtime relocated to the Platform Layer: Modular architecture, simpler service management, and a cleaner upgrade path.
- Refined permission model: More granular role-based controls.
- Python 3.12 upgrade: Latest language features, improved performance, and updated ecosystem compatibility.

Module 3: Data Center

New Features:
- Native Data Lakehouse: Unified storage for structured, semi-structured, and unstructured data, with ACID tables, schema evolution, and time-travel queries. Accessible through the new Data Lakehouse Reader and Writer pipeline components and from Spark Job tasks.
- Satellite Apps: A first-class deployable entity for platform-native applications, managed directly from the Data Center.
- Dynamic Form Framework: Configuration-driven, runtime-rendered forms — faster onboarding of new connectors and workflow variants.
Enhancements:
- Data Quality Monitoring: The monitoring subsystem has been extended with new rules for completeness, uniqueness, range, and referential integrity checks; table-level scope; a streamlined UI for rule configuration and lifecycle management; and AI-assisted rule authoring via the BDB Assist Agent.

Module 4: Data Pipeline

New Features:
- Pipeline Trigger: Event- and dependency-driven pipeline execution — supporting time schedules, upstream data availability, and external event signals.
- Workload Isolation via Kubernetes Taints and Tolerations:Pipeline workloads can now be scheduled onto dedicated nodes, providing predictable performance and clean cost attribution. See Module 5: Jobs for the full description — supported uniformly across Jobs, Data Pipeline, and Data Science Lab orchestration.
New Components:
- Data Lakehouse Reader and Writer: Native pipeline components with consistent schema and transaction semantics.
- Python Schema Validator: Validates incoming data against JSON schemas in Strict or Permissive mode. Non-conforming records are routed to a Bad Records event.
Enhancements:
- Component Monitor log enrichment: Structured logs with lifecycle events, offset positions, and failure classifications — feeding the Proactive Monitoring Dashboard.
- API Ingestion optimization: The intermediate internal topic and the API Reader deployment have been removed; data now flows directly to the output stage. Eliminates redundant duplication, removes additional deployment and storage requirements, and reduces infrastructure cost while improving throughput.
- Offset Mapping: State-preserving component replacement for streaming and message-based sources. New components resume from previously deployed offsets — no full reprocessing or duplicate generation during upgrades.
- Reduced system dependencies: The system pod log-manager dependency has been removed, lowering runtime overhead, improving resilience, and simplifying deployment.
- Repository access via Platform Layer service: Unified access layer; components no longer interact directly with MongoDB.

Module 5: Jobs

New Features:
- BDB Assist Agent (in Jobs): AI-guided job authoring, configuration validation, and troubleshooting recommendations. Available across the platform's development surfaces — see Data Science Lab for notebook-based assistance and Data Center for Data Quality rule authoring.
- Ray Job support: Distributed Python workloads on managed Ray infrastructure — training, tuning, and parallel transforms.
- Spark Job — Data Lakehouse Reader and Writer tasks: Large-scale Spark reads and writes against the Lakehouse.
- On-Demand Job as API: Expose jobs as REST endpoints with Client ID / Client Secret authentication and per-client rate limits.
- Concurrency policy for on-demand jobs: Configurable handling of simultaneous invocations — parallel execution, queuing, or rejection.
- Workload Isolation via Kubernetes Taints and Tolerations: Now supported across Jobs, Data Pipeline, and Data Science Lab orchestration — pin specific workloads to dedicated infrastructure, for example GPU nodes for ML training, high-memory nodes for large Spark jobs, or tenant-specific node pools for regulated workloads. This delivers predictable performance for critical workloads (no resource contention from noisy neighbors), clean cost attribution by line of business or use case, and the workload separation typically required for compliance, multi-tenancy, and chargeback scenarios.
- Kubernetes taints and tolerations: Schedule workloads on dedicated nodes for isolation and cost attribution.
- Spark lineage: End-to-end visibility of Spark data flow, integrated with the Catalog's unified lineage view.
- RBAC for Spark: Fine-grained permissions across Spark jobs, clusters, and related artifacts.
Enhancements:
- Job Trigger enhancements: More reliable trigger evaluation with expanded event-driven execution scenarios and multi-job dependency linking (e.g., Job D triggers based on the combined outcomes of Jobs B and C).

Module 6: Data Science Lab

New Features:
- UV-based virtual environments: UV — a modern, Rust-based Python package manager — is now used in both the UI and backend, delivering faster dependency resolution and notebook start-up.
- Delta Sharing: Consume shared Delta datasets from external providers directly in notebooks, with no duplication.
- Azure Blob connectivity: Native reads and writes against Azure Blob Storage; required JARs are bundled in the runtime.
- CatBoost and XGBoost support: Native support in the DS Notebook runtime and in Explainability Dashboards.
- Widgets integration: Inline controls for parameter tuning, data exploration, and visualization within notebooks.
- Ray Job execution: Run distributed Python workloads on the platform's Ray infrastructure directly from DS Lab workspaces.
- BDB Assist Agent (in Notebooks): AI-guided notebook authoring, code completion, and troubleshooting recommendations, available alongside UV and Ray within DS Lab.
- Workload Isolation via Kubernetes Taints and Tolerations: DS Lab orchestration can now schedule notebook and training workloads onto dedicated nodes — for example, pinning GPU-intensive training to GPU nodes. See Module 5: Jobs for the full description.
Enhancements:
- Persistent notebook saving — hardened: Reliable saves across project state transitions; no work is lost in long-running sessions.
- Notebook UI refinements: Cleaner navigation, improved cell controls, and greater visual consistency.
- UI logging for critical errors: Frontend errors are now captured and transmitted for faster diagnosis.
- Angular v16 upgrade: Improved performance and modernized build tooling.
- Python 3.12 upgrade: Latest language features and library compatibility.
- PySpark library updates: Azure, GCS, S3, Kafka, and model-version dependencies updated to the latest supported versions.

Module 7: Data Agent

New Features:
- Semantic-layer-grounded agents: Agents reason over Business Objects, governed metrics, classification tags, and quality signals — not raw schemas.
- Agent Evaluation Framework: Repeatable assessment of agent performance, response quality, and behavioral consistency.
- Data Agent mobile application: Dedicated iOS and Android client with conversational interaction, platform-standard secure authentication, push notifications, offline-friendly graceful handling, and workspace awareness consistent with the web experience.
Enhancements:
- Design system alignment: The Data Agent interface has been aligned with the current platform design system.

General Improvements & Bug Fixes (Cross-Module)

Unified repository access layer: All MongoDB access is now routed through the Platform Layer service, providing a unified access model and stronger governance.
Reduced runtime dependencies: The system pod log-manager dependency has been removed from core Data Engineering.
Security posture hardening: Penetration-test remediations and targeted vulnerability fixes across Catalog and DS Lab.
Observability foundation: Structured logs and Component Monitor enrichment underpin the Proactive Monitoring Dashboard.

Product Release Notes

Introduction

Key Highlights

Module-Specific Updates

General Improvements & Bug Fixes (Cross-Module)

Connect with BDB Expert