Data Mart

Project Overview

To build a modular and scalable Data Mart within Gaio DataOS, organizing operational data by domain (Customers, Orders, Products, and Order Items), enabling descriptive, operational, and predictive analytics.


Development Stages

1. Validation and Process Trigger

  • The project starts with an automatic data update check.

  • SQL Control Table creates tmp_tb_control_data_update.

  • Flow:

    • tmp_tb_control_data_update: checks for updates

    • tb_control_data_update: stores execution status

    • If new data is detected, the main extraction flow is triggered automatically.


2. Data Extraction

  • Data is extracted from data sources, CSV files or APIs at Gaio.

  • Uses Builder tasks for:

    • Data transformation

    • Handling nulls and errors

    • Standardizing formats

  • Intermediate results are stored in temporary tables (tmp_).

  • Final consolidation and versioning are done in final tables (tb_).


3. Domain-Based Structure

Each domain follows this modular structure:

Source → Staging Builder → Temporary Table (`tmp_`) → Final Builder → Final Table (`tb_`)

3.1 Customers Domain

  • Source: PostgreSQL (customers)

  • Steps:

    • Filter out invalid data

    • Normalize and enrich information

    • Build tmp_customers, then publish to tb_customers


3.2 Orders Domain

  • Source: PostgreSQL (orders)

  • Steps:

    • Calculate totals, shipping fees, estimated delivery

    • Status and failure normalization

    • tmp_orderstb_orders


3.3 Products Domain

  • Source: PostgreSQL (products)

  • Steps:

    • Enrich with category and availability

    • Add flags for discontinued products

    • tmp_productstb_products


3.4 Order Items Domain

  • Source: PostgreSQL (order_items)

  • Steps:

    • Clean up nulls and data types

    • Calculate taxes, discounts, unit values

    • tmp_order_itemstb_order_items


Technologies Used

  • Gaio DataOS as the core platform

  • Visual ETL with Builder, Form, Content, and SQL tasks

  • Temporary Tables (tmp_) for staging and audit

  • Final Tables (tb_) for downstream use

  • Dynamic Parameters for context handling and automation

  • Conditional logic to manage execution flow

  • Ready for integration with dashboards, AI and automation flows


Expected Outcomes

  • Solid and auditable data processing pipelines

  • Ability to rerun specific domains independently

  • Accelerated dashboard development

  • Reusable components for future projects

  • Improved data trust and governance for business users

  • Ready-to-use structure for forecasting, clustering, and churn models


Download this project

Last updated