Data Mart

Project Overview
To build a modular and scalable Data Mart within Gaio DataOS, organizing operational data by domain (Customers, Orders, Products, and Order Items), enabling descriptive, operational, and predictive analytics.
Development Stages
1. Validation and Process Trigger

The project starts with an automatic data update check.
SQL Control Table creates tmp_tb_control_data_update.
Flow:
tmp_tb_control_data_update
: checks for updatestb_control_data_update
: stores execution statusIf new data is detected, the main extraction flow is triggered automatically.
2. Data Extraction

Data is extracted from data sources, CSV files or APIs at Gaio.
Uses Builder tasks for:
Data transformation
Handling nulls and errors
Standardizing formats
Intermediate results are stored in temporary tables (
tmp_
).Final consolidation and versioning are done in final tables (
tb_
).
3. Domain-Based Structure
Each domain follows this modular structure:
Source → Staging Builder → Temporary Table (`tmp_`) → Final Builder → Final Table (`tb_`)
3.1 Customers Domain

Source: PostgreSQL (
customers
)Steps:
Filter out invalid data
Normalize and enrich information
Build
tmp_customers
, then publish totb_customers
3.2 Orders Domain

Source: PostgreSQL (
orders
)Steps:
Calculate totals, shipping fees, estimated delivery
Status and failure normalization
tmp_orders
→tb_orders
3.3 Products Domain

Source: PostgreSQL (
products
)Steps:
Enrich with category and availability
Add flags for discontinued products
tmp_products
→tb_products
3.4 Order Items Domain

Source: PostgreSQL (
order_items
)Steps:
Clean up nulls and data types
Calculate taxes, discounts, unit values
tmp_order_items
→tb_order_items
Technologies Used
Gaio DataOS as the core platform
Visual ETL with Builder, Form, Content, and SQL tasks
Temporary Tables (
tmp_
) for staging and auditFinal Tables (
tb_
) for downstream useDynamic Parameters for context handling and automation
Conditional logic to manage execution flow
Ready for integration with dashboards, AI and automation flows
Expected Outcomes
Solid and auditable data processing pipelines
Ability to rerun specific domains independently
Accelerated dashboard development
Reusable components for future projects
Improved data trust and governance for business users
Ready-to-use structure for forecasting, clustering, and churn models
Download this project
Last updated