Gaio DataOS
Gaio DataOS
Gaio DataOS
  • 👋 Welcome to Gaio DataOS
  • GETTING STARTED
    • Gaio DataOS Console
    • Quickstart
  • FUNDAMENTALS
    • Data Projects
    • Studio
    • Database
    • Workflow
  • Data Sources
  • TASKS
    • ETL
      • Builder
      • SQL
      • Source SQL
      • Insert Table
      • Insert Row
      • Update
      • Delete
      • Create Table
      • Quick Table
      • Quick Upload
      • Pivot Table
      • Unpivot Table
      • REST
      • Parameters to Table
      • Table to Parameters
      • Define parameter value
      • Users
      • CSV Web
      • CSV Local
      • Google Spreadsheet
    • Analytics
      • Sample
      • Cluster
      • Principal Components
      • Association Rules
      • Forecast
      • Python
    • Delivery
      • Content
      • Form Card
      • Export CSV
    • Map Editor
Powered by GitBook
On this page
  • How to Use the Sample Task
  • 1. Add the Sample Task to Your Flow
  • 2. Configure the Main Fields
  • 3. Choose the Sampling Type
  • 4. Save and Execute
  1. TASKS
  2. Analytics

Sample

PreviousAnalyticsNextCluster

Last updated 3 days ago

The Sample task in Gaio DataOS allows you to extract a subset of data from a table in a simple and controlled way. This functionality is ideal for testing, validation, initial visualizations, or preprocessing in Machine Learning workflows.

This task can only be used when a table is selected in the flow.️


How to Use the Sample Task


1. Add the Sample Task to Your Flow

  • In the Studio, go to the Tasks panel.

  • Under the Analytics section, select Sample task.


2. Configure the Main Fields

  • Task label: (optional) Provide a name for this task within your flow.(default: sample)

  • Result table: name of the output table that will contain the sampled data (e.g., sample_sample)


3. Choose the Sampling Type

You can choose between two options:

Percentage

  • Allows you to define the percentage of rows to be sampled from the original table.

  • You can adjust the slider or manually input the value.

  • Example: 0.7 (70%) → returns 70% of the rows from the source table.

Rows

  • Allows you to define a fixed number of rows to extract as a sample.

  • Example: 1,000 → the output table will contain exactly 1,000 randomly selected rows.


4. Save and Execute

  • Once you’ve configured the sample type and value, click Save.

  • Run the flow — a new table will be generated based on the selected sample configuration.


Best Practices

  • Use the Sample task to:

    • Reduce dataset size during development or dashboard previews.

    • Create smaller datasets for training ML models.

    • Test queries and transformations without processing the full dataset.

  • Combine with other tasks like AutoML, Cluster, or Scoring to streamline your experimentation and modeling.

All columns from the source table will be present in the random table. Only the number of lines will be smaller.