Sample

The Sample task in Gaio DataOS allows you to extract a subset of data from a table in a simple and controlled way. This functionality is ideal for testing, validation, initial visualizations, or preprocessing in Machine Learning workflows.
This task can only be used when a table is selected in the flow.️
How to Use the Sample Task
1. Add the Sample Task to Your Flow
In the Studio, go to the Tasks panel.
Under the Analytics section, select Sample task.
2. Configure the Main Fields
Task label: (optional) Provide a name for this task within your flow.(default:
sample
)Result table: name of the output table that will contain the sampled data (e.g.,
sample_sample
)
3. Choose the Sampling Type
You can choose between two options:
Percentage
Allows you to define the percentage of rows to be sampled from the original table.
You can adjust the slider or manually input the value.
Example:
0.7
(70%) → returns 70% of the rows from the source table.
Rows
Allows you to define a fixed number of rows to extract as a sample.
Example:
1,000
→ the output table will contain exactly 1,000 randomly selected rows.
4. Save and Execute
Once you’ve configured the sample type and value, click Save.
Run the flow — a new table will be generated based on the selected sample configuration.
Best Practices
Use the Sample task to:
Reduce dataset size during development or dashboard previews.
Create smaller datasets for training ML models.
Test queries and transformations without processing the full dataset.
Combine with other tasks like AutoML, Cluster, or Scoring to streamline your experimentation and modeling.
All columns from the source table will be present in the random table. Only the number of lines will be smaller.
Last updated