Sample

The Sample task in Gaio DataOS allows you to extract a subset of data from a table in a simple and controlled way. This functionality is ideal for testing, validation, initial visualizations, or preprocessing in Machine Learning workflows.

circle-exclamation

How to Use the Sample Task


1. Add the Sample Task to Your Flow

  • In the Studio, go to the Tasks panel.

  • Under the Analytics section, select Sample task.


2. Configure the Main Fields

  • Task label: (optional) Provide a name for this task within your flow.(default: sample)

  • Result table: name of the output table that will contain the sampled data (e.g., sample_sample)


3. Choose the Sampling Type

You can choose between two options:

Percentage

  • Allows you to define the percentage of rows to be sampled from the original table.

  • You can adjust the slider or manually input the value.

  • Example: 0.7 (70%) → returns 70% of the rows from the source table.

Rows

  • Allows you to define a fixed number of rows to extract as a sample.

  • Example: 1,000 → the output table will contain exactly 1,000 randomly selected rows.


4. Save and Execute

  • Once you’ve configured the sample type and value, click Save.

  • Run the flow — a new table will be generated based on the selected sample configuration.


Best Practices

  • Use the Sample task to:

    • Reduce dataset size during development or dashboard previews.

    • Create smaller datasets for training ML models.

    • Test queries and transformations without processing the full dataset.

  • Combine with other tasks like AutoML, Cluster, or Scoring to streamline your experimentation and modeling.

circle-check

Last updated