Gaio DataOS
Gaio DataOS
Gaio DataOS
  • 👋 Welcome to Gaio DataOS
  • GETTING STARTED
    • Gaio DataOS Console
    • Quickstart
  • FUNDAMENTALS
    • Data Projects
    • Studio
    • Database
    • Workflow
  • Data Sources
  • TASKS
    • ETL
      • Builder
      • SQL
      • Source SQL
      • Insert Table
      • Insert Row
      • Update
      • Delete
      • Create Table
      • Quick Table
      • Quick Upload
      • Pivot Table
      • Unpivot Table
      • REST
      • Parameters to Table
      • Table to Parameters
      • Define parameter value
      • Users
      • CSV Web
      • CSV Local
      • Google Spreadsheet
    • Analytics
      • Sample
      • Cluster
      • Principal Components
      • Association Rules
      • Forecast
      • Python
    • Delivery
      • Content
      • Form Card
      • Export CSV
    • Map Editor
Powered by GitBook
On this page
  • How to Configure the Association Rules Task
  • 1. Open the Forecast Task
  • 2. Fill in the Required Fields
  • 3. What This Task Does
  • 4. Results
  1. TASKS
  2. Analytics

Association Rules

PreviousPrincipal ComponentsNextForecast

Last updated 2 days ago

Very popular on ecommerce sites, Association Rules or Basket Analysis identify relationships between products. This type of technique makes it possible to offer products that other people have purchased when browsing a specific product page.

Many other applications are possible with this type of technique, such as identifying fraud in tenders (associations between participating companies), identifying affinities between people, among other applications.

Gaio uses to perform the calculations.


How to Configure the Association Rules Task


1. Open the Forecast Task

  • To identify associations between your data, click on the table that contains at least two columns (Transaction and Item).

  • In the Studio, go to the Tasks panel.

  • Under the Analytics section, select on Association Rules.


2. Fill in the Required Fields

  • Task label: (optional) Name for identifying this step in your flow.

  • Result table: Name of the output table that will store the forecast results (e.g., basket_association_rules).

  • Source table: Automatically populated with the selected table (e.g., association_rules).

  • Minimum Support : The minimum frequency an itemset must appear to be considered (e.g., 0.2 = 20%). Amount of sales in which the two "products" were together divided by total sales.

  • Minimum Confidence : The minimum confidence level for a rule to be accepted (e.g., 0.8 = 80%). Given that product A was sold, what is the probability that B will be sold?

  • Column ID : Column representing the unique ID of the transaction or user (e.g., order_id)

  • Category : Column containing the item, product, or event to be analyzed (e.g., product_name).

Having configured this information, the task can be run to search for associations.


3. What This Task Does

  • Applies association rule mining algorithms (like Apriori).

  • Generates "if-then" style rules that highlight relationships between items.

  • Produces a table containing the most relevant combinations, based on support, confidence, and lift.


Example Use Case

Given a dataset with:

  • order_id → transaction identifier

  • product_name → purchased items


4. Results

As a result of executing the technique, a table containing the associations is generated. Each line represents an association found based on two defined criteria (minimum support and minimum confidence).

  • antecedents : In this column appear one or more "products" that, if "purchased", increase the probability of what is in the consequents being purchased.

  • consequents: Here are presented the "products" that are enhanced if the antecedents are sold.

The columns with relationship statistics are:

Indicator

Formula

Variation

support(A->B)

support(A+B)

0 to 1

confidence(A->B)

support(A+B) / support(A)

0 to 1

lift(A->B)

confidence(A->B) / support(B)

0 to information

leverage(A->B)

support(A->B) – support(A)*support(B)

0 to 1

conviction

[1 – support(B)] / [1 – confidence(A->B)]

0 to information


Best Practices

  • Use datasets that contain multiple items per transaction (e.g., shopping carts, bundles, user actions).

  • Apply filters or segmentation before running the task to refine rule generation.

  • Visualize results using charts or networks based on rule confidence, support, or lift.

mlxtend