Association Rules
Last updated
Last updated
Very popular on ecommerce sites, Association Rules or Basket Analysis identify relationships between products. This type of technique makes it possible to offer products that other people have purchased when browsing a specific product page.
Many other applications are possible with this type of technique, such as identifying fraud in tenders (associations between participating companies), identifying affinities between people, among other applications.
Gaio uses to perform the calculations.
To identify associations between your data, click on the table that contains at least two columns (Transaction and Item).
In the Studio, go to the Tasks panel.
Under the Analytics section, select on Association Rules.
Task label: (optional) Name for identifying this step in your flow.
Result table: Name of the output table that will store the forecast results (e.g., basket_association_rules).
Source table: Automatically populated with the selected table (e.g., association_rules).
Minimum Support : The minimum frequency an itemset must appear to be considered (e.g., 0.2
= 20%). Amount of sales in which the two "products" were together divided by total sales.
Minimum Confidence : The minimum confidence level for a rule to be accepted (e.g., 0.8
= 80%). Given that product A was sold, what is the probability that B will be sold?
Column ID : Column representing the unique ID of the transaction or user (e.g., order_id
)
Category : Column containing the item, product, or event to be analyzed (e.g., product_name
).
Having configured this information, the task can be run to search for associations.
Applies association rule mining algorithms (like Apriori).
Generates "if-then" style rules that highlight relationships between items.
Produces a table containing the most relevant combinations, based on support, confidence, and lift.
Given a dataset with:
order_id
→ transaction identifier
product_name
→ purchased items
As a result of executing the technique, a table containing the associations is generated. Each line represents an association found based on two defined criteria (minimum support and minimum confidence).
antecedents : In this column appear one or more "products" that, if "purchased", increase the probability of what is in the consequents being purchased.
consequents: Here are presented the "products" that are enhanced if the antecedents are sold.
The columns with relationship statistics are:
Indicator
Formula
Variation
support(A->B)
support(A+B)
0 to 1
confidence(A->B)
support(A+B) / support(A)
0 to 1
lift(A->B)
confidence(A->B) / support(B)
0 to information
leverage(A->B)
support(A->B) – support(A)*support(B)
0 to 1
conviction
[1 – support(B)] / [1 – confidence(A->B)]
0 to information
Use datasets that contain multiple items per transaction (e.g., shopping carts, bundles, user actions).
Apply filters or segmentation before running the task to refine rule generation.
Visualize results using charts or networks based on rule confidence, support, or lift.