Home Database Mysql Tutorial How to use MySQL database for association rule mining?

How to use MySQL database for association rule mining?

Jul 12, 2023 pm 08:06 PM
Instructions mysql database Association rule mining

How to use MySQL database for association rule mining?

Introduction:
Association rule mining is a data mining technology used to discover the correlation between items in the data set. MySQL is a widely used relational database management system with powerful data processing and query functions. This article will introduce how to use MySQL database for association rule mining, including data preparation, association rule mining algorithm, SQL statement implementation, and code examples.

1. Data preparation
Before mining association rules, you first need to prepare a suitable data set. The data set is the basis for association rule mining, which contains the transactions and item sets that need to be mined. In MySQL, data sets can be stored by creating data tables. For example, assuming we want to mine association rules in shopping basket data, we can create a data table named "transactions" to store each customer's shopping records, where each record contains multiple purchases by a customer.

CREATE TABLE transactions (
customer_id INT,
item_id INT
);

Then insert the shopping basket data into the data table:

INSERT INTO transactions (customer_id, item_id) VALUES
(1, 101),
(1, 102),
(1, 103),
(2, 101),
(2, 104),
(3, 102),
(3, 105),
(4, 101),
(4, 103),
(4, 104);

2. Association rule mining algorithm
Common association rule mining algorithms include Apriori algorithm and FP-Growth algorithm. The Apriori algorithm is an iterative algorithm based on candidate sets that discovers frequent item sets and association rules by gradually generating candidate sets and calculating support thresholds. FP-Growth algorithm is a prefix tree-based algorithm that can efficiently mine frequent item sets and association rules. In MySQL, we can use SQL statements to implement these two algorithms.

3. SQL statement implementation

  1. Apriori algorithm
    The Apriori algorithm includes two steps: frequent item set generation and association rule generation. First, generate frequent item sets through the following SQL statement:

SELECT item_id, COUNT(*) AS support
FROM transactions
GROUP BY item_id
HAVING support >= min_support;

Among them, "item_id" is the item in the item set, "support" is the support of the item set, and "min_support" is the set minimum support threshold. This SQL statement will return frequent itemsets that meet the minimum support requirements.

Then, generate association rules through the following SQL statement:

SELECT t1.item_id AS antecedent, t2.item_id AS consequent,

   COUNT(*) / (SELECT COUNT(*) FROM transactions) AS confidence
Copy after login
Copy after login
Copy after login
Copy after login

FROM transactions AS t1, transactions AS t2
WHERE t1.item_id != t2.item_id
GROUP BY t1.item_id, t2.item_id
HAVING confidence >= min_confidence;

Among them, "antecedent" is the preceding item of the rule , "consequent" is the consequent of the rule, "confidence" is the confidence of the rule, and "min_confidence" is the minimum confidence threshold set. This SQL statement will return association rules that meet the minimum confidence requirements.

  1. FP-Growth algorithm
    FP-Growth algorithm mines frequent item sets and association rules by building a prefix tree. In MySQL, the FP-Growth algorithm can be implemented using temporary tables and user-defined variables.

First, create a temporary table to store the frequent itemset of the item:

CREATE TEMPORARY TABLE frequent_items (
item_id INT,
support INT
);

Then, generate frequent itemsets through the following SQL statement:

INSERT INTO frequent_items
SELECT item_id, COUNT(*) AS support
FROM transactions
GROUP BY item_id
HAVING support >= min_support;

Next, create a user-defined variable to store the set of frequent items:

SET @frequent_items = '';

Then, Generate association rules through the following SQL statement:

SELECT t1.item_id AS antecedent, t2.item_id AS consequent,

   COUNT(*) / (SELECT COUNT(*) FROM transactions) AS confidence
Copy after login
Copy after login
Copy after login
Copy after login

FROM transactions AS t1, transactions AS t2
WHERE t1.item_id ! = t2.item_id
AND FIND_IN_SET(t1.item_id, @frequent_items) > 0
AND FIND_IN_SET(t2.item_id, @frequent_items) > 0
GROUP BY t1.item_id, t2.item_id
HAVING confidence >= min_confidence;

Finally, update the user-defined variables through the following SQL statement:

SET @frequent_items = (SELECT GROUP_CONCAT(item_id) FROM frequent_items);

4. Code Example
The following is a code example for association rule mining using MySQL database:

--Create data table
CREATE TABLE transactions (
customer_id INT,
item_id INT
);

--Insert shopping basket data
INSERT INTO transactions (customer_id, item_id) VALUES
(1, 101),
(1, 102),
(1, 103),
(2, 101),
(2, 104),
(3, 102),
(3, 105),
(4, 101) ,
(4, 103),
(4, 104);

-- Apriori algorithm
-- Generate frequent item sets
SELECT item_id, COUNT(*) AS support
FROM transactions
GROUP BY item_id
HAVING support >= 2;

-- Generate association rules
SELECT t1.item_id AS antecedent, t2.item_id AS consequent,

   COUNT(*) / (SELECT COUNT(*) FROM transactions) AS confidence
Copy after login
Copy after login
Copy after login
Copy after login

FROM transactions AS t1, transactions AS t2
WHERE t1.item_id != t2.item_id
GROUP BY t1.item_id, t2.item_id
HAVING confidence >= 0.5;

-- FP-Growth algorithm
-- Create temporary table
CREATE TEMPORARY TABLE frequent_items (
item_id INT,
support INT
);

-- Generate frequent itemsets
INSERT INTO frequent_items
SELECT item_id, COUNT(*) AS support
FROM transactions
GROUP BY item_id
HAVING support >= 2;

--Create user-defined variables
SET @frequent_items = '';

--Generate association rules
SELECT t1.item_id AS antecedent, t2.item_id AS consequent,

   COUNT(*) / (SELECT COUNT(*) FROM transactions) AS confidence
Copy after login
Copy after login
Copy after login
Copy after login

FROM transactions AS t1, transactions AS t2
WHERE t1.item_id != t2.item_id
AND FIND_IN_SET(t1.item_id, @frequent_items) > 0
AND FIND_IN_SET(t2.item_id, @frequent_items) > 0
GROUP BY t1.item_id, t2.item_id
HAVING confidence >= 0.5;

Conclusion:
Through the introduction of this article, we understand how to use MySQL Database for association rule mining. Both the Apriori algorithm and the FP-Growth algorithm can be implemented through SQL statements. I hope this article will be helpful to you when using MySQL for association rule mining.

The above is the detailed content of How to use MySQL database for association rule mining?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to use DirectX repair tool? Detailed usage of DirectX repair tool How to use DirectX repair tool? Detailed usage of DirectX repair tool Mar 15, 2024 am 08:31 AM

How to use DirectX repair tool? Detailed usage of DirectX repair tool

Introduction to HTTP 525 status code: explore its definition and application Introduction to HTTP 525 status code: explore its definition and application Feb 18, 2024 pm 10:12 PM

Introduction to HTTP 525 status code: explore its definition and application

How to use Baidu Netdisk-How to use Baidu Netdisk How to use Baidu Netdisk-How to use Baidu Netdisk Mar 04, 2024 pm 09:28 PM

How to use Baidu Netdisk-How to use Baidu Netdisk

Learn to copy and paste quickly Learn to copy and paste quickly Feb 18, 2024 pm 03:25 PM

Learn to copy and paste quickly

What is the KMS activation tool? How to use the KMS activation tool? How to use KMS activation tool? What is the KMS activation tool? How to use the KMS activation tool? How to use KMS activation tool? Mar 18, 2024 am 11:07 AM

What is the KMS activation tool? How to use the KMS activation tool? How to use KMS activation tool?

How to use Xiaoma win7 activation tool - How to use Xiaoma win7 activation tool How to use Xiaoma win7 activation tool - How to use Xiaoma win7 activation tool Mar 04, 2024 pm 06:16 PM

How to use Xiaoma win7 activation tool - How to use Xiaoma win7 activation tool

What is PyCharm? Function introduction and detailed explanation of usage What is PyCharm? Function introduction and detailed explanation of usage Feb 20, 2024 am 09:21 AM

What is PyCharm? Function introduction and detailed explanation of usage

How to correctly use the win10 command prompt for automatic repair operations How to correctly use the win10 command prompt for automatic repair operations Dec 30, 2023 pm 03:17 PM

How to correctly use the win10 command prompt for automatic repair operations

See all articles