How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?-Mysql Tutorial-php.cn

How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?

Patricia Arquette

Release： 2024-12-13 05:48:12

Original

621 people have browsed it

How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?

Computing String Similarity in MySQL

In database management systems like MySQL, comparing the similarity of text strings is a common requirement. This article explores a versatile approach to calculate the similarity percentage between two strings using MySQL functions.

Calculating String Similarity Using Levenshtein Distance

The Levenshtein distance is a metric that measures the number of edits (insertions, deletions, or substitutions) required to transform one string into another. Higher similarity scores indicate closer resemblance between the strings.

In MySQL, the LEVENSHTEIN() function calculates the Levenshtein distance between two strings. To obtain the similarity percentage, we can use the following formula:

Similarity Percentage = (1 - (Levenshtein Distance / Length of Longest String)) * 100

Copy after login

MySQL Implementation

To implement this approach in MySQL, create the following two functions:

LEVENSHTEIN() Function:

CREATE FUNCTION `LEVENSHTEIN`(s1 TEXT, s2 TEXT) RETURNS INT(11)
DETERMINISTIC
BEGIN
    # ... Function implementation ...
END;

Copy after login

LEVENSHTEIN_RATIO() Function:

CREATE FUNCTION `LEVENSHTEIN_RATIO`(s1 TEXT, s2 TEXT) RETURNS INT(11)
DETERMINISTIC
BEGIN
    # ... Function implementation ...
END;

Copy after login

Example Usage

Considering the example provided in the question:

SET @a = "Welcome to Stack Overflow";
SET @b = "Hello to stack overflow";

Copy after login

The query to calculate the similarity percentage between @a and @b would be:

SELECT LEVENSHTEIN_RATIO(@a, @b) AS SimilarityPercentage;

Copy after login

This query would return a value of 60, indicating a 60% similarity between the two strings.

The above is the detailed content of How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?. For more information, please follow other related articles on the PHP Chinese website!