Home > Database > Mysql Tutorial > How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?

How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?

Patricia Arquette
Release: 2024-12-13 05:48:12
Original
621 people have browsed it

How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?

Computing String Similarity in MySQL

In database management systems like MySQL, comparing the similarity of text strings is a common requirement. This article explores a versatile approach to calculate the similarity percentage between two strings using MySQL functions.

Calculating String Similarity Using Levenshtein Distance

The Levenshtein distance is a metric that measures the number of edits (insertions, deletions, or substitutions) required to transform one string into another. Higher similarity scores indicate closer resemblance between the strings.

In MySQL, the LEVENSHTEIN() function calculates the Levenshtein distance between two strings. To obtain the similarity percentage, we can use the following formula:

Similarity Percentage = (1 - (Levenshtein Distance / Length of Longest String)) * 100
Copy after login

MySQL Implementation

To implement this approach in MySQL, create the following two functions:

LEVENSHTEIN() Function:

CREATE FUNCTION `LEVENSHTEIN`(s1 TEXT, s2 TEXT) RETURNS INT(11)
DETERMINISTIC
BEGIN
    # ... Function implementation ...
END;
Copy after login

LEVENSHTEIN_RATIO() Function:

CREATE FUNCTION `LEVENSHTEIN_RATIO`(s1 TEXT, s2 TEXT) RETURNS INT(11)
DETERMINISTIC
BEGIN
    # ... Function implementation ...
END;
Copy after login

Example Usage

Considering the example provided in the question:

SET @a = "Welcome to Stack Overflow";
SET @b = "Hello to stack overflow";
Copy after login

The query to calculate the similarity percentage between @a and @b would be:

SELECT LEVENSHTEIN_RATIO(@a, @b) AS SimilarityPercentage;
Copy after login

This query would return a value of 60, indicating a 60% similarity between the two strings.

The above is the detailed content of How can I Calculate String Similarity Percentage in MySQL using Levenshtein Distance?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template