Table of Contents
Introducing the
Use full-text index
Test full text index
两种全文索引
总结
Home Database Mysql Tutorial What is mysql full text index

What is mysql full text index

Apr 23, 2023 pm 07:03 PM
mysql

In mysql, full-text indexing is a technology to find any information in the entire book or entire article stored in the database. Most of the queries we need can be completed through numerical comparison, range filtering, etc. However, if you want to filter the query through keyword matching, you need a query based on similarity instead of the original precise numerical comparison; and Full-text indexing is designed for this scenario.

What is mysql full text index

The operating environment of this tutorial: windows7 system, mysql8 version, Dell G3 computer.

Introducing the


## concept

Full-text index (Full-Text Search) is a technology to find any information in the entire book or entire article stored in the database. It can obtain information about chapters, sections, paragraphs, sentences, words, etc. in the full text as needed, and can also perform various statistics and analysis. Full-text indexing is generally implemented through inverted indexing.

Most of the queries we need can be completed through numerical comparison, range filtering, etc. However, if you want to filter queries through keyword matching, you need to query based on similarity instead of the original precise numerical comparison. Full-text indexing is designed for this scenario.

You may say, you can use like % to achieve fuzzy matching, why do you need full-text indexing? like % is suitable when the text is relatively small, but it is unimaginable for retrieval of a large amount of text data. Full-text indexing can be N times faster than like % in the face of a large amount of data. The speed is not an order of magnitude, but full-text indexing may have accuracy issues.

You may not have paid attention to full-text indexing, but you should be familiar with at least one full-text indexing technology: various search engines. Although the index objects of search engines are extremely large amounts of data, and usually there is not a relational database behind them, the basic principles of full-text indexing are the same.

Version support

Before we begin, let’s talk about the full-text index version, storage engine, and data type support

    Before MySQL 5.6, only the MyISAM storage engine supports full-text index;
  1. In MySQL 5.6 and later versions, both MyISAM and InnoDB storage engines support full-text index;
  2. Only fields Full-text indexes can be built only if the data types are char, varchar, text and their series.
When testing or using full-text index, you must first check whether your MySQL version, storage engine and data type support full-text index.

Operation of full-text index


The operations of the index can be found in any search, but I will repeat them here.


Create

    Create a full-text index when creating a table
  1. create table fulltext_test (
        id int(11) NOT NULL AUTO_INCREMENT,
        content text NOT NULL,
        tag varchar(255),    PRIMARY KEY (id),
        FULLTEXT KEY content_tag_fulltext(content,tag)  // 创建联合全文索引列
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
    Create a full-text index on an existing table
  1. create fulltext index content_tag_fulltext    on fulltext_test(content,tag);
    Create a full-text index through the SQL statement ALTER TABLE
  1. alter table fulltext_test    add fulltext index content_tag_fulltext(content,tag);

Modify

Modify O, delete and rebuild directly.

Delete

    Use DROP INDEX directly to delete the full-text index
  1. drop index content_tag_fulltext    on fulltext_test;
    Through SQL Statement ALTER TABLE deletes full-text index
  1. alter table fulltext_test    drop index content_tag_fulltext;

Use full-text index


Different from commonly used fuzzy matching like %, full-text index has its own syntax Format, use match and against keywords, such as


select * from fulltext_test 
    where match(content,tag) against('xxx xxx');

Note: The columns specified in the match() function must be exactly the same as the columns specified in the full-text index, otherwise it will An error is reported and the full-text index cannot be used. This is because the full-text index does not record which column the keyword comes from. If you want to use a full-text index for a column, create a separate full-text index for that column.

Test full text index


Add test data

Yes With the above knowledge, you can test the full-text index.

First create the test table and insert the test data

create table test (
    id int(11) unsigned not null auto_increment,
    content text not null,    primary key(id),
    fulltext key content_index(content)
) engine=MyISAM default charset=utf8;insert into test (content) values ('a'),('b'),('c');insert into test (content) values ('aa'),('bb'),('cc');insert into test (content) values ('aaa'),('bbb'),('ccc');insert into test (content) values ('aaaa'),('bbbb'),('cccc');

Execute the following query according to the syntax of the full-text index

select * from test where match(content) against('a');select * from test where match(content) against('aa');select * from test where match(content) against('aaa');

According to our inertial thinking, 4 records should be displayed. Yes, but the result is that there is no record. Only when the following query is executed,

select * from test where match(content) against('aaaa');

will find the

aaaa record.

Why? There are many reasons for this problem, the most common of which is

Minimum search length. In addition, when using full-text index, there must be at least 4 records in the test table, otherwise, unexpected results will occur.

The full-text index in MySQL has two variables, the minimum search length and the maximum search length. Words whose length is less than the minimum search length and greater than the maximum search length will not be indexed. In layman's terms, if you want to use full-text index search for a word, the length of the word must be within the range of the above two variables.

The default values ​​of these two can be viewed using the following command

show variables like '%ft%';

可以看到这两个变量在 MyISAM 和 InnoDB 两种存储引擎下的变量名和默认值

// MyISAM
ft_min_word_len = 4;
ft_max_word_len = 84;

// InnoDB
innodb_ft_min_token_size = 3;
innodb_ft_max_token_size = 84;

可以看到最小搜索长度 MyISAM 引擎下默认是 4,InnoDB 引擎下是 3,也即,MySQL 的全文索引只会对长度大于等于 4 或者 3 的词语建立索引,而刚刚搜索的只有 aaaa 的长度大于等于 4。

配置最小搜索长度

全文索引的相关参数都无法进行动态修改,必须通过修改 MySQL 的配置文件来完成。修改最小搜索长度的值为 1,首先打开 MySQL 的配置文件 /etc/my.cnf,在 [mysqld] 的下面追加以下内容

[mysqld]innodb_ft_min_token_size = 1ft_min_word_len = 1

然后重启 MySQL 服务器,并修复全文索引。注意,修改完参数以后,一定要修复下索引,不然参数不会生效。

两种修复方式,可以使用下面的命令修复

repair table test quick;

或者直接删掉重新建立索引,再次执行上面的查询,a、aa、aaa 就都可以查出来了。

但是,这里还有一个问题,搜索关键字 a 时,为什么 aa、aaa、aaaa 没有出现结果中,讲这个问题之前,先说说两种全文索引。

两种全文索引


自然语言的全文索引

默认情况下,或者使用 in natural language mode 修饰符时,match() 函数对文本集合执行自然语言搜索,上面的例子都是自然语言的全文索引。

自然语言搜索引擎将计算每一个文档对象和查询的相关度。这里,相关度是基于匹配的关键词的个数,以及关键词在文档中出现的次数。在整个索引中出现次数越少的词语,匹配时的相关度就越高。相反,非常常见的单词将不会被搜索,如果一个词语的在超过 50% 的记录中都出现了,那么自然语言的搜索将不会搜索这类词语。上面提到的,测试表中必须有 4 条以上的记录,就是这个原因。

这个机制也比较好理解,比如说,一个数据表存储的是一篇篇的文章,文章中的常见词、语气词等等,出现的肯定比较多,搜索这些词语就没什么意义了,需要搜索的是那些文章中有特殊意义的词,这样才能把文章区分开。

布尔全文索引

在布尔搜索中,我们可以在查询中自定义某个被搜索的词语的相关性,当编写一个布尔搜索查询时,可以通过一些前缀修饰符来定制搜索。

MySQL 内置的修饰符,上面查询最小搜索长度时,搜索结果 ft_boolean_syntax 变量的值就是内置的修饰符,下面简单解释几个,更多修饰符的作用可以查手册

  • + 必须包含该词
  • - 必须不包含该词
  • > 提高该词的相关性,查询的结果靠前
  • < 降低该词的相关性,查询的结果靠后
  • (*)星号 通配符,只能接在词后面

对于上面提到的问题,可以使用布尔全文索引查询来解决,使用下面的命令,a、aa、aaa、aaaa 就都被查询出来了。

select * test where match(content) against(&#39;a*&#39; in boolean mode);

总结


好了,差不多写完了,又到了总结的时候。

MySQL 的全文索引最开始仅支持英语,因为英语的词与词之间有空格,使用空格作为分词的分隔符是很方便的。亚洲文字,比如汉语、日语、汉语等,是没有空格的,这就造成了一定的限制。不过 MySQL 5.7.6 开始,引入了一个 ngram 全文分析器来解决这个问题,并且对 MyISAM 和 InnoDB 引擎都有效。

事实上,MyISAM 存储引擎对全文索引的支持有很多的限制,例如表级别锁对性能的影响、数据文件的崩溃、崩溃后的恢复等,这使得 MyISAM 的全文索引对于很多的应用场景并不适合。所以,多数情况下的建议是使用别的解决方案,例如 Sphinx、Lucene 等等第三方的插件,亦或是使用 InnoDB 存储引擎的全文索引。

几个注意点

  1. 使用全文索引前,搞清楚版本支持情况;
  2. 全文索引比 like + % 快 N 倍,但是可能存在精度问题;
  3. 如果需要全文索引的是大量数据,建议先添加数据,再创建索引;
  4. 对于中文,可以使用 MySQL 5.7.6 之后的版本,或者第三方插件。

【相关推荐:mysql视频教程

The above is the detailed content of What is mysql full text index. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1503
276
How to use PHP to develop a Q&A community platform Detailed explanation of PHP interactive community monetization model How to use PHP to develop a Q&A community platform Detailed explanation of PHP interactive community monetization model Jul 23, 2025 pm 07:21 PM

1. The first choice for the Laravel MySQL Vue/React combination in the PHP development question and answer community is the first choice for Laravel MySQL Vue/React combination, due to its maturity in the ecosystem and high development efficiency; 2. High performance requires dependence on cache (Redis), database optimization, CDN and asynchronous queues; 3. Security must be done with input filtering, CSRF protection, HTTPS, password encryption and permission control; 4. Money optional advertising, member subscription, rewards, commissions, knowledge payment and other models, the core is to match community tone and user needs.

How to set environment variables in PHP environment Description of adding PHP running environment variables How to set environment variables in PHP environment Description of adding PHP running environment variables Jul 25, 2025 pm 08:33 PM

There are three main ways to set environment variables in PHP: 1. Global configuration through php.ini; 2. Passed through a web server (such as SetEnv of Apache or fastcgi_param of Nginx); 3. Use putenv() function in PHP scripts. Among them, php.ini is suitable for global and infrequently changing configurations, web server configuration is suitable for scenarios that need to be isolated, and putenv() is suitable for temporary variables. Persistence policies include configuration files (such as php.ini or web server configuration), .env files are loaded with dotenv library, and dynamic injection of variables in CI/CD processes. Security management sensitive information should be avoided hard-coded, and it is recommended to use.en

How to use PHP to develop product recommendation module PHP recommendation algorithm and user behavior analysis How to use PHP to develop product recommendation module PHP recommendation algorithm and user behavior analysis Jul 23, 2025 pm 07:00 PM

To collect user behavior data, you need to record browsing, search, purchase and other information into the database through PHP, and clean and analyze it to explore interest preferences; 2. The selection of recommendation algorithms should be determined based on data characteristics: based on content, collaborative filtering, rules or mixed recommendations; 3. Collaborative filtering can be implemented in PHP to calculate user cosine similarity, select K nearest neighbors, weighted prediction scores and recommend high-scoring products; 4. Performance evaluation uses accuracy, recall, F1 value and CTR, conversion rate and verify the effect through A/B tests; 5. Cold start problems can be alleviated through product attributes, user registration information, popular recommendations and expert evaluations; 6. Performance optimization methods include cached recommendation results, asynchronous processing, distributed computing and SQL query optimization, thereby improving recommendation efficiency and user experience.

How to build an online customer service robot with PHP. PHP intelligent customer service implementation technology How to build an online customer service robot with PHP. PHP intelligent customer service implementation technology Jul 25, 2025 pm 06:57 PM

PHP plays the role of connector and brain center in intelligent customer service, responsible for connecting front-end input, database storage and external AI services; 2. When implementing it, it is necessary to build a multi-layer architecture: the front-end receives user messages, the PHP back-end preprocesses and routes requests, first matches the local knowledge base, and misses, call external AI services such as OpenAI or Dialogflow to obtain intelligent reply; 3. Session management is written to MySQL and other databases by PHP to ensure context continuity; 4. Integrated AI services need to use Guzzle to send HTTP requests, safely store APIKeys, and do a good job of error handling and response analysis; 5. Database design must include sessions, messages, knowledge bases, and user tables, reasonably build indexes, ensure security and performance, and support robot memory

How to develop AI intelligent form system with PHP PHP intelligent form design and analysis How to develop AI intelligent form system with PHP PHP intelligent form design and analysis Jul 25, 2025 pm 05:54 PM

When choosing a suitable PHP framework, you need to consider comprehensively according to project needs: Laravel is suitable for rapid development and provides EloquentORM and Blade template engines, which are convenient for database operation and dynamic form rendering; Symfony is more flexible and suitable for complex systems; CodeIgniter is lightweight and suitable for simple applications with high performance requirements. 2. To ensure the accuracy of AI models, we need to start with high-quality data training, reasonable selection of evaluation indicators (such as accuracy, recall, F1 value), regular performance evaluation and model tuning, and ensure code quality through unit testing and integration testing, while continuously monitoring the input data to prevent data drift. 3. Many measures are required to protect user privacy: encrypt and store sensitive data (such as AES

How to make PHP container support automatic construction? Continuously integrated CI configuration method of PHP environment How to make PHP container support automatic construction? Continuously integrated CI configuration method of PHP environment Jul 25, 2025 pm 08:54 PM

To enable PHP containers to support automatic construction, the core lies in configuring the continuous integration (CI) process. 1. Use Dockerfile to define the PHP environment, including basic image, extension installation, dependency management and permission settings; 2. Configure CI/CD tools such as GitLabCI, and define the build, test and deployment stages through the .gitlab-ci.yml file to achieve automatic construction, testing and deployment; 3. Integrate test frameworks such as PHPUnit to ensure that tests are automatically run after code changes; 4. Use automated deployment strategies such as Kubernetes to define deployment configuration through the deployment.yaml file; 5. Optimize Dockerfile and adopt multi-stage construction

How to use PHP combined with AI to analyze video content PHP intelligent video tag generation How to use PHP combined with AI to analyze video content PHP intelligent video tag generation Jul 25, 2025 pm 06:15 PM

The core idea of PHP combining AI for video content analysis is to let PHP serve as the backend "glue", first upload video to cloud storage, and then call AI services (such as Google CloudVideoAI, etc.) for asynchronous analysis; 2. PHP parses the JSON results, extract people, objects, scenes, voice and other information to generate intelligent tags and store them in the database; 3. The advantage is to use PHP's mature web ecosystem to quickly integrate AI capabilities, which is suitable for projects with existing PHP systems to efficiently implement; 4. Common challenges include large file processing (directly transmitted to cloud storage with pre-signed URLs), asynchronous tasks (introducing message queues), cost control (on-demand analysis, budget monitoring) and result optimization (label standardization); 5. Smart tags significantly improve visual

How to build an independent PHP task container environment. How to configure the container for running PHP timed scripts How to build an independent PHP task container environment. How to configure the container for running PHP timed scripts Jul 25, 2025 pm 07:27 PM

Building an independent PHP task container environment can be implemented through Docker. The specific steps are as follows: 1. Install Docker and DockerCompose as the basis; 2. Create an independent directory to store Dockerfile and crontab files; 3. Write Dockerfile to define the PHPCLI environment and install cron and necessary extensions; 4. Write a crontab file to define timing tasks; 5. Write a docker-compose.yml mount script directory and configure environment variables; 6. Start the container and verify the log. Compared with performing timing tasks in web containers, independent containers have the advantages of resource isolation, pure environment, strong stability, and easy expansion. To ensure logging and error capture

See all articles