Home  >  Article  >  Backend Development  >  How to use wildcards to match strings in Python

How to use wildcards to match strings in Python

WBOY
WBOYforward
2023-05-06 12:13:061910browse

Use wildcards to match strings:

  • Use the fnmatch.filter() method to get the strings matching the pattern from the list.

  • Use the fnmatch.fnmatch() method to check whether a string matches a pattern.

import fnmatch

a_list = ['fql.txt', 'jiyik.txt', 'com.csv']

pattern = '*.txt'
filtered_list = fnmatch.filter(a_list, pattern)
print(filtered_list)  # ????️ ['fql.txt', 'jiyik.txt']

How to use wildcards to match strings in Python

If we prefer to use regular expressions, please scroll down to the next subtitle.

fnmatch.filter method accepts an iterable object and a pattern and returns a new list containing only the iterable object elements that match the provided pattern.

The pattern in the example starts with any one or more characters and ends with .txt.

The pattern in the example contains only one wildcard, but you can use as many wildcards as you want.

Note that the asterisk * matches everything (one or more characters).

If you want to match any single character, replace the asterisk * with the question mark ?.

  • * Matches everything (one or more characters)

  • ? Matches anything A single character

  • [sequence] matches any character in the sequence

  • [!sequence] Matches any non-sequential characters

Here is an example of using a question mark to match any single character.

import fnmatch

a_list = ['abc', 'abz', 'abxyz']

pattern = 'ab?'
filtered_list = fnmatch.filter(a_list, pattern)
print(filtered_list)  # ????️ ['abc', 'abz']

This pattern matches a string starting with ab followed by any single character.

If you want to use wildcards to check whether a string matches a pattern, use the fnmatch.fnmatch() method.

import fnmatch

a_string = '2023_jiyik.txt'
pattern = '2023*.txt'

matches_pattern = fnmatch.fnmatch(a_string, pattern)
print(matches_pattern)  # ????️ True

if matches_pattern:
    # ????️ this runs
    print('The string matches the pattern')
else:
    print('The string does NOT match the pattern')

The pattern starts with 2023, followed by any one or more characters, and ends with .txt.

fnmatch.fnmatch The method accepts a string and a pattern as parameters. If the string matches the pattern, the method returns True, otherwise it returns False. Just replace the asterisk * with the question mark ? if you want to match any single character instead of any one or more characters.

Alternatively, we can use regular expressions.

Use regular expressions to match strings using wildcards

Use wildcards to match strings:

Use re.match() Method checks if a string matches the given pattern. Use .* characters instead of wildcard characters. The

import re

a_list = ['2023_fql.txt', '2023_jiyik.txt', '2023_com.csv']

regex = re.compile(r'2023_.*\.txt')

list_of_matches = [
    item for item in a_list
    if re.match(regex, item)
]

print(list_of_matches)  # ????️ ['2023_fql.txt', '2023_jiyik.txt']

re.compile method compiles a regular expression pattern into an object that can be used using its match() or search() method to match.

This is more efficient than using re.match or re.search directly because it saves and reuses the regular expression object.

Regular expression starts with 2023_.

The .* characters in regular expressions are used as wildcards to match any one or more characters.

  • Dot . matches any character except a newline character.

  • The asterisk * matches the preceding regular expression (dot .) zero or more times.

We use the backslash\ character to escape dots. in the extension because, as we saw before, the dot . has a special meaning when used in regular expressions. In other words, we use backslashes to handle dots. as literal characters.

We use list comprehension to iterate over the list of strings.

List comprehensions are used to perform certain operations on each element or to select a subset of elements that meet a condition.

In each iteration, we use the re.match() method to check if the current string matches the pattern.

import re

a_list = ['2023_fql.txt', '2023_jiyik.txt', '2023_com.csv']

regex = re.compile(r'2023_.*\.txt')

list_of_matches = [
    item for item in a_list
    if re.match(regex, item)
]

print(list_of_matches)  # ????️ ['2023_fql.txt', '2023_jiyik.txt']

The re.match method returns a match object if the provided regular expression matches in the string.

If the string does not match the regular expression pattern, the match() method returns None.

The new list contains only the strings in the original list that match the pattern.

If you only want to match any single character, remove the asterisk after the dot *. in the regular expression.

import re

a_list = ['2023_a.txt', '2023_bcde.txt', '2023_z.txt']

regex = re.compile(r'2023_.\.txt')

list_of_matches = [
    item for item in a_list
    if re.match(regex, item)
]

print(list_of_matches)  # ????️ ['2023_a.txt', '2023_z.txt']

Dot . Matches any character except newlines.

By using dots . without escaping, the regular expression matches anything starting with 2023_ followed by any single character ending with ## The string ending in #.txt.

If you need help reading or writing regular expressions, please refer to our regular expression tutorial.

This page contains a list of all special characters and many useful examples.

If you want to use regular expressions to check whether a string matches a pattern, we can directly use the

re.match() method.

import re

a_string = '2023_fql.txt'

matches_pattern = bool(re.match(r'2023_.*\.txt', a_string))
print(matches_pattern)  # ????️ True

if matches_pattern:
    # ????️ this runs
    print('The string matches the pattern')
else:
    print('The string does NOT match the pattern')

如果字符串与模式匹配,则 re.match() 方法将返回一个匹配对象,如果不匹配,则返回 None

我们使用 bool() 类将结果转换为布尔值。

如果要对单个字符使用通配符,请删除星号 *

import re

a_string = '2023_ABC.txt'

matches_pattern = bool(re.match(r'2023_.\.txt', a_string))
print(matches_pattern)  # ????️ False

if matches_pattern:
    print('The string matches the pattern')
else:
    # ????️ this runs
    print('The string does NOT match the pattern')

请注意 ,点 . 我们没有使用反斜杠作为前缀用于匹配任何单个字符,而点 . 我们以反斜杠 \ 为前缀的被视为文字点。

示例中的字符串与模式不匹配,因此 matches_pattern 变量存储一个 False 值。

The above is the detailed content of How to use wildcards to match strings in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete