I am trying to trim unicode spaces such as this character and I was able to do it using this solution. The problem with this solution is that it does not trim unicode spaces between normal characters. For example, this uses thin space
$string = " test string "; echo preg_replace('/^[pZpC]+|[pZpC]+$/u', '', $string); // outputs: test string
I know a little bit about regular expressions, so I don't know what to change my expression to solve this problem
To remove all Unicode whitespace with control characters at the beginning and end of a string, and to remove all Unicode whitespace with control characters except regular spaces anywhere within the string, you can use
SeeRegular Expression Demo #1and Regular ExpressionDemo #2.
details
^[\pZ\pC]
- One or more spaces or control characters at the beginning of the string|
- or[\pZ\pC] $
- One or more spaces or control characters|
- or(?! )[\pZ\pC]
- One or more spaces or control characters other than regular spaces anywhere within the string[^\S ]
- Any whitespace except regular whitespace (\x20
)If you also need to "exclude" common newlines, replace
(?! )[\pZ\pC]
with(?![ \r\n])[ \pZ \pC]
(suggested by @MonkeyZeus), in the second regex, this means you need to use[^\S \r\n]
.View PHP Demo:
How such Unicode spaces \u{2009} can cause problems in different places. So I would replace all unicode spaces with regular spaces and then apply trim().
Note: The string in the comment is represented by debug::writeUni($string, $trimString);. Implemented fromthis class.