This article brings you relevant knowledge about javascript, which mainly introduces the relevant knowledge about strings, which mainly introduces the commonly used basic methods as well as special characters and A detailed introduction to common basic methods of JavaScript strings internal representation methods. Let’s take a look at the content below, I hope it will be helpful to everyone.
[Related recommendations: javascript video tutorial, web front-end]
No matter what programming language In , strings are important data types. Follow me to learn moreJavaScript
string knowledge!
A string is a string composed of characters. If you have studied C
, Java
, you should know that the characters themselves can also independently become a type. However, JavaScript
does not have a single character type, only a string of length 1
. The string of
JavaScript
uses a fixed UTF-16
encoding. No matter what encoding we use when writing the program, it will not be affected.
There are three ways to write strings: single quotes, double quotes, and backticks.
let single = 'abcdefg';//单引号let double = "asdfghj";//双引号let backti = `zxcvbnm`;//反引号
Single and double quotation marks have the same status, we do not make a distinction.
String formatting
Backticks allow us to use ${...}
elegantly format strings instead of using strings Addition operation.
let str = `I'm ${Math.round(18.5)} years old.`;console.log(str);
Code execution result:
Multi-line string
Backticks can also allow strings to span lines , very useful when we write multi-line strings.
let ques = `Is the author handsome? A. Very handsome; B. So handsome; C. Super handsome;`;console.log(ques);
Code execution result:
Doesn’t it seem like there is nothing? However, this cannot be achieved using single and double quotes. If you want to get the same result, you can write:
let ques = 'Is the author handsome?\nA. Very handsome;\nB. So handsome;\nC. Super handsome;';console.log(ques);
The above code contains a special character \n
, which is used in our programming process The most common special characters.
Characters\n
, also known as "newline character", supports single and double quotes to output multi-line strings. When the engine outputs a string, if it encounters \n
, it will continue to output on another line, thereby realizing a multi-line string.
Although \n
looks like two characters, it only occupies one character position. This is because \
is the escape character in the string. , characters modified by escape characters become special characters.
Special character list
Special character | Description | |
---|---|---|
\n |
Newline character, used to start a new line of output text. | |
\r |
Carriage return character, move the cursor to the beginning of the line, use ## in Windows system #\r\n represents a line break, which means that the cursor needs to go to the beginning of the line first, and then to the next line before it can change to a new line. For other systems, just use \n.
|
|
\' \"
| Single and double quotation marks, mainly because single and double quotation marks are special characters, we want When using single and double characters in a string, escape them. ||
\\
| Backslash, also because \ is a special character. If we just want to output \ itself, we must escape it.
|
|
\b \ f \v
| Backspace, page feed, vertical tabs - are no longer used. ||
\xXX
| is a hexadecimal Unicode character encoded as XX, for example: \x7A means z ( The hexadecimal Unicode encoding of z is 7A).
|
|
\uXXXX
| is a hexadecimal Unicode character encoded as XXXX, for example: \u00A9 means ©.
|
|
\u{X...X} ( 1-6 hexadecimal characters)
|
UTF-32The Unicode symbol encoded as X...X.
|
方法 | 描述 | 参数 |
---|---|---|
.slice(start,end) |
[start,end) |
可负 |
.substring(start,end) |
[start,end) |
负值为0
|
.substr(start,len) |
从start 开始长为len 的子串 |
可负 |
方法多了自然就选择困难了,这里建议记住
.A detailed introduction to common basic methods of JavaScript strings
就可以了,相比于其他两种更灵活。
我们在前文中已经提及过字符串的比较,字符串按照字典序进行排序,每个字符背后都是一个编码,ASCII
编码就是一个重要的参考。
例如:
console.log('a'>'Z');//true
字符之间的比较,本质上是代表字符的编码之间的比较。JavaScript
使用UTF-16
编码字符串,每个字符都是一个16
为的代码,想要知道比较的本质,就需要使用.codePointAt(idx)
获得字符的编码:
console.log('a'.codePointAt(0));//97console.log('Z'.codePointAt(0));//90
代码执行结果:
使用String.fromCodePoint(code)
可以把编码转为字符:
console.log(String.fromCodePoint(97));console.log(String.fromCodePoint(90));
代码执行结果如下:
这个过程可以用转义符\u
实现,如下:
console.log('\u005a');//Z,005a是90的16进制写法console.log('\u0061');//a,0061是97的16进制写法
下面我们探索一下编码为[65,220]
区间的字符:
let str = '';for(let i = 65; i<p>代码执行部分结果如下:</p><p><img src="https://img.php.cn/upload/article/000/000/067/0f4e2a78ef52090d845bd32f6b72d01c-17.png" alt="A detailed introduction to common basic methods of JavaScript strings"></p><p>上图并没有展示所有的结果,快去试试吧。</p><h2>.localeCompare()</h2><p>基于国际化标准<code>ECMA-402</code>,<code>JavaScript</code>已经实现了一个特殊的方法(<code>.localeCompare()</code>)比较各种字符串,采用<code>str1.localeCompare(str2)</code>的方式:</p><ol> <li>如果<code>str1 ,返回负数;</code> </li> <li>如果<code>str1 > str2</code>,返回正数;</li> <li>如果<code>str1 == str2</code>,返回0;</li> </ol><p>举个例子:</p><pre class="brush:php;toolbar:false">console.log("abc".localeCompare('def'));//-1
为什么不直接使用比较运算符呢?
这是因为英文字符有一些特殊的写法,例如,á
是a
的变体:
console.log('á' <p>虽然也是<code>a</code>,但是比<code>z</code>还要大!!</p><p>此时就需要使用<code>.localeCompare()</code>方法:</p><pre class="brush:php;toolbar:false">console.log('á'.localeCompare('z'));//-1
str.trim()
去除字符串前后空白字符,str.trimStart()
、str.trimEnd()
删除开头、结尾的空格;
let str = " 999 ";console.log(str.trim());//999
str.repeat(n)
重复n
次字符串;
let str = '6';console.log(str.repeat(3));//666
str.replace(substr,newstr)
替换第一个子串,str.replaceAll()
用于替换所有子串;
let str = '9+9';console.log(str.replace('9','6'));//6+9console.log(str.replaceAll('9','6'));//6+6
还有很多其他方法,我们可以访问手册获取更多知识。
JavaScript
使用UTF-16
编码字符串,也就是使用两个字节(16
位)表示一个字符,但是16
位数据只能表示65536
个字符,对于常见字符自然不在话下,但是对于生僻字(中文的)、A detailed introduction to common basic methods of JavaScript strings
、罕见数学符号等就力不从心了。
这种时候就需要扩展,使用更长的位数(32
位)表示特殊字符,例如:
console.log(''.length);//2console.log('?'.length);//2
代码执行结果:
这么做的结果是,我们无法使用常规的方法处理它们,如果我们单个输出其中的每个字节,会发生什么呢?
console.log(''[0]);console.log(''[1]);
代码执行结果:
可以看到,单个输出字节是不能识别的。
好在A detailed introduction to common basic methods of JavaScript strings
和.A detailed introduction to common basic methods of JavaScript strings
两个方法是可以处理这种情况的,这是因为二者是最近才加入的。在旧版本的JavaScript
中,只能使用String.fromCharCode()
和.charCodeAt()
两个方法转换编码和字符,但是他们不适用于特殊字符的情况。
我们可以通过判断一个字符的编码范围,判断它是否是一个特殊字符,从而处理特殊字符。如果一个字符的代码在0xd800~0xdbff
之间,那么他是32
位字符的第一部分,它的第二部分应该在0xdc00~0xdfff
。
举个例子:
console.log(''.charCodeAt(0).toString(16));//d83 dconsole.log('?'.charCodeAt(1).toString(16));//de02
代码执行结果:
在英文中,存在很多基于字母的变体,例如:字母 a
可以是 àáâäãåā
的基本字符。这些变体符号并没有全部存储在UTF-16
编码中,因为变化组合太多了。
为了支持所有的变体组合,同样使用多个Unicode
字符表示单个变体字符,在编程过程中,我们可以使用基本字符加上“装饰符号”的方式表达特殊字符:
console.log('a\u0307');//ȧ console.log('a\u0308');//ȧ console.log('a\u0309');//ȧ console.log('E\u0307');//Ė console.log('E\u0308');//Ë console.log('E\u0309');//Ẻ
代码执行结果:
一个基础字母还可以有多个装饰,例如:
console.log('E\u0307\u0323');//Ẹ̇ console.log('E\u0323\u0307');//Ẹ̇
代码执行结果:
这里存在一个问题,在多个装饰的情况下,装饰的排序不同,实际上展示的字符是一样的。
如果我们直接比较这两种表示形式,却会得到错误的结果:
let e1 = 'E\u0307\u0323'; let e2 = 'E\u0323\u0307'; console.log(`${e1}==${e2} is ${e1 == e2}`)
代码执行结果:
为了解决这种情况,有一个**Unicode
规范化算法,可以将字符串转为通用**格式,由str.normalize()
实现:
<span style="max-width:90%" microsoft yahei sans gb helvetica neue tahoma arial sans-serif>let e1 = 'E\u0307\u0323';<br>let e2 = 'E\u0323\u0307';<br>console.log(`${e1}==${e2} is ${e1.normalize() == e2.normalize()}`)</span><br>
代码执行结果:
【相关推荐:javascript视频教程、web前端】
The above is the detailed content of A detailed introduction to common basic methods of JavaScript strings. For more information, please follow other related articles on the PHP Chinese website!