objective-c - C语言或OC或C++ 中英文混合的文件读取前3个字符怎么做?
黄舟
黄舟 2017-04-17 11:42:25
0
4
533

1.txt 文件内容: 你好a,我是千叶!
期望结果: 你好a

C#include <stdio.h>                                                                                                                                                                                                                          
main()                                                                                                                                                                                                                                      
{                                                                                                                                                                                                                                           
FILE *fp;                                                                                                                                                                                                                                   
fp=fopen("1.txt","r");                                                                                                                                                                                                                      
char x[1000];                                                                                                                                                                                                                               
fread(x,sizeof(char),7,fp);   //length=7,对于现在的1.txt结果正确,如果1.txt变成纯中文的文件,第三个汉字就会被截断,请问要怎么处理呢?                                                                                                                                                                                                             
printf("%s",x);                                                                                                                                                                                                                             
}                 

====================================================================================

我的场景是文件比较大,不太想把整个文件读取到NSData或者NSString,所以希望NSData读取部分数据,再转化成NSString,于是就遇到了中文字符截取出现问题的情况。看了大家的回答,发现这个问题可能是个伪命题,毕竟文件的偏移是按字节算的不会去考虑文件字符编码。

之前提了一个问题在Object-c节点,没有人回答 所以想看看用C能不能解决,原问题:http://segmentfault.com/q/1010000002530834?_ea=128095

黄舟
黄舟

人生最曼妙的风景,竟是内心的淡定与从容!

reply all(4)
Peter_Zhu

Give me an idea:

  1. To read files, you must know the character encoding
  2. Generates an NSString object. NSString has an initialization method initWithData:encoding:, and NSData has an initialization method dataWithContentsOfFile:
  3. After ensuring that step axis 2 generates the object normally, call the member method of NSString: substringWithRange:Interception

Hope this helps lz

左手右手慢动作

The key point of the problem is: Under the conditions of ANSI encoding, one Chinese character occupies two bytes and one English character occupies one byte .

So for your example:

// 1.txt
你好a,我是千叶!
^^^^^
// "你好a", 数一数,是5个字节。

So if you want to intercept "Hello a", then use:

cfread(x,sizeof(char),5,fp);
printf("%s\n", x); // 输出 "你好a"

If it is all in Chinese, for example:

// 1.txt
你好啊,我是千叶!
^^^^^^
// 三个汉字是 6 个字节

Then if you want Chinese characters not to be truncated, you should at least read an even number of bytes.

cfread(x,sizeof(char),6,fp);
printf("%s\n", x); // 输出 "你好啊"
伊谢尔伦

This depends on the encoding. If the encoding standard is not certain, I am afraid that any software will read garbled characters.

Ty80

...I’m not sure if I’m talking about the same thing as you...
It's nothing more than a problem with Chinese characters. You can directly take the length of the first 6 characters (regardless of Chinese and English, 6 characters are always enough), convert it into NSString, and then directly substringToIndex:3, take the first three characters, and it will come out. ?

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template