Unicode File Handling in Standard C Library
The task of opening files using the C standard library can be particularly challenging with Windows applications and Unicode filenames. Unicode often refers to UTF-8 in this context.
The C standard library lacks intrinsic Unicode support. char and wchar_t are not inherently Unicode encodings. On Windows, wchar_t represents UTF-16, but the standard library lacks explicit support for UTF-8 filenames (char is non-Unicode on Windows).
Microsoft STL provides a constructor for file streams that accepts a const wchar_t* filename, allowing file creation as follows:
wchar_t const name[] = L"filename.txt"; std::fstream file(name);
However, this overload is not C 11 compliant (guaranteeing only char-based versions). It is also absent in alternate STL implementations like GCC's libstdc for MinGW(-w64) as of g 4.8.x.
Note that platform differences affect encoding interpretations. char on Windows is not UTF-8, and wchar_t may not be UTF-16 on other operating systems. Therefore, portability is an issue. Opening streams from wchar_t filenames is undefined by the standard, and specifying filenames in char can be problematic due to OS-dependent encoding variations.
The above is the detailed content of How Can I Handle Unicode Filenames Reliably in Standard C ?. For more information, please follow other related articles on the PHP Chinese website!