I'm working on a cross-platform project using Qt. On Windows, I want to pass some Unicode characters (for instance, file path that contains Chinese characters) as arguments when launching the application from the command line. Then use these arguments to create a QCoreApplication.
For some reasons, I need to use CommandLineToArgvW to get the argument list like this:
LPWSTR * argvW = CommandLineToArgvW( GetCommandLineW(), &argc );
I understand on modern Windows OS, LPWSTR is actually wchar_t* which is 16bit and uses UTF-16 encoding.
While if I want to initialize the QCoreApplication, it only takes char* but not wchar_t*. QCoreApplication
So the question is: how can I safely convert the LPWSTR returned by CommandLineToArgvW() function to char* without losing the UNICODE encoding (i.e. the Chinese characters are still Chinese characters for example)?
I've tried many different ways without success:
1:
std::string const argvString = boost::locale::conv::utf_to_utf<char>( argvW[0] )
2:
int res;
char buf[0x400];
char* pbuf = buf;
boost::shared_ptr<char[]> shared_pbuf;
res = WideCharToMultiByte(CP_UTF8, 0, pcs, -1, buf, sizeof(buf), NULL, NULL);
3: Convert to QString first, then convert to UTF-8.
ETID:
Problem solved. The UTF-16 wide character to UTF-8 char conversion actually works fine without problem with all these three approaches. And in Visual Studio, in order to correctly view the UTF-8 string in debug, it's necessary to append the s8 format specifier after the watched variable name (see: https://msdn.microsoft.com/en-us/library/75w45ekt.aspx). This is the part that I overlooked and made me think that my string conversion was wrong.
The real issue here is actually when calling QCoreApplication.arguments(), the returned QString is constructed by QString::fromLocal8Bit(), which would cause encoding issues on Windows when the command line arguments contain unicode characters. The workaround is whenever necessary to retrieve the command line arguments on Windows, always call the Windows API CommandLineToArgvW(), and convert the 16-bit UTF-16 wchar_t * (or LPWSTR) to 8-bit UTF-8 char * (by one of the three ways mentioned above).
QCoreApplicationis successful? That is, you say that you want "the Chinese characters are still Chinese characters". So how do you tell that they no longer are. Show us the code that, given an appropriate conversion function, you would expect to work. - Nicol BolasCommandLineToArgvWfor you, unless you pass modified arguments to theQCoreApplicationconstructor. It does not state what exactly "modified" means, but presumably the intent is to just work for ordinary code that just blindly forwards themainarguments, but honor the client code's wish if there is any difference. See doc.qt.io/qt-5/qcoreapplication.html#arguments - Cheers and hth. - AlfWideCharToMultiByte(CP_UTF8, ...is the canonical way under Windows. You say it "fails". What's the return value, and what's theGetLastError()after that? - dxiv