We need to have a concrete understanding about character conversions among Windows codepage encoding, UCS2 (or UTF16) encoding, UTF8 encoding. If you have a solid understanding of the following concepts, character conversions among the varieties of encodings become much more manageable.
6. When we use MultiByteToWideChar and WideCharToMultiByte functions to convert to and from Windows codepage, we have to set the default locale to our process (or thread). First, we need to find out what is the System Default locale. For such purpose, we use GetSystemDefaultLocaleName function. Once we get the system default locale name, then we set the default locale of our process using setlocale or _wsetlocale function.
Before we call MultiByteToWideChar or WideCharToMultiByte function to convert to or from codepage, we have to set the default locale of our process (or thread), otherwise, we have to explicitly provide codepage identifiers if such conversion is to be successful.
7. For example, if we want to convert codepaged Korean text to UCS2 encoding, we use MultiByteToWideChar function, because codepage encoding is of the Multibyte encoding system and UCS2 is in Wide Character encoding. Since codepage identifier for Korean is 949, we can call
MultiByteToWideChar(949, ...,), where 949 is the codepage identifier for Korean, the codepage identifier for the source of the conversion. The target of this conversion is always in the Wide Character encoding or UCS2 on Windows. So we do not need to concern about the codepage of the target in the call to MultiByteToWideChar function. Note that codepage identifiers are used for Multibyte strings.
8. If we want to convert UCS2 to codepaged Korean, then we use WideCharToMultiByte function, because the source of this conversion is in the Wide Character encoding, and the target is in the Multibyte encoding (such as codepage and UTF7/UTF8), since the codepage identifier for Korean is 949, we can call
WideCharToMultiByte(949, ...), where 949 is the codepage identifier for Korean, the codepage identifier for the target of the conversion. The source of this conversion is always in the Wide Character encoding or UCS2 on Windows. Again the codepage identifier is used for Multibyte strings.
9. I happen to use codepage 949 (or Korean) on my machine. If I hardcode Korean codepage in my program, it will fail on the machines where the default codepage is not 949 (or Korean). Instead, I query the system and find out the default system locale using GetDefaultSystemLocaleName() function, and set this locale to my program using setlocale() function such that it can work on other machines whose system locale is different from that of my machine.