マルチバイト文字列(std::string)とワイド文字列(std::wstring)の間の変換を行うライブラリを作りました(SJIS, UTF-8, UTF-16に対応。SJIS⇔UTF-8の変換も可能)

127440 ワード

Windows VisualStudio C++ MinGW Cygwin C++ テキストリンク

C++でWindowsアプリを作る場合に、マルチバイト文字列(std::string)とワイド文字列(std::wstring)の間で変換しなければならないことがあります。今回は、Windows API の MultiByteToWideChar() と WideCharToMultiByte() を使って変換を行うライブラリ(ヘッダーオンリー)を作ってみました(文末の strconv.h を保存してお使いください。C++専用です。Cでは使えません)。また、内部的にワイド文字列を経由することでシフトJIS⇔UTF-8 の変換(マルチバイト文字列同士の変換)を行う関数も用意しています。

【2020/12/15追記】
メッセージを出力するのを楽にするために、format()関数(C言語のsprintf()またはfprintf()に相当)を追加しました。「使用例2(メッセージをコード変換して出力)」を見て使い方をご確認ください。フォーマットした上で出力をANSI(日本語WindowsではシフトJIS)に変換する、formatA()関数の方が使いでがあるとおもいます。

【2020/12/24追記】
お待たせしました。「使用例3(メッセージをコード変換して出力(std::cout互換＝型安全な方法))」というサンプルを加えました。strconv.h v1.7.0 以降で実行できます。「使用例2(メッセージをコード変換して出力)」の sprintf 形式の指定が苦手(難解)だという方でも std::cout 互換ですので使い易いと思います。

C++11 以降のコンパイラとC++11 以前(C++98, C++03)のコンパイラの両方で動作するように工夫してあります(但し、ご利用になる場合にはコンパイラのバージョンを気にする必要はありません)。
このライブラリは多くの企業／デベロッパーで利用されています。
Visual C++, MinGW の 32bit/64bit 版でコンパイル・動作を確認しています。
添付の strconv.h のライセンスを MIT License or Public Domain (Unlicense) のデュアルライセンスにしました。詳しくはこちらをご覧ください。 Public Domain (Unlicense) に関しては https://ja.wikipedia.org/wiki/Unlicense もご覧ください。

使用例1(ライブラリをそのまま使用)

main1.cpp

#include <windows.h>
#include "strconv.h"

int main(int argc, char **argv)
{
  std::string utf8_str = u8"あいう";
  printf("utf8_str.length()=%u\n", utf8_str.length());
  std::wstring wide_str = utf8_to_wide(utf8_str);
  printf("wide_str.length()=%u\n", wide_str.length());
  std::string sjis_str = wide_to_sjis(wide_str);
  printf("sjis_str.length()=%u\n", sjis_str.length());
  printf("sjis_str.c_str()=%s\n", sjis_str.c_str());
  return 0;
}

実行結果

>main1.exe
utf8_str.length()=9
wide_str.length()=3
sjis_str.length()=6
sjis_str.c_str()=あいう

使用例2(メッセージをコード変換して出力)

使用例3とちがって、formatA は、(日本語Windowsでは)シフトJISにしか対応していません。(chcp 65001 を実行すると文字化けします)　formatを使って韓国語などを表示したい(chcp 65001 に対応したい)場合は、
unicode_ostream aout(std::cout, GetConsoleOutputCP()); aout << format(u8"漢字=%s", u8"한자") << std::endl;
のようにしてください。unicode_ostream と一緒に用いる場合は formatA() ではなくて format() を用いてください。

main2.cpp

#include "strconv.h"
#include <iostream>
#include <string>

int main(void)
{
    std::string nameUtf8 = u8"太郎";
    int age = 15;
    // 以下の3行は同じ意味です。
    std::cout << utf8_to_ansi(format(u8"ハロー、私の名前は %s。 年は %d だ!", nameUtf8.c_str(), age)) << std::endl;
    std::cout << formatA(u8"ハロー、私の名前は %s。 年は %d だ!", nameUtf8.c_str(), age) << std::endl;
    formatA(std::cout, u8"ハロー、私の名前は %s。 年は %d だ!\n", nameUtf8.c_str(), age);
    std::wstring nameWide = L"花子";
    age = 23;
    // 以下の3行は同じ意味です。
    std::cout << wide_to_ansi(format(L"ハロー、私の名前は %s。 年は %d だ!", nameWide.c_str(), age)) << std::endl;
    std::cout << formatA(L"ハロー、私の名前は %s。 年は %d だ!", nameWide.c_str(), age) << std::endl;
    formatA(std::cout, L"ハロー、私の名前は %s。 年は %d だ!\n", nameWide.c_str(), age);
    return 0;
}

実行結果(日本語版WindowsではシフトJISで出力されます)

>main2.exe
ハロー、私の名前は 太郎。 年は 15 だ!
ハロー、私の名前は 太郎。 年は 15 だ!
ハロー、私の名前は 太郎。 年は 15 だ!
ハロー、私の名前は 花子。 年は 23 だ!
ハロー、私の名前は 花子。 年は 23 だ!
ハロー、私の名前は 花子。 年は 23 だ!

使用例3(メッセージをコード変換して出力(std::cout互換＝型安全な方法))

main3.cpp

#include "strconv.h"
#include <iostream>
#include <iomanip>
#include <cmath>

using namespace std;

class CMyClass
{
    time_t t;
    int count;

public:
    CMyClass()
    {
        this->t = std::time(nullptr);
        this->count = 0;
    }
    int CountUp()
    {
        return ++this->count;
    }
    friend unicode_ostream &operator<<(unicode_ostream &stream, const CMyClass &value);
};

unicode_ostream &operator<<(unicode_ostream &stream, const CMyClass &value)
{
    struct tm *ptm = std::localtime(&value.t);
    stream << u8"[作成日時=" << std::put_time(ptm, "%Y-%m-%d %H:%M:%S");
    stream << u8"、カウント=" << value.count << "]";
    return stream;
};

int main()
{
#ifdef ANSI_ONLY
    unicode_ostream aout(cout);
#else
    unicode_ostream aout(cout, GetConsoleOutputCP()); // chcp 65001 とすると「한자」が「??」に化けずに表示される
#endif
    CMyClass mc;
    mc.CountUp();
    mc.CountUp();
    aout << mc << endl;
    double pi = 4 * atan(1.0);
    aout << u8"π(1)=" << pi << endl;
    aout << u8"π(2)=" << format("%.2f", pi) << endl;

    aout << 1 << u8" char*漢字=한자 " << std::string(u8" string漢字=한자 ") << 1.2345 << endl;
    aout << 2 << L" wchar_t*漢字=한자 " << std::wstring(L" wstring漢字=한자 ") << 1.2345 << endl;

    double A = 100;
    double B = 2001.5251;

    // 書式指定(A) hex の代わりに setbase(16) を使うこともできます
    aout << hex << left << showbase << nouppercase;
    // 実際の印字処理(A)
    aout << (long long)A << endl;

    // 書式指定(B) setbase(10) の代わりに dec を使うこともできます
    aout << setbase(10) << right << setw(15)
         << setfill('_') << showpos
         << fixed << setprecision(2);
    // 実際の印字処理(B)
    aout << B << endl;

    return 0;
}

chcp 65001 を試すときは、生の PowerShell では漢字や韓国語が表示されませんので、コマンドプロンプトか Windows Terminal をお使いください。Windows Terminal の PowerShell では chcp 65001 しても、漢字や韓国語が問題なく表示されます。

実行結果

>chcp 932
>main3.exe
[作成日時=2020-12-25 21:17:18、カウント=2]
π(1)=3.14159
π(2)=3.14
1 char*漢字=??  string漢字=?? 1.2345
2 wchar_t*漢字=??  wstring漢字=?? 1.2345
0x64
_______+2001.53

>chcp 65001
>main3.exe
[作成日時=2020-12-25 21:20:44、カウント=2]
π(1)=3.14159
π(2)=3.14
1 char*漢字=한자  string漢字=한자 1.2345
2 wchar_t*漢字=한자  wstring漢字=한자 1.2345
0x64
_______+2001.53

関数一覧

std::wstring ansi_to_wide(const std::string &s)

システムロケールにおける既定のコードページの文字コードからワイド文字列への変換を行います。システムロケールが「日本/日本語」の場合は sjis_to_wide() と同じ結果となります。

std::string wide_to_ansi(const std::wstring &s)

ワイド文字列からシステムロケールにおける既定のコードページの文字コードへの変換を行います。システムロケールが「日本/日本語」の場合は wide_to_sjis() と同じ結果となります。

std::wstring sjis_to_wide(const std::string &s)

シフトJIS(コードページ932)からワイド文字列への変換を行います。

std::string wide_to_sjis(const std::wstring &s)

ワイド文字列からシフトJIS(コードページ932)への変換を行います。

std::wstring utf8_to_wide(const std::string &s)

UTF-8からワイド文字列への変換を行います。

std::string wide_to_utf8(const std::wstring &s)

ワイド文字列からUTF-8への変換を行います。

std::string ansi_to_utf8(const std::string &s)

システムロケールにおける既定のコードページの文字コードからUTF-8への変換を行います。システムロケールが「日本/日本語」の場合は sjis_to_utf8() と同じ結果となります。

std::string utf8_to_ansi(const std::string &s)

UTF-8からシステムロケールにおける既定のコードページの文字コードへの変換を行います。システムロケールが「日本/日本語」の場合は utf8_to_sjis() と同じ結果となります。

std::string sjis_to_utf8(const std::string &s)

シフトJIS(コードページ932)からUTF-8への変換を行います。

std::string utf8_to_sjis(const std::string &s)

UTF-8からシフトJIS(コードページ932)への変換を行います。

std::string format(const char *format, ...)

printf()の書式指定に従って std::string を出力します。sprintf()のようなものです。(主に UTF-8 の入力を想定していますが、シフトJISで使っていただいても構いません)

std::wstring format(const wchar_t *format, ...)

wprintf()の書式指定に従って std::wstring を出力します。wide版sprintf()のようなものです。(こちらはワイド文字列用です)

std::string formatA(const char *format, ...)

上記format()関数の出力をANSI(日本語WindowsではシフトJIS)に変換して返します。そのまま std::cout への出力に使えます。(但し、入力はUTF-8を想定しています←utf8_to_ansi()を適用しているため)

    // 以下の3行は同じ意味です。
    std::cout << utf8_to_ansi(format(u8"ハロー、私の名前は %s。 年は %d だ!", nameUtf8.c_str(), age)) << std::endl;
    std::cout << formatA(u8"ハロー、私の名前は %s。 年は %d だ!", nameUtf8.c_str(), age) << std::endl;
    formatA(std::cout, u8"ハロー、私の名前は %s。 年は %d だ!\n", nameUtf8.c_str(), age);

std::string formatA(const wchar_t *format, ...)

上記format()関数の出力をANSI(日本語WindowsではシフトJIS)に変換して返します。そのまま std::cout への出力に使えます。(ワイド文字列を入力とする関数です)

    // 以下の3行は同じ意味です。
    std::cout << wide_to_ansi(format(L"ハロー、私の名前は %s。 年は %d だ!", nameWide.c_str(), age)) << std::endl;
    std::cout << formatA(L"ハロー、私の名前は %s。 年は %d だ!", nameWide.c_str(), age) << std::endl;
    formatA(std::cout, L"ハロー、私の名前は %s。 年は %d だ!\n", nameWide.c_str(), age);

添付ファイル(strconv.h)

2020/12/15 06:26 久しぶりに、strconv.h を修正しました。変更点は(クラシックな対応方法ですが) sprintf() もどきの format()関数を追加することで要望の多かったメッセージ出力に対応しました。format()の出力を多くの場合 ANSIコードページに変換するので、そこまで一気にやろうかとも考えましたが、とりあえずはコード変換は別途(手動)ということにしました。追記：ANSIコードへの変換まで一気にやってしまう関数 formatA() も追加しました(使いやすそうだったので)。
https://github.com/javacommons/strconv の方にもコード(strconv.h)を置いています。この記事内には最新のコードを貼り付けておきます。

strconv.h

/* strconv.h v1.8.10               */
/* Last Modified: 2021/08/30 21:53 */
#ifndef STRCONV_H
#define STRCONV_H

#include <windows.h>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>

#if __cplusplus >= 201103L && !defined(STRCONV_CPP98)
static inline std::wstring cp_to_wide(const std::string &s, UINT codepage)
{
  int in_length = (int)s.length();
  int out_length = MultiByteToWideChar(codepage, 0, s.c_str(), in_length, 0, 0);
  std::wstring result(out_length, L'\0');
  if (out_length)
    MultiByteToWideChar(codepage, 0, s.c_str(), in_length, &result[0], out_length);
  return result;
}
static inline std::string wide_to_cp(const std::wstring &s, UINT codepage)
{
  int in_length = (int)s.length();
  int out_length = WideCharToMultiByte(codepage, 0, s.c_str(), in_length, 0, 0, 0, 0);
  std::string result(out_length, '\0');
  if (out_length)
    WideCharToMultiByte(codepage, 0, s.c_str(), in_length, &result[0], out_length, 0, 0);
  return result;
}
#else /* __cplusplus < 201103L */
static inline std::wstring cp_to_wide(const std::string &s, UINT codepage)
{
  int in_length = (int)s.length();
  int out_length = MultiByteToWideChar(codepage, 0, s.c_str(), in_length, 0, 0);
  std::vector<wchar_t> buffer(out_length);
  if (out_length)
    MultiByteToWideChar(codepage, 0, s.c_str(), in_length, &buffer[0], out_length);
  std::wstring result(buffer.begin(), buffer.end());
  return result;
}
static inline std::string wide_to_cp(const std::wstring &s, UINT codepage)
{
  int in_length = (int)s.length();
  int out_length = WideCharToMultiByte(codepage, 0, s.c_str(), in_length, 0, 0, 0, 0);
  std::vector<char> buffer(out_length);
  if (out_length)
    WideCharToMultiByte(codepage, 0, s.c_str(), in_length, &buffer[0], out_length, 0, 0);
  std::string result(buffer.begin(), buffer.end());
  return result;
}
#endif

static inline std::string cp_to_utf8(const std::string &s, UINT codepage)
{
  if (codepage == CP_UTF8)
    return s;
  std::wstring wide = cp_to_wide(s, codepage);
  return wide_to_cp(wide, CP_UTF8);
}
static inline std::string utf8_to_cp(const std::string &s, UINT codepage)
{
  if (codepage == CP_UTF8)
    return s;
  std::wstring wide = cp_to_wide(s, CP_UTF8);
  return wide_to_cp(wide, codepage);
}

static inline std::wstring ansi_to_wide(const std::string &s)
{
  return cp_to_wide(s, CP_ACP);
}
static inline std::string wide_to_ansi(const std::wstring &s)
{
  return wide_to_cp(s, CP_ACP);
}

static inline std::wstring sjis_to_wide(const std::string &s)
{
  return cp_to_wide(s, 932);
}
static inline std::string wide_to_sjis(const std::wstring &s)
{
  return wide_to_cp(s, 932);
}

static inline std::wstring utf8_to_wide(const std::string &s)
{
  return cp_to_wide(s, CP_UTF8);
}
static inline std::string wide_to_utf8(const std::wstring &s)
{
  return wide_to_cp(s, CP_UTF8);
}

static inline std::string ansi_to_utf8(const std::string &s)
{
  return cp_to_utf8(s, CP_ACP);
}
static inline std::string utf8_to_ansi(const std::string &s)
{
  return utf8_to_cp(s, CP_ACP);
}

static inline std::string sjis_to_utf8(const std::string &s)
{
  return cp_to_utf8(s, 932);
}
static inline std::string utf8_to_sjis(const std::string &s)
{
  return utf8_to_cp(s, 932);
}

#ifdef __cpp_char8_t
static inline std::u8string utf8_to_char8(const std::string &s)
{
  return std::u8string(s.begin(), s.end());
}
static inline std::string char8_to_utf8(const std::u8string &s)
{
  return std::string(s.begin(), s.end());
}

static inline std::wstring char8_to_wide(const std::u8string &s)
{
  return cp_to_wide(char8_to_utf8(s), CP_UTF8);
}
static inline std::u8string wide_to_char8(const std::wstring &s)
{
  return utf8_to_char8(wide_to_cp(s, CP_UTF8));
}

static inline std::u8string cp_to_char8(const std::string &s, UINT codepage)
{
  return utf8_to_char8(cp_to_utf8(s, codepage));
}
static inline std::string char8_to_cp(const std::u8string &s, UINT codepage)
{
  return utf8_to_cp(char8_to_utf8(s), codepage);
}

static inline std::u8string ansi_to_char8(const std::string &s)
{
  return cp_to_char8(s, CP_ACP);
}
static inline std::string char8_to_ansi(const std::u8string &s)
{
  return char8_to_cp(s, CP_ACP);
}

static inline std::u8string sjis_to_char8(const std::string &s)
{
  return cp_to_char8(s, 932);
}
static inline std::string char8_to_sjis(const std::u8string &s)
{
  return char8_to_cp(s, 932);
}
#endif

#if defined(_MSC_VER)
#pragma warning(push)
#pragma warning(disable : 4996)
#endif

static inline std::wstring vformat(const wchar_t *format, va_list args)
{
  int len = _vsnwprintf(0, 0, format, args);
  if (len < 0)
    return L"";
  std::vector<wchar_t> buffer(len + 1);
  len = _vsnwprintf(&buffer[0], len, format, args);
  if (len < 0)
    return L"";
  buffer[len] = L'\0';
  return &buffer[0];
}
static inline std::string vformat(const char *format, va_list args)
{
  int len = _vsnprintf(0, 0, format, args);
  if (len < 0)
    return "";
  std::vector<char> buffer(len + 1);
  len = _vsnprintf(&buffer[0], len, format, args);
  if (len < 0)
    return "";
  buffer[len] = '\0';
  return &buffer[0];
}
#ifdef __cpp_char8_t
static inline std::u8string vformat(const char8_t *format, va_list args)
{
  int len = _vsnprintf(0, 0, (const char *)format, args);
  if (len < 0)
    return u8"";
  std::vector<char> buffer(len + 1);
  len = _vsnprintf(&buffer[0], len, (const char *)format, args);
  if (len < 0)
    return u8"";
  buffer[len] = '\0';
  return (char8_t *)&buffer[0];
}
#endif

#if defined(_MSC_VER)
#pragma warning(pop)
#endif

static inline std::wstring format(const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring s = vformat(format, args);
  va_end(args);
  return s;
}
static inline std::string format(const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  return s;
}
#ifdef __cpp_char8_t
static inline std::u8string format(const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  return s;
}
#endif

static inline void format(std::ostream &ostrm, const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring s = vformat(format, args);
  va_end(args);
  ostrm << wide_to_utf8(s) << std::flush;
}
static inline void format(std::ostream &ostrm, const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  ostrm << s << std::flush;
}
#ifdef __cpp_char8_t
static inline void format(std::ostream &ostrm, const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  ostrm << char8_to_utf8(s) << std::flush;
}
#endif

static inline std::string formatA(const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring s = vformat(format, args);
  va_end(args);
  return wide_to_ansi(s);
}
static inline std::string formatA(const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  return utf8_to_ansi(s);
}
#ifdef __cpp_char8_t
static inline std::string formatA(const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  return char8_to_ansi(s);
}
#endif

static inline void formatA(std::ostream &ostrm, const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring s = vformat(format, args);
  va_end(args);
  ostrm << wide_to_ansi(s) << std::flush;
}
static inline void formatA(std::ostream &ostrm, const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  ostrm << utf8_to_ansi(s) << std::flush;
}
#ifdef __cpp_char8_t
static inline void formatA(std::ostream &ostrm, const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  ostrm << char8_to_ansi(s) << std::flush;
}
#endif

static inline void dbgmsg(const wchar_t *title, const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring s = vformat(format, args);
  va_end(args);
  MessageBoxW(0, s.c_str(), title, MB_OK);
}
static inline void dbgmsg(const char *title, const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  MessageBoxW(0, utf8_to_wide(s).c_str(), utf8_to_wide(title).c_str(), MB_OK);
}
#ifdef __cpp_char8_t
static inline void dbgmsg(const char8_t *title, const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  MessageBoxW(0, char8_to_wide(s).c_str(), char8_to_wide(title).c_str(), MB_OK);
}
#endif

static inline HANDLE handle_for_ostream(std::ostream &ostrm)
{
  if (&ostrm == &std::cout)
  {
    return GetStdHandle(STD_OUTPUT_HANDLE);
  }
  else if (&ostrm == &std::cerr)
  {
    return GetStdHandle(STD_ERROR_HANDLE);
  }
  return INVALID_HANDLE_VALUE;
}
static inline void dbgout(std::ostream &ostrm, const wchar_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::wstring ws = vformat(format, args);
  va_end(args);
  HANDLE h = handle_for_ostream(ostrm);
  if (h == INVALID_HANDLE_VALUE)
  {
    return;
  }
  DWORD dwNumberOfCharsWrite;
  if (GetFileType(h) != FILE_TYPE_CHAR)
  {
    std::string s = wide_to_cp(ws, GetConsoleOutputCP());
    WriteFile(h, s.c_str(), (DWORD)s.size(), &dwNumberOfCharsWrite, NULL);
  }
  else
  {
    WriteConsoleW(h,
                  ws.c_str(),
                  (DWORD)ws.size(),
                  &dwNumberOfCharsWrite,
                  NULL);
  }
}
static inline void dbgout(std::ostream &ostrm, const char *format, ...)
{
  va_list args;
  va_start(args, format);
  std::string s = vformat(format, args);
  va_end(args);
  HANDLE h = handle_for_ostream(ostrm);
  if (h == INVALID_HANDLE_VALUE)
  {
    return;
  }
  DWORD dwNumberOfCharsWrite;
  if (GetFileType(h) != FILE_TYPE_CHAR)
  {
    s = utf8_to_cp(s, GetConsoleOutputCP());
    WriteFile(h, s.c_str(), (DWORD)s.size(), &dwNumberOfCharsWrite, NULL);
  }
  else
  {
    std::wstring ws = utf8_to_wide(s);
    WriteConsoleW(h,
                  ws.c_str(),
                  (DWORD)ws.size(),
                  &dwNumberOfCharsWrite,
                  NULL);
  }
}
#ifdef __cpp_char8_t
static inline void dbgout(std::ostream &ostrm, const char8_t *format, ...)
{
  va_list args;
  va_start(args, format);
  std::u8string s = vformat(format, args);
  va_end(args);
  HANDLE h = handle_for_ostream(ostrm);
  if (h == INVALID_HANDLE_VALUE)
  {
    return;
  }
  DWORD dwNumberOfCharsWrite;
  if (GetFileType(h) != FILE_TYPE_CHAR)
  {
    std::string str = char8_to_cp(s, GetConsoleOutputCP());
    WriteFile(h, (const char *)str.c_str(), (DWORD)str.size(), &dwNumberOfCharsWrite, NULL);
  }
  else
  {
    std::wstring ws = char8_to_wide(s);
    WriteConsoleW(h,
                  ws.c_str(),
                  (DWORD)ws.size(),
                  &dwNumberOfCharsWrite,
                  NULL);
  }
}
#endif

class unicode_ostream
{
private:
  std::ostream *m_ostrm;
  UINT m_target_cp;
  bool is_ascii(const std::string &s)
  {
    for (std::size_t i = 0; i < s.size(); i++)
    {
      unsigned char c = (unsigned char)s[i];
      if (c > 0x7f)
        return false;
    }
    return true;
  }

public:
  unicode_ostream(std::ostream &ostrm, UINT target_cp = CP_ACP) : m_ostrm(&ostrm), m_target_cp(target_cp) {}
  std::ostream &stream() { return *m_ostrm; }
  void stream(std::ostream &ostrm) { m_ostrm = &ostrm; }
  UINT target_cp() { return m_target_cp; }
  void target_cp(UINT cp) { m_target_cp = cp; }
  template <typename T>
  unicode_ostream &operator<<(const T &x)
  {
    std::ostringstream oss;
    oss << x;
    std::string output = oss.str();
    if (is_ascii(output))
    {
      (*m_ostrm) << x;
    }
    else
    {
      (*m_ostrm) << utf8_to_cp(output, m_target_cp);
    }
    return *this;
  }
  unicode_ostream &operator<<(const std::wstring &x)
  {
    (*m_ostrm) << wide_to_cp(x, m_target_cp);
    return *this;
  }
  unicode_ostream &operator<<(const wchar_t *x)
  {
    (*m_ostrm) << wide_to_cp(x, m_target_cp);
    return *this;
  }
  unicode_ostream &operator<<(const std::string &x)
  {
    (*m_ostrm) << utf8_to_cp(x, m_target_cp);
    return *this;
  }
  unicode_ostream &operator<<(const char *x)
  {
    (*m_ostrm) << utf8_to_cp(x, m_target_cp);
    return *this;
  }
#ifdef __cpp_char8_t
  unicode_ostream &operator<<(const std::u8string &x)
  {
    (*m_ostrm) << char8_to_cp(x, m_target_cp);
    return *this;
  }
  unicode_ostream &operator<<(const char8_t *x)
  {
    (*m_ostrm) << char8_to_cp(x, m_target_cp);
    return *this;
  }
#endif
  unicode_ostream &operator<<(std::ostream &(*pf)(std::ostream &)) // For manipulators...
  {
    (*m_ostrm) << pf;
    return *this;
  }
  unicode_ostream &operator<<(std::basic_ios<char> &(*pf)(std::basic_ios<char> &)) // For manipulators...
  {
    (*m_ostrm) << pf;
    return *this;
  }
};

#define U8(X) ((const char *)u8##X)
#define WIDE(X) (L##X)

#endif /* STRCONV_H */

関数一覧にないコードページへの対応方法

以下の関数の UINT codepage 引数に文末の表のコードページを渡すことで様々な文字コード(日本語EUC・JIS等も含む)に対応することができます。関数一覧に挙げた関数の実際の定義を参考にして、以下の関数をラップする関数を作るのが良いかもしれません。便利なら、strconv.h の末尾にそれらの関数を付け加えても構いません。UINT codepage に 932 を渡している、sjis_to_～() や～_to_sjis() 関数が参考になるでしょう。932 は文末のコードページ一覧表に「日本語 (シフト JIS)」として載っています。

std::wstring cp_to_wide(const std::string &s, UINT codepage)
std::string wide_to_cp(const std::wstring &s, UINT codepage)
std::string cp_to_utf8(const std::string &s, UINT codepage)
std::string utf8_to_cp(const std::string &s, UINT codepage)

(コードページ一覧表。長いので折りたたみ中。展開してご覧ください)

コードページ	説明
37	IBM EBCDIC (米国 - カナダ)
437	OEM 米国
500	IBM EBCDIC (インターナショナル)
708	アラビア語 (ASMO 708)
720	アラビア語 (DOS)
737	ギリシャ語 (DOS)
775	バルト言語 (DOS)
850	西ヨーロッパ言語 (DOS)
852	中央ヨーロッパ言語 (DOS)
855	OEM キリル
857	トルコ語 (DOS)
858	OEM マルチリンガルラテン I
860	ポルトガル語 (DOS)
861	アイスランド語 (DOS)
862	ヘブライ語 (DOS)
863	フランス語 (カナダ) (DOS)
864	アラビア語 (864)
865	北欧 (DOS)
866	キリル言語 (DOS)
869	ギリシャ語, Modern (DOS)
870	IBM EBCDIC (多国語ラテン 2)
874	タイ語 (Windows)
875	IBM EBCDIC (ギリシャ語 Modern)
932	日本語 (シフト JIS)
936	簡体字中国語 (GB2312)
949	韓国語
950	繁体字中国語 (Big5)
1026	IBM EBCDIC (トルコ語ラテン 5)
1047	IBM ラテン-1
1140	IBM EBCDIC (米国 - カナダ - ヨーロッパ)
1141	IBM EBCDIC (ドイツ - ヨーロッパ)
1142	IBM EBCDIC (デンマーク - ノルウェー - ヨーロッパ)
1143	IBM EBCDIC (フィンランド - スウェーデン - ヨーロッパ)
1144	IBM EBCDIC (イタリア - ヨーロッパ)
1145	IBM EBCDIC (スペイン - ヨーロッパ)
1146	IBM EBCDIC (英国 - ヨーロッパ)
1147	IBM EBCDIC (フランス - ヨーロッパ)
1148	IBM EBCDIC (インターナショナル - ヨーロッパ)
1149	IBM EBCDIC (アイスランド語 - ヨーロッパ)
1200	Unicode
1201	Unicode (ビッグエンディアン)
1250	中央ヨーロッパ言語 (Windows)
1251	キリル言語 (Windows)
1252	西ヨーロッパ言語 (Windows)
1253	ギリシャ語 (Windows)
1254	トルコ語 (Windows)
1255	ヘブライ語 (Windows)
1256	アラビア語 (Windows)
1257	バルト言語 (Windows)
1258	ベトナム語 (Windows)
1361	韓国語 (Johab)
10000	西ヨーロッパ言語 (Mac)
10001	日本語 (Mac)
10002	繁体字中国語 (Mac)
10003	韓国語 (Mac)
10004	アラビア語 (Mac)
10005	ヘブライ語 (Mac)
10006	ギリシャ語 (Mac)
10007	キリル言語 (Mac)
10008	簡体字中国語 (Mac)
10010	ルーマニア語 (Mac)
10017	ウクライナ語 (Mac)
10021	タイ語 (Mac)
10029	中央ヨーロッパ言語 (Mac)
10079	アイスランド語 (Mac)
10081	トルコ語 (Mac)
10082	クロアチア語 (Mac)
20000	繁体字中国語 (CNS)
20001	TCA 台湾
20002	繁体字中国語 (Eten)
20003	IBM5550 台湾
20004	TeleText 台湾
20005	Wang 台湾
20105	西ヨーロッパ言語 (IA5)
20106	ドイツ語 (IA5)
20107	スウェーデン語 (IA5)
20108	ノルウェー語 (IA5)
20127	US-ASCII
20261	T.61
20269	ISO-6937
20273	IBM EBCDIC (ドイツ)
20277	IBM EBCDIC (デンマーク - ノルウェー)
20278	IBM EBCDIC (フィンランド - スウェーデン)
20280	IBM EBCDIC (イタリア)
20284	IBM EBCDIC (スペイン)
20285	IBM EBCDIC (英国)
20290	IBM EBCDIC (日本語カタカナ)
20297	IBM EBCDIC (フランス)
20420	IBM EBCDIC (アラビア語)
20423	IBM EBCDIC (ギリシャ語)
20424	IBM EBCDIC (ヘブライ語)
20833	IBM EBCDIC (韓国語拡張)
20838	IBM EBCDIC (タイ語)
20866	キリル言語 (KOI8-R)
20871	IBM EBCDIC (アイスランド語)
20880	IBM EBCDIC (キリル言語 - ロシア語)
20905	IBM EBCDIC (トルコ語)
20924	IBM ラテン-1
20932	日本語 (JIS 0208-1990 および 0212-1990)
20936	簡体字中国語 (GB2312-80)
20949	韓国語 Wansung
21025	IBM EBCDIC (キリル言語セルビア - ブルガリア)
21027	拡張アルファベットの小文字
21866	キリル言語 (KOI8-U)
28591	西ヨーロッパ言語 (ISO)
28592	中央ヨーロッパ言語 (ISO)
28593	ラテン 3 (ISO)
28594	バルト言語 (ISO)
28595	キリル言語 (ISO)
28596	アラビア語 (ISO)
28597	ギリシャ語 (ISO)
28598	ヘブライ語 (ISO-Visual)
28599	トルコ語 (ISO)
28603	リトアニア語 (ISO)
28605	ラテン 9 (ISO)
29001	ヨーロッパ
38598	ヘブライ語 (ISO-Logical)
50000	ユーザー定義
50001	自動選択
50220	日本語 (JIS)
50221	日本語 (JIS 1 バイトカタカナ可)
50222	日本語 (JIS 1 バイトカタカナ可 - SO/SI)
50225	韓国語 (ISO)
50227	簡体字中国語 (ISO-2022)
50229	繁体字中国語 (ISO-2022)
50930	IBM EBCDIC (日本語および日本語カタカナ)
50931	IBM EBCDIC (日本語および米国 - カナダ)
50932	日本語 (自動選択)
50933	IBM EBCDIC (韓国語および韓国語拡張)
50935	IBM EBCDIC (簡体字中国語)
50936	簡体字中国語 (自動選択)
50937	IBM EBCDIC (繁体字中国語)
50939	IBM EBCDIC (日本語および日本語 - ラテン語)
50949	韓国語 (自動選択)
50950	繁体字中国語 (自動選択)
51251	キリル言語 (自動選択)
51253	ギリシャ語 (自動選択)
51256	アラビア語 (自動選択)
51932	日本語 (EUC)
51936	簡体字中国語 (EUC)
51949	韓国語 (EUC)
52936	簡体字中国語 (HZ)
54936	簡体字中国語 (GB18030)
57002	ISCII デバナガリ文字
57003	ISCII ベンガル語
57004	ISCII タミール語
57005	ISCII テルグ語
57006	ISCII アッサム語
57007	ISCII オリヤー語
57008	ISCII カンナダ語
57009	ISCII マラヤーラム語
57010	ISCII グジャラート語
57011	ISCII パンジャブ語
65000	Unicode (UTF-7)
65001	Unicode (UTF-8)

このソースコードのライセンス (MIT License or Public Domain)

Public Domain (Unlicense) に関しては https://ja.wikipedia.org/wiki/Unlicense もご覧ください。

Unlicenseは、パブリックドメインに非常に近いライセンスであり、著作権法への抵抗に重点を置いている。2010年1月1日（パブリックドメインの日）に初めて提唱された。Unlicenseはパブリックドメインとして権利を放棄する手段を提供しており、帰属の必要がない非常に緩いライセンスともいえる。2015年にGitHubは、github.com上のライセンスされた全プロジェクト500万弱のうち、2パーセントに当たる約10万のプロジェクトがUnlicenseを採用していると発表した。

------------------------------------------------------------------------------
This software is available under 2 licenses -- choose whichever you prefer.
------------------------------------------------------------------------------
ALTERNATIVE A - MIT License
Copyright (c) 2019-2021 JavaCommons
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do
so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
------------------------------------------------------------------------------
ALTERNATIVE B - Public Domain (www.unlicense.org)
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
software, either in source code form or as a compiled binary, for any purpose,
commercial or non-commercial, and by any means.
In jurisdictions that recognize copyright laws, the author or authors of this
software dedicate any and all copyright interest in the software to the public
domain. We make this dedication for the benefit of the public at large and to
the detriment of our heirs and successors. We intend this dedication to be an
overt act of relinquishment in perpetuity of all present and future rights to
this software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-----------------------------------------------------------------------------

Author And Source

この問題について(マルチバイト文字列(std::string)とワイド文字列(std::wstring)の間の変換を行うライブラリを作りました(SJIS, UTF-8, UTF-16に対応。SJIS⇔UTF-8の変換も可能)), 我々は、より多くの情報をここで見つけました https://qiita.com/javacommons/items/9ea0c8fd43b61b01a8da

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .

C++条件変数condition_variableの使い方

protocol-buffers:シーケンス化