JAVAメソッド文字列とunicodeの相互変換

3907 ワード

unicode符号化は、簡単に言えば、各文字を16ビット2進数で識別することである.しかし、通常は4ビットの16進数で識別されます.
例:
1)中国語文字列「こんにちは」のunicodeコードは:u 60u 597 d;
2)英語文字列「ab」のunicodeコードは「61」である.
ここで、uはunicodeコードを識別するためのものであり、後の4ビット16進数は対応する文字のunicodeコードである.
unicodeコードはJ 2 EEプロジェクトで広く応用されており、javaはunicodeコードに良いサポートを提供している.例えば国際化はunicodeの古典的な運用である.
ではunicodeの符号化ルールは具体的に何なのか、どのようにプログラムで実現しますか?
1、unicode符号化規則

unicodeコードは、各文字に対して4ビット16進数で表される.具体的なルールは、1文字(char)の高さ8ビットと低い8ビットをそれぞれ取り出し、16進数に変換することです.
変換された16進数の長さが2桁未満の場合は、その後に0を補い、8桁下の16進数文字列をつなぎ合わせて前に「u」を補えばよい.

2、トランスコードプログラム

1)文字列回転unicode

/**
*文字列をunicodeに変換
*@param str回転文字列
*@return unicode文字列
*/
public String convert(String str)
{
str = (str == null ? "": str);
String tmp;
StringBuffer sb = new StringBuffer(1000);
char c;
int i, j;
sb.setLength(0);
for (i = 0; i < str.length(); i++)
{
c = str.charAt(i);
sb.append("\\u");
j = (c >>>8);//上位8位を取り出す
tmp = Integer.toHexString(j);
if (tmp.length() == 1)
sb.append("0");
sb.append(tmp);
j = (c & 0xFF);//下位8ビットを取り出す
tmp = Integer.toHexString(j);
if (tmp.length() == 1)
sb.append("0");
sb.append(tmp);
}
return (new String(sb));
}
2)unicodeを文字列に変換し、上記の手順と逆方向に操作すればよい

/**
*unicode文字列
*@param str回転文字列
*@return普通文字列
*/
public String revert(String str)
{
str = (str == null ? "": str);
if(str.indexOf("\u")=-1)/unicodeコードでない場合はそのまま返す
return str;
StringBuffer sb = new StringBuffer(1000);
for (int i = 0; i < str.length() - 6;)
{
String strTemp = str.substring(i, i + 6);
String value = strTemp.substring(2);
int c = 0;
for (int j = 0; j < value.length(); j++)
{
char tempChar = value.charAt(j);
int t = 0;
switch (tempChar)
{
case 'a':
t = 10;
break;
case 'b':
t = 11;
break;
case 'c':
t = 12;
break;
case 'd':
t = 13;
break;
case 'e':
t = 14;
break;
case 'f':
t = 15;
break;
default:
t = tempChar - 48;
break;
}
c += t * ((int) Math.pow(16, (value.length() - j - 1)));
}
sb.append((char) c);
i = i + 6;
}
return sb.toString();
}
//Method 2 :

java     jdk bin    native2ascii.exe         ，    java            。
     unicode java      ：
     :

/**
 *      unicode
 */
public static String string2Unicode(String string) {
 
    StringBuffer unicode = new StringBuffer();
 
    for (int i = 0; i < string.length(); i++) {
 
        //        
        char c = string.charAt(i);
 
        //    unicode
        unicode.append("\\u" + Integer.toHexString(c));
    }
 
    return unicode.toString();
}
unicode     java      ：
     :

/**
 * unicode     
 */
public static String unicode2String(String unicode) {
 
    StringBuffer string = new StringBuffer();
 
    String[] hex = unicode.split("\\\\u");
 
    for (int i = 1; i < hex.length; i++) {
 
        //          
        int data = Integer.parseInt(hex[i], 16);
 
        //    string
        string.append((char) data);
    }
 
    return string.toString();
}
  java    ：
     :

public static void main(String[] args) {
    String test = "       :www.zuidaima.com";
 
    String unicode = string2Unicode(test);
     
    String string = unicode2String(unicode) ;
     
    System.out.println(unicode);
     
    System.out.println(string);
 
}
    ：
\u6700\u4ee3\u7801\u7f51\u7ad9\u5730\u5740\u3a\u77\u77\u77\u2e\u7a\u75\u69\u64\u61\u69\u6d\u61\u2e\u63\u6f\u6d

【AWS】ストレージ特性とユースケース

【Laravel】ZipArchiveを使ってS3の画像をまとめてzip化する