現在のドキュメントのエンコードを取得するための考え方

4881 ワード

エンコーディング

最近、現在のドキュメントのエンコードを自動的に取得する必要があり、現在の問題に遭遇しました.
前にdocumentを通したいです.charset取得、様々なピットに遭遇しました~ドキュメントの符号化を取得する方法をいくつか検索しました

1つ目:
document.charset
使用する文字セットでエンコードされたオブジェクトの設定または取得
A LINK SCRIPTではcharset属性値を設定できますが、設定した場合のみ取得できます

2つ目:
document.defaultCharset
現在のロケールのデフォルト文字コードの取得

以上の2つの表記はGeckoではサポートされていません

3つ目:
document.characterSet
この文字符号化は、現在のページをレンダリングするために使用される文字セットであり、その値は必ずしも現在のページの正しい文字符号化ではない(ユーザが他の符号化を使用して現在のページをレンダリングすることを選択できるため)

まとめ:
document.charsetとは、設定されていると思われる符号化
document.defaultCharsetとは、地域のデフォルト符号化を指す
document.characterSet最終ドキュメントレンダリングのエンコード

コメント:

<!DOCTYPE HTML>

<html >

<head>

<meta charset="utf-8">

<!--meta content="text/html;charset=gbk"-->

<script>

alert("document.charset=>"+document.charset+"
document.defaultCharset=>"+document.defaultCharset+"
document.characterSet=>"+document.characterSet);

</script>

<head>

<body>

</body>

</html>

各ブラウザの実行状況:

Chrome
FF
Opera
Safari
IE/6
document.charset
UTF-8
undefined
UTF-8
UTF-8
utf-8/utf-8
document.defaultCharset
GBK
undefined
GBK
ISO-8859-1
gb2312/gb2312
document.characterSet
UTF-8
UTF-8
UTF-8
UTF-8
utf-8/undefined
したがって、現在のドキュメントのエンコードを取得する方法は、次のように実現できます.

//   UTF-8 ， GBK

var fileCharset = document.charset ? document.charset : (document.characterSet ? document.characterSet : document.defaultCharset);

var utf8Set = "utf-8|UTF-8";

if(utf8Set.indexOf(fileCharset)>-1){

    charset = fileCharset;

}else{

    charset = "GBK";

}

alert(charset);

参照先:https://developer.mozilla.org/zh-CN/docs/DOM/document.characterSet

Practicing Typescript: Advanced Props

[C++]argcとargv