CephのCRUSH MapとCRUSH Mapの紹介を入手
13972 ワード
プロファイルを作成し、mkcephfsでCephを配置すると、Cephは構成中にデフォルトのCRUSHマッピングを生成します.デフォルトのCRUSHマッピングは、Ceph sandbox環境に優れています.しかし、大規模なデータ・クラスタを導入する場合は、Cephクラスタの管理を支援し、パフォーマンスを向上させ、データのセキュリティを確保するために、カスタムCRUSHマッピングの開発に重点を置く必要があります.
たとえば、OSDが壊れている場合、crushマッピングは、フィールドサポートまたはハードウェアの置き換えが必要なプロジェクトで、失敗したOSDの物理データセンター、スペース、行、ホストのラックを見つけるのに役立ちます.
同様にcrushは、障害をより迅速に見つけるのに役立ちます.たとえば、特定のラック内のすべてのOSDが同時に壊れた場合、障害は、OSD自体ではなく、1つのネットワークスイッチまたはラックまたはネットワークスイッチの電源にある可能性があります.
カスタムCRUSHマッピングは、Cephストレージデータの冗長コピーの物理的な場所を識別するのに役立ちます.失敗したホスト関連構成グループがダウングレードされている場合.
1.CRUSH mapのバイナリファイルを取得する
ceph osd getcrushmap-o {compiled-crushmap-filename}
# ceph osd getcrushmap -o crushmap.map
2.逆コンパイル、バイナリファイルをテキストファイルに変換
crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
# crushtool -d crushmap.map -o crushmap.txt
3.crush mapの表示
# vim crushmap.txt
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
# buckets
host node2 { id -2 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.0 weight 0.018 item osd.5 weight 0.018 item osd.6 weight 0.009
}
host node3 { id -3 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.1 weight 0.018 item osd.7 weight 0.018 item osd.8 weight 0.009
}
host node1 { id -4 # do not change unnecessarily # weight 0.046 alg straw hash 0 # rjenkins1 item osd.2 weight 0.018 item osd.3 weight 0.018 item osd.4 weight 0.009
}
root default { id -1 # do not change unnecessarily # weight 0.137 alg straw hash 0 # rjenkins1 item node2 weight 0.046 item node3 weight 0.046 item node1 weight 0.046
}
# rules
rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit
}
# end crush map
次のコマンドでcrush treeを表示することもできます.
# ceph osd crush tree
[ { "id": -1, "name": "default", "type": "root", "type_id": 10, "items": [ { "id": -2, "name": "node2", "type": "host", "type_id": 1, "items": [ { "id": 0, "name": "osd.0", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 5, "name": "osd.5", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 6, "name": "osd.6", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] }, { "id": -3, "name": "node3", "type": "host", "type_id": 1, "items": [ { "id": 1, "name": "osd.1", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 7, "name": "osd.7", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 8, "name": "osd.8", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] }, { "id": -4, "name": "node1", "type": "host", "type_id": 1, "items": [ { "id": 2, "name": "osd.2", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 3, "name": "osd.3", "type": "osd", "type_id": 0, "crush_weight": 0.018494, "depth": 2 }, { "id": 4, "name": "osd.4", "type": "osd", "type_id": 0, "crush_weight": 0.008789, "depth": 2 } ] } ] }
]
コマンドを使用してdevices、buckets、rulesets情報を表示します.
# ceph osd crush dump
ceph osd crush dump
{ "devices": [ { "id": 0, "name": "osd.0" }, { "id": 1, "name": "osd.1" }, { "id": 2, "name": "osd.2" }, { "id": 3, "name": "osd.3" }, { "id": 4, "name": "osd.4" }, { "id": 5, "name": "osd.5" }, { "id": 6, "name": "osd.6" }, { "id": 7, "name": "osd.7" }, { "id": 8, "name": "osd.8" } ], "types": [ { "type_id": 0, "name": "osd" }, { "type_id": 1, "name": "host" }, { "type_id": 2, "name": "chassis" }, { "type_id": 3, "name": "rack" }, { "type_id": 4, "name": "row" }, { "type_id": 5, "name": "pdu" }, { "type_id": 6, "name": "pod" }, { "type_id": 7, "name": "room" }, { "type_id": 8, "name": "datacenter" }, { "type_id": 9, "name": "region" }, { "type_id": 10, "name": "root" } ], "buckets": [ { "id": -1, "name": "default", "type_id": 10, "type_name": "root", "weight": 9000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": -2, "weight": 3000, "pos": 0 }, { "id": -3, "weight": 3000, "pos": 1 }, { "id": -4, "weight": 3000, "pos": 2 } ] }, { "id": -2, "name": "node2", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 0, "weight": 1212, "pos": 0 }, { "id": 5, "weight": 1212, "pos": 1 }, { "id": 6, "weight": 576, "pos": 2 } ] }, { "id": -3, "name": "node3", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 1, "weight": 1212, "pos": 0 }, { "id": 7, "weight": 1212, "pos": 1 }, { "id": 8, "weight": 576, "pos": 2 } ] }, { "id": -4, "name": "node1", "type_id": 1, "type_name": "host", "weight": 3000, "alg": "straw", "hash": "rjenkins1", "items": [ { "id": 2, "weight": 1212, "pos": 0 }, { "id": 3, "weight": 1212, "pos": 1 }, { "id": 4, "weight": 576, "pos": 2 } ] } ], "rules": [ { "rule_id": 0, "rule_name": "replicated_ruleset", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] } ], "tunables": { "choose_local_tries": 0, "choose_local_fallback_tries": 0, "choose_total_tries": 50, "chooseleaf_descend_once": 1, "chooseleaf_vary_r": 0, "straw_calc_version": 1, "allowed_bucket_algs": 22, "profile": "unknown", "optimal_tunables": 0, "legacy_tunables": 0, "require_feature_tunables": 1, "require_feature_tunables2": 1, "require_feature_tunables3": 0, "has_v2_rules": 0, "has_v3_rules": 0, "has_v4_buckets": 0 }
}
CRUSH Map紹介
詳細なCRUSH Mapsの説明については、公式ドキュメントCRUSH MapsでcephでのCRUSH Mapの説明とルールを参照し、アルゴリズムCRUSH-Controlled,Scalable,Descentralized Placement of Replicated Dataを参照してください.
TODO
参考文献
[1]CRUSHマッピング
[2] CRUSH Maps
[3]CRUSHアルゴリズムCRUSH-Controlled,Scalable,Decentralized Placement of Replicated Data