第四章MapReduceアプリケーションの開発

1813 ワード

4.1システムパラメータの構成
構成中に「final」とマークされた属性は書き換えられません

4.2開発環境の構成
Hadoopの3つの異なる運転方式:単機モード、擬似分布式モード、完全分布式

4.3 MapReduceプログラムの作成

4.4ローカルテスト
P62

4.5 MapReduceプログラムの実行
P62

4.6ネットワークユーザーインタフェース
P65

4.7性能調整
P68

4.8 MapReduceワークフロー
1、setup関数

/**
   * Called once at the beginning of the task.
   */
  protected void setup(Context context
                       ) throws IOException, InterruptedException {
    // NOTHING
  }

task関数の起動後にデータ処理の前に値が1回呼び出され、map関数とreduce関数はスライス内のkeyごとに1回呼び出されます.
2、cleanup関数

  /**
   * Called once at the end of the task.
   */
  protected void cleanup(Context context
                         ) throws IOException, InterruptedException {
    // NOTHING
  }

task破棄前に呼び出す
3、run数

  /**
   * Expert users can override this method for more complete control over the
   * execution of the Mapper.
   * @param context
   * @throws IOException
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }

きどうかんすう

MapReduce Jobでのグローバル共有データ
1、HDFSファイルを読み込む
複数のMapとReduceに対して書き込みを行うと、前のデータが上書きされ、I/Oがリソースを消費する
2、Job属性の配置
コンフィギュレーションクラスのset()でプロパティを設定し、taskでget()でプロパティを取得すると、大きなデータ共有が弱くなります
3、DistributedCache
MapReduceは、アプリケーションにキャッシュファイルを提供する読み取り専用ツールです.

HDu-1754-I Hate It(線分ツリー(単点更新))

JS閉包は徹底的に理解されていますか?