logstash-input-file使用

37346 ワード

プラグインを学ぶには、まずプラグインを走らせてから、まずテキストファイルを作成します.テキストファイルの内容は次のとおりです.
[sqczm@sqczm first]$ pwd
/opt/logstash-6.7.1/demo/first
[sqczm@sqczm first]$ more users.txt 
name: zhangsan, age: 21, addr: "     "
name: lisi, age:20,addr:"  "
name:wangwu,age:19,addr:"beijing"

次に、logstashのプロファイルを構成します.
[sqczm@sqczm first]$ pwd
/opt/logstash-6.7.1/demo/first
[sqczm@sqczm first]$ more first.conf 
input {
    file {
        path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
    }
}
filter {
    
}
output {
    stdout {}
}

最後にlogstashを起動します
[sqczm@sqczm logstash-6.7.1]$ pwd
/opt/logstash-6.7.1
[sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
[2019-04-20T16:18:32,057][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2019-04-20T16:18:32,083][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
[2019-04-20T16:18:40,628][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2019-04-20T16:18:41,060][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
[2019-04-20T16:18:41,112][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<0x2ed54826 run="">"}
[2019-04-20T16:18:41,202][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-04-20T16:18:41,248][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-04-20T16:18:41,658][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

WTF、コンソールにテキスト情報が印刷されていませんか?まず公式ドキュメントを見てみましょう.公式ドキュメントにはstartがあります.positionのパラメータ、公式の説明はこうです:logstash-input-file公式の説明
start_position
  • Value can be any of: beginning, end
  • Default value is “end”

  • Choose where Logstash starts initially reading files: at the beginning or at the end. The default behavior treats files like live streams and thus starts at the end. If you have old data you want to import, set this to beginning.
    This option only modifies “first contact” situations where a file is new and not seen before, i.e. files that don’t have a current position recorded in a sincedb file read by Logstash. If a file has already been seen before, this option has no effect and the position recorded in the sincedb file will be used.
    パラメータが設定されていない場合、デフォルトはファイルの末尾から読み出されます.つまり、さっき起動したコンソールにテキストが印刷されていない理由がわかります.公式の説明に従って「beginning」に設定し、変更後の構成は以下の通りです.
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ more demo/first/first.conf 
    input {
        file {
            path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
            start_position => "beginning"
        }
    }
    filter {
        
    }
    output {
        stdout {}
    }
    

    修正が完了したら、引き続き起動します.
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
    Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
    [2019-04-20T16:31:36,250][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
    [2019-04-20T16:31:36,274][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
    [2019-04-20T16:31:44,536][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
    [2019-04-20T16:31:44,864][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
    [2019-04-20T16:31:44,915][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<0x5e479e9a run="">"}
    [2019-04-20T16:31:45,008][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    [2019-04-20T16:31:45,022][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
    [2019-04-20T16:31:45,443][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
    

    WTFは、花が散るまで待っていたが、ファイルの内容を出力するまで待っていなかった.構成パラメータが少なくなったのか、公式ドキュメントを見続けた.
    Tracking of current position in watched files edit
    The plugin keeps track of the current position in each file by recording it in a separate file named sincedb. This makes it possible to stop and restart Logstash and have it pick up where it left off without missing the lines that were added to the file while Logstash was stopped.
    By default, the sincedb file is placed in the data directory of Logstash with a filename based on the filename patterns being watched (i.e. the path option). Thus, changing the filename patterns will result in a new sincedb file being used and any existing current position state will be lost. If you change your patterns with any frequency it might make sense to explicitly choose a sincedb path with the sincedb_path option.
    A different sincedb_path must be used for each input. Using the same path will cause issues. The read checkpoints for each input must be stored in a different path so the information does not override.
    Files are tracked via an identifier. This identifier is made up of the inode, major device number and minor device number. In windows, a different identifier is taken from a kernel32 API call.
    Sincedb records can now be expired meaning that read positions of older files will not be remembered after a certain time period. File systems may need to reuse inodes for new content. Ideally, we would not use the read position of old content, but we have no reliable way to detect that inode reuse has occurred. This is more relevant to Read mode where a great many files are tracked in the sincedb. Bear in mind though, if a record has expired, a previously seen file will be read again.
    Sincedb files are text files with four (< v5.0.0), five or six columns:
  • The inode number (or equivalent).
  • The major device number of the file system (or equivalent).
  • The minor device number of the file system (or equivalent).
  • The current byte offset within the file.
  • The last active timestamp (a floating point number)
  • The last known path that this record was matched to (for old sincedb records converted to the new format, this is blank.

  • On non-Windows systems you can obtain the inode number of a file with e.g. ls -li.
    公式にはsincedbファイルには、リスニングされたファイルの位置などの情報が記録されており、logstashを再起動すると、ファイルを最初から読み取る必要はありません.次に私たちがしなければならないのは、このファイルを削除することです.公式には、このファイルがdataディレクトリの下にあることを説明しています.私たちは探しに来ました.半日探してみると、このファイルが存在しないことに気づきます.実は私たちが間違っています.このファイルは隠しファイルなので、見つかりませんでした.はい、以下のコマンドを実行して削除します.
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ ls data/plugins/inputs/file/
    [sqczm@sqczm logstash-6.7.1]$ ls -a data/plugins/inputs/file/
    .  ..  .sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3
    [sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 
    

    削除が完了したらlogstashを起動し続けます
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
    
    Sending Logstash logs to /opt/logstash-6.7.1/logs which is now configured via log4j2.properties
    [2019-04-20T16:57:38,915][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
    [2019-04-20T16:57:38,939][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"6.7.1"}
    [2019-04-20T16:57:47,643][INFO ][logstash.pipeline        ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
    [2019-04-20T16:57:48,093][INFO ][logstash.inputs.file     ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/opt/logstash-6.7.1/data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3", :path=>["/opt/logstash-6.7.1/demo/first/users.txt"]}
    [2019-04-20T16:57:48,145][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<0xc6ca077 run="">"}
    [2019-04-20T16:57:48,233][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
    [2019-04-20T16:57:48,251][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
    [2019-04-20T16:57:48,693][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
    /opt/logstash-6.7.1/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
    {
        "@timestamp" => 2019-04-20T08:57:48.917Z,
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name: lisi, age:20,addr:\"  \""
    }
    {
        "@timestamp" => 2019-04-20T08:57:48.886Z,
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name: zhangsan, age: 21, addr: \"     \""
    }
    {
        "@timestamp" => 2019-04-20T08:57:48.918Z,
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name:wangwu,age:19,addr:\"beijing\""
    }
    

    気持ちが高ぶって、やっとテキストの内容を見て、それから改造して、実はみんなが見ている間に私が偽造したデータが実はjsonフォーマットを表現したいことに気づいて、私たちはプロファイルを改造し続けました.プロファイルを変更してjson形式に設定
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ more demo/first/first.conf 
    input {
        file {
            path => ["/opt/logstash-6.7.1/demo/first/users.txt"]
            start_position => "beginning"
            codec => "json"
        }
    }
    filter {
        
    }
    output {
        stdout {}
    }
    

    修正が終わったらsincedbファイルを削除してください
    sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 
    [sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
    ……      ……
    {
              "tags" => [
            [0] "_jsonparsefailure"
        ],
          "@version" => "1",
        "@timestamp" => 2019-04-20T11:48:45.377Z,
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name: lisi, age:20,addr:\"  \""
    }
    {
              "tags" => [
            [0] "_jsonparsefailure"
        ],
          "@version" => "1",
        "@timestamp" => 2019-04-20T11:48:45.332Z,
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name: zhangsan, age: 21, addr: \"     \""
    }
    {
              "tags" => [
            [0] "_jsonparsefailure"
        ],
          "@version" => "1",
        "@timestamp" => 2019-04-20T11:48:45.381Z,
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "host" => "sqczm",
           "message" => "name:wangwu,age:19,addr:\"beijing\""
    }
    

    結果を見ていると潰れてしまいますね.tagsノードのエラー情報はjson解析エラーで、突然私が構築したデータがjsonフォーマットではないことを見て、急いで修正しました.
    [sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ more demo/first/users.txt 
    {"name": "zhangsan", "age": 21, "addr": "     "}
    {"name": "lisi", "age":20,"addr":"  "}
    {"name":"wangwu","age":19,"addr":"beijing"}
    

    修正が完了したら、sincedbファイルを削除して再起動します.
    sqczm@sqczm logstash-6.7.1]$ pwd
    /opt/logstash-6.7.1
    [sqczm@sqczm logstash-6.7.1]$ rm -rf data/plugins/inputs/file/.sincedb_ccdcb2b886f0094c5a7fa2ddbbd759e3 
    [sqczm@sqczm logstash-6.7.1]$ bin/logstash -f /opt/logstash-6.7.1/demo/first/first.conf
    ……      ……
    {
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "name" => "zhangsan",
        "@timestamp" => 2019-04-20T11:54:55.419Z,
               "age" => 21,
              "addr" => "     ",
              "host" => "sqczm"
    }
    {
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "name" => "lisi",
        "@timestamp" => 2019-04-20T11:54:55.460Z,
               "age" => 20,
              "addr" => "  ",
              "host" => "sqczm"
    }
    {
          "@version" => "1",
              "path" => "/opt/logstash-6.7.1/demo/first/users.txt",
              "name" => "wangwu",
        "@timestamp" => 2019-04-20T11:54:55.462Z,
               "age" => 19,
              "addr" => "beijing",
              "host" => "sqczm"
    }
    

    ここで、logstash-input-fileプラグインの例はこれで終わり、他のプロパティは公式ドキュメントを見て練習することができます.