Elasticsearch > Logstash > TSVを読む

4062 ワード

Elasticsearch Logstash Elasticsearch テキストリンク

pipeline config file(logstash.conf)を以下のように設定する

方法1

columnsにカラム名を設定する

logstash.con

input {
  file {
    path => ["/usr/share/logstash/data-path/*"]
    start_position => "beginning"
  }
}


filter {
  csv{
    separator => "  " # tab文字をセット,`\t`をセットしてはいけない    
    columns => ["column1", "column1", "column3"] # ファイルの1行目と同じカラム名と列数にする必要あり
    skip_header => false
    skip_empty_rows => true
    skip_empty_columns => true
    remove_field => "message"
  }
}

output {
  stdout {codec => rubydebug }
}

方法2

カラム名を設定せずにautodetect_column_namesをTrueにして自動検出する
※ ファイル内の行処理順が保証されないので、logstash.ymlに設定を追加する必要がある。

logstash.con

input {
  file {
    path => ["/usr/share/logstash/data-path/*"]
    start_position => "beginning"
  }
}


filter {
  csv{
    separator => "  " # tab文字をセット,`\t`をセットしてはいけない    
    autodetect_column_names => true
    skip_header => false
    skip_empty_rows => true
    skip_empty_columns => true
    remove_field => "message"
  }
}

output {
  stdout {codec => rubydebug }
}

logstash.yml

pipeline.workers: 1

separator

TSVのときはseparatorにタブ文字を入力する。
\tをセットしてはいけない

separator => "  "

autodetect_column_names (自動カラム検出)

autodetect_column_names => true

1行目をヘッダー列として自動で列名を設定する機能

のはずが、実行する度に1行目が列名と認識したり、2行目が列名と認識されたりばらばらな挙動となった。

以下の情報で解決した。要はマルチパイプラインがファイルの1行単位で動作しているので処理順はファイルの行の通りにならないとの事。

Autodetect_column_names take header from second row · Issue #67 · logstash-plugins/logstash-filter-csv
autodetect_column_names does not work with multiple worker threads · Issue #65 · logstash-plugins/logstash-filter-csv

よく見たら以下の公式マニュアルにも書いてあった。（見過ごしていた）
パイプラインワーカーを1に設定せよとのこと。

Csv filter plugin | Logstash Reference [7.10] | Elastic

Logstash pipeline workers must be set to 1 for this option to work.

logstash.ymlに設定するか、素直にcolumnsを設定したほうがよさそうです。

logstash.yml

pipeline.workers: 1

参考

https://gist.github.com/carrotsword/1824c1fe79d1cc3270ba17e615388faa#file-logstash-conf
https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html

Author And Source

この問題について(Elasticsearch > Logstash > TSVを読む), 我々は、より多くの情報をここで見つけました https://qiita.com/sugasaki/items/984c1dfea56890b3ff59

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .

RxJava 2+Retrofit簡単なログインを実現

Vueでkeep-aliveキャッシュdesroyed()は解決策を実行できません