CentOS7でElasticsearch + apache


目的

Apacheのログ解析ツールが中々適当なものが無く障害発生の都度、調査に時間が掛かっていた。
ELK環境でApacheのログ解析ってどうやるの?うまく活用できないかな?というお勉強を行う。
本番稼働しているサーバとリアルタイムに同期、というのは今回は対象外。
⇒本番稼働のログを手動で持ってきて解析する。

作業環境

項目
VM VirtualBox 6.1
HostOS Win10 Professional
GuestOS CentOS Linux release 7.8.2003 (Core)
GuestIP 192.168.56.10

環境構築

参考サイト

1.を中心に参考とさせて頂いた。

  1. ELK(Elasticsearch+Logstash+Kibana) Apacheのグラフ化まで
  2. Elastic Stack 7 : Elasticsearch インストール

Apacheインストール

yum install -y httpd
systemctl enable httpd
systemctl start httpd

ELKインストール

cat > /etc/yum.repos.d/elasticsearch.repo <<EOF
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF 

yum install -y java-1.8.0-openjdk-devel
yum install -y elasticsearch kibana logstash
# elasticsearchで日本語が扱えるようにプラグイン追加
/usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-kuromoji 

elasticsearch設定

cd /etc/elasticsearch/
cp -p elasticsearch.yml elasticsearch.yml.org
vi elasticsearch.yml   
----------------------------------
#network.host: 192.168.0.1
network.host: 0.0.0.0
discovery.type: single-node   #(注)
----------------------------------

(注)記載がないと以下のエラーが発生した
下記サイトを参照し設定追加
https://papalagi.org/blog/archives/437

[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured

kibana設定

cd /etc/kibana/
cp -p kibana.yml kibana.yml.org
vi kibana.yml
----------------------------------
#server.host: "localhost"
server.host: "0.0.0.0"

#i18n.locale: "en"
i18n.locale: "ja-JP"
----------------------------------

i18n.locale: "ja-JP"の部分は日本語だが、日本語化するとkibanaの一部ページがエラーで見れなくなった。
(お勉強なので気にしない。)

起動確認

systemctl daemon-reload
# elasticsearchが起動する事を確認
systemctl restart elasticsearch
systemctl status elasticsearch

自動実行設定

systemctl start kibana
systemctl start logstash
systemctl enable elasticsearch
systemctl enable kibana
systemctl enable logstash

接続確認

サービス URL
elastic http://サーバIP:9200/
kibana http://サーバIP:5601/

Apacheログを読み込んでみる

参考サイト

logstash設定ファイル作成

# 設定ファイルの配置場所は任意だが、ここでは/etc/logstash配下に配置する。
cd /etc/logstash
vi apache_import.conf
----------------------------------
input {
  stdin { }
}

filter {
  grok {
#     https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
#     defaultのapacheはCOMBINEDを利用
#     match => { "message" => "%{COMBINEDAPACHELOG}" }
      match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    locale => "en"
  }
  mutate {
    replace => { "type" => "apache_access" }
  }
}

output {
  # stdout { codec => rubydebug }
  # elasticsearch { host => '172.17.4.199' } (参考サイトの記述。これだとエラーになる)
  elasticsearch { hosts => '192.168.56.10' }
}
----------------------------------

ElasticSearchへの取り込み

/usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/apache_import.conf < /etc/httpd/logs/access_log 
[root@centos7 logstash]# /usr/share/logstash/bin/logstash --path.settings /etc/logstash -f apache_import.conf < /etc/httpd/logs/access_log 
Sending Logstash logs to /var/log/logstash which is now configured via log4j2.properties
[2020-05-16T00:43:30,750][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-05-16T00:43:30,878][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.7.0"}
[2020-05-16T00:43:33,376][INFO ][org.reflections.Reflections] Reflections took 52 ms to scan 1 urls, producing 21 keys and 41 values 
[2020-05-16T00:43:34,491][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://192.168.56.10:9200/]}}
[2020-05-16T00:43:34,917][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://192.168.56.10:9200/"}
[2020-05-16T00:43:36,135][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-05-16T00:43:36,141][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-05-16T00:43:37,592][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//192.168.56.10"]}
[2020-05-16T00:43:37,754][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-05-16T00:43:38,003][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1, "index.lifecycle.name"=>"logstash-policy", "index.lifecycle.rollover_alias"=>"logstash"}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-05-16T00:43:38,202][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-05-16T00:43:38,207][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/etc/logstash/apache_import.conf"], :thread=>"#<Thread:0x23adbb83 run>"}
[2020-05-16T00:43:38,213][INFO ][logstash.outputs.elasticsearch][main] Installing elasticsearch template to _template/logstash
[2020-05-16T00:43:39,821][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-05-16T00:43:39,960][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-05-16T00:43:40,845][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-05-16T00:43:43,213][INFO ][logstash.outputs.elasticsearch][main] Creating rollover alias <logstash-{now/d}-000001>
[2020-05-16T00:43:46,277][INFO ][logstash.outputs.elasticsearch][main] Installing ILM policy {"policy"=>{"phases"=>{"hot"=>{"actions"=>{"rollover"=>{"max_size"=>"50gb", "max_age"=>"30d"}}}}}} to _ilm/policy/logstash-policy
[2020-05-16T00:43:56,232][INFO ][logstash.runner          ] Logstash shut down.

改良版
confファイル上に読み込むファイルを指定+geoipの利用(geoipを有効化すると取込時間が大幅に増加する)
ファイル指定での読込はうまく行かなかった。(原因不詳)上記の通り、引数指定ではgeoip指定でも読み込めた

[root@localhost logstash]# cat /etc/logstash/apache_import.conf
input {
#  stdin { }
  file {
    path => "/root/logs/ssl_access_log.2020-05-13"   
  }
}

filter {
  grok {
#     https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns
#     match => { "message" => "%{COMBINEDAPACHELOG}" }
      match => { "message" => "%{HTTPD_COMMONLOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    locale => "en"
  }
  mutate {
    replace => { "type" => "apache_access" }
  }
  geoip {
    source => ["clientip"]
  }
}

output {
  # stdout { codec => rubydebug }
  # elasticsearch { host => '172.17.4.199' } (参考サイトの記述。これだとエラーになる)
  elasticsearch { hosts => '192.168.56.10' }
}

usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/apache_import.conf