Are you sick of going into a Kubernetes cluster to look at the logs of an application? Do you want a clear overview of the access logs over several pods? Use these tools for a fluent log experience.

We are going to use fluent-bit, fluentd, elasticsearch and kibana as tools to ultimately visualise our logs. The following diagram, from this post on fluent-bit and fluentd, shows the flow of the data.

flow chart fluentd - FLUENT

For more in depth information on these tools take a look at their own docs:

Data collection

We start with an example app that will generate logs. We will use a Java API from a previous post on deploying with Helm. For the complete code for this section take a look at this branch here. We see that we have a logback.xml where we specify to write logs to a file. (See here te relevant part)

<appender name="javaLog" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <encoder>
       <pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} %-5level --- [%thread] %logger{36} : %msg%n</pattern>
    </encoder>
    <file>/var/log/java.log</file>
    <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
        <fileNamePattern>java.log.%d{yyyy-MM-dd}.%i</fileNamePattern>
        <maxFileSize>250MB</maxFileSize>
        <maxHistory>15</maxHistory>
        <totalSizeCap>500MB</totalSizeCap>
    </rollingPolicy>
</appender>

We give the pattern in which the logs are written and the file where they are stored in the container. To clean up old logs we rotate the logfiles specified in the rollingPolicy. For more information on logback see for example here.

We include this configuration by adding it in our docker image and refering to it in the entrypoint.

FROM openjdk:11-jre-slim

ADD target/java-spring-api-1.0.0-SNAPSHOT.jar /srv/app/app.jar
COPY logback.xml /srv/app

WORKDIR /srv/app

ENTRYPOINT [ "java", "-Dlogging.config=file:/srv/app/logback.xml", "-jar", "app.jar" ]

Now we can build and push our new java application image to dockerhub.

> docker build -t sybrenbolandit/java-spring-api:1.1.0 .
> docker push sybrenbolandit/java-spring-api:1.1.0

To send the logs somewhere else we are going to use a fluent-bit sidecar container. This means adding it to our Kubernetes Deployment so every pod of the java application also has a fluentbit container.

...
containers:
    - name: java-spring-api
    ...
    - name: fluent-bit
      image: fluent/fluent-bit
      resources:
        requests:
          cpu: 5m
          memory: 10Mi
        limits:
          cpu: 50m
          memory: 60Mi
      volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/

Please note that this sidecar container consumes very little memory and cpu resources. Its only function is to send the logs in the /var/log directory (remember we write the logs to /var/log/java.log) and send them to the data aggregation (see next section).

The configuration of the fluent-bit container is found in a separate configmap. Here we configure which files are picked up and where they are send. Furthermore we want multiple lines that belong together to be packaged as one log statement.

...
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush             1
        Log_Level         info
        Daemon            off
        Parsers_File      parsers.conf
    [INPUT]
        Name              tail
        Tag               java_log
        Path              /var/log/java.log
        Multiline         On
        Parser_Firstline  java_multiline
        DB                /var/log/flb.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10
    [FILTER]
        Name              record_modifier
        Match             *
        Record            hostname ${HOSTNAME}
        Record            environment {{ required "A valid .Values.environment entry required!" .Values.environment }}
        Record            app {{ include "java-spring-api.fullname" . }}
    [OUTPUT]
        Name              forward
        Match             *
        Host              fluentd
        Port              24224
  parsers.conf: |
    [PARSER]
        Name              java_multiline
        Format            regex
        Regex             /^(?<date>[0-9]+-[0-9]+-[0-9]+\s+[0-9]+:[0-9]+:[0-9]+.[0-9]+)\s+(?<loglevel>\S+)\s+(?<pid>\S+)\s+---\s+\[(?<thread>.*?)\]\s+(?<logger>\S+)\s+:\s+(?<message>.*)$/

In the [FILTER] section we add some values that are shared by the whole deployment and environment.

Data aggregation

All log statements are collected in fluentd. We will not go into the definitions of the Deployment, Service or HorizontalPodAutoscaler for fluentd. For these concepts take a look at this previous post on Kubernetes deployments, and this one on Kubernetes Autoscaling.

The complete code for this post is found here on github. We examine the Dockerfile for it is extended with the elasticsearch plugin. The plugin is required for Elasticsearch is the Data Storage that we will use in this post.

FROM fluent/fluentd:v1.9.1-1.0

# Use root account to use apk
USER root

RUN apk add --no-cache --update --virtual .build-deps \
        sudo build-base ruby-dev \
 && sudo gem install fluent-plugin-elasticsearch \
 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem

COPY fluent.conf /fluentd/etc/

USER fluent

We can build and push this image to our favorite docker registry.

> cd docker/fluentd
> docker build -t sybrenbolandit/fluentd:1.0.0 .
> docker push sybrenbolandit/fluentd:1.0.0

Now next up is the configuration of fluentd. Note that it has to match the configuration of fluent-bit in the previous section. The input is comming in on port 24224, which is the output service of fluent-bit. Also the log statements are tagged by fluent-bit with java_log. This is exactly where fluentd matches on in the following.

...
data:
  fluent.conf: |
    <source>
      type forward
      bind 0.0.0.0
      port 24224
    </source>

    <match java_log>
      @type elasticsearch
      host elasticsearch-elasticsearch-master
      port 9200
      index_name fluentd
      type_name fluentd
      <buffer>
          @type memory
          flush_thread_count 4
          flush_interval 3s
          chunk_limit_size 5m
          chunk_limit_records 500
      </buffer>
    </match>

Are you still with me? Let’s deploy that fluentd!

helm install kubernetes-monitoring . \
    --namespace sybrenbolandit \
    --values environments/test/values.yaml

Storage and Visualization

This part is not the focus of this post although we would like a fully functional example. Lets install the elasticsearch and kibana helm charts. (This is pretty cpu intensive)

> helm repo add bitnami https://charts.bitnami.com/bitnami
> helm -n sybrenbolandit install elasticsearch bitnami/elasticsearch
> helm -n sybrenbolandit install kibana bitnami/kibana \
          --set elasticsearch.hosts[0]=elasticsearch-elasticsearch-master,elasticsearch.port=9200

Note that the elasticsearch host and port are precisely the host and port we defined for fluentd to send the logs to.

Now that everything is set up we need to generate some logs. The easiest way is to restart the java-spring-api pod. Now lets publish our new Kibana locally.

kubectl -n sybrenbolandit port-forward svc/kibana 5601:5601

Now we can go to http://localhost:5601 and get the Kibana UI.

Screenshot 2020 06 25 at 10.52.29 - FLUENT

To actually see our logs we need to create an index pattern. Click on the menu in the top left and go to Discover. Here you are asked to create an index pattern and a suggestion is made for fluentd.

Screenshot 2020 06 25 at 11.03.33 - FLUENT

Finish the wizard with fluentd as input. Now navigate to Discover again and you will see the logs we generated!

Screenshot 2020 06 25 at 11.05.07 - FLUENT

Note that there is a lot more we can do with Elasticsearch and Kibana but this is out of scope for this post.

Hopefully you are now able to collect, aggregate, store and visualise your logs in a fluent way. Happy fluenting!