Problem:

You want send container logs from self-managed kubernetes cluster pods to AWS S3 bucket using Fluent Bit. Actually this is possible now because starting Oct 2020, Fluent Bit supports AWS S3 as a destination to route container logs. In this article we will learn how to setup Fluent Bit to send logs to S3 bucket.

 

Solution:

In order to send records into Amazon S3, follow these steps-

1. Create a ClusterRole (cluster-role.yaml). Set your namespace name.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluent-bit-read
  namespace: fluentbit2S3
rules:
- apiGroups: ["", "extensions", "apps"]
  resources: ["deployments", "pods", "namespaces"]
  verbs: ["get", "list", "watch"]

 

2. Create ServiceAccount (service-account.yaml). Set your serviceaccount and namespace name.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: fluentbit2S3

 

3. Create ClusterRoleBinding (cluster-role-binding.yaml). Set your serviceaccount and namespace name.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-read
  namespace: fluentbit2S3
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-read
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: fluentbit2S3

 

4. Create Configmap for Fluent Bit (config-map.yaml). Update namespace and S3 bucket name.

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit
  namespace: fluentbit2S3
  labels:
    app: fluent-bit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File parsers.conf

    @INCLUDE input-*.conf
    @INCLUDE output-*.conf

    @INCLUDE filters.conf
  
  parsers.conf: |
    [PARSER]
        Name        s3_logs
        Format      json
        Time_Key    requestReceivedTimestamp
        Time_Format %Y-%m-%dT%H:%M:%S.%LZ
        Time_Keep   On
    
    [PARSER]
        Name        kubernetes-tag
        Format      regex
        Regex       (?<namespace_name>.+)\.(?<pod_name>.+)\.(?<container_name>.+)


  input-s3logs.conf: |
    [INPUT]
        Name              tail
        Tag               kube.<namespace_name>.<pod_name>.<container_name>
        Tag_Regex         (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
        Path              /var/log/containers/*.log
        Exclude_Path      /var/log/containers/*_fluentbit2S3_*.log
        Parser            s3_logs
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10


  output-s3.conf: |
    [OUTPUT]
        Name s3
        Match *
        bucket /fluentbit2S3/
        region ap-southeast-2
        use_put_object On
        s3_key_format /$TAG[1]/$TAG[3]/%Y/%m/%d/
        s3_key_format_tag_delimiters . 
        total_file_size 5M
        upload_timeout 1m
    
    [OUTPUT]
        Name  stdout
        Match *

  filters.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Regex_Parser        kubernetes-tag
        Labels              Off
        Annotations         Off

 

5. Create Daemonset file (daemon-set.yaml). Update namespace name, secretKeyRef name.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: fluentbit2S3
  labels:
    k8s-app: fluent-bit
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      app: fluent-bit
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: fluent-bit
        k8s-app: fluent-bit
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - env:
        - name: AWS_REGION
          value: ap-southeast-2
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              name: fluentbit-s3-key
              key: AWS_ACCESS_KEY_ID
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: fluentbit-s3-key
              key: AWS_SECRET_ACCESS_KEY
        image: 906394416424.dkr.ecr.ap-southeast-2.amazonaws.com/aws-for-fluent-bit:2.8.0
        imagePullPolicy: Always
        name: fluent-bit
        volumeMounts:
        - name: fluentbitconfigvol
          mountPath: /etc/fluent-bit/conf/
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        command: ["/fluent-bit/bin/fluent-bit"]
        args: ["-c", "/etc/fluent-bit/conf/fluent-bit.conf"]
      imagePullSecrets:
      - name: aws-registry
      volumes:
      - name: fluentbitconfigvol
        configMap:
          name: fluent-bit
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

 

6. Create a secret into Pod as environment variable. Set the name of the secret in daemonset file.

kubectl create secret generic fluentbit-s3-key --from-literal=AWS_ACCESS_KEY_ID=<REPLACE THE KAY ID HERE> --from-literal=AWS_SECRET_ACCESS_KEY=<REPLACE THE ACCESS KEY STRING HERE>

 

Now run this Fluent Bit DaemonSet on Kubernetes cluster. It will start sending the container logs to S3 bucket.

Note: Follow this doc for more options to configure AS S3 plugin – https://docs.fluentbit.io/manual/pipeline/outputs/s3#getting-started