Introduction to Beats: Collect Data from Anywhere & Level Up your Elastic Stack 📡
Logstash did a very impressive job transforming our logs into documents that allowed us to understand and visualize how our applications are behaving. But, because of that Logstash could require quite the memory and CPU to run. So, it's not the most efficient move to make Logstash collect the data from various sources.
Elastic offered a solution to this concern and introduced Beats. Beats is a lightweight shipper for forwarding and centralizing log data. It's installed as an agent on your servers to capture all sorts of operational data like logs or network packet data. Beats is great for gathering data and works efficiently with a large number of files. It can also handle back pressure (when Logstash is busy) and ensures that no data is lost during such periods.
And to be clear Logstash can do most of what Beats. So, why use Beats instead?
1. Lightweight Data Shipping: Beats is designed to be lightweight and requires fewer resources than Logstash. This makes it ideal for forwarding logs from a machine with limited resources.
2. Backpressure-Sensitive Protocol: Beats communicates with Logstash using a backpressure-sensitive protocol, which ensures that Beats doesn't overload Logstash by sending too much data at once. If Logstash is busy, Beats slows down its read rate. Logstash doesn't have this capability on its own.
3. At-Least-Once Delivery: Beats keeps track of the read offset in the files and ensures the at-least-once delivery of events. If Logstash goes down, Beats will remember where it left off when Logstash comes back online.
4. File Rotation and Wildcards: Beats can handle log rotation and wildcards in file paths, which makes it easier to collect logs from many different files.
5. Multiline Events: Beats can handle multiline events (like stack traces) on the client side before shipping them to Logstash.
6. Distributed Architecture: Beats can be installed on every application server, which allows it to fetch logs locally and then send them to Logstash or Elasticsearch. This distributed architecture can be more scalable and resilient than having Logstash fetch logs from all your servers.
So, after pointing out why Beats could be a more efficient way to gather data let's see how to use it in this simple example.
Configuring Beats ⚙️
In order to run Beats, you have to prepare a YAML file (filebeat.yml) for its configuration. Which acts like the configuration file we prepared here for Logstash.
filebeat.inputs:
- type: log
enabled: true
paths:
- /logs/*.log
output.logstash:
hosts: ["logstash:5044"]
input {beats {port => 5044}}filter {grok {match => { "message" => "%{COMBINEDAPACHELOG}" }}date {match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]target => "@timestamp"}}output {elasticsearch {hosts => ["elasticsearch:9200"]index => "logstash_index"}}
Running Beats 🏃
As in all of my previous Elastic tools examples: Elasticsearch, Kibana and Logstash, I will be using a docker-compose.yml file to run my ELK stack and my Beats container too.
But I will not pull it directly from Docker Hub. Rather, I will build my own image because when I tried the former, I had read/write permission problems. Which is something common when running Docker on Windows.
So, I will prepare a dockerfile first for my Beats image then after it builds, I will include it in my docker-compose.yml.
Please note that you might not need to do this, and you should try first to include the image in you YAML file and if your Beats container doesn't run and keeps exiting then you can try making a dockerfile such as the one I will show you now.
# Use the official Filebeat image from the Elastic Docker registry
FROM docker.elastic.co/beats/filebeat:8.13.0
# Copy the Filebeat configuration file from the local directory into the container
COPY filebeat.yml /usr/share/filebeat/filebeat.yml
# Change the ownership of the configuration file to root:filebeat
USER root
RUN chown root:filebeat /usr/share/filebeat/filebeat.yml
# Switch back to the filebeat user
USER filebeat
Now, run docker build -t my-filebeat . in the directory of the docker file and wait until the image is built successfully.
The next step is to run the docker-compose.yml file but first let's add the Beats image we built (my-filebeat) and remove the logs volume from the previous post's Logstash section and add it to the Beats section. And it should be like this.
Perfect! Now let's run docker compose up in the directory of this docker-compose.yml file and see what happens.version: '3.7'services:elasticsearch:image: docker.elastic.co/elasticsearch/elasticsearch:8.13.0environment:- node.name=elasticsearch- cluster.name=es-docker-cluster- discovery.type=single-node- bootstrap.memory_lock=true- "ES_JAVA_OPTS=-Xms512m -Xmx512m"- "xpack.security.enabled=false"ulimits:memlock:soft: -1hard: -1volumes:- esdata1:/usr/share/elasticsearch/dataports:- 9200:9200kibana:image: docker.elastic.co/kibana/kibana:8.13.0ports:- 5601:5601environment:ELASTICSEARCH_URL: http://elasticsearch:9200depends_on:- elasticsearchlogstash:image: docker.elastic.co/logstash/logstash:8.13.0ports:- 5044:5044volumes:- ./logstash.conf:/usr/share/logstash/pipeline/logstash.confcommand: logstash -f /usr/share/logstash/pipeline/logstash.confdepends_on:- elasticsearchfilebeat:image: my-filebeatvolumes:- ./logs:/logsdepends_on:- logstashvolumes:esdata1:driver: local
Collect Everything 🌪️
Before adding logs to the log files, I will open Kibana to check the number of docs in the logstash_index.
Our index is empty as expected. Let's now add logs to the 2 log files and see what happens.
Okay, now Beats should be monitoring these 2 files and should ship the added logs to Logstash and then Logstash should do its magic.
And as expected, the 8 logs from the 2 different files were added to our logs index in Elasticsearch. But what if need to tell the logs apart depending on their source?
Visualize Logs 📊
Let's click on Discover Index to see the documents that were created.
So, let's click on Visualize to see how many came from logfile-2.log and how many from logfile.log.
In conclusion, the Elastic stack is ever-growing, and it provides much more features and details than I have shown you in this or the previous posts.
I encourage you to build on whatever knowledge you've gained from any of my posts and dig deeper into the different capabilities that the Elastic stack offers to achieve anything that your system needs or might need in the future.
Comments
Post a Comment