For those out there searching for “What is syslog?,” this post has answers to all of your questions.
Simply put, syslog handles a very important task—collecting events—and is present in almost all systems and peripherals out there. It’s the standard used to collect events in an ever-growing number of devices. Syslog can often be related to Ubuntu and servers, but it’s certainly much more than that. This protocol is also used by your printer, router, phone, and *nix OS. If you can name a device, it probably uses syslog or any of its versions and variants.
More than just storing events, an even more critical part of “logging” is being able to check the messages because having visibility on them is what makes them useful. This post is about syslog, and later in this article, I’m going to be covering in detail everything related to this essential protocol. Format, log levels, transmission, and visibility will be included here.
What It Does
First, just a tiny bit of a history lesson before we dig in. Syslog has been around for quite some time. It dates back to the ’80s (courtesy of one Eric Allman), and its adoption was immediate. As I mentioned above, it immediately comes to mind when I hear “Linux,” but its usage has extended way beyond that. Syslog is now present on network devices, specifically routers.
If you feel nerdy enough to take a look at a deeper level, the syslog protocol standard is available here.
Syslog is where the network collects events. The information about those events can include
- Access logging
- Wrong password login attempts
- Anomalies in the system functioning
- Hardware errors
- Software errors
- OS messaging
Its advantages don’t stop there. An everyday use case nowadays involves cybersecurity. Say that one of your servers has been compromised. The attacker can easily erase the server’s logs. But, if you have syslog configured and a different server receiving all of the events that are happening, you can timeline the attack and better respond to it.
Another everyday use case you might relate to involves auditing. Timestamping events and tracking severity levels are vital advantages that make this protocol essential when it comes to auditing a network and its responses to different situations.
The timestamping is, of course, a crucial part of the logged event. But it’s not all that a syslog message can carry. We’ll discuss further on how that looks with the different formatting and the several log levels available.
A typical syslog message should have the following elements:
- Structured data
Reading the specification, we learn that within the header are several parts:
- Priority: discussed below
- Version: the version of the syslog protocol in use, might come in handy for processing
- Timestamp: full timestamp of the event
- Hostname: hostname of the machine originating the message; The standard recommends using an FQDN in this field.
- Application: identifies the application or device originating the message.
- Process ID: good old PID, along with the application name, to identify the originating event
- Message ID: the type of message being sent, sometimes associated with the protocol being triggered; TCPIN or TCPOUT are good examples.
What follows is the structured data (SD) part. I won’t get too deep into what it does, but it’s metadata about the message itself. The SD field isn’t required, but if it’s not present, the NILVALUE character should be sent in its place. The specification, however, warns us about a possible conflict with collectors if the SD is malformed.
Last but not least, the final element of the syslog event is the message. It is required to be encoded in UTF-8 and is normally free-form—text that’s easy to read (one can hope) to provide context to what’s going on. The message should explain and, along with the context provided by the SD and the header information, should identify an event. It should tag the event to an element in the network, or the application running, or even the server transmitting the data.
At this point, it’s worth mentioning that although it’s standardized, syslog isn’t as consistent as it should be. The formatting can vary depending on the developer, the manufacturer, the system, and so forth. While many of the messages will be perfectly readable (in human terms), some applications may not care about that and will change the formatting. A potential cause might be an old version of rsyslog or syslog-ng. A nonstandard format might make it hard to curate and process the messages, for which you’ll need a specialized tool. More on that below.
Included as part of any syslog message being transmitted is the severity level of that message. The following table represents the standardized log levels available in the protocol.
|0||Emergency||The system is unusable (also referred to as a panic condition)|
|1||Alert||Action must be taken immediately|
|5||Notice||Normal but significant conditions|
A limitation of the UDP (used by syslog by default) is that it does not confirm receipt to the originator. This means that packet loss can be a problem. This raises concerns about syslog’s ability to adequately check and collect the logs. As we learned previously, syslog can be software agnostic. Which means that your syslog server can collect data from a number of different origins, including servers, *nix implementations, routers, IoT, etc., which can be overwhelming.
In order to transmit those messages, syslog operates in the UDP port 514. So, remember to keep that open for messages to go through. If you’re looking for alternatives to UDP (remember that UDP is, by nature, unreliable and doesn’t provide flow control, retransmission, or connection tracking), TCP can also handle syslog transmissions via the same 514 port. Should you take the security one step further, rsyslog provides the ability to transmit the messages using TLS, effectively securing your messages traveling over the wire.
As far as the packet size goes, a lot of information out there will tell you to keep it under 1K. In fact, the first versions (from 2001) of the spec suggested so. However, newer revisions (from 2009) exist, indicating that no upper limit is preset and that it’s up to each implementation to limit the size of the message. It’s something you have to work on with your collector and transport layer implementations, but it’s no longer a protocol-limiting factor.
Once the data is centralized in one location, you can visualize the information via graphs and diagrams. However, as I’ve mentioned a few times now, the amount of data being collected by the server can be astonishing. The best friend of anyone trying to work all of this out is a tool that can filter and interpret the sheer number of messages flowing through. You’re more than welcome to try for yourself. But, you’ll spend a lot of time navigating through endless timestamps and hardware events. Being able to sort through an infinite number of lines and events quickly is particularly important when your infrastructure is under attack. If you’re going through a DevOps incident, you’ll wish you had an automated tool in your corner.
If you’re willing to add more tools to your skill set, check out XPLG’s syslog server. It’s not just capable of streaming events from any source out there. This tool will alert you in real time if custom data rules are met. Deploying it will only take five minutes. And if you’re not ready to commit, XPLG’s Free Forever pricing tier has you covered.
Ultimately, the need to implement syslog should be clear. There’s plenty more to it than what we can cover in a single post. What’s really important, though, is to understand that as your company operates and grows, the need to keep an eye out on the important data is imperative. You don’t want to be in a situation where you’d say to yourself, “If only I had that logged.”
Guillermo Salazar is the author of this post. Guillermo is a solutions architect with over 10 years of experience across a number of different industries. While his experience is based mostly in the web environment, he’s recently started to expand his horizons to data science and cybersecurity.