Web server protection: Logs and web server security
This article on logs and web server security continues the Infosec Skills series on web server protection. While there are many active and passive defenses that can be employed to attempt to secure a web server and mitigate risk of an attack to it, one of the most powerful methods involves understanding and utilizing web server logs. The web server log is, quite simply, a guest book or sign-in sheet that captures visitors as they visit your organization’s website, including some basic information about them.
In the event of a security incident, one must remember that all cyber attackers leave a trace of their work; the difficulty is knowing where to look and what to look for. Logs, therefore, are often the best first place to look.
What are web server logs?
Web server logs capture a range of data about the requests handled by the web server on your network. These logs files are often configured to be recorded, by default, in a text file in a Common Log Format and can be customized to collect a range of information that passes through your web server.
While this will be covered in more detail later in this article, some of the data that can be collected, stored, and analyzed for incident remediation include: client IP addresses, user agent strings, date, time, server name, server IP and services running, among many others.
The log can also capture requests from other computers that request data from the web server and internal actions completed by the server itself, such as updates. With this information, you can see who is visiting your website, where do they go within your website and what types of actions are they taking.
Types of logs
A web server’s access log captures information about the traffic requests coming into the server itself and, by default, are running in both Apache and Microsoft IIS servers . As mentioned previously, this information can include the specific pages viewed, the browser used and the client IP address, but also information such as how long the web server took to process the request and response. However, the types of information captured and the information generated in each entry can be customized to aid in log analysis.
In an Apache web server log, by default, a web server is comprised of the following components:
[LogFormat [host IP] [identity][date][request][status code][bytes]
Additional fields, such as the user agent string (which includes the browser type, cookies, transfer sizes and referrer) can also be captured, if configured to do so.
Altogether, an example log entry could look like the following.
127.0.0.0 - john [12/Dec/2012:17:23:17 -1700] "GET /apache_pb.gif HTTP/2.0" 100 3262 "http://www.apache.org.html" "Chrome/7.08 [en] (Win10)"
At the same time, where this Access Log is located and the content of the log can be configured in the CustomLog directive, using the LogFormat command. On the other hand, a web server log can be configured to conditionally present information or exclude it based on characteristics of the client request or other environmental variables. These criteria can be set with the SetEnvIf configuration within the web server.
However, it should be noted that web server logs do have some limitations. For example, log entries on their own are unable to identify true, unique user traffic from robot (bot) traffic. This means that web crawlers or other bots that visit and interact with the web server have their activity logged alongside unique user traffic, which could be difficult to parse out when having to analyze large amounts of data.
Similarly, some organizations and internet service providers also use proxies to send traffic from their network to the network that hosts the web server. Because of this, an IP address may appear multiple times in the web server log, but it is actually just the IP range of the proxy server that is used to pass the request to the web server from a host serviced by the proxy. Therefore, multiple users could use the same IP address when visiting a website, which could make it difficult to trace activity to a unique user.
While breaking down each component of the Access Log is not in the scope of this article, the Apache Software Foundation and Microsoft Support are great resources for more detailed information.
Finally, error logs capture information about the errors that may be generated when the web server cannot process a server request. This could be information about missing files or broken links, or hung services or diagnostic information about the web server itself. The error log is of more use for troubleshooting the web server, but if an attacker attempts to search for vulnerable web services that may generate an error, this information can be used by security professionals. Error logs can be found in error.log on Windows or in error_log on Unix.
Logs & web server security
The biggest benefit of web server logs is the simplicity and consistency to which information is generated for each web request, which makes log analysis a relatively easy source for initial triage in case of a security incident. In the event of a security incident, web server logs from across an enterprise network can be pooled together with those from application and database servers to identify multiple factors: first, the vector of attack and second, the nature of the traffic that traversed the web server.
However, as can be imagined in the case of an advanced persistent threat that maintains access over a long period of time in a network or even a short-term, destructive attack, there can be a lot of log data to review. In either case, security professionals can use manual or advanced log reviews in attempt to identify potential anomalies, such as:
- Malicious request or inputs from hosts
- Unusual spikes in web access trends
- Changes in usual traffic proxies or referrers
In the wake of an attack, but prior to any reviews, it is best practice for the affected web server(s) to be taken offline and isolated as to preserve the status of the system while the logs are reviewed. Next, it is recommended to take a forensic copy of the server so reviews can be conducted without affecting the original equipment. Finally, depending on the number of web servers present on the network or potentially involved in an incident, the time period under question and the number of event logs, organizations can then choose to employ manual or advanced log reviews.
By now you know that log files provide you with a precise view into the behalf of a server as well as critical information about how, when and by whom a web server is presenting information. In the event of a security incident, such as when databases are dumped, websites defaced or files removed, a web server log may be the only piece of evidence about what happened.
To begin any investigation, one needs to identify what types of evidence to look for. While it can vary by the type of attack, common anomalies to look for include:
- Attempts to access hidden or unusual files
- Brute-force attempts at administrative services
- Activity occurring at unusual times or toward uncommon services
- Attempts to perform remote code execution or SQL injection
- Attempt to use file inclusion or cross-site scripting
- General network reconnaissance, such as port scans
As you can imagine, checking every single web server log entry can be impractical and time-consuming. Instead, security professionals commonly employ a number of different resources that can range in complexity in levels of automation.
Where there are not as many logs to review or if the number or type of evidence that is needed is very specific, then logs can be reviewed manually in command line using expressions such as “grep.”
For example, if there was an indication that a SQL injection occurred via a web server, a search using the grep expression can search for the keyword “union” in a particular URL. When done, an output such as “union select 1,2,3,4,5” in the web address could have attempted a SQL injection, listed alongside the IP address from which the request was made.
Similarly, keywords such as “/etc/passwrd” can be searched to attempt to find an attacker trying to complete a local file inclusion, while a search for specific files or users probed can return similar results to continue the investigation.
Log File Navigator or mySQL (filtering or summation)
If the number of log entries is too long for a manual review, other tools such as Log File Navigator or mySQL can assist in filtering or summarizing large amounts of data. Log File Navigator is an open-source, advanced web server log viewer that is controlled at the command line, available for Linux and macOS. Log File Navigator, or LNAV, is installed on a local machine — not the web server itself — and supports a wide-range of default log formats ranging from CLF to sudo, syslog and more.
In addition to presenting data at the log entry level, LNAV or other SQL-enabled tools like it can present data using SQL queries and statements that search for information within the entries. Another advantage is that LNAV supports syntax highlighting, so particular statements or keywords are highlighted across the log file for quicker triage.
For example, LNAV can be used to look for common terms such as .php requests, “password” file requests, error statuses such as “denied,” “granted” or “failed,” or the use of “su” permissions.
The final group of tools include those that are tuned to be large-scale log analyzers. Microsoft’s LogParser and the open-source tool for Apache, Scalp, are common resources used by security professionals and web administrators. Both of these tools are free, open-source applications that support a variety of log formats and use SQL and command-line statements to process log files quickly with a robust number of built-in features. For example, Scalp can automatically search log files for common signs of security issues and call them out for faster triage.
Bringing it all together
Web server logs are obviously only one tool that security professionals can use to attempt to mitigate the risk of attacks and respond when one does occur. While the practice of reviewing logs has evolved as more and more tools have become available to the security professional, ultimately they can only reveal part of the total picture that describes how and a cyberattack occurred, what may have been affected by it and who carried it out.
However, armed with results from manual and automated log reviews, security professionals can trace back IP addresses, identify which security holes were exploited and the types of information probed and possibly stolen so more advanced incident response activities can continue.
- Apache HTTP Server Version 2.5, Apache
- Log Parser 2.2, Microsoft
- Using Logs to Investigate – SQL Injection Attack Example, Acunetix