In today’s ever-changing business landscape, those that operate using a software-driven model will be the most successful. These businesses recognize the power of transforming enormous volumes of data generated by digital operations into real-time insights that propel further success. The ability to do this in real-time, all the time, across multiple functional disciplines, lies at the heart of continuous intelligence.
In order to effectively manage and monitor your infrastructure, a web admin needs clear and transparent information about the types of activity going on within their servers. Server logs provide a documented footprint of all traffic and errors that occur within an environment. Apache has two main log files, Error Logs, and Access Logs.
If you’re reading this article, you probably already use Apache, and are familiar with its features and functionality. Users of Apache and any technology should also be aware of the importance of monitoring application logs to ensure that they are running optimally, and of using them to identify problems before they cause problems for the consumers of a site.
Apache, an open source web server that dates back to the mid-1990s, is one of the oldest web servers still in widespread use today. It remains one of the most popular web server platforms out there (even though NGINX has been slowly eroding its market share for a while). This means that there is a good chance that you’ll find yourself at some point working with Apache logs, if you’re not already. Below, I explain how logging in Apache works, where to find logs, what to look for in them and which tools can help you work with them. For this post, I’ll focus on Apache error logs. Some of these tips apply to Apache error logs as well.
When it comes to IT operations, there are two main categories of Apache log analytics: error monitoring and server optimization. In the first part of this series, we discussed the former. In this article, we’ll introduce several types of Apache server optimization.As in Part I, all of these articles provide a hands-on walkthrough of Sumo Logic’s Apache log analytics capabilities. You should walk away with the confidence to identify server configuration issues or malicious clients in your own Apache log data.Understand Your Apache TrafficThe first step towards gaining actionable insights into your Apache server operations is to examine basic traffic metrics like the number of requests and the total bytes served. Sumo Logic includes many built-in dashboards to monitor these metrics in real time.Pivoting hits and traffic volume against other fields in your Apache logs helps you find all sorts of web optimization opportunities. For example, the above chart tells you that only one of your servers experienced a traffic spike. After you know which server to look at, you might continue your analysis by examining hits against URL or referrer to identify potential caching issues or hotlinked content, respectively.Learn more about analyzing Apache traffic ›Identify and Block Malicious RobotsCompared to grep’ing your log files, identifying suspicious traffic with an Apache log analyzer is a breeze. A powerful query language lets you search for all sorts of client behavior in your Apache logs, and visualizations provide an at-a-glance view of potential bots.In addition to examining traffic from known bots like Googlebot, Sumo Logic also provides several queries designed to help you identify unknown bots. For example, the above chart identifies two suspicious IP addresses by analyzing the request frequency of every client in your Apache logs.By quickly identifying misbehaving bots, scraping behavior, and potential DoS attacks, Sumo Logic makes sure your Apache infrastructure is serving your customers as efficiently as possible.Learn more about identifying bots ›Optimize Apache Response TimesBoth traffic and robot analysis are concerned with improving the performance of your web applications. Apache lets you record response times in a custom access log format, which gives you much more visibility into your system performance. For example, the following area chart clearly indicates that one of your servers is overloaded.Comparing response time to traffic volume, URL, and server quickly identifies the root cause of any performance issues. And, the faster you can catch performance issues, the less impact they’ll have on your users.Learn more about optimizing response times ›SummaryAnalyzing robots, traffic, and response times aren’t typically as mission-critical as the topics we discussed in the first part of this series. Fixing performance bottlenecks can, however, provide a much better user experience.The goal of Apache log analytics is to be able to say “I know what’s wrong and we’re already fixing it” whenever your customers call you up about a problem with your website. Instead of SSH’ing into all your servers and inspecting individual log files, all you have to do is glance at your dashboards to figure out where you should start your debugging.We hope that these hands-on walkthroughs have provided a glimpse into the potential insights you can glean from your own Apache log data.
It’s hard to understand the benefits of an Apache log analyzer without actually using one to explore your own logs. So, we’ve created a hands-on walkthrough of Sumo Logic’s Apache log analytics capabilities. If you’ve never used a dedicated log analyzer, this series will revolutionize your outlook on monitoring an Apache infrastructure.Getting a Handle on Serious ErrorsSome aspects of Apache log analytics involve optional optimizations, but gaining visibility into your servers’ critical errors is an absolute necessity. By providing a powerful query language and built-in visualizations, Sumo Logic provides instant insight into your Apache error logs.This lets you quickly filter log messages by their error level, identify trends in error reasons, determine if malicious client IPs are behind serious errors, and monitor important server events in real time.Learn more about analyzing critical Apache errors ›Optimizing Status Code ErrorsSifting through access logs to find 400- and 500-level errors is a pain for any system administrator. In the worst case, you’re directly grep’ing your access log file. In the best case, you’re piping your logs into a database so you can query it with SQL (but even that probably took a whole lot of finagling).Either way, it’s almost impossible to identify real-time trends in status code errors without a way to aggregate and visualize results. Sumo Logic dashboards make it easy to monitor 404 errors, identify 404 URLs and referrers, and even set dynamic thresholds for what constitutes an “abnormal” amount of 500-level errors.Learn more about analyzing status code errors ›Keeping Track of All Your ServersAs a data structure, Apache logs are pretty simple. But, when you have a hundred servers generating millions of log messages, getting to the root cause of an issue is no trivial task. It’s not until you try aggregating logs from dozens of servers that you begin to see the true benefits of a dedicated log analysis tool.Sumo Logic ensures an automated, reliable collection process and puts all of your logs in one place. This means you can query logs from hundreds or even thousands of servers in a single interface and find correlations across clusters. And, thanks to our multi-tenant cloud, operations on terabytes of log data are fast.Learn more about monitoring multiple Apache servers ›SummaryIf you’re not asking these kinds of questions of your log data, you’re ignoring valuable insights. Apache log analytics doesn’t just reduce MTTR and increase uptime; it ensures your IT infrastructure is living up to its full potential.Error monitoring is only one facet of Apache log analytics. There’s a whole other class of insights you can find in your log data, including optimizing web resources, identifying misbehaving bots, and speeding up Apache response times. Stay tuned for the second half of the Introduction to Apache Log Analytics.