It’s often useful to be able to perform queries across multiple log files. A great example is when dealing with IIS log files. I recently had a situation in which I needed to know the average response times and number of requests received for a couple of IIS web-servers in order to troubleshoot an issue we were experiencing. I needed to know the results over a large time period, so this meant collating data across a large number of IIS log files. Fortunately I had a call open with Microsoft and the support engineer I was dealing with told me about the Microsoft Log Parser, which was perfect for this job.
The tool allows you to group together a set of log files in a single location and run queries across them using a SQL-like language. It can be used with any type of file, but comes into its own when used with IIS log files.
Here are some examples of queries I performed using the tool:
Gets the number of hits per hour, grouped by hour, and writes them to a CSV file:
LogParser.exe "SELECT TO_LOCALTIME(QUANTIZE(TO_TIMESTAMP(date, time),3600)) AS Hours, COUNT(*) AS Hits INTO ReqPerhour.csv FROM u_ex*.log GROUP BY Hours ORDER BY Hours " -i:W3C -o:csv
Gets information about response times for the 20 slowest URLs and writes them to a CSV file:
LogParser.exe "SELECT TOP 20 cs-uri-stem, COUNT(*) AS TotalRequest, MAX(time-taken) AS MaxTime, AVG(time-taken) AS AvgTime INTO avg.csv FROM u_ex*.log GROUP BY cs-uri-stem ORDER BY TotalRequest DESC" -i: IISW3C -o:csv
Gets information about response times for the 25 slowest URLs, filtered between a start and end time, and writes them to a CSV file:
LogParser.exe "SELECT TOP 25 cs-uri-stem, COUNT(*) AS TotalRequest, MAX(time-taken) AS MaxTime, AVG(time-taken) AS AvgTime INTO BeforeavgProb.csv FROM u_ex*.log WHERE TO_TIME(time) BETWEEN TIMESTAMP('00:00:00','hh:mm:ss') AND TIMESTAMP('14:00:00','hh:mm:ss') GROUP BY cs-uri-stem ORDER BY TotalRequest DESC" -i: IISW3C -o:csv
Gets all request URLs that resulted in a 500 error:
LogParser.exe "SELECT cs-uri-stem AS FileName FROM u_ex*.log WHERE sc-Status = 500" -i:W3C -rtp:-1
Gets all request URLs that took more than a second to return:
LogParser.exe "SELECT cs-uri-stem AS FileName, time-taken AS Time from u_ex*.log WHERE time-taken > 1000" -i:W3C -rtp:-1
This beast (which I’ve not tested!) appears to bring back a summary of status codes for requests:
LogParser.exe "SELECT to_lowercase(cs-uri-stem) AS URI, SUM([_200]) AS [200s], SUM([_304]) AS [304s], SUM([_302]) AS [302s], SUM([_404]) AS [404s], SUM([_301]) AS [301s], SUM([_500]) AS [500s],SUM([_501]) AS [501s],SUM([_403]) AS [403s],SUM([_206]) AS [206s],SUM([_406]) AS [406s],SUM([_400]) AS [400s], sub(count(*),add([200s],[206s])) as Failures USING CASE TO_STRING(scstatus) WHEN '200' THEN 1 ELSE 0 END AS [_200], CASE TO_STRING(sc-status) WHEN '304' THEN 1 ELSE 0 END AS [_304], CASE TO_STRING(sc-status) WHEN '302' THEN 1 ELSE 0 END AS [_302], CASE TO_STRING(sc-status) WHEN '404' THEN 1 ELSE 0 END AS [_404], CASE TO_STRING(sc-status) WHEN '301' THEN 1 ELSE 0 END AS [_301], CASE TO_STRING(sc-status) WHEN '500' THEN 1 ELSE 0 END AS [_500], CASE TO_STRING(sc-status) WHEN '501' THEN 1 ELSE 0 END AS [_501], CASE TO_STRING(sc-status) WHEN '403' THEN 1 ELSE 0 END AS [_403], CASE TO_STRING(sc-status) WHEN '206' THEN 1 ELSE 0 END AS [_206], CASE TO_STRING(sc-status) WHEN '406' THEN 1 ELSE 0 END AS [_406], CASE TO_STRING(scstatus) WHEN '400' THEN 1 ELSE 0 END AS [_400] FROM ex*.log GROUP BY URI ORDER BY Failures DESC"
The possibilities are endless. Here are some articles I have been recommended which demonstrate alternative queries, starting with two that deal specifically with IIS logs:
IIS Log Parser examples 1
IIS Log Parser examples 2
Here is an example involving more generic text files:
Reading large text files with Log Parser
…and finally, here is some generic information about the Log Parser from Microsoft:
Generic information on Log Parser
Happy parsing!