Ruby library to parse Apache/NGINX server log files using regular expressions.
MIT License
ServerLogParser provides a high-level Ruby library for parsing apache server log files (common log format, with or without virtual hosts and combined log format) as used by Apache, Nginx and others.
It's a fork of ApacheLogRegex, which was in turn a port of Apache::LogRegex 1.4 Perl module. where much of the regex parts come from.
gem install server_log_parser
require 'server_log_parser'
parser = ServerLogParser::Parser.new(ServerLogParser::COMBINED_VIRTUAL_HOST)
# or:
# parser = ServerLogParser::Parser.new('%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"')
File.foreach('/var/log/apache/access.log') do |line|
parsed = parser.parse(line)
# {
# '%h' => '212.74.15.68',
# '%l' => '-',
# '%u' => '-',
# '%t' => '[23/Jan/2004:11:36:20 +0000]',
# '%r' => 'GET /images/previous.png HTTP/1.1',
# '%>s' => '200',
# '%b' => '2607',
# '%{Referer}i' => 'http://peterhi.dyndns.org/bandwidth/index.html',
# '%{User-Agent}i' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
# }
end
ServerLogParser#parse
will silently ignore errors, but if you'd prefer,
ServerLogParser#parse!
will raise a ParseError
exception.
File.foreach('/var/log/apache/access.log') do |line|
parsed = parser.handle(line)
# {
# '%h' => '212.74.15.68',
# '%l' => nil,
# '%u' => nil,
# '%t' => DateTime.new(2004, 1, 23, 11, 36, 20, '+0'),
# '%r' => {"method" => "GET", "resource" => "/images/previous.png", "protocol" => "HTTP/1.1"},
# '%>s' => 200,
# '%b' => 2607,
# '%{Referer}i' => 'http://peterhi.dyndns.org/bandwidth/index.html',
# '%{User-Agent}i' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
# }
end
Apache log files use -
to mean no data is present and these are replaced with nil
,
like the %l
and %u
values above. Request is split into a nested hash.
The following fields are stored as Integer
: %B
, %b
, %k
, %p
, %{format}p
,
%P
, %{format}P
, %s
, %>s
, %I
, %O
.
The following fields are stored as Float
: %D
, %T
.
The following fields are stored as DateTime
: %t
.
Note: %{format}t
is stored as String
currently.
The field %r
is special, see above.
All other fields are stored as String
.
ServerLogParser#handle
will silently ignore errors, but if you'd prefer,
ServerLogParser#handle!
will raise a ParseError
exception.
The log format is specified using a rather verbose constant, which map out like:
Name | Constant | Apache Format |
---|---|---|
Common Log Format | ServerLogParser::COMMON_LOG_FORMAT |
%h %l %u %t \"%r\" %>s %b |
Common Log Format with virtual hosts | ServerLogParser::COMMON_LOG_FORMAT_VIRTUAL_HOST |
%v %h %l %u %t \"%r\" %>s %b |
Combined | ServerLogParser::COMBINED |
%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" |
Combined with virtual hosts | ServerLogParser::COMBINDED_VIRTUAL_HOST |
%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" |
Alexander Kurakin <[email protected]>
https://github.com/kuraga/server_log_parser/issues
MIT