Developing Filters
******************

Filters are tricky. They need to:

* work with a variety of the versions of the software that generates
  the logs;

* work with the range of logging configuration options available in
  the software;

* work with multiple operating systems;

* not make assumptions about the log format in excess of the software
  (e.g. do not assume a username doesn't contain spaces and use S+
  unless you've checked the source code);

* account for how future versions of the software will log messages
  (e.g. guess what would happen to the log message if different
  authentication types are added);

* not be susceptible to DoS vulnerabilities (see Filter Security
  below); and

* match intended log lines only.

Please follow the steps from Filter Test Cases to Developing Filter
Regular Expressions and submit a GitHub pull request (PR) afterwards.
If you get stuck, you can push your unfinished changes and still
submit a PR -- describe what you have done, what is the hurdle, and
we'll attempt to help (PR will be automagically updated with future
commits you would push to complete it).


Filter Test Cases
=================


Purpose
-------

Start by finding the log messages that the application generates
related to some form of authentication failure. If you are adding to
an existing filter think about whether the log messages are of a
similar importance and purpose to the existing filter. If you were a
user of Fail2Ban, and did a package update of Fail2Ban that started
matching new log messages, would anything unexpected happen?  Would
the bantime/findtime for the jail be appropriate for the new log
messages?  If it doesn't, perhaps it needs to be in a separate filter
definition, for example like exim filter aims at authentication
failures and exim-spam at log messages related to spam.

Even if it is a new filter you may consider separating the log
messages into different filters based on purpose.


Cause
-----

Are some of the log lines a result of the same action? For example, is
a PAM failure log message, followed by an application specific failure
message the result of the same user/script action?  If you add regular
expressions for both you would end up with two failures for a single
action. Therefore, select the most appropriate log message and
document the other log message) with a test case not to match it and a
description as to why you chose one over another.

With the selected log lines consider what action has caused those log
messages and whether they could have been generated by accident? Could
the log message be occurring due to the first step towards the
application asking for authentication? Could the log messages occur
often? If some of these are true make a note of this in the jail.conf
example that you provide.


Samples
-------

It is important to include log file samples so any future change in
the regular expression will still work with the log lines you have
identified.

The sample log messages are provided in a file under
testcases/files/logs/ named identically as the corresponding filter
(but without .conf extension). Each log line should be preceded by a
line with failJSON metadata (so the logs lines are tested in the test
suite) directly above the log line. If there is any specific
information about the log message, such as version or an application
configuration option that is needed for the message to occur, include
this in a comment (line beginning with #) above the failJSON metadata.

Log samples should include only one, definitely not more than 3,
examples of log messages of the same form. If log messages are
different in different versions of the application log messages that
show this are encouraged.

Also attempt to inject an IP into the application (e.g. by specifying
it as a username) so that Fail2Ban possibly detects the IP from user
input rather than the true origin. See the Filter Security section and
the top example in testcases/files/logs/apache-auth as to how to do
this. One you have discovered that this is possible, correct the regex
so it doesn't match and provide this as a test case with "match":
false (see failJSON below).

If the mechanism to create the log message isn't obvious provide a
configuration and/or sample scripts
testcases/files/config/{filtername} and reference these in the
comments above the log line.


FailJSON metadata
-----------------

A failJSON metadata is a comment immediately above the log message. It
will look like:

   # failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" }

Time should match the time of the log message. It is in a specific
format of Year-Month-Day'T'Hour:minute:Second.  If your log message
does not include a year, like the example below, the year should be
listed as 2005, if before Sun Aug 14 10am UTC, and 2004 if afterwards.
Here is an example failJSON line preceding a sample log line:

   # failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" }
   Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543

The "host" in failJSON should contain the IP or domain that should be
blocked.

For long lines that you do not want to be matched (e.g. from log
injection attacks) and any log lines to be excluded (see "Cause"
section above), set "match": false in the failJSON and describe the
reason in the comment above.

After developing regexes, the following command will test all failJSON
metadata against the log lines in all sample log files:

   ./fail2ban-testcases testSampleRegex


Developing Filter Regular Expressions
=====================================


Date/Time
---------

At the moment, Fail2Ban depends on log lines to have time stamps.
That is why before starting to develop failregex, check if your log
line format known to Fail2Ban.  Copy the time component from the log
line and append an IP address to test with following command:

   ./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>"

Output of such command should contain something like:

   Date template hits:
   |- [# of hits] date format
   |  [1] Year-Month-Day Hour:Minute:Second

Ensure that the template description matches time/date elements in
your log line time stamp.  If there is no matched format then date
template needs to be added to server/datedetector.py.  Ensure that a
new template is added in the order that more specific matches occur
first and that there is no confusion between a Day and a Month.


Filter file
-----------

The filter is specified in a config/filter.d/{filtername}.conf file.
Filter file can have sections INCLUDES (optional) and Definition as
follows:

   [INCLUDES]

   before = common.conf

   after = filtername.local

   [Definition]

   failregex = ....

   ignoreregex = ....

This is also documented in the man page jail.conf (section 5). Other
definitions can be added to make failregex's more readable and
maintainable to be used through string Interpolations (see
http://docs.python.org/2.7/library/configparser.html)


General rules
-------------

Use "before" if you need to include a common set of rules, like syslog
or if there is a common set of regexes for multiple filters.

Use "after" if you wish to allow the user to overwrite a set of
customisations of the current filter. This file doesn't need to exist.

Try to avoid using ignoreregex mainly for performance reasons. The
case when you would use it is if in trying to avoid using it, you end
up with an unreadable failregex.


Syslog
------

If your application logs to syslog you can take advantage of log line
prefix definitions present in common.conf.  So as a base use:

   [INCLUDES]

   before = common.conf

   [Definition]

   _daemon = app

   failregex = ^%(__prefix_line)s

In this example common.conf defines __prefix_line which also contains
the _daemon name (in syslog terms the service) you have just
specified. _daemon can also be a regex.

For example, to capture following line _daemon should be set to
"dovecot":

   Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193

and then "^%(__prefix_line)s" would match "Dec 12 11:19:11 dunnart
dovecot: ". Note it matches the trailing space(s) as well.


Substitutions (AKA string interpolations)
-----------------------------------------

We have used string interpolations in above examples.  They are useful
for making the regexes more readable, reuse generic patterns in
multiple failregex lines, and also to refer definition of regex parts
to specific filters or even to the user.  General principle is that
value of a _name variable replaces occurrences of %(_name)s within the
same section or anywhere in the config file if defined in [DEFAULT]
section.


Regular Expressions
-------------------

Regular expressions (failregex, ignoreregex) assume that the date/time
has been removed from the log line (this is just how fail2ban works
internally ATM).

If the format is like '<date...> error 1.2.3.4 is evil' then you need
to match the <> at the start so regex should be similar to '^<> error
<HOST> is evil$' using <HOST> where the IP/domain name appears in the
log line.

The following general rules apply to regular expressions:

* ensure regexes start with a ^ and are as restrictive as possible.
  E.g. do not use .* if d+ is sufficient;

* use functionality of Python regexes defined in the standard Python
  re library http://docs.python.org/2/library/re.html;

* make regular expressions readable (as much as possible). E.g.
  (?:...) represents a non-capturing regex but (...) is more readable,
  thus preferred.

If you have only a basic knowledge of regular repressions we advise to
read http://docs.python.org/2/library/re.html first.  It doesn't take
long and would remind you e.g. which characters you need to escape and
which you don't.


Developing/testing a regex
--------------------------

You can develop a regex in a file or using command line depending on
your preference. You can also use samples you have already created in
the test cases or test them one at a time.

The general tool for testing Fail2Ban regexes is fail2ban-regex. To
see how to use it run:

   ./fail2ban-regex --help

Take note of  -l heavydebug  / -l debug  and -v as they might be very
useful.

Tip:

  Take a look at the source code of the application you are developing
  failregex for. You may see optional or extra log messages, or parts
  there of, that need to form part of your regex.  It may also reveal
  how some parts are constrained and different formats depending on
  configuration or less common usages.

Tip:

  For looking through source code - http://sourcecodebrowser.com/ . It
  has call graphs and can browse different versions.

Tip:

  Some applications log spaces at the end. If you are not sure add s*$
  as the end part of the regex.

If your regex is not matching, http://www.debuggex.com/?flavor=python
can help to tune it.  fail2ban-regex -D ...  will present Debuggex
URLs for the regexs and sample log files that you pass into it.

In general use when using regex debuggers for generating fail2ban
filters:

* use regex from the ./fail2ban-regex output (to ensure all
  substitutions are

done) * replace <HOST> with (?&.ipv4) * make sure that regex type set
to Python * for the test data put your log output with the date/time
removed

When you have fixed the regex put it back into your filter file.

Please spread the good word about Debuggex - Serge Toarca is kindly
continuing its free availability to Open Source developers.


Finishing up
------------

If you've added a new filter, add a new entry in config/jail.conf. The
theory here is that a user will create a jail.local with
[filtername]nenable=true to enable your jail.

So more specifically in the [filter] section in jail.conf:

* ensure that you have "enabled = false" (users will enable as
  needed);

* use "filter =" set to your filter name;

* use a typical action to disable ports associated with the
  application;

* set "logpath" to the usual location of application log file;

* if the default findtime or bantime isn't appropriate to the filter,
  specify more appropriate choices (possibly with a brief comment
  line).

Submit github pull request (See "Pull Requests" above) for
github.com/fail2ban/fail2ban containing your great work.


Filter Security
===============

Poor filter regular expressions are susceptible to DoS attacks.

When a remote user has the ability to introduce text that would match
filter's failregex, while matching inserted text to the <HOST> part,
they have the ability to deny any host they choose.

So the <HOST> part must be anchored on text generated by the
application, and not the user, to an extent sufficient to prevent user
inserting the entire text matching this or any other failregex.

Ideally filter regex should anchor at the beginning and at the end of
log line. However as more applications log at the beginning than the
end, anchoring the beginning is more important. If the log file used
by the application is shared with other applications, like system
logs, ensure the other application that use that log file do not log
user generated text at the beginning of the line, or, if they do,
ensure the regexes of the filter are sufficient to mitigate the risk
of insertion.


Examples of poor filters
------------------------

1. Too restrictive

We find a log message:

   Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4

We make a failregex:

   ^Invalid command \S+ from <HOST>

Now think evil. The user does the command 'blah from 1.2.3.44'

The program diligently logs:

   Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4

And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was
successful.

The fix here is that the command can be anything so .* is appropriate:

   ^Invalid command .* from <HOST>

Here the .* will match until the end of the string. Then realise it
has more to match, i.e. "from <HOST>" and go back until it find this.
Then it will ban 1.2.3.4 correctly. Since the <HOST> is always at the
end, end the regex with a $:

   ^Invalid command .* from <HOST>$

Note if we'd just had the expression:

   ^Invalid command \S+ from <HOST>$

Then provided the user put a space in their command they would have
never been banned.

2. Unanchored regex can match other user injected data

From the Apache vulnerability CVE-2013-2178 ( original ref:
https://vndh.net/note:fail2ban-089-denial-service ).

An example bad regex for Apache:

   failregex = [[]client <HOST>[]] user .* not found

Since the user can do a get request on:

   GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0
   Host: remote.site

Now the log line will be:

   [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found

As this log line doesn't match other expressions hence it matches the
above regex and blocks 192.168.33.1 as a denial of service from the
HTTP requester.

3. Over greedy pattern matching

From: https://github.com/fail2ban/fail2ban/pull/426

An example ssh log (simplified):

   Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser

As we assume username can include anything including spaces its
prudent to put .* here. The remote user can also exist as anything so
lets not make assumptions again:

   failregex = ^%(__prefix_line)sFailed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$

So this works. The problem is if the .* after remote user is injected
by the user to be 'from 1.2.3.4'. The resultant log line is:

   Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4

Testing with:

   fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'

Tip:

  I've removed the bit that matches __prefix_line from the regex and
  log.

Shows:

   1) [1] ^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
      1.2.3.4  Sun Sep 29 17:15:02 2013

It should of matched 127.0.0.1. So the first greedy part of the greedy
regex matched until the end of the string. The was no "from <HOST>" so
the regex engine worked backwards from the end of the string until
this was matched.

The result was that 1.2.3.4 was matched, injected by the user, and the
wrong IP was banned.

The solution here is to make the first .* non-greedy with .*?. Here it
matches as little as required and the fail2ban-regex tool shows the
output:

   fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$'

   1) [1] ^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$
      127.0.0.1  Sun Sep 29 17:15:02 2013

So the general case here is a log line that contains:

   (fixed_data_1)<HOST>(fixed_data_2)(user_injectable_data)

Where the regex that matches fixed_data_1 is gready and matches the
entire string, before moving backwards and user_injectable_data can
match the entire string.


Another case
------------

ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0

A webserver logs the following without URL escaping:

   [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com"

regex:

   failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: <HOST>, server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+"

The .* matches to the end of the string. Finds that it can't continue
to match ", client ... so it moves from the back and find that the
user injected web URL:

   ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host

In this case there is a fixed host: "www.myhost.com" at the end so the
solution is to anchor the regex at the end with a $.

If this wasn't the case then first .* needed to be made so it didn't
capture beyond <HOST>.

4. Application generates two identical log messages with different
   meanings

If the application generates the following two messages under
different circumstances:

   client <IP>: authentication failed
   client <USER>: authentication failed

Then it's obvious that a regex of "^client <HOST>: authentication
failed$" will still cause problems if the user can trigger the second
log message with a <USER> of 123.1.1.1.

Here there's nothing to do except request/change the application so it
logs messages differently.
