Picture of Jürgen Kreileder

Fixing the ipt_recent Netfilter Module

I have experienced some strange behavior with my ipt_recent netfilter rules after an uptime of about 25 days. The rules started to block much too early. After rebooting the machine I was able to reproduce the problem for five minutes. This clearly indicated a problem with jiffies (Linux initialized jiffies so that the first roll-over happens five minutes after booting).

A closer look at ipt_recent.c revealed that the time tests did not work like intended if one of the last hits was more than LONG_MAX jiffies ago or if the list of last hits contained empty slots and jiffies is greater than LONG_MAX.

To fix this, I replaced jiffies with seconds since ’00:00:00 1970-01-01 UTC’. I have sent the patch to linux-kernel and netfilter-devel. The patch also includes some 64-bit fixes.

May 12th, 2005: The patch has been added to Linux 2.6.12-rc4-mm1

September 8th, 2005: Please note that only the 64-bit parts of my patch have made it into 2.6.12. I’m working on an updated fix for the time comparison problems which will hopefully get accepted for 2.6.14 or later.

September 12th, 2005: These issues have CAN numbers now: CAN-2005-2872 and CAN-2005-2873 (which supersede CAN-2005-2802)

July 10th, 2006: The jiffies issue is fixed in the vanilla kernel now. Also note that 2.6.18 will contain a rewrite of ipt_recent.c.

This article Jürgen Kreileder is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

50 Comments

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post. Both comments and pings are currently closed.

micah said

CAN-2005-2802 has been rejected, its only CAN-2005-2872 and CAN-2005-2873.

Is there an update to this? It seems like Dave Miller didn’t like the proposed fix, and thinks that the only solution is a complete rewrite of ipt_recent? This is somewhat annoying because this seemed like the best solution to this problem I’ve seen so far, and to find that the kernel needs a module re-written before this can be useful is disheartening.

The situation hasn’t changed.

I’ll post an updated patch for just the jiffies problem in the next days. And, maybe, I’ll start a rewrite when I have some free days.

micah said

I think I’ve noticed this problem — I’ve found myself blacklisted even when only issuing one ssh connection. I’ve even been blacklisted when already connected via ssh.

FX said

Thanks so much for writing about this!

Is the time required for the jiffies rollover predictable?

Would rebooting the server once per week avoid this problem?

Sure. Jiffies are an unsigned long value. That means it’s a 32-bit integer on 32-bit Linux architectures and a 64-bit integer on 64-bit Linux architectures.

The kernel initializes jiffies to a value that makes the first rollover happen 5 minutes after boot. This is done to catch rollover problems.

After that, rollovers happen all ULONG_MAX / HZ seconds.

The ipt_recent problem shows up earlier: It happens after LONG_MAX / HZ seconds. E.g., on a 32-bit system with HZ=1000 that means after (2³¹-1)s/1000 = 2147483.647s ≈ 24.855 days.

On 64-bit systems you’re safe. On 32-bit kernels you can either use a kernel with a lower HZ setting (e.g. 100 or 250, which gives you about 249 or 99 days before getting problems) or use my patch (I really should upload an updated version).

FX said

I think I have a workaround for this problem. Please let me know if this would work:

hourly cron script to cleanup SSHA file in ipt_recent (script pseudocode):

get current jiffies by parsing /proc/interrupts
if jiffies close to rollover then
   issue iptables commands to wipe out entries in /proc/net/ipt_recent
else
  for each entry in /proc/net/ipt_recent/SSHA
   age = current jiffies - last_seen
   if age > MAX_AGE
       issue iptables commands to clear this entry from ipt_recent/SSHA
   endif
  endfor 
endif

Hm, I don’t think that that will work because the jiffies rollover isn’t the problem, it’s the difference between current jiffies and last_seen. That would be fixable with slight modifications to your script. But then there’s still the problem with empty slots when jiffies > LONG_MAX.

FX said

Thanks so much for the extra info! It was very helpful.

What are empty slots and how do we detect them?

Each per-IP entry in a ipt_recent list has ip_pkt_list_tot slots (default is 256) for recording the time of arrival of last seen packets. The slots are initialized with time 0, ie. an empty slot is a slot which doesn’t record the time of arrival of any packet.

The proc interface for ipt_recent just shows non-empty slots (ie. only slots with value different from zero) but the hit-count calculation looks at all slots unconditionally. The hit-count algorithm compares the stored time with current jiffies using time_before_eq(). time_before_eq(ul, 0UL) returns true for ul values greater than LONG_MAX + 1.

For ipt_recent’s hit-count algorithm this means, that as soon as jiffies (plus the time value given in the rule) reaches LONG_MAX + 2 every empty slot counts as a hit!

FX said

By HZ, do you mean interrupt frequency of the timer?

For example, on a 3.0Ghz cpu with 100 interrupts per sec. for the timer, the correct value is 100, correct? Jiffies appear to get incremented 100 times per second.

The value depends on the kernel: Older kernels used 100 on x86, more recent kernels used 1000 and since a few versions you can choose between 100, 250, and 1000 while configuring the kernel.

micah said

The netfilter team is looking for a maintainer for ipt_recent, if someone doesn’t maintain it, they intend to mark it EXPERIMENTAL or BROKEN in Kconfig 2.6.16: http://lists.netfilter.org/pipermail/netfilter-devel/2005-December/022696.html

Juergen, would you consider maintaining it? You seem to have a very good grasp of the issues.

FX said

Here is another person’s approach to fixing this problem:

http://www.kd.cz/~martin/kernel-recent/readme

These are his patches using 64 bit counters for kernels 2.6.9 and 2.6.14:

http://www.kd.cz/~martin/kernel-recent/

Claims to have worked well on 2.6.9 since February 2005.

HansHonk said

Does CAN-2005-2872 also affected 2.4 based kernels ? At least the following Debian Advisory says so:
http://www.debian.org/security/2005/dsa-921

Yes, the code wasn’t 64-bit safe until 2.6.12.

Sam Liddicott said

I make mention of this patches and enhancements here so that they do not get lost.

http://article.gmane.org/gmane.comp.security.firewalls.netfilter.devel/14931/match=ipt+recent

Sam

Moritz said

Several people seem to suggest that the code would need cleanup/partial rewrite to be maintainable. But, in my opinion, it would really be very nice to have this functionality on the firewall level (even though others seem to think completely different about this).

Jürgen, do you think you will take on this at some point? Or do you consider this module a lost case?

Moritz, Patrick McHardy already did a rewrite for the forthcoming 2.6.18.

Moritz said

Just to complete this story:

http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.18

commit 404bdbfd242cb99ca0e9d3eb5fbb5bcd54123081
Author: Patrick McHardy
Date: Mon May 29 18:21:34 2006 -0700

[NETFILTER]: recent match: replace by rewritten version

Replace the unmaintainable ipt_recent match by a rewritten version that
should be fully compatible.

Manj said

There is no test case or a proof of concept for the ipt_recent fix
to show the effect of the jiffies changes done in the patch.

It will be useful to accept the patch if any one can provide a
testcase.

Manj, please note that the problem is fixed in current Linux kernels.

The test case for the problem actually is pretty simple, just write a ipt_recent rule that blocks after receiving a specific number of packets within 60 seconds on some port. The first jiffies rollover happens five minutes after booting, so there’s enough time to recreate the described problem by telnetting to the port. With a broken version of ipt_recent, the rule will hit after receiving the first packet. With a fixed version it won’t hit until the number of packets you’ve specified is reached.

Manj said

Juergen, Im using the linux-2.6.11 kernel and hence i need to fix the jiffies problem in it. It would be helpful if ill get a patch for only the jiffies problem because i dont want to get into backporting the ipt_recent.c code from 2.6.18 kernel.

The patch should work fine with 2.6.11:

$ cd linux-2.6.11
$ patch -NEp1 < ~/ipt_recent-fix.patch
patching file include/linux/netfilter_ipv4/ipt_recent.h
patching file net/ipv4/netfilter/ipt_recent.c

Let me know if you need some help with the iptable rules.
(If you’re using shorewall, I’ve posted two rules which use ipt_recent in another article.)

Manj said

Juergen, thank you very much for the information. If u will give me the iptable rules to reproduce the bug on linux-2.6.11 kernel it will be very helpful.

Here are some simple test rules:

iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -p tcp --dport 1234 -m recent --update --seconds 60 --hitcount 6 --name foo -j DROP
iptables -A INPUT -p tcp --dport 1234 -m recent --set --name foo -j ACCEPT

Then, within the first five minutes after booting, try telnetting to port 1234. With a working version of ipt_recent, the rules won’t block unless you connect more than six times within 60 seconds. With a broken version, the second connect will be blocked.

Manj said

Juergen, thank you very much for the iptable rules but im getting an error as follows.

“unable to connect to remote host”

My /etc/hosts contains the follwoing ip addresses
127.0.0.1
43.88.102.48

Hence i did a telnet as follows
# telnet 43.88.102.48 1234

and i got the error.

That’s the expected result, there’s nothing listening on port 1234. For this test, the immediate error “unable to connect to remote host” means that no packets where DROPed.
If you repeat the telnet command rapidly enough, it will hang at some point. That means the ipt_recent rule kicked in and DROPed the connection request.

Of course you can also use a port with some listener (e.g. port 80 if you have web server running) for the test. But using a port with no listener is just as good in this case.

Manj said

Juergen, to repeat the telnet command rapidly enough, ive executed the following shell script which will do a telnet 1000 times to the port 1234.

#! /bin/sh

i=1
j=1000
while [ $i -ne $j ] ;
do
telnet 43.88.102.48 1234
i=`expr $i + 1`
done

I got the same messages “unable to connect to remote host” 1000 times.

But what ive noticed is that the system did not hang even after completing 1000 iterations for both bug-fixed and bug-not-fixed kernels.

My system is a 32 bit system and ive applied the patch ipt_recent.patch that contains even the 64 bit fixes as u said. It will be helpful for me if u can tell me a clear method to reproduce the bug.

Please post the output from “iptables -L”.

The expected result for you script would be;

  • Patched kernel: the 7th telnet call hangs (not the machine, just the command)
  • Unpatched kernel: When one of the conditions mentioned in the article is true, the 2nd call hangs. Otherwise it behaves like the patched kernel.

I’ll post a more detailed test setup later. (Might take till tomorrow
evening, I need some sleep tonight :-)

Manj said

Juergen, Please find the output from “iptables -L”. Ive used the iptables-1.3.6 version at the user mode.

For unpatched kernel:
————————-
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

For patched kernel:
———————-
Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

There is no difference in the “iptables -L” output for both the unpatched and the patched kernels. I hope u will give me a good solution.

OK, looks like something went wrong when you entered the three iptables commands I’ve posted above.

After entering the rules you should see:

$ iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
DROP       tcp  --  anywhere             anywhere            tcp dpt:1234 recent: UPDATE seconds: 60 hit_count: 6 name: foo side: source 
ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:1234 recent: SET name: foo side: source 

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Manj said

Juergen, Please find the output from “iptables -L” after entering the three iptables commands that U’ve posted. One thing that i can see is that the 3rd line shows “ACCEPT 0” but according to you it should be “ACCEPT all”.

Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT 0 — anywhere anywhere state RELATED,ESTABLISHED
DROP tcp — anywhere anywhere tcp dpt:1234 recent: UPDATE seconds: 60 hit_count: 6 name: foo side: source
ACCEPT tcp — anywhere anywhere tcp dpt:1234 recent: SET name: foo side: source

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

That’s OK, 0 == all.

Did you try running your script again with that setup?

Manj said

Juergen, I did the following after running your iptable rules

1. Immediately after the setup ive executed the script.
2. After the setup i rebooted the system and executed the script.

There is no difference in the output :(

Please test whether your iptables setup works at all. Just use a simple DROP rule.
(Note that the rules are not persistent, you have to setup them after each boot.)

Manj said

Juergen i just tested my iptables by running the following rule

# iptables -P OUTPUT DROP

After this any command execution gives the error and this output for ls.

# ls
RPC: sendmsg returned error 1
nfs: RPC call returned error 1
ash: ls: Operation not permitted

So i feel there is no problem in my iptables setup.

Manj said

Juergen, can u tell me some way to reproduce the problem. It will be very much helpful to me.

I’ll post a more detailed test setup later.

Manj said

Thank u very much, I’ll be waiting for your post.

Manj said

Hi Juergen,

There is a good news. Just now i was able to reproduce the bug on my i386 box. The results are as follows.

Unpatched kernel: telnetting to the port 1234 for the second time.
The ipt_recent rule kicked in and DROPed the connection request.

Patched kernel: telnet to the port 1234 for the 7th time.
For 6 times the connection refused since we have our hit_count as 6 and for the 7th time the ipt_recent rule kicked in and DROPed the connection request.

Thank you very much for the patch and the iptables rules.

Jose Lima said

Hi,

I’m running redhat EL 4 with kernel 2.6.9-22.ELsmp

I tryed rebooting and doing the test with in 5 minutes, the rules seem to be working correctly as far as I can tell. Is there any other way to test? or am I safe?
aslo, can some one post on how to actually apply the patch? do i have to recompile the kernel? or is this something applied when i compile iptables?

thanks for your time

BR

J

Manj said

Jose,

The patch that is provided by Juergen is to fix the problem in
the kernel, hence you have to apply the patch to your kernel
and then rebuild the kernel.

To apply the patch refer to Juergen’s comment above, on
October 15th, 2006 at 2:06 pm

[…] Then I read this bug report. It seems I have to patch the ipt_recent source… Actually, I think this is the wrong solution anyway, and I need to disable hosts at the sshd level after a few failed password attemps. […]

Sei said

Does Patrick McHardy’s rewrite really fix this more_than_LONG_MAX_jiffies problem?
It still has a time_after() comparison of unsigned long variables.

Yeah, the rewrite fixes the problem. I reviewed it months ago (I can’t remember why it is OK off the top of my head though — but it is OK!).

Sei said

Hi Juergen,

My term “this more_than_LONG_MAX_jiffies problem” was rather unclear.
The ipt_recent code before his rewrite had several problems related to jiffies values.
With the rewrite, a problem

> if the list of last hits contained empty slots and jiffies is greater than LONG_MAX

seems to be fixed, but another one

> if one of the last hits was more than LONG_MAX jiffies ago

looks not to be fixed yet.
Do you mean the latter fixed too?

Sei said

(continued) On i386 arch, a hit at 25 days after the last hit will result in a wrong match because ipt_recent_match() compares two unsigned long variables “t” and “e->stamps[i]” by time_after(), I believe.

[…] да в модуле ipt_recent для netfilter-а оказывается есть крутой баг, делающий это ipt_recent совершенно бесполезным. Нашли его […]

siva said

Please tell me whether Uptime continues to calculate life long…
or else it will be rolled over after 497 days…

Linux uses 64-bit jiffies now, you can get higher uptimes :)