I have experienced some strange behavior with my ipt_recent netfilter rules after an uptime of about 25 days. The rules started to block much too early. After rebooting the machine I was able to reproduce the problem for five minutes. This clearly indicated a problem with jiffies (Linux initialized jiffies so that the first roll-over happens five minutes after booting).
A closer look at ipt_recent.c revealed that the time tests did not work like intended if one of the last hits was more than LONG_MAX
jiffies ago or if the list of last hits contained empty slots and jiffies is greater than LONG_MAX
.
To fix this, I replaced jiffies with seconds since ’00:00:00 1970-01-01 UTC’. I have sent the patch to linux-kernel and netfilter-devel. The patch also includes some 64-bit fixes.
May 12th, 2005: The patch has been added to Linux 2.6.12-rc4-mm1
September 8th, 2005: Please note that only the 64-bit parts of my patch have made it into 2.6.12. I’m working on an updated fix for the time comparison problems which will hopefully get accepted for 2.6.14 or later.
September 12th, 2005: These issues have CAN numbers now: CAN-2005-2872 and CAN-2005-2873 (which supersede CAN-2005-2802)
July 10th, 2006: The jiffies issue is fixed in the vanilla kernel now. Also note that 2.6.18 will contain a rewrite of ipt_recent.c.
CAN-2005-2802 has been rejected, its only CAN-2005-2872 and CAN-2005-2873.
Is there an update to this? It seems like Dave Miller didn’t like the proposed fix, and thinks that the only solution is a complete rewrite of ipt_recent? This is somewhat annoying because this seemed like the best solution to this problem I’ve seen so far, and to find that the kernel needs a module re-written before this can be useful is disheartening.
The situation hasn’t changed.
I’ll post an updated patch for just the jiffies problem in the next days. And, maybe, I’ll start a rewrite when I have some free days.
I think I’ve noticed this problem — I’ve found myself blacklisted even when only issuing one ssh connection. I’ve even been blacklisted when already connected via ssh.
Thanks so much for writing about this!
Is the time required for the jiffies rollover predictable?
Would rebooting the server once per week avoid this problem?
Sure. Jiffies are an unsigned long value. That means it’s a 32-bit integer on 32-bit Linux architectures and a 64-bit integer on 64-bit Linux architectures.
The kernel initializes jiffies to a value that makes the first rollover happen 5 minutes after boot. This is done to catch rollover problems.
After that, rollovers happen all
ULONG_MAX / HZ
seconds.The ipt_recent problem shows up earlier: It happens after
LONG_MAX / HZ
seconds. E.g., on a 32-bit system withHZ=1000
that means after (2³¹-1)s/1000 = 2147483.647s ≈ 24.855 days.On 64-bit systems you’re safe. On 32-bit kernels you can either use a kernel with a lower
HZ
setting (e.g. 100 or 250, which gives you about 249 or 99 days before getting problems) or use my patch (I really should upload an updated version).I think I have a workaround for this problem. Please let me know if this would work:
hourly cron script to cleanup SSHA file in ipt_recent (script pseudocode):
Hm, I don’t think that that will work because the jiffies rollover isn’t the problem, it’s the difference between current jiffies and last_seen. That would be fixable with slight modifications to your script. But then there’s still the problem with empty slots when jiffies >
LONG_MAX
.Thanks so much for the extra info! It was very helpful.
What are empty slots and how do we detect them?
Each per-IP entry in a ipt_recent list has
ip_pkt_list_tot
slots (default is 256) for recording the time of arrival of last seen packets. The slots are initialized with time0
, ie. an empty slot is a slot which doesn’t record the time of arrival of any packet.The proc interface for ipt_recent just shows non-empty slots (ie. only slots with value different from zero) but the hit-count calculation looks at all slots unconditionally. The hit-count algorithm compares the stored time with current jiffies using
time_before_eq()
.time_before_eq(ul, 0UL)
returns true forul
values greater thanLONG_MAX + 1
.For ipt_recent’s hit-count algorithm this means, that as soon as jiffies (plus the time value given in the rule) reaches
LONG_MAX + 2
every empty slot counts as a hit!By HZ, do you mean interrupt frequency of the timer?
For example, on a 3.0Ghz cpu with 100 interrupts per sec. for the timer, the correct value is 100, correct? Jiffies appear to get incremented 100 times per second.
The value depends on the kernel: Older kernels used 100 on x86, more recent kernels used 1000 and since a few versions you can choose between 100, 250, and 1000 while configuring the kernel.
The netfilter team is looking for a maintainer for ipt_recent, if someone doesn’t maintain it, they intend to mark it EXPERIMENTAL or BROKEN in Kconfig 2.6.16: http://lists.netfilter.org/pipermail/netfilter-devel/2005-December/022696.html
Juergen, would you consider maintaining it? You seem to have a very good grasp of the issues.
Here is another person’s approach to fixing this problem:
http://www.kd.cz/~martin/kernel-recent/readme
These are his patches using 64 bit counters for kernels 2.6.9 and 2.6.14:
http://www.kd.cz/~martin/kernel-recent/
Claims to have worked well on 2.6.9 since February 2005.
Does CAN-2005-2872 also affected 2.4 based kernels ? At least the following Debian Advisory says so:
http://www.debian.org/security/2005/dsa-921
Yes, the code wasn’t 64-bit safe until 2.6.12.
I make mention of this patches and enhancements here so that they do not get lost.
http://article.gmane.org/gmane.comp.security.firewalls.netfilter.devel/14931/match=ipt+recent
Sam
Several people seem to suggest that the code would need cleanup/partial rewrite to be maintainable. But, in my opinion, it would really be very nice to have this functionality on the firewall level (even though others seem to think completely different about this).
Jürgen, do you think you will take on this at some point? Or do you consider this module a lost case?
Moritz, Patrick McHardy already did a rewrite for the forthcoming 2.6.18.
Just to complete this story:
http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.18
commit 404bdbfd242cb99ca0e9d3eb5fbb5bcd54123081
Author: Patrick McHardy
Date: Mon May 29 18:21:34 2006 -0700
[NETFILTER]: recent match: replace by rewritten version
Replace the unmaintainable ipt_recent match by a rewritten version that
should be fully compatible.
There is no test case or a proof of concept for the ipt_recent fix
to show the effect of the jiffies changes done in the patch.
It will be useful to accept the patch if any one can provide a
testcase.
Manj, please note that the problem is fixed in current Linux kernels.
The test case for the problem actually is pretty simple, just write a ipt_recent rule that blocks after receiving a specific number of packets within 60 seconds on some port. The first jiffies rollover happens five minutes after booting, so there’s enough time to recreate the described problem by telnetting to the port. With a broken version of ipt_recent, the rule will hit after receiving the first packet. With a fixed version it won’t hit until the number of packets you’ve specified is reached.
Juergen, Im using the linux-2.6.11 kernel and hence i need to fix the jiffies problem in it. It would be helpful if ill get a patch for only the jiffies problem because i dont want to get into backporting the ipt_recent.c code from 2.6.18 kernel.
The patch should work fine with 2.6.11:
Let me know if you need some help with the iptable rules.
(If you’re using shorewall, I’ve posted two rules which use ipt_recent in another article.)
Juergen, thank you very much for the information. If u will give me the iptable rules to reproduce the bug on linux-2.6.11 kernel it will be very helpful.
Here are some simple test rules:
Then, within the first five minutes after booting, try telnetting to port 1234. With a working version of ipt_recent, the rules won’t block unless you connect more than six times within 60 seconds. With a broken version, the second connect will be blocked.
Juergen, thank you very much for the iptable rules but im getting an error as follows.
“unable to connect to remote host”
My /etc/hosts contains the follwoing ip addresses
127.0.0.1
43.88.102.48
Hence i did a telnet as follows
# telnet 43.88.102.48 1234
and i got the error.
That’s the expected result, there’s nothing listening on port 1234. For this test, the immediate error “unable to connect to remote host” means that no packets where DROPed.
If you repeat the telnet command rapidly enough, it will hang at some point. That means the ipt_recent rule kicked in and DROPed the connection request.
Of course you can also use a port with some listener (e.g. port 80 if you have web server running) for the test. But using a port with no listener is just as good in this case.
Juergen, to repeat the telnet command rapidly enough, ive executed the following shell script which will do a telnet 1000 times to the port 1234.
#! /bin/sh
i=1
j=1000
while [ $i -ne $j ] ;
do
telnet 43.88.102.48 1234
i=`expr $i + 1`
done
I got the same messages “unable to connect to remote host” 1000 times.
But what ive noticed is that the system did not hang even after completing 1000 iterations for both bug-fixed and bug-not-fixed kernels.
My system is a 32 bit system and ive applied the patch ipt_recent.patch that contains even the 64 bit fixes as u said. It will be helpful for me if u can tell me a clear method to reproduce the bug.
Please post the output from “iptables -L”.
The expected result for you script would be;
I’ll post a more detailed test setup later. (Might take till tomorrow
evening, I need some sleep tonight :-)
Juergen, Please find the output from “iptables -L”. Ive used the iptables-1.3.6 version at the user mode.
For unpatched kernel:
————————-
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
For patched kernel:
———————-
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
There is no difference in the “iptables -L” output for both the unpatched and the patched kernels. I hope u will give me a good solution.
OK, looks like something went wrong when you entered the three iptables commands I’ve posted above.
After entering the rules you should see:
Juergen, Please find the output from “iptables -L” after entering the three iptables commands that U’ve posted. One thing that i can see is that the 3rd line shows “ACCEPT 0” but according to you it should be “ACCEPT all”.
Chain INPUT (policy ACCEPT)
target prot opt source destination
ACCEPT 0 — anywhere anywhere state RELATED,ESTABLISHED
DROP tcp — anywhere anywhere tcp dpt:1234 recent: UPDATE seconds: 60 hit_count: 6 name: foo side: source
ACCEPT tcp — anywhere anywhere tcp dpt:1234 recent: SET name: foo side: source
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
That’s OK, 0 == all.
Did you try running your script again with that setup?
Juergen, I did the following after running your iptable rules
1. Immediately after the setup ive executed the script.
2. After the setup i rebooted the system and executed the script.
There is no difference in the output :(
Please test whether your iptables setup works at all. Just use a simple DROP rule.
(Note that the rules are not persistent, you have to setup them after each boot.)
Juergen i just tested my iptables by running the following rule
# iptables -P OUTPUT DROP
After this any command execution gives the error and this output for ls.
# ls
RPC: sendmsg returned error 1
nfs: RPC call returned error 1
ash: ls: Operation not permitted
So i feel there is no problem in my iptables setup.
Juergen, can u tell me some way to reproduce the problem. It will be very much helpful to me.
I’ll post a more detailed test setup later.
Thank u very much, I’ll be waiting for your post.
Hi Juergen,
There is a good news. Just now i was able to reproduce the bug on my i386 box. The results are as follows.
Unpatched kernel: telnetting to the port 1234 for the second time.
The ipt_recent rule kicked in and DROPed the connection request.
Patched kernel: telnet to the port 1234 for the 7th time.
For 6 times the connection refused since we have our hit_count as 6 and for the 7th time the ipt_recent rule kicked in and DROPed the connection request.
Thank you very much for the patch and the iptables rules.
Hi,
I’m running redhat EL 4 with kernel 2.6.9-22.ELsmp
I tryed rebooting and doing the test with in 5 minutes, the rules seem to be working correctly as far as I can tell. Is there any other way to test? or am I safe?
aslo, can some one post on how to actually apply the patch? do i have to recompile the kernel? or is this something applied when i compile iptables?
thanks for your time
BR
J
Jose,
The patch that is provided by Juergen is to fix the problem in
the kernel, hence you have to apply the patch to your kernel
and then rebuild the kernel.
To apply the patch refer to Juergen’s comment above, on
October 15th, 2006 at 2:06 pm
[…] Then I read this bug report. It seems I have to patch the ipt_recent source… Actually, I think this is the wrong solution anyway, and I need to disable hosts at the sshd level after a few failed password attemps. […]
Does Patrick McHardy’s rewrite really fix this more_than_LONG_MAX_jiffies problem?
It still has a time_after() comparison of unsigned long variables.
Yeah, the rewrite fixes the problem. I reviewed it months ago (I can’t remember why it is OK off the top of my head though — but it is OK!).
Hi Juergen,
My term “this more_than_LONG_MAX_jiffies problem” was rather unclear.
The ipt_recent code before his rewrite had several problems related to jiffies values.
With the rewrite, a problem
> if the list of last hits contained empty slots and jiffies is greater than LONG_MAX
seems to be fixed, but another one
> if one of the last hits was more than LONG_MAX jiffies ago
looks not to be fixed yet.
Do you mean the latter fixed too?
(continued) On i386 arch, a hit at 25 days after the last hit will result in a wrong match because ipt_recent_match() compares two unsigned long variables “t” and “e->stamps[i]” by time_after(), I believe.
[…] да в модуле ipt_recent для netfilter-а оказывается есть крутой баг, делающий это ipt_recent совершенно бесполезным. Нашли его […]
Please tell me whether Uptime continues to calculate life long…
or else it will be rolled over after 497 days…
Linux uses 64-bit jiffies now, you can get higher uptimes :)