-
Bug
-
Resolution: Done
-
Undefined
-
rhel-9.4
-
None
-
kernel-5.14.0-427.42.1.el9_4
-
No
-
Important
-
rhel-net-firewall
-
ssg_networking
-
3
-
False
-
False
-
-
None
-
None
-
None
-
Automated
-
-
x86_64
-
None
What were you trying to do that didn't work?
We were able to create a repro program that uses the same lib as felix (our policy program) for nfqueue. The lib is by a kernel developer involved in netfilter development. we created a RH9 Vm in gcp with kernel Linux rh-9-4 5.14.0-427.33.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 16 10:56:24 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux (slightly newer than the one used on-site and it is broken while it works well on my ubuntu 23.04 6.5 kernel. Here is how it works:
Create an iptables rule to send pings from the local host to 1.2.3.4 to nfqueue 100.
sudo iptables -I OUTPUT -p icmp -d 1.2.3.4 -j NFQUEUE --queue-num 100
- The program reads from the queue, prints the mark of the packet (no mark) and sets mark 0x1 and issues verdict REPEAT.
- The packet should be re-injected into iptables and evaluated again - since it is the same packet it matches our iptables rule and is sent to the program again.
- The program should see the mark 0x1 and issue verdict DROP
Here is the program:
----------------------
package main
import (
"context"
"fmt"
"time"
nfqueue "github.com/florianl/go-nfqueue"
"github.com/mdlayher/netlink"
)
func main() {
// Send outgoing pings to nfqueue queue 100
// # sudo iptables -I OUTPUT -p icmp -j NFQUEUE --queue-num 100
// Set configuration options for nfqueue
config := nfqueue.Config{
NfQueue: 100,
MaxPacketLen: 0xFFFF,
MaxQueueLen: 0xFF,
Copymode: nfqueue.NfQnlCopyPacket,
WriteTimeout: 15 * time.Millisecond,
}
nf, err := nfqueue.Open(&config)
if err != nil {
fmt.Println("could not open nfqueue socket:", err)
return
}
defer nf.Close()
// Avoid receiving ENOBUFS errors.
if err := nf.SetOption(netlink.NoENOBUFS, true); err != nil {
fmt.Printf("failed to set netlink option %v: %v\n",
netlink.NoENOBUFS, err)
return
}
ctx := context.Background()
fn := func(a nfqueue.Attribute) int {
id := *a.PacketID
// Just print out the id and payload of the nfqueue packet
if a.Mark != nil {
fmt.Printf("[%d]\t0x%08x\n", id, *a.Mark)
if (*a.Mark)&uint32(0x1) != 0 {
fmt.Println("Dropping seen")
nf.SetVerdict(id, nfqueue.NfDrop)
return 0
}
} else {
fmt.Printf("[%d]\tno mark\n", id)
}
fmt.Println("Set makr 0x1 and repeat")
nf.SetVerdictWithMark(id, nfqueue.NfRepeat, 0x1)
return 0
}
// Register your function to listen on nflqueue queue 100
err = nf.RegisterWithErrorFunc(ctx, fn, func(e error) int {
fmt.Println(err)
return -1
})
if err != nil {
fmt.Println(err)
return
}
// Block till the context expires
<-ctx.Done()
{color:#cccccc}}
Expected behavior
===============
Create an iptables rule to send pings from the local host to 1.2.3.4 to nfqueue 100.
sudo iptables -I OUTPUT -p icmp -d 1.2.3.4 -j NFQUEUE --queue-num 100
- The program reads from the queue, prints the mark of the packet (no mark) and sets mark 0x1 and issues verdict REPEAT.
- The packet should be reinjected into iptables and evaluated again - since it is the same packet it matches our iptables rule and is sent to the program again.
- The program should see the mark 0x1 and issue verdict DROP
When things work as expect, the program should see the packet 2x. When thins work badly, the program would see the packet just once and would never see a packet with the mark 0x1.
Note that is the rule sets destination to an existing and reachable IP, then things work, there is no connectivity, when things do not work, the packet goes through and there should be ping response.
Also attached the source of the program and also a binary.
Bad output from the test program:
[1] no mark Set makr 0x1 and repeat [2] no mark Set makr 0x1 and repeat [3] no mark Set makr 0x1 and repeat [4] no mark Set makr 0x1 and repeat [5] no mark Set makr 0x1 and repeat [6] no mark Set makr 0x1 and repeat
Good output from the test program:
[1] no mark Set makr 0x1 and repeat [2] 0x00000001 Dropping seen [3] no mark Set makr 0x1 and repeat [4] 0x00000001 Dropping seen
What is the impact of this issue to you?
Our customer needs to release a project
Please provide the package NVR for which the bug is seen:
We can reproduce the issue in kernel:
5.14.0-427.13.1.el9_4.x86_64 <<---- issue ! RHEL 9
5.14.0-446.el9.x86_64 <<----------- issue is fixed
$ git log --oneline --grep="netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts" origin/main
e3b4118d5495 netfilter: nfnetlink_queue: un-break NF_REPEAT <<-------
$ git tag --contains=e3b4118d5495 | head -1
kernel-5.14.0-446.el9
How reproducible is this bug?:
100% reproducible
Steps to reproduce
I created a detailed output on this case for different versions on RHEL, I am not pasting here because it is losing the formatting I gave to have a clear picture ( comment #18) and (comment #21, same steps but using the DEV kernel)
https://gss--c.vf.force.com/apex/Case_View?srPos=0&srKp=500&id=5006R0000212kDB&sfdc.override=1
Expected results
issue is fixed on the dev kernel 5.14.0-446.el9.x86_64
- ./nfqueue_app
[1] no mark
Set mark 0x1 and repeat
[2] 0x00000001
Dropping seen
[3] no mark
Set mark 0x1 and repeat
[4] 0x00000001
Dropping seen
[5] no mark
Set mark 0x1 and repeat
[6] 0x00000001
Dropping seen
[7] no mark
Set mark 0x1 and repeat
[8] 0x00000001
Dropping seenActual results
10.- Testing
- ./nfqueue_app
[1] no mark
Set mark 0x1 and repeat
[2] no mark
Set mark 0x1 and repeat
[3] no mark
Set mark 0x1 and repeat
[4] no mark