Comments

339 Comments

Many thanks Florian for puzzling this out! Our nightly run still failed, but dnf only "saw" 2.37-6.fc38 on the mirrors still. I'll let you know tomorrow morning. But I'm sure it'll be good, as you tested it on the very thing.

Sorry @fweimer, indeed these VMs are offline by default, to make sure none of our tests depends on something outside. Please apply this local hack:

--- test/common/testlib.py
+++ test/common/testlib.py
@@ -1290,7 +1290,7 @@ class MachineCase(unittest.TestCase):
                 if cleanup:
                     self.addCleanup(network.kill)
                 self.network = network
-            networking = self.network.host(restrict=restrict, forward=forward or {})
+            networking = self.network.host(restrict=False, forward=forward or {})
             machine = machine_class(verbose=opts.trace, networking=networking, image=image, **kwargs)
             if opts.fetch and not os.path.exists(machine.image_file):
                 machine.pull(machine.image_file)

As this needs a client machine which is talking to a configured FreeIPA, and thus at least two VMs which talk to each other, this is unfortunately quite involved. If you have a FreeIPA setup, you could just run the ipa-getkeytab command. From scratch, here is how you can reproduce the cockpit test. Note that this is safe -- they don't run as root, don't create/change permanent files on the host (only temp dirs/files) or change the qemu/libvirt config (all transient VMs with socket networking).

You can do this in our development toolbox, so that you don't have to install all the nodejs, libvirt and QEMU packages (if you already have them installed, you can skip this):

toolbox create --image quay.io/cockpit/tasks -c cockpit
toolbox enter cockpit

Then check out cockpit and build a test image:

git clone https://github.com/cockpit-project/cockpit
cd cockpit
test/image-prepare -q fedora-38

This is without updates-testing still. Now run a FreeIPA test:

test/verify/check-system-realms TestKerberos.testNegotiate

This ought to succeed. If not, and it's not obvious why (like, missing libvirt packages or so), please ping me here or on Slack, I'm happy to assist.

Now install the glibc update into the VM:

bots/image-customize -v -r 'dnf upgrade --enablerepo=updates-testing -y --refresh --advisory=FEDORA-2023-7f0a294b1a >&2' fedora-38

Now run the test again, but this time with the -s option, which will make it "sit" on a test failure without cleaning up the VMs:

test/verify/check-system-realms TestKerberos.testNegotiate -s

This should fail with this assertion error, and give you some information how to log in via SSH:

ssh -p 2201 -i bots/machine/identity root@127.0.0.2

Move that terminal to the side -- as soon as you press enter, it'll continue, i.e. clean up all the test VMs.

Inside the test VM, you can now reproduce the crash:

# ipa-getkeytab -p HTTP/x0.cockpit.lan -k /etc/cockpit/krb5.keytab
Keytab successfully retrieved and stored in: /etc/cockpit/krb5.keytab
k5_mutex_lock: Received error 22 (Invalid argument)
ipa-getkeytab: ../../include/k5-thread.h:376: k5_mutex_lock: Assertion `r == 0' failed.
Aborted (core dumped)

This is a Fedora 38 cloud image with some extra packages installed, so you can install debug symbols, run gdb, etc.

karma

Works for me.

Works for me.

Works for me.

Works for me.

Works for me.

karma

Crashes confirmed with cockpit tests, they now segfault ipa-getkeytab and iscsid at least.

The journal for ipa-getkeytab crash shows the crash in krb5int_key_delete(), and the test output confirms the assertion:

ipa-getkeytab: ../../include/k5-thread.h:376: k5_mutex_lock: Assertion `r == 0' failed.

The journal for the iscsi crash doesn't even get that far, it crashes right at the beginning:

systemd-coredump[34279]: Process 34270 (iscsid) of user 0 dumped core.    

Module libpcre2-8.so.0 from rpm pcre2-10.42-1.fc38.1.x86_64    
Module liblz4.so.1 from rpm lz4-1.9.4-2.fc38.x86_64    
Module libcap.so.2 from rpm libcap-2.48-6.fc38.x86_64    
Module liblzma.so.5 from rpm xz-5.4.1-1.fc38.x86_64    
Module libzstd.so.1 from rpm zstd-1.5.5-1.fc38.x86_64    
Module libselinux.so.1 from rpm libselinux-3.5-1.fc38.x86_64    
Module libblkid.so.1 from rpm util-linux-2.38.1-4.fc38.x86_64    
Module libz.so.1 from rpm zlib-1.2.13-3.fc38.x86_64    
Module libopeniscsiusr.so.0.2.0 from rpm iscsi-initiator-utils-6.2.1.4-10.git2a8f9d8.fc38.x86_64    
Module libsystemd.so.0 from rpm systemd-253.10-1.fc38.x86_64    
Module libkmod.so.2 from rpm kmod-30-4.fc38.x86_64    
Module libmount.so.1 from rpm util-linux-2.38.1-4.fc38.x86_64    
Module libcrypto.so.3 from rpm openssl-3.0.9-2.fc38.x86_64    
Module libisns.so.0 from rpm isns-utils-0.101-6.fc38.x86_64                             
Module iscsid from rpm iscsi-initiator-utils-6.2.1.4-10.git2a8f9d8.fc38.x86_64                             
Stack trace of thread 34270:                             
#0  0x00007fc7ef527324 __poll (libc.so.6 + 0x105324)                             
#1  0x0000558cec9fcbc5 main (iscsid + 0x8bc5)                             
#2  0x00007fc7ef449b8a __libc_start_call_main (libc.so.6 + 0x27b8a)                             
#3  0x00007fc7ef449c4b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x27c4b)                             
#4  0x0000558cec9fd805 _start (iscsid + 0x9805)                             
ELF object binary architecture: AMD x86-64                             
systemd[1]: iscsid.service: Main process exited, code=dumped, status=6/ABRT                    
karma

Works!

works fine.

Works fine.

karma

I verified that this update fixes FIPS mode. Thanks!

BZ#2229127 gnutls-3.8.1 is available
BZ#2235589 gnutls fails in FIPS mode: Error in GnuTLS initialization: Error while performing self checks.

I pinged Adam about the OpenQA failures. They could be related to mirror lag and the selinux-policy update as well, but I have some trouble interpreting at least one of the failures. https://openqa.fedoraproject.org/tests/2036536#step/role_deploy_domain_controller/11 is clearly just temporary and needs a retry.

Note: Rawide is still waiting for FEDORA-2023-2ae7eaf74a , i.e. selinux-policy 38.22-1. dnf still sees 38.21, that's what makes the test fail. Checking again tomorrow.

Cockpit's tests found at least one regression: https://bugzilla.redhat.com/show_bug.cgi?id=2223568

I also found https://bugzilla.redhat.com/show_bug.cgi?id=2223571 which may be due to the new systemd version (which landed at the same time), or also a regression.

Again, please don't rush SELinux updates through -proposed, that's not enough time to catch regressions with tests.

This update triggers two SELinux AVCs, see https://bugzilla.redhat.com/show_bug.cgi?id=2223571 . Our Cockpit tests also don't actually exercise networkd much, so other than the rejection messages nothing fails. But it may for users who actually use networkd to configure the network.

karma

Tested in a clean environment, udpates and works fine.