Comments

100 Comments

Looks like one of the tests runs against rawhide:

+ lorax --product=Fedora --version=rawhide --release=rawhide --volid=Fedora-rawhide-test --repo /etc/yum.repos.d/fedora-cisco-openh264.repo --repo /etc/yum.repos.d/fedora.repo --repo /etc/yum.repos.d/fedora-updates.repo --repo /etc/yum.repos.d/fedora-updates-testing.repo --repo /etc/yum.repos.d/tag-repository.repo --repo /etc/yum.repos.d/test-artifacts.repo --isfinal --nomacboot /var/tmp/lorax-fedora-iso/
2024-11-22 19:21:52,127: selinux is enabled and in Enforcing mode
2024-11-22 19:21:52,128: Using platform:f40 for module_platform_id
2024-11-22 19:21:52,132: Using repos: fedora-cisco-openh264, fedora, updates, test-artifacts, testing-farm-tag-repository
2024-11-22 19:21:52,132: Fetching metadata...
2024-11-22 19:22:38,110: selinux is enabled and in Enforcing mode
2024-11-22 19:22:38,110: checking for root privileges
2024-11-22 19:22:38,110: checking dnf base object
2024-11-22 19:22:38,111: setting up build architecture
2024-11-22 19:22:38,111: setting up build parameters
2024-11-22 19:22:38,111: Using templatedir /usr/share/lorax/templates.d/99-generic
2024-11-22 19:23:12,992: got release: fedora-release
2024-11-22 19:23:12,993: installing runtime packages
2024-11-22 19:23:13,024: running runtime-install.tmpl
2024-11-22 19:23:13,032: installpkg: anaconda-install-img-deps>=40.15 expands to anaconda-install-img-deps-42.14-1.fc42.x86_64
2024-11-22 19:23:13,050: installpkg: *-firmware expands to amd-gpu-firmware-20241110-1.fc42.noarch,amd-ucode-firmware-20241110-1.fc42.noarch,atheros-firmware-20241110-1.fc42.noarch,atmel-firmware-1.3-33.fc41.noarch,brcmfmac-firmware-20241110-1.fc42.noarch,cirrus-audio-firmware-20241110-1.fc42.noarch,crust-firmware-0.6-3.fc42.noarch,dvb-firmware-20241110-1.fc42.noarch,intel-audio-firmware-20241110-1.fc42.noarch,intel-gpu-firmware-20241110-1.fc42.noarch,intel-vsc-firmware-20241110-1.fc42.noarch,iwlegacy-firmware-20241110-1.fc42.noarch,iwlwifi-dvm-firmware-20241110-1.fc42.noarch,iwlwifi-mvm-firmware-20241110-1.fc42.noarch,libertas-firmware-20241110-1.fc42.noarch,linux-firmware-20241110-1.fc42.noarch,mt7xxx-firmware-20241110-1.fc42.noarch,nvidia-gpu-firmware-20241110-1.fc42.noarch,nxpwireless-firmware-20241110-1.fc42.noarch,qed-firmware-20241110-1.fc42.noarch,realtek-firmware-20241110-1.fc42.noarch,tiwilink-firmware-20241110-1.fc42.noarch,zd1211-firmware-1.5-16.fc41.noarch
2024-11-22 19:23:13,051: installpkg: grub2-tools-efi>=1:2.06-67 expands to grub2-tools-efi-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,052: installpkg: grub2-efi-x64-cdboot>=1:2.06-67 expands to grub2-efi-x64-cdboot-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,053: installpkg: grub2-efi-ia32-cdboot>=1:2.06-67 expands to grub2-efi-ia32-cdboot-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,053: installpkg: grub2-tools>=1:2.06-67 expands to grub2-tools-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,054: installpkg: grub2-tools-minimal>=1:2.06-67 expands to grub2-tools-minimal-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,054: installpkg: grub2-tools-extra>=1:2.06-67 expands to grub2-tools-extra-1:2.12-13.fc42.x86_64
2024-11-22 19:23:13,054: installpkg: grub2-pc-modules>=1:2.06-67 expands to grub2-pc-modules-1:2.12-13.fc42.noarch
2024-11-22 19:23:13,066: Checking dependencies
2024-11-22 19:24:52,888: Dependency check failed: Debug data written to "/var/ARTIFACTS/work-build-isoqzmwvq3p/plans/build-iso/tree/debugdata"
Problem: package python3-pyatspi-2.46.1-5.fc41.noarch requires python(abi) = 3.13, but none of the providers can be installed
  - python3-3.13.0-1.fc42.i686 has inferior architecture
  - cannot install both python3-3.12.7-1.fc40.x86_64 and python3-3.13.0-1.fc42.x86_64
  - package python3-dnf-4.21.1-1.fc40.noarch requires python(abi) = 3.12, but none of the providers can be installed
  - package dnf-4.21.1-1.fc40.noarch requires python3-dnf = 4.21.1-1.fc40, but none of the providers can be installed
  - cannot install the best candidate for the job
  - conflicting requests

Not sure if this expected to work because rawhide has DNF 5.

karma

This needs an update to the lorax templates. This change will have to be applied to f40: https://gitlab.com/redhat/centos-stream/rpms/lorax-templates-rhel/-/merge_requests/66

@markec It's a left-over from the previous version, not the “Running post-uninstall scriptlet: glibc-gconv-extra-0:2.40-3.fc41.x86_64” part.

That was what I ment with incompatible with libglvnd. Thanks for these amazing tests!

I've got a new build that hopefully fixes this: glibc-2.40.9000-13.fc42

Upstream discussion will happen here: [PATCH 1/2] Revert "elf: Run constructors on cyclic recursive dlopen (bug 31986)"

(I thought I had given negative karma before, but maybe I can't do that for my own updates?)

Update is incompatible with libglvnd.

The kernel.s390x scratch build failure appears to be an infrastructure issue:

$ git clone -n https://src.fedoraproject.org/rpms/kernel.git /var/lib/mock/f40-build-side-86585-49916704-5958020/root/chroot_tmpdir/scmroot/kernel
Cloning into '/var/lib/mock/f40-build-side-86585-49916704-5958020/root/chroot_tmpdir/scmroot/kernel'...
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output

https://kojipkgs.fedoraproject.org//work/tasks/4215/115504215/checkout.log

The fedora-ci.koji-build.tier0.functionalfailure is not diagnosable because the URL is broken.

Hmph, I see it. I misinterpreted the nature of the ns-slapd bug, and the upstream workaround I pushed does not actually work around it. Hmmph. I guess I'll need another fix.

Before the workaround, glibc had this loop:

 220       while (cmp (run_ptr, tmp_ptr, arg) < 0)
 221         tmp_ptr -= size;

https://sourceware.org/git/?p=glibc.git;a=blob;f=stdlib/qsort.c;h=ad110e8a892a66e1fc90f850b828e1a2d09e2ac5;hb=HEAD#l220

The loop is known to terminate if the comparison function is correct because eventually, run_ptr == tmp_ptr, and cmp must return zero. If that never happens, we eventually run into non-allocated memory regions. The only access to that memory is from the cmp function here, not from the qsort implementation, so that crash will happen in the comparison callback.

Thanks. The comparison function can never return zero: https://github.com/389ds/389-ds-base/blob/main/ldap/servers/plugins/cos/cos_cache.c#L2933

This is clearly a 389-ds-base bug. The old qsort implementation in glibc did not tickle it because it rarely called the comparison function with equal pointer arguments. We already worked around similar application problems in other places in the new implementation, we can probably do it in the insertion sort phase as well.

With the new approach:

# ipa-getkeytab -p HTTP/x0.cockpit.lan -k /etc/cockpit/krb5.keytab 
Keytab successfully retrieved and stored in: /etc/cockpit/krb5.keytab

I'll do another build, so that the AnyConnect users can test it as well.

But I think I see what's wrong with the current ELF destructor ordering approach. I'll experiment with something else.

This is a Fedora 38 cloud image with some extra packages installed, so you can install debug symbols, run gdb, etc.

Thank you. I got to this point and could reproduce the assert, but the VM with ipa-getkeytab does not have a default route. Any idea how to fix that? DHCP assigns 172.27.0.2 for the eth0 interface, but no default route.

This backtrace is more interesting:

Stack trace of thread 1959:
#0  0x00007f81c8ab0884 __pthread_kill_implementation (libc.so.6 + 0x8e884)
#1  0x00007f81c8a5fafe raise (libc.so.6 + 0x3dafe)
#2  0x00007f81c8a4887f abort (libc.so.6 + 0x2687f)
#3  0x00007f81c8a4879b __assert_fail_base.cold (libc.so.6 + 0x2679b)
#4  0x00007f81c8a58187 __assert_fail (libc.so.6 + 0x36187)
#5  0x00007f81c9030323 krb5int_key_delete (libkrb5support.so.0 + 0x6323)
#6  0x00007f81c86f0e8b gssint_mechglue_fini (libgssapi_krb5.so.2 + 0xee8b)
#7  0x00007f81c91f50f2 _dl_call_fini (ld-linux-x86-64.so.2 + 0x10f2)
#8  0x00007f81c91f8e5e _dl_fini (ld-linux-x86-64.so.2 + 0x4e5e)
#9  0x00007f81c8a621e6 __run_exit_handlers (libc.so.6 + 0x401e6)
#10 0x00007f81c8a6232e exit (libc.so.6 + 0x4032e)
#11 0x00005622a14583bc main (ipa-getkeytab + 0x63bc)
#12 0x00007f81c8a49b8a __libc_start_call_main (libc.so.6 + 0x27b8a)
#13 0x00007f81c8a49c4b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x27c4b)
#14 0x00005622a1459bb5 _start (ipa-getkeytab + 0x7bb5)

It has _dl_fini in it, so it's very likely it's caused by the changes in this update.

@adamwill @martinpitt How can I create a VM (or set of VMs) that reproduces this issue? Thanks.

@adamwill How can we reproduce this in an environment where we can run the failing process under a debugger, or with certain environment variables configured? Thanks.

The bz699724 test is recently added and apparently still under development, so I'm not particularly worried about it. It still needs porting to Python 3.

@adamwill Which failure specific worries you? I have trouble finding it in the results.