I revoked this update because it was queued for stable, but it seems to break openQA's ability to launch Fedora from a console - see https://openqa.fedoraproject.org/tests/762385#step/server_cockpit_default/14 . This is vital to how we do several openQA tests. I'm not sure what the problem is yet beyond the apparently cut-off error message "ldn't load XPCOM.xinit: connection t" we see at the console, but it is failing consistently for this update and working consistently without it.
This update changes the soname of libdns (in bind-libs-lite) and will render bind-dyndb-ldap uninstallable (that's the failures in the openQA tests). The update should at least include a rebuild of bind-dyndb-ldap , but really you should not change library APIs/ABIs in stable release updates at all unless it's necessary.
I don't really see any benefit to keeping the restriction in Rawhide. People are going to want to run Firefox add-ons. I mean, the two most prominent people using Rawhide on the desktop are probably me and @kevin and we both want to run Firefox add-ons. :)
All the restriction will do is cause people to set the entire system-wide policy to LEGACY (which reduces their security in all sorts of other areas) or try and hand-craft an exception for this specific case and possibly get it wrong and break stuff. There's no benefit for anyone there.
If Mozilla re-signs add-ons, great. Until then our default policy should accept SHA-1 signatures for Firefox add-ons, using as finely-grained as possible a policy exception for that purpose.
Aha - so, significantly, this seems to be specific to the domain name I used for this test. I just tweaked the staging instance to use
test.openqa.fedoraproject.org instead of
domain.local as the domain, and the tests pass with that change. So it's likely that the issue here is specific to using
It still seems like an incorrect behaviour change somewhere, but less of a big deal.
For the record, I think we do have this issue in Rawhide also. I can't tell for the simple "deploy directly on Rawhide" tests as server deployment fails in that case (so the client tests never reach the point where this bug would happen), but I think we're seeing it on the upgrade tests. On the upgrades tests, we deploy server + client on F32 or F33, then upgrade server to Rawhide, then upgrade clients to Rawhide and run client tests. In that scenario, the server is deploying and upgrading apparently successfully and from the logs is working OK after upgrade...but the client tests, after upgrade to Rawhide, cannot resolve
ipa001.domain.local (they fail when trying to browse to it in Firefox, to access the FreeIPA web UI). That looks a lot like the same bug.
"if we should just require LEGACY crypto policies if you want to load add-ons"
No, that would be completely unacceptable. We can't just start breaking people's add-ons in a stable release update and telling them to do cryptic stuff to fix them. After the update Firefox add-ons should work as they did before, with no manual intervention required. If the policy is too far ahead of the real world the policy needs to be adjusted.
We should probably unpush this update until this is figured out, also.
This seems to break add-on install in Firefox. Note the "Installation aborted because the add-on appears to be corrupt" error. I don't think this is an issue with the add-on itself, because the same test passed on other earlier and later F33 updates. It really seems to be an issue caused by the newer NSS.
Hum, so it looks like the problem here actually is on the client end. If I hack up the test so the server uses the updated NetworkManager but the client uses the current stable one (1.26.4-1.fc33), it works.
I also checked the client logs from the previous run against the server logs with the named logging enabled. This is where the client fails:
Dec 10 22:25:34 client003.domain.local realmd: Using 'r552.902' operation for method 'Discover' invocation on 'org.freedesktop.realmd.Provider' interface Dec 10 22:25:34 client003.domain.local realmd: Registered cancellable for operation 'r552.902' Dec 10 22:25:34 client003.domain.local realmd: * Resolving: _ldap._tcp.ipa001.domain.local Dec 10 22:25:34 client003.domain.local realmd: * Resolving: _ldap._tcp.ipa001.domain.local Dec 10 22:25:34 client003.domain.local realmd: Resolving ipa001.domain.local failed: Temporarily unable to resolve “_kerberos._udp.ipa001.domain.local” Dec 10 22:25:34 client003.domain.local realmd: Temporarily unable to resolve “_ldap._tcp.ipa001.domain.local” Dec 10 22:25:34 client003.domain.local realmd: * Resolving: ipa001.domain.local Dec 10 22:25:34 client003.domain.local realmd: * Resolving: ipa001.domain.local Dec 10 22:25:34 client003.domain.local realmd: Resolving ipa001.domain.local failed: Temporarily unable to resolve “_kerberos._tcp.ipa001.domain.local” Dec 10 22:25:34 client003.domain.local realmd: Error resolving “ipa001.domain.local”: Name or service not known Dec 10 22:25:34 client003.domain.local realmd: * No results: ipa001.domain.local Dec 10 22:25:34 client003.domain.local realmd: * No results: ipa001.domain.local
but there is nothing at all in the server named logs at the corresponding time, they go straight from :25:15 to :26:05 (the difference in hours is just local time vs. UTC):
10-Dec-2020 17:25:15.930 client @0x7f5e14008cc0 172.16.2.102#59589 (dl.fedoraproject.org): endrequest 10-Dec-2020 17:26:05.462 client @0x7f5e14060c10 172.16.2.102#39716: UDP request
so that matches up. It seems like, with the updated NM, this request somehow never makes it off the client box and hits the server; it's failing entirely on the client end somehow.
https://openqa.stg.fedoraproject.org/tests/983692/file/role_deploy_domain_controller-varnamed.tar.gz has the contents of
/var/named from the server after a test where we ran
rndc querylog on after deploying the server (on instructions from @abbra).