daemon dumps core on unexpected disconnection #571

Closed
opened 2024-11-06 16:38:20 +00:00 by catvayor · 3 comments
Member

Describe the bug

My nix-deamon crashed while building my system.

Steps To Reproduce

I don't remember what I was doing exactly when the coredump occurred, I saw it when randomly reading my systemd-journal.

nix --version output

nix (Lix, like Nix) 2.91.1

coredump message

       PID: 71944 (nix-daemon)
       UID: 0 (root)
       GID: 0 (root)
    Signal: 6 (ABRT)
 Timestamp: Fri 2024-11-01 11:20:22 CET (5 days ago)

Command Line: nix-daemon 71942
Executable: /nix/store/n99cjrvzmw28v6vy6nbkcp0niwvlhwnb-lix-2.91.1/bin/nix
Control Group: /system.slice/nix-daemon.service
Unit: nix-daemon.service
Slice: system.slice
Boot ID: aa4f83a64d494119aa2e2247befbbebf
Machine ID: 9498cd0c1ff74b9eac48ce3bc55bcd99
Hostname: kat-probook
Storage: /var/lib/systemd/coredump/core.nix-daemon.0.aa4f83a64d494119aa2e2247befbbebf.71944.1730456422000000.zst (present) // you can download it here
Size on Disk: 1.3M
Message: Process 71944 (nix-daemon) of user 0 dumped core.

            Module libkeyutils.so.1 without build-id.
            Module libkrb5support.so.0 without build-id.
            Module libcom_err.so.3 without build-id.
            Module libk5crypto.so.3 without build-id.
            Module libkrb5.so.3 without build-id.
            Module libunistring.so.5 without build-id.
            Module libattr.so.1 without build-id.
            Module libaws-c-common.so.1 without build-id.
            Module libaws-checksums.so.1.0.0 without build-id.
            Module libaws-c-sdkutils.so.1.0.0 without build-id.
            Module libaws-c-cal.so.1.0.0 without build-id.
            Module libaws-c-compression.so.1.0.0 without build-id.
            Module libs2n.so.1 without build-id.
            Module libaws-c-io.so.1.0.0 without build-id.
            Module libaws-c-http.so.1.0.0 without build-id.
            Module libaws-c-auth.so.1.0.0 without build-id.
            Module libaws-c-s3.so.0unstable without build-id.
            Module libaws-c-event-stream.so.1.0.0 without build-id.
            Module libaws-c-mqtt.so.1.0.0 without build-id.
            Module libaws-crt-cpp.so without build-id.
            Module libgssapi_krb5.so.2 without build-id.
            Module libpsl.so.5 without build-id.
            Module libssh2.so.1 without build-id.
            Module libidn2.so.0 without build-id.
            Module libnghttp2.so.14 without build-id.
            Module libbrotlicommon.so.1 without build-id.
            Module libxml2.so.2 without build-id.
            Module libz.so.1 without build-id.
            Module libbz2.so.1 without build-id.
            Module libzstd.so.1 without build-id.
            Module liblzma.so.5 without build-id.
            Module libacl.so.1 without build-id.
            Module libaws-cpp-sdk-transfer.so without build-id.
            Module libaws-cpp-sdk-s3.so without build-id.
            Module libaws-cpp-sdk-core.so without build-id.
            Module libseccomp.so.2 without build-id.
            Module libbrotlienc.so.1 without build-id.
            Module libbrotlidec.so.1 without build-id.
            Module libarchive.so.13 without build-id.
            Module libcpuid.so.17 without build-id.
            Module liblowdown.so.1 without build-id.
            Module libeditline.so.1 without build-id.
            Module libgcc_s.so.1 without build-id.
            Module libstdc++.so.6 without build-id.
            Stack trace of thread 71944:
            #0  0x00007fb90d299a9c __pthread_kill_implementation (libc.so.6 + 0x92a9c)
            #1  0x00007fb90d247576 raise (libc.so.6 + 0x40576)
            #2  0x00007fb90d22f935 abort (libc.so.6 + 0x28935)
            #3  0x00007fb90d4acc2b _ZN9__gnu_cxx27__verbose_terminate_handlerEv.cold (libstdc++.so.6 + 0xacc2b)
            #4  0x00007fb90d4bc20a _ZN10__cxxabiv111__terminateEPFvvE (libstdc++.so.6 + 0xbc20a)
            #5  0x00007fb90d4bb289 __cxa_call_terminate (libstdc++.so.6 + 0xbb289)
            #6  0x00007fb90d4bb996 __gxx_personality_v0 (libstdc++.so.6 + 0xbb996)
            #7  0x00007fb90defea59 _Unwind_RaiseException_Phase2 (libgcc_s.so.1 + 0x1aa59)
            #8  0x00007fb90deff181 _Unwind_RaiseException (libgcc_s.so.1 + 0x1b181)
            #9  0x00007fb90d4bc4ba __cxa_throw (libstdc++.so.6 + 0xbc4ba)
            #10 0x00007fb90e26b461 _ZN3nix8FdSource14readUnbufferedEPcm.cold (liblixutil.so + 0x48461)
            #11 0x00007fb90e2d32d2 _ZN3nix14BufferedSource4readEPcm (liblixutil.so + 0xb02d2)
            #12 0x00007fb90e2d5858 _ZN3nix6SourceclEPcm (liblixutil.so + 0xb2858)
            #13 0x00007fb90dccf59a _ZN3nix7readNumIjEET_RNS_6SourceE (liblixstore.so + 0xcf59a)
            #14 0x00007fb90dccf65f _ZN3nix12FramedSourceD2Ev (liblixstore.so + 0xcf65f)
            #15 0x00007fb90dc81b69 _ZZN3nix6daemonL9performOpEPNS0_12TunnelLoggerENS_3refINS_5StoreEEENS_11TrustedFlagENS0_13RecursiveFlagEjRNS_6SourceERNS_12BufferedSinkENS_11WorkerProto2OpEENKUlvE_clEv.cold (liblixstore.so + 0x81b69)
            #16 0x00007fb90dd04a01 _ZN3nix6daemonL9performOpEPNS0_12TunnelLoggerENS_3refINS_5StoreEEENS_11TrustedFlagENS0_13RecursiveFlagEjRNS_6SourceERNS_12BufferedSinkENS_11WorkerProto2OpE (liblixstore.so + 0x104a01)
            #17 0x00007fb90dd0654e _ZN3nix6daemon17processConnectionENS_3refINS_5StoreEEERNS_8FdSourceERNS_6FdSinkENS_11TrustedFlagENS0_13RecursiveFlagE (liblixstore.so + 0x10654e)
            #18 0x000000000049266b _ZNSt17_Function_handlerIFvvEZL10daemonLoopSt8optionalIN3nix11TrustedFlagEEEUlvE_E9_M_invokeERKSt9_Any_data.lto_priv.0 (nix + 0x9266b)
            #19 0x00007fb90e2d8385 _ZNSt17_Function_handlerIFvvEZN3nix12startProcessESt8functionIS0_ERKNS1_14ProcessOptionsEEUlvE_E9_M_invokeERKSt9_Any_data (liblixutil.so + 0xb5385)
            #20 0x00007fb90e2d82a1 _ZN3nix12startProcessESt8functionIFvvEERKNS_14ProcessOptionsE (liblixutil.so + 0xb52a1)
            #21 0x000000000048cefa _ZL10daemonLoopSt8optionalIN3nix11TrustedFlagEE (nix + 0x8cefa)
            #22 0x000000000048dcb3 _ZL15main_nix_daemoniPPc.lto_priv.0 (nix + 0x8dcb3)
            #23 0x00000000004e6af2 _ZN3nix11mainWrappedEiPPc (nix + 0xe6af2)
            #24 0x00007fb90df2fdf7 _ZN3nix16handleExceptionsERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt8functionIFvvEE (liblixmain.so + 0x26df7)
            #25 0x000000000046b1b4 main (nix + 0x6b1b4)
            #26 0x00007fb90d23127e __libc_start_call_main (libc.so.6 + 0x2a27e)
            #27 0x00007fb90d231339 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a339)
            #28 0x000000000046f2e5 _start (nix + 0x6f2e5)
            ELF object binary architecture: AMD x86-64
## Describe the bug My nix-deamon crashed while building my system. ## Steps To Reproduce I don't remember what I was doing exactly when the coredump occurred, I saw it when randomly reading my systemd-journal. ## `nix --version` output nix (Lix, like Nix) 2.91.1 ## coredump message PID: 71944 (nix-daemon) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Fri 2024-11-01 11:20:22 CET (5 days ago) Command Line: nix-daemon 71942 Executable: /nix/store/n99cjrvzmw28v6vy6nbkcp0niwvlhwnb-lix-2.91.1/bin/nix Control Group: /system.slice/nix-daemon.service Unit: nix-daemon.service Slice: system.slice Boot ID: aa4f83a64d494119aa2e2247befbbebf Machine ID: 9498cd0c1ff74b9eac48ce3bc55bcd99 Hostname: kat-probook Storage: /var/lib/systemd/coredump/core.nix-daemon.0.aa4f83a64d494119aa2e2247befbbebf.71944.1730456422000000.zst (present) [// you can download it here](https://katvayor.net/core.nix-daemon.0.aa4f83a64d494119aa2e2247befbbebf.71944.1730456422000000.zst) Size on Disk: 1.3M Message: Process 71944 (nix-daemon) of user 0 dumped core. Module libkeyutils.so.1 without build-id. Module libkrb5support.so.0 without build-id. Module libcom_err.so.3 without build-id. Module libk5crypto.so.3 without build-id. Module libkrb5.so.3 without build-id. Module libunistring.so.5 without build-id. Module libattr.so.1 without build-id. Module libaws-c-common.so.1 without build-id. Module libaws-checksums.so.1.0.0 without build-id. Module libaws-c-sdkutils.so.1.0.0 without build-id. Module libaws-c-cal.so.1.0.0 without build-id. Module libaws-c-compression.so.1.0.0 without build-id. Module libs2n.so.1 without build-id. Module libaws-c-io.so.1.0.0 without build-id. Module libaws-c-http.so.1.0.0 without build-id. Module libaws-c-auth.so.1.0.0 without build-id. Module libaws-c-s3.so.0unstable without build-id. Module libaws-c-event-stream.so.1.0.0 without build-id. Module libaws-c-mqtt.so.1.0.0 without build-id. Module libaws-crt-cpp.so without build-id. Module libgssapi_krb5.so.2 without build-id. Module libpsl.so.5 without build-id. Module libssh2.so.1 without build-id. Module libidn2.so.0 without build-id. Module libnghttp2.so.14 without build-id. Module libbrotlicommon.so.1 without build-id. Module libxml2.so.2 without build-id. Module libz.so.1 without build-id. Module libbz2.so.1 without build-id. Module libzstd.so.1 without build-id. Module liblzma.so.5 without build-id. Module libacl.so.1 without build-id. Module libaws-cpp-sdk-transfer.so without build-id. Module libaws-cpp-sdk-s3.so without build-id. Module libaws-cpp-sdk-core.so without build-id. Module libseccomp.so.2 without build-id. Module libbrotlienc.so.1 without build-id. Module libbrotlidec.so.1 without build-id. Module libarchive.so.13 without build-id. Module libcpuid.so.17 without build-id. Module liblowdown.so.1 without build-id. Module libeditline.so.1 without build-id. Module libgcc_s.so.1 without build-id. Module libstdc++.so.6 without build-id. Stack trace of thread 71944: #0 0x00007fb90d299a9c __pthread_kill_implementation (libc.so.6 + 0x92a9c) #1 0x00007fb90d247576 raise (libc.so.6 + 0x40576) #2 0x00007fb90d22f935 abort (libc.so.6 + 0x28935) #3 0x00007fb90d4acc2b _ZN9__gnu_cxx27__verbose_terminate_handlerEv.cold (libstdc++.so.6 + 0xacc2b) #4 0x00007fb90d4bc20a _ZN10__cxxabiv111__terminateEPFvvE (libstdc++.so.6 + 0xbc20a) #5 0x00007fb90d4bb289 __cxa_call_terminate (libstdc++.so.6 + 0xbb289) #6 0x00007fb90d4bb996 __gxx_personality_v0 (libstdc++.so.6 + 0xbb996) #7 0x00007fb90defea59 _Unwind_RaiseException_Phase2 (libgcc_s.so.1 + 0x1aa59) #8 0x00007fb90deff181 _Unwind_RaiseException (libgcc_s.so.1 + 0x1b181) #9 0x00007fb90d4bc4ba __cxa_throw (libstdc++.so.6 + 0xbc4ba) #10 0x00007fb90e26b461 _ZN3nix8FdSource14readUnbufferedEPcm.cold (liblixutil.so + 0x48461) #11 0x00007fb90e2d32d2 _ZN3nix14BufferedSource4readEPcm (liblixutil.so + 0xb02d2) #12 0x00007fb90e2d5858 _ZN3nix6SourceclEPcm (liblixutil.so + 0xb2858) #13 0x00007fb90dccf59a _ZN3nix7readNumIjEET_RNS_6SourceE (liblixstore.so + 0xcf59a) #14 0x00007fb90dccf65f _ZN3nix12FramedSourceD2Ev (liblixstore.so + 0xcf65f) #15 0x00007fb90dc81b69 _ZZN3nix6daemonL9performOpEPNS0_12TunnelLoggerENS_3refINS_5StoreEEENS_11TrustedFlagENS0_13RecursiveFlagEjRNS_6SourceERNS_12BufferedSinkENS_11WorkerProto2OpEENKUlvE_clEv.cold (liblixstore.so + 0x81b69) #16 0x00007fb90dd04a01 _ZN3nix6daemonL9performOpEPNS0_12TunnelLoggerENS_3refINS_5StoreEEENS_11TrustedFlagENS0_13RecursiveFlagEjRNS_6SourceERNS_12BufferedSinkENS_11WorkerProto2OpE (liblixstore.so + 0x104a01) #17 0x00007fb90dd0654e _ZN3nix6daemon17processConnectionENS_3refINS_5StoreEEERNS_8FdSourceERNS_6FdSinkENS_11TrustedFlagENS0_13RecursiveFlagE (liblixstore.so + 0x10654e) #18 0x000000000049266b _ZNSt17_Function_handlerIFvvEZL10daemonLoopSt8optionalIN3nix11TrustedFlagEEEUlvE_E9_M_invokeERKSt9_Any_data.lto_priv.0 (nix + 0x9266b) #19 0x00007fb90e2d8385 _ZNSt17_Function_handlerIFvvEZN3nix12startProcessESt8functionIS0_ERKNS1_14ProcessOptionsEEUlvE_E9_M_invokeERKSt9_Any_data (liblixutil.so + 0xb5385) #20 0x00007fb90e2d82a1 _ZN3nix12startProcessESt8functionIFvvEERKNS_14ProcessOptionsE (liblixutil.so + 0xb52a1) #21 0x000000000048cefa _ZL10daemonLoopSt8optionalIN3nix11TrustedFlagEE (nix + 0x8cefa) #22 0x000000000048dcb3 _ZL15main_nix_daemoniPPc.lto_priv.0 (nix + 0x8dcb3) #23 0x00000000004e6af2 _ZN3nix11mainWrappedEiPPc (nix + 0xe6af2) #24 0x00007fb90df2fdf7 _ZN3nix16handleExceptionsERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt8functionIFvvEE (liblixmain.so + 0x26df7) #25 0x000000000046b1b4 main (nix + 0x6b1b4) #26 0x00007fb90d23127e __libc_start_call_main (libc.so.6 + 0x2a27e) #27 0x00007fb90d231339 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x2a339) #28 0x000000000046f2e5 _start (nix + 0x6f2e5) ELF object binary architecture: AMD x86-64
jade changed title from nix-deamon coredumped to daemon dumps core on unexpected disconnection 2024-11-10 02:12:03 +00:00
Owner

I'm pretty sure we should have other instances of this bug but it's basically that the daemon handles surprise disconnections poorly and winds up dumping core due to throwing while an exception is being handled.

It is a bug that it's dumping core but it is relatively harmless.

I'm pretty sure we should have other instances of this bug but it's basically that the daemon handles surprise disconnections poorly and winds up dumping core due to throwing while an exception is being handled. It is a bug that it's dumping core but it is relatively harmless.
Owner

aha yes okay there was a reason i was familiar with this: #123 (comment)

let's keep this report open because we should fix the double throw in the future.

aha yes okay there was a reason i was familiar with this: https://git.lix.systems/lix-project/lix/issues/123#issuecomment-1641 let's keep this report open because we *should* fix the double throw in the future.
Owner

I suspect this might have been fixed by https://gerrit.lix.systems/c/lix/+/2086 or similar exception handling work I did in 2.92-dev?

The reproducer is to take a broken client that aborts the connection (2.90 works) and then run the example from #123 against a newer daemon. The 2.91 daemon crashes, the 2.92-dev daemon no longer crashes.

lix2/example » sudo ../outputs/out/bin/nix-daemon --option enable-core-dumps true --daemon
accepted connection from pid 731668, user jade (trusted)
unexpected Nix daemon error: error: interrupted by the user
^C%                                                                                                                                                     

lix2/example » sudo ./result-291/bin/nix-daemon --option enable-core-dumps true --daemon
accepted connection from pid 738234, user jade (trusted)
terminate called after throwing an instance of 'nix::EndOfFile'
  what():  error: unexpected end-of-file
I suspect this might have been fixed by https://gerrit.lix.systems/c/lix/+/2086 or similar exception handling work I did in 2.92-dev? The reproducer is to take a broken client that aborts the connection (2.90 works) and then run the example from #123 against a newer daemon. The 2.91 daemon crashes, the 2.92-dev daemon no longer crashes. ``` lix2/example » sudo ../outputs/out/bin/nix-daemon --option enable-core-dumps true --daemon accepted connection from pid 731668, user jade (trusted) unexpected Nix daemon error: error: interrupted by the user ^C% lix2/example » sudo ./result-291/bin/nix-daemon --option enable-core-dumps true --daemon accepted connection from pid 738234, user jade (trusted) terminate called after throwing an instance of 'nix::EndOfFile' what(): error: unexpected end-of-file ```
jade closed this issue 2024-11-13 00:08:51 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#571
No description provided.