Uploaded image for project: 'OpenShift Service Mesh'
  1. OpenShift Service Mesh
  2. OSSM-5962

MTT: TestSSL test case is failing against the 2.5 SM proxy (Segmentation fault in the proxy container)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • None
    • OSSM 2.5.0
    • Customer Impact
    • None

      With the latest 2.5 proxy build, the TestSSL test case is failing.

      The command

      ./testssl/testssl.sh -P -6 productpage:9080 || true 

      in the testssl pod fails on:

       Using "OpenSSL 1.1.1g FIPS  21 Apr 2020" [~85 ciphers]
       on testssl-84789b6c48-ljpxf:/usr/bin/openssl
       (built: "Mar 25 16:46:53 2021", platform: "linux-x86_64")
      
      
       Start 2024-02-20 09:12:24        -->> 172.30.44.247:9080 (productpage) <<--
      
       rDNS (172.30.44.247):   productpage.bookinfo.svc.cluster.local.
      ./testssl/testssl.sh: connect: Connection refused
      ./testssl/testssl.sh: line 10326: /dev/tcp/172.30.44.247/9080: Connection refused
       Oops: TCP connect problem
      
      Unable to open a socket to 172.30.44.247:9080. ./testssl/testssl.sh: connect: Connection refused
      ./testssl/testssl.sh: line 10326: /dev/tcp/172.30.44.247/9080: Connection refused
      

      ref.:  https://master-jenkins-csb-servicemesh.apps.ocp-c1.prod.psi.redhat.com/job/umb-listeners/job/servicemesh-iib-listener/297/testReport/(root)/pkg_tests_ossm/v2_5_TestSSL_2/

      update:
      I also noticed that when I run that command from `testssl` pod, the `istio-proxy` container in the `product` pod is restarted.

      2024-02-20T09:32:07.727518Z	info	cache	returned workload trust anchor from cache	ttl=23h59m59.272485027s
      [2024-02-20T09:33:45.512Z] "- - -" 0 NR filter_chain_not_found - "-" 0 0 8 - "-" "-" "-" "-" "-" - - 10.128.2.146:9080 10.129.3.147:53968 - -
      2024-02-20T09:33:45.731817Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:104	Caught Segmentation fault, suspect faulting address 0x0	thread=28
      2024-02-20T09:33:45.731857Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:91	Backtrace (use tools/stack_decode.py to get line numbers):	thread=28
      2024-02-20T09:33:45.731860Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:92	Envoy version: ae3bbc4313b45af63777a2588388796d74221cfd/1.26.8-dev/OSSM 2.5.0-1/RELEASE/OpenSSL	thread=28
      2024-02-20T09:33:45.732123Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:96	#0: __restore_rt [0x7f5e9fec9cf0]	thread=28
      2024-02-20T09:33:45.743869Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:96	#1: Envoy::Extensions::TransportSockets::Tls::TlsContext::isCipherEnabled() [0x55ec4937e91a]	thread=28
      2024-02-20T09:33:45.755545Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:96	#2: Envoy::Extensions::TransportSockets::Tls::ServerContextImpl::isClientEcdsaCapable() [0x55ec4937e8cf]	thread=28
      2024-02-20T09:33:45.766882Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:96	#3: Envoy::Extensions::TransportSockets::Tls::ServerContextImpl::selectTlsContext() [0x55ec4937f03c]	thread=28
      2024-02-20T09:33:45.766986Z	critical	envoy backtrace external/envoy/source/server/backtrace.h:98	#4: [0x7f5ea0db186e]thread=28
      ConnectionImpl 0x55ec4e578340, connecting_: 0, bind_error_: 0, state(): Open, read_buffer_limit_: 1048576
      socket_: 
        ListenSocketImpl 0x55ec4dfebb80, transport_protocol_: tls
        connection_info_provider_: 
          ConnectionInfoSetterImpl 0x55ec4e55ca60, remote_address_: 10.129.3.147:53974, direct_remote_address_: 10.129.3.147:53974, local_address_: 10.128.2.146:9080, server_name_: productpage
      2024-02-20T09:33:46.811366Z	info	ads	ADS: "@" productpage-v1-7c5c65566c-l54hv.bookinfo-2 terminated
      2024-02-20T09:33:46.811461Z	info	ads	ADS: "@" productpage-v1-7c5c65566c-l54hv.bookinfo-1 terminated
      2024-02-20T09:33:46.811764Z	error	Envoy exited with error: signal: segmentation fault (core dumped)
      2024-02-20T09:33:46.811895Z	error	error serving tap http server: http: Server closed
      

      update2:
      When I removed an additional tls config from SMCP, the script was able to connect and start showing some info

      Testing server preferences 
      
       Has server cipher order?     yes (OK) -- TLS 1.3 and below
       Negotiated protocol          TLSv1.3
       Negotiated cipher            TLS_AES_256_GCM_SHA384, 253 bit ECDH (X25519)
       Cipher order Oops: openssl s_client connect problem
      ./testssl.sh: connect: Connection refused
      

      till the `Segmentation fault` crashes the whole container.

        1. testssl.log
          3 kB
          Praneeth Bajjuri

            mkralik@redhat.com Matej Kralik
            mkralik@redhat.com Matej Kralik
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: