Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-18308

MariaDB error 1020 "Record has changed since last read" not properly handled by oslo.db

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • 2025.2 (Flamingo)
    • 2025.2 (Flamingo)
    • python-oslo-db
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • rhos-ops-platform-services-pidone
    • None
    • Sprint 3
    • 1
    • Moderate

      To Reproduce Steps to reproduce the behavior:

      1. Configure OpenStack service (Nova, Keystone, etc.) to use MariaDB as database backend
      2. Set MariaDB isolation level to REPEATABLE-READ
      3. Execute concurrent UPDATE operations on the same table rows within transactions
      4. Observe that MariaDB error 1020 is raised but not properly categorized by oslo.db
        See error: sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1020, "Record has changed since last read in table '...'") passed through as generic OperationalError

      Expected behavior

      The MariaDB error 1020 should be detected and wrapped in a specific oslo.db exception class (e.g., DBConsistencyError) that clearly indicates this is a transient consistency validation failure that can be resolved by retrying the transaction. This would allow OpenStack services to implement appropriate retry logic instead of treating it as a generic operational error.

      Screenshots

      N/A - This is a database error handling issue

      Device Info (please complete the following information):

      • Database: MariaDB (various versions supporting REPEATABLE-READ isolation)
      • OS Version: Various Linux distributions running OpenStack
      • Python Version: Python 3.x
      • SQLAlchemy Version: 1.4+ / 2.0+
      • Oslo.db Version: 17.3.0 or any previous versions lacking MariaDB 1020 error handling

      Bug impact

      This bug impacts OpenStack deployments using MariaDB as the database backend under REPEATABLE-READ isolation level:

      • High: Services like Nova and Keystone experience unhandled OperationalErrors during concurrent operations
      • Medium: Lack of proper error categorization prevents implementation of appropriate retry mechanisms
      • Low: Generic error handling makes troubleshooting more difficult for operators
      • The issue becomes more prominent in high-concurrency environments where multiple transactions attempt to modify the same data simultaneously.

      Known workaround

      Currently, the work around is simply to set `innodb_snapshot_isolation = OFF`. Applications can also catch the generic OperationalError and manually parse the error message to detect MariaDB error 1020 but this would requires service-specific code rather than leveraging oslo.db's centralized error handling.

      Additional context

      This is related to MariaDB's new transaction validation model under REPEATABLE-READ isolation level. The error occurs when MariaDB detects that a row has changed since it was last read in the current transaction, which is a legitimate consistency check but should be treated as a retryable condition.

      References

              hberaud Hervé Beraud
              hberaud Hervé Beraud
              rhos-dfg-pidone
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: