Uploaded image for project: 'Cost Management'
  1. Cost Management
  2. COST-4118

Hive: Incorrect file size

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 2023Q3
    • None
    • None
    • None
    • False
    • None
    • False

      Sentry Issue:

      Bread Crumbs:

      trino.exceptions.TrinoExternalError: TrinoExternalError(type=EXTERNAL, name=HIVE_CANNOT_OPEN_SPLIT, message="Error opening Hive split s3a://hccm-prod-s3/data/parquet/daily/7281533/AWS/raw/source=065de94b-66ff-4f38-9e76-2083f82af801/year=2023/month=08/2023-08-17_29_4_daily_0.parquet (offset=0, length=41426): Incorrect file size (41426) for file (end of stream not reached): s3a://hccm-prod-s3/data/parquet/daily/7281533/AWS/raw/source=065de94b-66ff-4f38-9e76-2083f82af801/year=2023/month=08/2023-08-17_29_4_daily_0.parquet", query_id=20230818_180748_86270_dwhwd)

      Initial Research: Amazon Docs

      Key Point:
      """
      This message can occur when a file has changed between query planning and query execution. It usually occurs when a file on Amazon S3 is replaced in-place (for example, a PUT is performed on a key where an object already exists). Athena does not support deleting or replacing the contents of a file when a query is running. To avoid this error, schedule jobs that overwrite or delete files at times when queries do not run, or only write data to new files or partitions.
      """

      My findings:
      This occurs when attempt to replace a file while a query is executing. I do think this means that the file wasn't replaced though. 

      Additional Notes
      It looks to have started on August 18th, but the errors we saw today are all related to the same account. The timing could indicate it is daily archive related; however, I would suspect it would be a bigger problem than one account. Could also just be a timing issue. Further investigation is needed, and seeing if this error keeps popping up. 

              rhn-support-lcouzens Luke Couzens
              myersco Cody Myers
              Eva Šebestová Eva Šebestová
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: