Details

    • Type: Feature Request
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 7.0.0.Alpha4
    • Fix Version/s: None
    • Component/s: Clustered Executor
    • Labels:
      None

      Description

      Map/Reduce tasks should collect statistics during the task execution that can be returned to the user to help them determine the optimal settings for the task. Here are some thoughts on useful statistics:

      Final status - completed, failed, cancelled, etc.
      Duration - either overall, per node, per phase (map, reduce, combine, collate)
      Number of nodes participating in the task
      Keys in the intermediate cache
      Keys in the result map
      Node specific statistics:
      Status of node - completed, failed, cancelled, etc.
      Number of keys processed
      Max size of collector

      Here are the built in counters that are reported by Hadoop:
      https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-8/counters

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  afield Alan Field
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated: