Uploaded image for project: 'OpenShift Hive'
  1. OpenShift Hive
  2. HIVE-2540

Cache caching remote clients (for machinepool controller)

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Won't Do
    • Icon: Minor Minor
    • None
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None

      See thread and thread for background.

      Our MachinePool controller, like all our controllers, builds a kube client to talk to the spoke cluster being reconciled. We do not cache these clients today. The main reasons:
      1. It's Hardâ„¢. Two aspects of that off the top:

      • If the client goes stale, how do we distinguish that from other errors so we can rebuild it?
      • We need to not leak cache entries. Possible solutions include adding a CD finalizer so we can delete the entry for that CD before letting it be garbage collected.

      2. It's expensive. Each client is biggish; and we support hundreds-to-thousands of spokes per hive.

      (See HIVE-2399 where we're trying to do something like this. We haven't managed to get it working properly yet.)

      Assuming we did manage to cache clients locally, the next step would be to replace some of the places where we're hard polling spokes with Watch()es via those clients. I don't know how expensive those Watch()es are when they're spread across hundreds-to-thousands of clients vs just the one (to the local KAS) we're using today. I also don't know if we could rely on them for all the use cases where we're currently polling. For example, the point of the unreachable controller is to update status/labels when the remote cluster is... unreachable. Does a Watch() pop when the remote KAS breaks? Dunno.

      Compare and contrast with HIVE-2539, which seeks to reduce network traffic by reducing the number of objects retrieved per reconcile, as opposed to reducing the number of times we need to request the same objects.

              Unassigned Unassigned
              efried.openshift Eric Fried
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: