Uploaded image for project: 'OpenShift Top Level Product Strategy'
  1. OpenShift Top Level Product Strategy
  2. OCPPLAN-7575

User should be able to create separate PIOPS volume for etcd

XMLWordPrintable

      Copied from https://bugzilla.redhat.com/show_bug.cgi?id=1706228:

      Greg Blomquist 2019-05-03 22:45:04 CEST
      Spawned from a discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1703581 .

      Users should know how to create a separate PIOPS volume specifically for etcd.

      This will likely start as documentation, and possibly move to install-time configuration if it is useful enough.

      RHEL Program Management 2019-05-03 22:45:09 CEST
      Flags: pm_ack+
      Rule Engine Rule: OSE-pm-ack
      Flags: devel_ack+
      Rule Engine Rule: OSE-devel-ack
      Flags: qa_ack+
      Rule Engine Rule: OSE-qa-ack
      Greg Blomquist 2019-05-03 22:45:50 CEST
      Assignee: vigoyal@redhat.com → scuppett@redhat.com
      PrivateComment 1Stephen Cuppett 2019-05-06 16:26:24 CEST
      RED HAT CONFIDENTIAL
      The default gp2 volume on the masters is is 360 iops. Is that enough for most OSD clusters?

      If not, is a 10 GB, 500 iops disk enough for the current set of clusters in OSD?

      CC: crawford@redhat.com, jeder@redhat.com, nmalik@redhat.com
      Link ID: CoreOS Jira SREP-1171
      Flags: needinfo?(jeder@redhat.com)
      Stephen Cuppett 2019-05-06 16:35:36 CEST
      Link ID: CoreOS Jira CORS-1078
      Stephen Cuppett 2019-05-06 16:41:50 CEST
      Link ID: CoreOS Jira CORS-828
      PrivateComment 2Jeremy Eder 2019-05-06 17:05:40 CEST
      RED HAT CONFIDENTIAL
      The most important thing is to make it configurable at install time.

      For OSD we will use io1 type, with 1000 iops as a default.

      It's something that needs monitoring and alerts. These are things we will add to OSD (or the product when that makes sense).

      Flags: needinfo?(jeder@redhat.com)
      PrivateComment 3Stephen Cuppett 2019-05-06 17:47:00 CEST
      This is currently possible (changing the root volume for the master):

      https://github.com/openshift/installer/blob/master/docs/user/aws/customization.md

      Also, via Hive:

      https://github.com/openshift/hive/blob/master/pkg/apis/hive/v1alpha1/machinepools.go#L59-L66

      Priority: unspecified → medium
      Target Release: — → 4.1.0
      Docs Contact: vigoyal@redhat.com
      Component: Documentation → Etcd
      CC: dgoodwin@redhat.com
      Assignee: scuppett@redhat.com → sbatsche@redhat.com
      QA Contact: xtian@redhat.com → geliu@redhat.com
      Severity: low → medium
      Red Hat Bugzilla 2019-05-06 17:47:00 CEST
      Flags: qa_ack+ pm_ack+ → qa_ack? pm_ack?
      RHEL Program Management 2019-05-06 17:47:04 CEST
      Flags: pm_ack? → pm_ack+
      Rule Engine Rule: OSE-pm-ack
      Flags: qa_ack? → qa_ack+
      Rule Engine Rule: OSE-qa-ack
      PrivateComment 4Stephen Cuppett 2019-05-06 17:48:58 CEST
      RED HAT CONFIDENTIAL
      Currently, the installer default is 120G gp2 (360 iops). To achieve 1000 iops guaranteed/baseline, you can use a 350G gp2 or 20G/1000 iops io1 definition in the platform config.

      CC: scuppett@redhat.com
      Flags: pm_ack+ qa_ack+ → pm_ack? qa_ack?
      RHEL Program Management 2019-05-06 17:49:02 CEST
      Flags: pm_ack? → pm_ack+
      Rule Engine Rule: OSE-pm-ack
      Flags: qa_ack? → qa_ack+
      Rule Engine Rule: OSE-qa-ack
      PrivateComment 5Stephen Cuppett 2019-05-06 17:49:25 CEST
      Deferring "separate" volume question to 4.2.0 and the CORS stories we have for it. For the 4.1.0 release, adjusting the main master volume to the right size/type should get us what we need.

      Target Release: 4.1.0 → 4.2.0
      Flags: pm_ack+ qa_ack+ → pm_ack? qa_ack?
      RHEL Program Management 2019-05-06 17:49:29 CEST
      Flags: pm_ack? → pm_ack+
      Rule Engine Rule: OSE-pm-ack
      Flags: qa_ack? → qa_ack+
      Rule Engine Rule: OSE-qa-ack
      CEE Openshift PM Score Bot 2019-05-07 09:02:07 CEST
      PM Score: 0 → 27
      W. Trevor King 2019-05-09 07:06:31 CEST
      CC: trking@redhat.com
      PrivateComment 6Neelesh Agrawal 2019-08-15 17:18:01 CEST
      RED HAT CONFIDENTIAL
      Need a jira card.

      Target Release: 4.2.0 → 4.3.0
      CC: nagrawal@redhat.com
      PrivateComment 7W. Trevor King 2019-08-15 22:15:16 CEST
      RED HAT CONFIDENTIAL
      https://jira.coreos.com/browse/RFE-305
      See also discussion in and resolution of https://jira.coreos.com/browse/SREP-1171

      Naveen Malik 2019-08-19 14:55:53 CEST
      CC: mwoodson@redhat.com
      PrivateComment 8Eric Rich 2019-09-05 00:00:38 CEST
      At what point will 4.1 and 4.2 clusters that grow in size be subject to hitting scale issues because of this?
      Do we need to have a separate bug, epic, jira for tracking how to mitigate this and move etcd to more performant storage post-install?

      CC: erich@redhat.com, sbatsche@redhat.com
      Flags: needinfo?(sbatsche@redhat.com)
      PrivateComment 9W. Trevor King 2019-09-05 00:35:50 CEST
      RED HAT CONFIDENTIAL
      > Do we need to have a separate bug, epic, jira for tracking how to mitigate this and move etcd to more performant storage post-install?

      Hopefully we have an etcd-operator to help recover control plane machines before we need this. That operator is being tracked in [1]. Then you could just bump your MachineSet to ask for larger root volumes and the machine API and etcd operator would handle rolling you out on faster/larger disks. As it stands, you can work through the disaster-recovery workflows to add new control plane machines with larger disks. And there are probably easier approaches you can take now too, although I'm not sure if it's worth working up docs around that vs. just waiting for the etcd operator.

      [1]: https://jira.coreos.com/browse/ETCD-25

      PrivateComment 10Michal Fojtik 2019-11-07 10:55:17 CET
      RED HAT CONFIDENTIAL
      Sam, is this something the operator will handle?

      CC: mfojtik@redhat.com
      Devel Whiteboard: candidate-4.3
      Flags: needinfo?(gblomqui@redhat.com)
      PrivateComment 11Sam Batschelet 2019-11-14 12:19:49 CET
      RED HAT CONFIDENTIAL
      We can look to this as a feature for the operator in 4.4 but this can not happen in 4.3.

      Target Release: 4.3.0 → 4.4.0
      Devel Whiteboard: candidate-4.3 → rejected-4.3
      Flags: needinfo?(sbatsche@redhat.com) needinfo?(gblomqui@redhat.com)
      PrivateComment 12Michal Fojtik 2020-01-30 10:41:25 CET
      RED HAT CONFIDENTIAL
      Is the disaster recovery handled by operator in 4.4 now?

      Flags: needinfo?(sbatsche@redhat.com)
      PrivateComment 13Sam Batschelet 2020-01-30 13:28:46 CET
      RED HAT CONFIDENTIAL
      > Is the disaster recovery handled by operator in 4.4 now?

      1. can
      • cluster-etcd-operator for 4.4 we can replace a failed etcd instance on a master node. For example node A etcd instance has a catastrophic failure due to gRPC bug and is in CrashLoopBackoff. The operator will conclude failed state. Scale down etcd cluster, remove data-dir and scale etcd back up with healthy member.
      • scale etcd during DR. After initial control-plane is restored new nodes being added to the system will not require manual intervention to scale up etcd or its dependencies. Some manual steps might still be involved such as DNS and accepting CSR requests of new master nodes.
      1. can not
      • recover from lost quorum without manual intervention

      Regarding https://bugzilla.redhat.com/show_bug.cgi?id=1706228#c9

      In general, I agree that cluster has enough data to conclude the instance it is on is underperforming. Meaning the cluster can know that p99 for key metrics is crossing baseline thresholds on a regular basis. Given that we should be able to have a controller that can understand how on a per cloud level instance A specs relate to instance B. Then conclude based on metrics data what components should change to correct this issue. etcd-operator can handle the generation of etcd dependencies such as TLS certs so that this process can happen much more gracefully and with minimal human operator involvement. This work is currently unplanned.

      Future plans:

      We are migrating the bash DR scripts to golang to allow for more autonomous/graceful DR solutions.

      PrivateComment 14Eric Rich 2020-02-19 20:39:15 CET
      RED HAT CONFIDENTIAL
      Because of https://access.redhat.com/articles/4766521 we need to be careful in what / how we deal with disks.
      I have closed the connected RFE on this.

              Unassigned Unassigned
              sttts@redhat.com Stefan Schimanski (Inactive)
              Ge Liu Ge Liu
              Votes:
              1 Vote for this issue
              Watchers:
              22 Start watching this issue

                Created:
                Updated:
                Resolved: