Loading...

Type: Story
Resolution: Unresolved
Priority: Major
Fix Version/s: 2.11.1
Affects Version/s: None
Component/s: Documentation
Labels:
- doc

Activity Type:
Product / Portfolio Work
Story Points:
5
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Epic Link:
MTV backend telemetry
Color Status:
Not Selected
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

PX Impact Score:

This ticket is for documenting the new counter metric, mtv_migration_plan_vms_total, which will be used to gather and forward telemetry data about Virtual Machine (VM) migrations facilitated by the Migration Toolkit for Virtualization (MTV).

The goal of this metric is to provide a concise way of counting the total number of VM migrations on the cluster based on their key attributes, which currently relies on unreliable customer input.

Metric to Document

Metric Name	Type	Purpose	Labels
`mtv_migration_plan_vms_total`	Counter	Counts the total number of migration plan VMs.	`vm_status`, `provider`, `mode`, `target`

Label Details and Possible Values

The documentation should clearly define each label and list the possible values it can take.

vm_status

Description: The final state of the VM after the migration attempt.

Possible Values:

- Succeeded

- Failed

provider

- Description: The source virtualization platform the VM was migrated from.

- Possible Values (Current):

- - vsphere

- - ova

- - Other currently supported providers (e.g., OpenStack, oVirt, OpenShift) should be included.

Future Values (Note for Inclusion):

- awsec2 (Not yet implemented, but planned)

mode

- Description: The type of migration performed.

- Possible Values (Current):

- - Cold: The source VM is shut down while data is copied.

- - Warm: Most data is copied while the VM is running (pre-copy stage), and the VM is shut down only for the cutover stage.

- Future Values (Note for Inclusion):

- - RCM (Not yet implemented, but planned)

target

- Description: The destination cluster for the migration.

- Possible Values:

- - local

- - remote

Example Use Cases

Please include clear examples to illustrate how this counter aggregates data.

Example 1: Successful Warm Migrations

- If the metric reports: succeeded, vsphere, warm, local = 2

- This means: 2 VMs were successfully migrated using warm migration from a vSphere source provider to a local cluster.

Example 2: Failed Cold Migrations

- If the metric reports: failed, vsphere, cold, local = 3

- This means: 3 VMs failed cold migration from a vSphere source provider to a local cluster.

Content Journey for MTV Telemetry Metrics Documentation

Stage	User Goal	Documentation Content & Focus
Discover	"Is there a way to track how my migrations are performing?"	High-level summary and value proposition. Focus on why this feature exists (currently, there is no concise or reliable way to gather migration metrics) and what it provides (data on VM/migration counts, results, providers, OSes).
Learn	"What data is being collected and how do I access it?"	Conceptual guide and metric definitions. Introduce the main metrics (e.g., `mtv_migrations_status_total`, `mtv_migration_plan_vms_total`, `mtv_migration_vm_oses_total`) and explain the labels (status, provider, mode, target, OS). Explain the process—gathering and forwarding information as telemetry data.
Try	"How do I verify the metrics are working correctly?"	Hands-on verification and testing guide. Provide clear steps and example queries for Prometheus or monitoring tools. Show how to check if the metrics correspond to reality. Include examples of the counter and gauge metrics in action.
Adopt	"How do I integrate this data into my overall operational monitoring?"	Deployment and integration instructions. Detail the configuration needed on the MTV side. Provide instructions for setting up dashboards and alerts using the documented metrics and labels. Focus on using the data to address the original motivation: getting reliable insight into MTV usage.
Expand	"What deeper insights can I get, and what is coming next?"	Advanced usage, future capabilities, and data analysis examples. Discuss using metrics like `mtv_migration_duration_seconds` (histogram) for deeper analysis of performance. Document the currently implemented metrics alongside planned future metrics (e.g., network throughput, disk I/O).

Jobs to be Done" (JTBD) statement:

"When I am running VM migrations with MTV, I want to automatically and reliably gather comprehensive data on the status, type, and outcomes of the migrations and individual VMs, so that I can report on usage, identify trends, and address failures without relying on manual customer input."

Breakdown

Job (What the user wants to accomplish): Gather comprehensive data on migration status, type, and outcomes.
Circumstance (The situation or context): When running VM migrations with MTV.
Motivation/Need (Why they want it): To report on usage, identify trends, and address failures.
Desired Outcome (What success looks like): Accessing this data automatically and reliably, without relying on manual customer input.

Personas:

1. Product Manager / Business Analyst

As a Product Manager, I want to perform trend analysis on migration success rates, provider usage, and popular OSes , to achieve the goal of understanding MTV adoption, identifying pain points, and making data-driven decisions for feature prioritization.

2. Reliability Engineer (SRE) / Operations Specialist

As a Reliability Engineer, I want to perform real-time monitoring of migration status, duration, and failures by plan and VM, to achieve the goal of proactively detecting and alerting on widespread issues, ensuring migration stability, and measuring the Mean Time To Recovery (MTTR).

3. Support Engineer / QE Specialist

As a Support Engineer, I want to perform detailed lookups of metrics (like data transferred and duration) associated with a specific failed migration plan ID , to achieve the goal of quickly diagnosing specific customer issues and verifying the root cause of migration failures or cancellations.

is related to

MTV-3891 [DOC] Generate metrics and forward as telemetry data - Tech Preview

In Progress

Details

Description

Metric to Document

Label Details and Possible Values

Example Use Cases

Content Journey for MTV Telemetry Metrics Documentation

Jobs to be Done" (JTBD) statement:

Breakdown

Personas:

1. Product Manager / Business Analyst

2. Reliability Engineer (SRE) / Operations Specialist

3. Support Engineer / QE Specialist

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates