Loading...

XML

Word

Printable

Type: Initiative
Resolution: Duplicate
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: Model Validation
Labels:
- model-validation

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Summary

As Jbenchmark evolves into a customer-facing product, it must behave as a unified service that receives parameters, runs a benchmark, and produces structured, transparent, and insightful logs. When errors occur, we must be able to pinpoint their source — whether it's resource constraints during model spin-up, an authentication failure, an infrastructure issue (e.g., wrong node type, unavailable spot instance), or a bug in our own stack.

This means logs can no longer be fragmented per container. Instead, we need consolidated, contextualized logging at the workload level, with a focus on clarity, troubleshooting, and future observability.

Goals

Provide a clear, single point of truth per benchmark workload for all key events and states.

Enable full lifecycle tracking: setup → model loading → inference → output → teardown.

Categorize and distinguish errors (infra, runtime, auth, quota, bugs).

Structure logs so they’re searchable, filterable, and queryable by both internal teams and external users.

Expose logs through a user-friendly interface, such as Kibana, backed by a system like OpenSearch.

Lay the foundation for proactive monitoring and future dashboards.

Tasks:

Write a detailed spec document, including:

- Target logging behaviors and developer/operator expectations.

- Logging structure, content, severity levels, and naming.

- Log aggregation and exposure strategy (e.g., OpenSearch + Kibana).

- Examples of filtered use cases (e.g., "show only workloads that failed due to auth").

Propose multiple architectural alternatives:

- Logging destinations and formats (stdout, file, system-level).

- How logs are shipped and ingested.

- Options for log visualization tools.

Discuss with team:

- Internal design review sessions to gather input and challenge assumptions.

Approval:

- Submit finalized spec and architecture for Aviran’s approval.

Planning:

- Aviran assigns ownership of the Epic

- Team breaks the scope into stories under this Epic.

Phase 2: Implementation

Begins only after Phase 1 is approved and tasks are broken down.

Note:
As part of the logging overhaul, logs should not only be structured and accessible programmatically — they must also be searchable and filterable via a user-friendly interface. A recommended approach is to stream logs into a centralized system like OpenSearch, and provide a Kibana interface on top of it. This would enable both internal and external users to:

Filter logs by workload ID, model name, date, error type, or severity.

Easily visualize and investigate failures across runs.

Gain visibility into recurring issues and usage patterns without requiring access to the raw system.

Assignee:: Aviran Badli

Reporter:: Aviran Badli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/06/22 6:40 PM

Updated:: 2025/08/10 7:15 PM

Resolved:: 2025/08/10 7:15 PM

Details

Description

Summary

Tasks:

Phase 2: Implementation

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty