github-mirrors/oh-my-claudecode

Fork 0

mirror of https://fastgit.cc/github.com/Yeachan-Heo/oh-my-claudecode synced 2026-04-20 21:00:50 +08:00

Files

Yeachan-Heo 0982e12a05 cleanup: normalize legacy OmO agent aliases to canonical names

2026-02-22 06:43:12 +00:00

12 KiB

Raw Blame History

Hephaestus vs Deep-Executor: Comparative Analysis

Analysis Summary

Research Question: How do the Hephaestus (oh-my-opencode) and Deep-Executor (oh-my-claudecode) agent architectures differ, and what can each learn from the other?
Methodology: Structured feature comparison across 14 capability dimensions, scored 0-3

1. Architectural Overview

Dimension	Hephaestus	Deep-Executor
Core Philosophy	Conductor/Delegator	Self-Contained Forge
Execution Model	Multi-agent parallel	Single-agent sequential
Agent Spawning	2-5 parallel background agents	BLOCKED (by design)
Tool Strategy	Agents as tools	Direct MCP/LSP tools
Model	GPT 5.2 with reasoning levels	Claude (Opus/Sonnet)

Key Insight

These are fundamentally different architectural paradigms. Hephaestus is a distributed system -- it treats agents as microservices. Deep-Executor is a monolith -- it concentrates all capability in one process. Neither is inherently superior; they optimize for different constraints.

2. Feature Gap Analysis: What Hephaestus Has That Deep-Executor Lacks

Feature Comparison Matrix

Category                                 Hephaestus    Deep-Exec    Delta
--------------------------------------------------------------------------------
Parallel Exploration                              3            0       +3
Delegation to Specialists                         3            0       +3
External Research (Docs/OSS)                      3            0       +3
Failure Recovery / Escalation                     3            1       +2
Dynamic Prompt Adaptation                         3            0       +3
Reasoning Level Configuration                     3            0       +3
TODO / Task Tracking Discipline                   1            3       -2
Verification Protocol Rigor                       1            3       -2
Structured Output Contract                        2            3       -1
MCP/LSP Tool Strategy                             1            3       -2
Ambiguity Resolution                              3            2       +1
Session Continuity                                3            2       +1
Token Efficiency                                  1            3       -2
Self-Sufficiency                                  1            3       -2
--------------------------------------------------------------------------------
TOTAL                                            31           23       +8

2.1 Parallel Exploration (Gap: 3/3)

Hephaestus: Fires 2-5 explore/document-specialist agents simultaneously as background tasks. Continues working while results stream in. Uses background_output(task_id) to collect.

Deep-Executor: Sequential exploration only. Must complete each Glob/Grep/Read call before starting the next.

Impact: For large codebases, Hephaestus can gather context 3-5x faster. Deep-Executor compensates with more targeted, cheaper queries but loses wall-clock time on broad searches.

2.2 Delegation to Specialists (Gap: 3/3)

Hephaestus: Three specialized agent types:

Explore agents: Parallel codebase search
Document-Specialist: External docs, GitHub, OSS research
Architect: High-IQ consulting for stuck situations

Deep-Executor: No delegation. All work is self-performed. This is a deliberate design choice ("You are the forge") but means no access to specialist capabilities.

Impact: Hephaestus can handle broader task scopes. Deep-Executor is limited to what a single agent context window can reason about.

2.3 External Research Capability (Gap: 3/3)

Hephaestus: Document-Specialist agent fetches external documentation, GitHub repos, and OSS references. This provides real-time knowledge augmentation.

Deep-Executor: No external research capability. Relies entirely on pre-loaded context and available tools.

Impact: When working with unfamiliar APIs or libraries, Hephaestus has a significant advantage.

2.4 Failure Recovery / Escalation (Gap: 2/3)

Hephaestus: Structured 3-failure protocol: STOP -> REVERT -> DOCUMENT -> CONSULT Architect. Clear escalation path prevents infinite retry loops.

Deep-Executor: No explicit failure threshold or escalation. Has verification loops but no "give up and escalate" mechanism.

Impact: Hephaestus avoids wasting tokens on unrecoverable situations. Deep-Executor can get stuck in retry loops.

2.5 Dynamic Prompt Adaptation (Gap: 3/3)

Hephaestus: Uses helper functions (buildExploreSection(), etc.) to dynamically construct prompts based on available capabilities. Prompt adapts to runtime environment.

Deep-Executor: Static prompt. Same instructions regardless of available tools or context.

Impact: Hephaestus is more portable across environments with varying tool availability.

2.6 Reasoning Level Configuration (Gap: 3/3)

Hephaestus: Explicit reasoning budget per task type (MEDIUM for code changes, HIGH for complex refactoring). "ROUTER NUDGE" directs model thinking depth.

Deep-Executor: No reasoning level control. Same approach for all task complexities.

Impact: Hephaestus can optimize cost/quality tradeoff per subtask.

3. Inverse Gaps: What Deep-Executor Has That Hephaestus Could Benefit From

3.1 TODO Discipline (Gap: 2/3)

Deep-Executor: NON-NEGOTIABLE rules: TodoWrite for 2+ steps, ONE in_progress at a time, mark completed IMMEDIATELY. This creates a reliable audit trail and prevents task drift.

Hephaestus: Minimal task tracking. Relies on delegation structure rather than explicit progress tracking.

Recommendation for Hephaestus: Adopt mandatory task tracking for complex multi-step operations.

3.2 Verification Protocol Rigor (Gap: 2/3)

Deep-Executor: After EVERY change: lsp_diagnostics. Before completion: ALL of (todos, tests, build, diagnostics). Specified evidence format.

Hephaestus: No structured verification protocol. Delegates verification implicitly through agent results.

Recommendation for Hephaestus: Add post-change diagnostic checks and a completion checklist.

3.3 MCP/LSP Tool Strategy (Gap: 2/3)

Deep-Executor: Explicit strategy for lsp_diagnostics (single file), lsp_diagnostics_directory (project-wide), ast_grep_search/replace with dryRun protocol. Clear escalation from file to project scope.

Hephaestus: No explicit LSP/AST tool strategy documented.

Recommendation for Hephaestus: Document and enforce a tool selection hierarchy.

3.4 Token Efficiency (Gap: 2/3)

Deep-Executor: Single agent = single context window. No inter-agent communication overhead. No prompt duplication across spawned agents.

Hephaestus: Each spawned agent carries its own system prompt + context. 2-5 parallel agents means 2-5x prompt overhead. Background task management adds coordination tokens.

Estimated overhead: Hephaestus uses ~2-4x more tokens per exploration phase due to agent spawning costs.

3.5 Self-Sufficiency (Gap: 2/3)

Deep-Executor: Works in any environment. No dependency on agent infrastructure, background task systems, or multi-agent coordination. Degrades gracefully.

Hephaestus: Depends on delegation infrastructure. If agent spawning fails, core workflow breaks.

4. Token Efficiency Analysis

Operation	Hephaestus (est. tokens)	Deep-Executor (est. tokens)	Ratio
System prompt per agent	~3,000	~3,000 (once)	1:1
3 parallel explore agents	~9,000 prompt + ~6,000 output	~2,000 (sequential Grep/Glob)	7.5:1
Document-Specialist research call	~4,000 prompt + ~2,000 output	N/A (not available)	-
Architect consultation	~5,000 prompt + ~3,000 output	N/A (not available)	-
Coordination overhead	~1,000 per delegation	0	-
Typical task total	~30,000-50,000	~10,000-20,000	~2.5:1

Conclusion: Deep-Executor is approximately 2-3x more token-efficient for equivalent tasks. Hephaestus trades tokens for wall-clock speed and broader capability.

5. Architectural Tradeoffs

Delegation Model (Hephaestus)

Strengths:

Parallel execution reduces wall-clock time
Specialist agents can be individually optimized
External research augments knowledge
Failure escalation prevents waste

Weaknesses:

Higher token cost (2-3x)
Coordination complexity
Context fragmentation across agents
Infrastructure dependency

Self-Contained Model (Deep-Executor)

Strengths:

Token efficient
No coordination overhead
Unified context (no information loss between agents)
Portable and infrastructure-independent
Strong verification discipline

Weaknesses:

Sequential exploration (slower wall-clock)
No escalation path when stuck
No external research
Cannot parallelize independent subtasks
Single point of failure (one agent context limit)

6. Prioritized Improvement Recommendations for Deep-Executor

Priority 1: Failure Recovery Protocol (HIGH IMPACT, LOW EFFORT)

Add a structured failure threshold:

After 3 consecutive failures on same task:
1. STOP current approach
2. DOCUMENT what was tried and why it failed
3. Try fundamentally different approach
4. If still failing: report to orchestrator with evidence

This requires NO delegation infrastructure -- just self-discipline rules.

Priority 2: Exploration Batching (HIGH IMPACT, MEDIUM EFFORT)

While true parallel agents are blocked, Deep-Executor can batch exploration:

- Issue multiple Glob/Grep calls in a single turn (already possible)
- Structure 5 exploration questions upfront (already present)
- Add explicit "exploration budget" (max N tool calls before proceeding)

Ensure the agent always issues independent Glob/Grep/Read calls in parallel within a single response.

Priority 3: Reasoning Depth Hints (MEDIUM IMPACT, LOW EFFORT)

Add task-complexity classification to control thoroughness:

SIMPLE (< 1 file, < 20 lines): Quick fix, minimal exploration
MEDIUM (1-3 files, < 100 lines): Standard exploration + verification
COMPLEX (3+ files, architectural): Full exploration + multiple verification passes

Priority 4: Dynamic Tool Adaptation (MEDIUM IMPACT, MEDIUM EFFORT)

Add capability detection:

IF lsp_diagnostics available: use for verification
ELSE IF build command known: use build output
ELSE: rely on ast_grep_search for structural validation

Priority 5: Structured Escalation Reporting (LOW IMPACT, LOW EFFORT)

When stuck, produce a structured failure report:

## Escalation Report
- **Task**: What was attempted
- **Attempts**: What approaches were tried (with outcomes)
- **Blocker**: Why it cannot be resolved
- **Suggested Next Steps**: What a human or orchestrator should try

7. Implementation Suggestions

For Deep-Executor Enhancements

Enhancement	Implementation	Effort
Failure threshold	Add counter + rules to prompt	1 hour
Exploration batching	Add parallel tool call guidance	30 min
Complexity classification	Add task sizing heuristic	1 hour
Escalation report format	Add output template	30 min
Tool capability detection	Add conditional tool sections	2 hours

For Hephaestus Enhancements (Inverse)

Enhancement	Implementation	Effort
TODO discipline	Port Deep-Executor's TodoWrite rules	1 hour
Verification protocol	Add post-change lsp_diagnostics mandate	1 hour
LSP tool strategy	Document tool selection hierarchy	2 hours
Completion checklist	Port Definition of Done format	30 min

8. Conclusion

Hephaestus and Deep-Executor represent two valid points on the agent architecture spectrum:

Hephaestus optimizes for capability breadth and speed at the cost of token efficiency
Deep-Executor optimizes for reliability and efficiency at the cost of parallelism

The most impactful improvements for Deep-Executor are those that require NO architectural changes: failure recovery protocols, exploration batching, and complexity-aware reasoning. These can be implemented purely through prompt engineering within the existing self-contained model.

The most impactful improvements for Hephaestus are Deep-Executor's discipline mechanisms: TODO tracking, verification protocols, and structured completion contracts. These add reliability without sacrificing Hephaestus's delegation strengths.

Analysis completed: 2026-02-01 Session: hephaestus-deep-executor-comparison

12 KiB Raw Blame History