mirror of
https://fastgit.cc/github.com/HKUDS/CLI-Anything
synced 2026-04-20 21:00:28 +08:00
feat: Add VideoCaptioner CLI harness — AI-powered video captioning
Add agent-harness for VideoCaptioner, an AI-powered video captioning tool. Pipeline: Speech transcription → Subtitle optimization → Translation → Video synthesis with styled subtitles. Key features: - 4 ASR engines (bijian/jianying free, whisper-api, whisper-cpp) - 3 translation services (LLM, Bing free, Google free), 38 languages - Beautiful subtitle styles (ASS outline + rounded background) - Full pipeline in one command - 26 tests (14 unit + 12 e2e), all passing
This commit is contained in:
83
videocaptioner/agent-harness/VIDEOCAPTIONER.md
Normal file
83
videocaptioner/agent-harness/VIDEOCAPTIONER.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# VideoCaptioner: Project-Specific Analysis & SOP
|
||||
|
||||
## Architecture Summary
|
||||
|
||||
VideoCaptioner is an AI-powered video captioning tool that provides a complete
|
||||
pipeline from speech recognition to styled subtitle synthesis. It ships as a
|
||||
standalone CLI (`pip install videocaptioner`) with a well-defined command interface.
|
||||
|
||||
```
|
||||
+----------------------------------------------------------+
|
||||
| VideoCaptioner CLI |
|
||||
| +------------+ +----------+ +-----------+ +-----------+ |
|
||||
| | Transcribe | | Subtitle | | Synthesize| | Process | |
|
||||
| | (ASR) | | (NLP) | | (FFmpeg) | | (Pipeline)| |
|
||||
| +-----+------+ +----+-----+ +-----+-----+ +-----+-----+ |
|
||||
| | | | | |
|
||||
| +-----+--------------+-------------+-------------+-----+ |
|
||||
| | Core Engine | |
|
||||
| | ASR engines, LLM optimization, Translation, | |
|
||||
| | Subtitle rendering (ASS + Rounded), FFmpeg | |
|
||||
| +-----------------------------------------------------+ |
|
||||
+----------------------------------------------------------+
|
||||
```
|
||||
|
||||
## CLI Strategy: Subprocess Wrapper
|
||||
|
||||
Unlike applications that need reverse-engineering of internal formats,
|
||||
VideoCaptioner already provides a production CLI. Our harness:
|
||||
|
||||
1. **Click wrapper** provides the CLI-Anything standard interface
|
||||
2. **Subprocess backend** delegates to `videocaptioner` CLI commands
|
||||
3. **JSON mode** (`--json`) returns structured output for agents
|
||||
4. **REPL mode** provides interactive session with tab-completion
|
||||
|
||||
### Why Subprocess?
|
||||
|
||||
VideoCaptioner's CLI is:
|
||||
- **Production-tested** with 50+ unit tests and 200+ QA test cases
|
||||
- **Feature-complete** with 7 subcommands covering the full pipeline
|
||||
- **Well-documented** with clear `--help` text and exit codes
|
||||
- **Actively maintained** on PyPI with automated releases
|
||||
|
||||
Wrapping via subprocess preserves all these qualities without reimplementation.
|
||||
|
||||
## Coverage
|
||||
|
||||
### Transcription (4 ASR engines)
|
||||
- `bijian` — Free, Chinese & English, no setup needed
|
||||
- `jianying` — Free, Chinese & English, no setup needed
|
||||
- `whisper-api` — All languages, OpenAI-compatible API
|
||||
- `whisper-cpp` — All languages, local model
|
||||
|
||||
### Subtitle Processing
|
||||
- **Split** — Semantic re-segmentation via LLM
|
||||
- **Optimize** — Fix ASR errors, punctuation, formatting via LLM
|
||||
- **Translate** — 38 languages, 3 translators (LLM, Bing free, Google free)
|
||||
- **Layout** — target-above, source-above, target-only, source-only
|
||||
|
||||
### Video Synthesis
|
||||
- **Soft subtitles** — Embedded subtitle track (switchable)
|
||||
- **Hard subtitles** — Burned into video frames
|
||||
- **ASS style** — Traditional outline/shadow with presets (default, anime, vertical)
|
||||
- **Rounded style** — Modern rounded background boxes
|
||||
- **Customizable** — Inline JSON override for any style parameter
|
||||
- **Quality levels** — ultra (CRF 18), high (CRF 23), medium (CRF 28), low (CRF 32)
|
||||
|
||||
### Utilities
|
||||
- Configuration management (TOML config + env vars)
|
||||
- Style preset listing with full parameters
|
||||
- Online video download (YouTube, Bilibili, etc.)
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- **Unit tests**: Mock subprocess calls, verify argument construction
|
||||
- **End-to-end tests**: Real videocaptioner CLI with test media files
|
||||
- **Prerequisite**: `videocaptioner` and `ffmpeg` must be installed
|
||||
|
||||
## Limitations
|
||||
|
||||
- Requires `videocaptioner` package to be installed separately
|
||||
- Free ASR engines (bijian/jianying) only support Chinese & English
|
||||
- LLM features require an OpenAI-compatible API key
|
||||
- Hard subtitle styles require FFmpeg
|
||||
@@ -0,0 +1,71 @@
|
||||
# VideoCaptioner CLI
|
||||
|
||||
AI-powered video captioning tool with beautiful customizable subtitle styles.
|
||||
|
||||
## Architecture
|
||||
|
||||
- **Subprocess backend** delegates to the production `videocaptioner` CLI (`pip install videocaptioner`)
|
||||
- **Click** provides the CLI framework with subcommand groups and REPL
|
||||
- **JSON output mode** (`--json`) for agent consumption
|
||||
- **Free features included**: bijian ASR (Chinese/English), Bing/Google translation
|
||||
|
||||
## Pipeline
|
||||
|
||||
```
|
||||
Audio/Video → ASR Transcription → Subtitle Splitting → LLM Optimization → Translation → Video Synthesis
|
||||
(bijian/whisper) (semantic) (fix errors) (38 languages) (styled subtitles)
|
||||
```
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
pip install videocaptioner click prompt-toolkit
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
# One-shot: transcribe a Chinese video and add English subtitles
|
||||
cli-anything-videocaptioner process video.mp4 --asr bijian --translator bing --target-language en --subtitle-mode hard
|
||||
|
||||
# Transcribe only
|
||||
cli-anything-videocaptioner transcribe video.mp4 --asr bijian -o output.srt
|
||||
|
||||
# Translate existing subtitles
|
||||
cli-anything-videocaptioner subtitle input.srt --translator google --target-language ja
|
||||
|
||||
# Burn subtitles with anime style
|
||||
cli-anything-videocaptioner synthesize video.mp4 -s sub.srt --subtitle-mode hard --style anime
|
||||
|
||||
# Custom style (red outline, large font)
|
||||
cli-anything-videocaptioner synthesize video.mp4 -s sub.srt --subtitle-mode hard \
|
||||
--style-override '{"outline_color": "#ff0000", "font_size": 48}'
|
||||
|
||||
# JSON output mode (for agent consumption)
|
||||
cli-anything-videocaptioner --json transcribe video.mp4 --asr bijian
|
||||
|
||||
# Interactive REPL
|
||||
cli-anything-videocaptioner
|
||||
```
|
||||
|
||||
## Subtitle Styles
|
||||
|
||||
Two rendering modes for beautiful subtitles:
|
||||
|
||||
**ASS mode** — traditional outline/shadow:
|
||||
- Presets: `default` (white+black), `anime` (warm+orange), `vertical` (portrait videos)
|
||||
|
||||
**Rounded mode** — modern rounded background boxes:
|
||||
- Preset: `rounded` (dark text on semi-transparent background)
|
||||
|
||||
Fully customizable via `--style-override` with inline JSON.
|
||||
|
||||
## Coverage
|
||||
|
||||
| Feature | Commands |
|
||||
|---------|----------|
|
||||
| Transcription | 4 ASR engines, auto language detection, word timestamps |
|
||||
| Subtitle Processing | Split + optimize + translate, 3 translators, 38 languages |
|
||||
| Video Synthesis | Soft/hard subtitles, 4 quality levels, 5 style presets |
|
||||
| Styles | ASS outline + rounded background, inline JSON customization |
|
||||
| Utilities | Config management, style listing, video download |
|
||||
@@ -0,0 +1,91 @@
|
||||
"""Full pipeline — transcribe → optimize → translate → synthesize in one command."""
|
||||
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
|
||||
|
||||
def process(
|
||||
input_path: str,
|
||||
output_path: str | None = None,
|
||||
asr: str = "bijian",
|
||||
language: str = "auto",
|
||||
translator: str | None = None,
|
||||
target_language: str | None = None,
|
||||
subtitle_mode: str = "soft",
|
||||
quality: str = "medium",
|
||||
layout: str | None = None,
|
||||
style: str | None = None,
|
||||
style_override: str | None = None,
|
||||
render_mode: str | None = None,
|
||||
no_optimize: bool = False,
|
||||
no_translate: bool = False,
|
||||
no_split: bool = False,
|
||||
no_synthesize: bool = False,
|
||||
reflect: bool = False,
|
||||
prompt: str | None = None,
|
||||
api_key: str | None = None,
|
||||
api_base: str | None = None,
|
||||
model: str | None = None,
|
||||
) -> str:
|
||||
"""Run the complete captioning pipeline.
|
||||
|
||||
Args:
|
||||
input_path: Video or audio file path.
|
||||
output_path: Output file or directory path.
|
||||
asr: ASR engine.
|
||||
language: Source language.
|
||||
translator: Translation service.
|
||||
target_language: Target language.
|
||||
subtitle_mode: soft or hard.
|
||||
quality: Video quality.
|
||||
layout: Bilingual layout.
|
||||
style: Style preset name.
|
||||
style_override: Inline JSON style override.
|
||||
render_mode: ass or rounded.
|
||||
no_optimize: Skip optimization.
|
||||
no_translate: Skip translation.
|
||||
no_split: Skip re-segmentation.
|
||||
no_synthesize: Skip video synthesis.
|
||||
reflect: Reflective translation.
|
||||
prompt: Custom LLM prompt.
|
||||
api_key: LLM API key.
|
||||
api_base: LLM API base URL.
|
||||
model: LLM model name.
|
||||
|
||||
Returns:
|
||||
Output file path.
|
||||
"""
|
||||
args = ["process", input_path, "--asr", asr, "--language", language,
|
||||
"--subtitle-mode", subtitle_mode, "--quality", quality]
|
||||
if output_path:
|
||||
args += ["-o", output_path]
|
||||
if translator:
|
||||
args += ["--translator", translator]
|
||||
if target_language:
|
||||
args += ["--target-language", target_language]
|
||||
if layout:
|
||||
args += ["--layout", layout]
|
||||
if style:
|
||||
args += ["--style", style]
|
||||
if style_override:
|
||||
args += ["--style-override", style_override]
|
||||
if render_mode:
|
||||
args += ["--render-mode", render_mode]
|
||||
if no_optimize:
|
||||
args.append("--no-optimize")
|
||||
if no_translate:
|
||||
args.append("--no-translate")
|
||||
if no_split:
|
||||
args.append("--no-split")
|
||||
if no_synthesize:
|
||||
args.append("--no-synthesize")
|
||||
if reflect:
|
||||
args.append("--reflect")
|
||||
if prompt:
|
||||
args += ["--prompt", prompt]
|
||||
if api_key:
|
||||
args += ["--api-key", api_key]
|
||||
if api_base:
|
||||
args += ["--api-base", api_base]
|
||||
if model:
|
||||
args += ["--model", model]
|
||||
return run_quiet(args)
|
||||
@@ -0,0 +1,68 @@
|
||||
"""Subtitle processing — optimize and translate subtitle files."""
|
||||
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
|
||||
|
||||
def process_subtitle(
|
||||
input_path: str,
|
||||
output_path: str | None = None,
|
||||
translator: str | None = None,
|
||||
target_language: str | None = None,
|
||||
format: str = "srt",
|
||||
layout: str | None = None,
|
||||
no_optimize: bool = False,
|
||||
no_translate: bool = False,
|
||||
no_split: bool = False,
|
||||
reflect: bool = False,
|
||||
prompt: str | None = None,
|
||||
api_key: str | None = None,
|
||||
api_base: str | None = None,
|
||||
model: str | None = None,
|
||||
) -> str:
|
||||
"""Optimize and/or translate a subtitle file.
|
||||
|
||||
Args:
|
||||
input_path: Subtitle file (.srt, .ass, .vtt).
|
||||
output_path: Output file or directory path.
|
||||
translator: Translation service (llm, bing, google).
|
||||
target_language: Target language BCP 47 code.
|
||||
format: Output format (srt, ass, txt, json).
|
||||
layout: Bilingual layout (target-above, source-above, target-only, source-only).
|
||||
no_optimize: Skip LLM optimization.
|
||||
no_translate: Skip translation.
|
||||
no_split: Skip re-segmentation.
|
||||
reflect: Enable reflective translation (LLM only).
|
||||
prompt: Custom LLM prompt.
|
||||
api_key: LLM API key.
|
||||
api_base: LLM API base URL.
|
||||
model: LLM model name.
|
||||
|
||||
Returns:
|
||||
Output file path.
|
||||
"""
|
||||
args = ["subtitle", input_path, "--format", format]
|
||||
if output_path:
|
||||
args += ["-o", output_path]
|
||||
if translator:
|
||||
args += ["--translator", translator]
|
||||
if target_language:
|
||||
args += ["--target-language", target_language]
|
||||
if layout:
|
||||
args += ["--layout", layout]
|
||||
if no_optimize:
|
||||
args.append("--no-optimize")
|
||||
if no_translate:
|
||||
args.append("--no-translate")
|
||||
if no_split:
|
||||
args.append("--no-split")
|
||||
if reflect:
|
||||
args.append("--reflect")
|
||||
if prompt:
|
||||
args += ["--prompt", prompt]
|
||||
if api_key:
|
||||
args += ["--api-key", api_key]
|
||||
if api_base:
|
||||
args += ["--api-base", api_base]
|
||||
if model:
|
||||
args += ["--model", model]
|
||||
return run_quiet(args)
|
||||
@@ -0,0 +1,49 @@
|
||||
"""Video synthesis — burn subtitles into video with customizable styles."""
|
||||
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
|
||||
|
||||
def synthesize(
|
||||
video_path: str,
|
||||
subtitle_path: str,
|
||||
output_path: str | None = None,
|
||||
subtitle_mode: str = "soft",
|
||||
quality: str = "medium",
|
||||
layout: str | None = None,
|
||||
render_mode: str | None = None,
|
||||
style: str | None = None,
|
||||
style_override: str | None = None,
|
||||
font_file: str | None = None,
|
||||
) -> str:
|
||||
"""Burn subtitles into a video file.
|
||||
|
||||
Args:
|
||||
video_path: Input video file.
|
||||
subtitle_path: Subtitle file (.srt, .ass).
|
||||
output_path: Output video file path.
|
||||
subtitle_mode: 'soft' (embedded track) or 'hard' (burned in).
|
||||
quality: Video quality (ultra, high, medium, low).
|
||||
layout: Bilingual layout.
|
||||
render_mode: 'ass' (outline/shadow) or 'rounded' (background boxes).
|
||||
style: Style preset name (default, anime, vertical, rounded).
|
||||
style_override: Inline JSON to override style fields.
|
||||
font_file: Custom font file path (.ttf/.otf).
|
||||
|
||||
Returns:
|
||||
Output video file path.
|
||||
"""
|
||||
args = ["synthesize", video_path, "-s", subtitle_path,
|
||||
"--subtitle-mode", subtitle_mode, "--quality", quality]
|
||||
if output_path:
|
||||
args += ["-o", output_path]
|
||||
if layout:
|
||||
args += ["--layout", layout]
|
||||
if render_mode:
|
||||
args += ["--render-mode", render_mode]
|
||||
if style:
|
||||
args += ["--style", style]
|
||||
if style_override:
|
||||
args += ["--style-override", style_override]
|
||||
if font_file:
|
||||
args += ["--font-file", font_file]
|
||||
return run_quiet(args)
|
||||
@@ -0,0 +1,44 @@
|
||||
"""Transcription — speech to subtitles via ASR engines."""
|
||||
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
|
||||
|
||||
def transcribe(
|
||||
input_path: str,
|
||||
output_path: str | None = None,
|
||||
asr: str = "bijian",
|
||||
language: str = "auto",
|
||||
format: str = "srt",
|
||||
word_timestamps: bool = False,
|
||||
whisper_api_key: str | None = None,
|
||||
whisper_api_base: str | None = None,
|
||||
whisper_model: str | None = None,
|
||||
) -> str:
|
||||
"""Transcribe audio/video to subtitles.
|
||||
|
||||
Args:
|
||||
input_path: Audio or video file path.
|
||||
output_path: Output file or directory path.
|
||||
asr: ASR engine (bijian, jianying, whisper-api, whisper-cpp).
|
||||
language: Source language ISO 639-1 code, or 'auto'.
|
||||
format: Output format (srt, ass, txt, json).
|
||||
word_timestamps: Include word-level timestamps.
|
||||
whisper_api_key: Whisper API key (for whisper-api engine).
|
||||
whisper_api_base: Whisper API base URL.
|
||||
whisper_model: Whisper model name.
|
||||
|
||||
Returns:
|
||||
Output file path.
|
||||
"""
|
||||
args = ["transcribe", input_path, "--asr", asr, "--language", language, "--format", format]
|
||||
if output_path:
|
||||
args += ["-o", output_path]
|
||||
if word_timestamps:
|
||||
args.append("--word-timestamps")
|
||||
if whisper_api_key:
|
||||
args += ["--whisper-api-key", whisper_api_key]
|
||||
if whisper_api_base:
|
||||
args += ["--whisper-api-base", whisper_api_base]
|
||||
if whisper_model:
|
||||
args += ["--whisper-model", whisper_model]
|
||||
return run_quiet(args)
|
||||
@@ -0,0 +1,123 @@
|
||||
---
|
||||
name: >-
|
||||
cli-anything-videocaptioner
|
||||
description: >-
|
||||
AI-powered video captioning — transcribe speech, optimize/translate subtitles, burn into video with beautiful customizable styles (ASS outline or rounded background). Free ASR and translation included.
|
||||
---
|
||||
|
||||
# cli-anything-videocaptioner
|
||||
|
||||
AI-powered video captioning tool. Transcribe speech → optimize subtitles → translate → burn into video with beautiful styles.
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install cli-anything-videocaptioner
|
||||
```
|
||||
|
||||
**Prerequisites:**
|
||||
- Python 3.10+
|
||||
- `videocaptioner` must be installed (`pip install videocaptioner`)
|
||||
- FFmpeg required for video synthesis
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Commands
|
||||
|
||||
```bash
|
||||
# Show help
|
||||
cli-anything-videocaptioner --help
|
||||
|
||||
# Start interactive REPL mode
|
||||
cli-anything-videocaptioner
|
||||
|
||||
# Transcribe a video (free, no setup)
|
||||
cli-anything-videocaptioner transcribe video.mp4 --asr bijian
|
||||
|
||||
# Translate subtitles (free Bing translator)
|
||||
cli-anything-videocaptioner subtitle input.srt --translator bing --target-language en
|
||||
|
||||
# Full pipeline: transcribe → translate → burn subtitles
|
||||
cli-anything-videocaptioner process video.mp4 --asr bijian --translator bing --target-language en --subtitle-mode hard
|
||||
|
||||
# JSON output (for agent consumption)
|
||||
cli-anything-videocaptioner --json transcribe video.mp4 --asr bijian
|
||||
```
|
||||
|
||||
### REPL Mode
|
||||
|
||||
When invoked without a subcommand, the CLI enters an interactive REPL session:
|
||||
|
||||
```bash
|
||||
cli-anything-videocaptioner
|
||||
# Enter commands interactively with tab-completion and history
|
||||
```
|
||||
|
||||
## Command Groups
|
||||
|
||||
### transcribe — Speech to subtitles
|
||||
```
|
||||
transcribe <input> [--asr bijian|jianying|whisper-api|whisper-cpp] [--language CODE] [--format srt|ass|txt|json] [-o PATH]
|
||||
```
|
||||
- `bijian` (default): Free, Chinese & English, no setup
|
||||
- `whisper-api`: All languages, requires `--whisper-api-key`
|
||||
|
||||
### subtitle — Optimize and translate
|
||||
```
|
||||
subtitle <input.srt> [--translator llm|bing|google] [--target-language CODE] [--layout target-above|source-above|target-only|source-only] [--no-optimize] [--no-translate] [-o PATH]
|
||||
```
|
||||
- Three steps: Split → Optimize → Translate
|
||||
- Bing/Google translators are free
|
||||
- 38 target languages supported (BCP 47 codes)
|
||||
|
||||
### synthesize — Burn subtitles into video
|
||||
```
|
||||
synthesize <video> -s <subtitle> [--subtitle-mode soft|hard] [--quality ultra|high|medium|low] [--style NAME] [--style-override JSON] [--render-mode ass|rounded] [--font-file PATH] [-o PATH]
|
||||
```
|
||||
- **ASS mode**: Outline/shadow style with presets (default, anime, vertical)
|
||||
- **Rounded mode**: Modern rounded background boxes
|
||||
- Customizable via `--style-override '{"outline_color": "#ff0000"}'`
|
||||
|
||||
### process — Full pipeline
|
||||
```
|
||||
process <input> [--asr ...] [--translator ...] [--target-language ...] [--subtitle-mode ...] [--style ...] [--no-optimize] [--no-translate] [--no-synthesize] [-o PATH]
|
||||
```
|
||||
|
||||
### styles — List style presets
|
||||
```
|
||||
styles
|
||||
```
|
||||
|
||||
### config — Manage settings
|
||||
```
|
||||
config show
|
||||
config set <key> <value>
|
||||
```
|
||||
|
||||
### download — Download online video
|
||||
```
|
||||
download <URL> [-o DIR]
|
||||
```
|
||||
|
||||
## JSON Output
|
||||
|
||||
All commands support `--json` for machine-readable output:
|
||||
```bash
|
||||
cli-anything-videocaptioner --json transcribe video.mp4 --asr bijian
|
||||
# {"output_path": "/path/to/output.srt"}
|
||||
```
|
||||
|
||||
## Style Presets
|
||||
|
||||
| Name | Mode | Description |
|
||||
|------|------|-------------|
|
||||
| `default` | ASS | White text, black outline — clean and universal |
|
||||
| `anime` | ASS | Warm white, orange outline — anime/cartoon style |
|
||||
| `vertical` | ASS | High bottom margin — for portrait/vertical videos |
|
||||
| `rounded` | Rounded | Dark text on semi-transparent rounded background |
|
||||
|
||||
Customize any field: `--style-override '{"font_size": 48, "outline_color": "#ff0000"}'`
|
||||
|
||||
## Target Languages
|
||||
|
||||
BCP 47 codes: `zh-Hans` `zh-Hant` `en` `ja` `ko` `fr` `de` `es` `ru` `pt` `it` `ar` `th` `vi` `id` and 23 more.
|
||||
@@ -0,0 +1,110 @@
|
||||
"""Unit tests for VideoCaptioner CLI harness core modules."""
|
||||
|
||||
import pytest
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
|
||||
class TestTranscribe:
|
||||
@patch("cli_anything.videocaptioner.core.transcribe.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_basic(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.transcribe import transcribe
|
||||
assert transcribe("video.mp4") == "/tmp/o.srt"
|
||||
assert "transcribe" in mock_run.call_args[0][0]
|
||||
assert "bijian" in mock_run.call_args[0][0]
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.transcribe.run_quiet", return_value="/tmp/o.json")
|
||||
def test_options(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.transcribe import transcribe
|
||||
transcribe("v.mp4", asr="whisper-api", language="fr", format="json",
|
||||
output_path="/tmp/o.json", whisper_api_key="sk-xxx")
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "whisper-api" in a and "fr" in a and "json" in a and "sk-xxx" in a
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.transcribe.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_word_timestamps(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.transcribe import transcribe
|
||||
transcribe("v.mp4", word_timestamps=True)
|
||||
assert "--word-timestamps" in mock_run.call_args[0][0]
|
||||
|
||||
|
||||
class TestSubtitle:
|
||||
@patch("cli_anything.videocaptioner.core.subtitle.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_translate(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.subtitle import process_subtitle
|
||||
process_subtitle("in.srt", translator="bing", target_language="en")
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "bing" in a and "en" in a
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.subtitle.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_skip(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.subtitle import process_subtitle
|
||||
process_subtitle("in.srt", no_optimize=True, no_translate=True)
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "--no-optimize" in a and "--no-translate" in a
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.subtitle.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_llm(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.subtitle import process_subtitle
|
||||
process_subtitle("in.srt", translator="llm", target_language="ja",
|
||||
reflect=True, api_key="sk-xxx", layout="target-above")
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "--reflect" in a and "sk-xxx" in a and "target-above" in a
|
||||
|
||||
|
||||
class TestSynthesize:
|
||||
@patch("cli_anything.videocaptioner.core.synthesize.run_quiet", return_value="/tmp/o.mp4")
|
||||
def test_soft(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.synthesize import synthesize
|
||||
synthesize("v.mp4", "s.srt")
|
||||
assert "soft" in mock_run.call_args[0][0]
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.synthesize.run_quiet", return_value="/tmp/o.mp4")
|
||||
def test_hard_style(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.synthesize import synthesize
|
||||
synthesize("v.mp4", "s.srt", subtitle_mode="hard", style="anime", quality="high")
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "hard" in a and "anime" in a and "high" in a
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.synthesize.run_quiet", return_value="/tmp/o.mp4")
|
||||
def test_rounded(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.synthesize import synthesize
|
||||
synthesize("v.mp4", "s.srt", subtitle_mode="hard", render_mode="rounded",
|
||||
style_override='{"bg_color":"#000000cc"}')
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "rounded" in a and "#000000cc" in str(a)
|
||||
|
||||
|
||||
class TestPipeline:
|
||||
@patch("cli_anything.videocaptioner.core.pipeline.run_quiet", return_value="/tmp/o.mp4")
|
||||
def test_full(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.pipeline import process
|
||||
process("v.mp4", translator="bing", target_language="en", style="anime")
|
||||
a = mock_run.call_args[0][0]
|
||||
assert "process" in a and "bing" in a and "anime" in a
|
||||
|
||||
@patch("cli_anything.videocaptioner.core.pipeline.run_quiet", return_value="/tmp/o.srt")
|
||||
def test_no_synth(self, mock_run):
|
||||
from cli_anything.videocaptioner.core.pipeline import process
|
||||
process("v.mp4", no_synthesize=True)
|
||||
assert "--no-synthesize" in mock_run.call_args[0][0]
|
||||
|
||||
|
||||
class TestBackend:
|
||||
@patch("subprocess.run")
|
||||
def test_success(self, mock_sub):
|
||||
mock_sub.return_value = MagicMock(returncode=0, stdout="/tmp/o.srt\n", stderr="")
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
assert run_quiet(["transcribe", "v.mp4"]) == "/tmp/o.srt"
|
||||
|
||||
@patch("subprocess.run")
|
||||
def test_failure(self, mock_sub):
|
||||
mock_sub.return_value = MagicMock(returncode=5, stdout="", stderr="Error: fail")
|
||||
from cli_anything.videocaptioner.utils.vc_backend import run_quiet
|
||||
with pytest.raises(RuntimeError, match="fail"):
|
||||
run_quiet(["transcribe", "x.mp4"])
|
||||
|
||||
@patch("shutil.which", return_value=None)
|
||||
def test_not_installed(self, _):
|
||||
from cli_anything.videocaptioner.utils.vc_backend import _find_vc
|
||||
with pytest.raises(RuntimeError, match="not found"):
|
||||
_find_vc()
|
||||
@@ -0,0 +1,104 @@
|
||||
"""End-to-end tests for VideoCaptioner CLI harness.
|
||||
|
||||
These tests require videocaptioner to be installed.
|
||||
Skip with: pytest -m "not e2e"
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import subprocess
|
||||
import shutil
|
||||
|
||||
# Skip all tests if videocaptioner is not installed
|
||||
pytestmark = pytest.mark.skipif(
|
||||
shutil.which("videocaptioner") is None,
|
||||
reason="videocaptioner not installed"
|
||||
)
|
||||
|
||||
|
||||
class TestCLIEntryPoint:
|
||||
def test_help(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "--help"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 0
|
||||
assert "transcribe" in result.stdout
|
||||
assert "subtitle" in result.stdout
|
||||
assert "synthesize" in result.stdout
|
||||
|
||||
def test_version(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "--version"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 0
|
||||
assert "videocaptioner" in result.stdout
|
||||
|
||||
def test_style_list(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "style"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 0
|
||||
assert "default" in result.stdout
|
||||
assert "anime" in result.stdout or "rounded" in result.stdout
|
||||
|
||||
def test_config_show(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "config", "show"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 0
|
||||
|
||||
def test_transcribe_missing_file(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "transcribe", "nonexistent.mp4", "--asr", "bijian"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 3 # FILE_NOT_FOUND
|
||||
|
||||
def test_subtitle_missing_file(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "subtitle", "nonexistent.srt"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 3
|
||||
|
||||
def test_synthesize_missing_args(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "synthesize", "video.mp4"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 2 # USAGE_ERROR (missing -s)
|
||||
|
||||
def test_invalid_asr_engine(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "transcribe", "video.mp4", "--asr", "invalid"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode == 2
|
||||
|
||||
def test_invalid_target_language(self):
|
||||
result = subprocess.run(
|
||||
["videocaptioner", "subtitle", "test.srt", "--translator", "bing",
|
||||
"--target-language", "invalid-lang"],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
assert result.returncode != 0
|
||||
|
||||
|
||||
class TestBackendIntegration:
|
||||
def test_get_version(self):
|
||||
from cli_anything.videocaptioner.utils.vc_backend import get_version
|
||||
version = get_version()
|
||||
assert "videocaptioner" in version.lower()
|
||||
|
||||
def test_get_config(self):
|
||||
from cli_anything.videocaptioner.utils.vc_backend import get_config
|
||||
config = get_config()
|
||||
assert "llm" in config or "transcribe" in config
|
||||
|
||||
def test_get_styles(self):
|
||||
from cli_anything.videocaptioner.utils.vc_backend import get_styles
|
||||
styles = get_styles()
|
||||
assert "default" in styles
|
||||
@@ -0,0 +1,500 @@
|
||||
"""cli-anything REPL Skin — Unified terminal interface for all CLI harnesses.
|
||||
|
||||
Copy this file into your CLI package at:
|
||||
cli_anything/<software>/utils/repl_skin.py
|
||||
|
||||
Usage:
|
||||
from cli_anything.<software>.utils.repl_skin import ReplSkin
|
||||
|
||||
skin = ReplSkin("ollama", version="1.0.0")
|
||||
skin.print_banner()
|
||||
prompt_text = skin.prompt(project_name="llama3.2", modified=False)
|
||||
skin.success("Model pulled")
|
||||
skin.error("Connection failed")
|
||||
skin.warning("No models loaded")
|
||||
skin.info("Generating...")
|
||||
skin.status("Model", "llama3.2:latest")
|
||||
skin.table(headers, rows)
|
||||
skin.print_goodbye()
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
# ── ANSI color codes (no external deps for core styling) ──────────────
|
||||
|
||||
_RESET = "\033[0m"
|
||||
_BOLD = "\033[1m"
|
||||
_DIM = "\033[2m"
|
||||
_ITALIC = "\033[3m"
|
||||
_UNDERLINE = "\033[4m"
|
||||
|
||||
# Brand colors
|
||||
_CYAN = "\033[38;5;80m" # cli-anything brand cyan
|
||||
_CYAN_BG = "\033[48;5;80m"
|
||||
_WHITE = "\033[97m"
|
||||
_GRAY = "\033[38;5;245m"
|
||||
_DARK_GRAY = "\033[38;5;240m"
|
||||
_LIGHT_GRAY = "\033[38;5;250m"
|
||||
|
||||
# Software accent colors — each software gets a unique accent
|
||||
_ACCENT_COLORS = {
|
||||
"gimp": "\033[38;5;214m", # warm orange
|
||||
"blender": "\033[38;5;208m", # deep orange
|
||||
"inkscape": "\033[38;5;39m", # bright blue
|
||||
"audacity": "\033[38;5;33m", # navy blue
|
||||
"libreoffice": "\033[38;5;40m", # green
|
||||
"obs_studio": "\033[38;5;55m", # purple
|
||||
"kdenlive": "\033[38;5;69m", # slate blue
|
||||
"shotcut": "\033[38;5;35m", # teal green
|
||||
"ollama": "\033[38;5;255m", # white (Ollama branding)
|
||||
}
|
||||
_DEFAULT_ACCENT = "\033[38;5;75m" # default sky blue
|
||||
|
||||
# Status colors
|
||||
_GREEN = "\033[38;5;78m"
|
||||
_YELLOW = "\033[38;5;220m"
|
||||
_RED = "\033[38;5;196m"
|
||||
_BLUE = "\033[38;5;75m"
|
||||
_MAGENTA = "\033[38;5;176m"
|
||||
|
||||
# ── Brand icon ────────────────────────────────────────────────────────
|
||||
|
||||
# The cli-anything icon: a small colored diamond/chevron mark
|
||||
_ICON = f"{_CYAN}{_BOLD}◆{_RESET}"
|
||||
_ICON_SMALL = f"{_CYAN}▸{_RESET}"
|
||||
|
||||
# ── Box drawing characters ────────────────────────────────────────────
|
||||
|
||||
_H_LINE = "─"
|
||||
_V_LINE = "│"
|
||||
_TL = "╭"
|
||||
_TR = "╮"
|
||||
_BL = "╰"
|
||||
_BR = "╯"
|
||||
_T_DOWN = "┬"
|
||||
_T_UP = "┴"
|
||||
_T_RIGHT = "├"
|
||||
_T_LEFT = "┤"
|
||||
_CROSS = "┼"
|
||||
|
||||
|
||||
def _strip_ansi(text: str) -> str:
|
||||
"""Remove ANSI escape codes for length calculation."""
|
||||
import re
|
||||
return re.sub(r"\033\[[^m]*m", "", text)
|
||||
|
||||
|
||||
def _visible_len(text: str) -> int:
|
||||
"""Get visible length of text (excluding ANSI codes)."""
|
||||
return len(_strip_ansi(text))
|
||||
|
||||
|
||||
class ReplSkin:
|
||||
"""Unified REPL skin for cli-anything CLIs.
|
||||
|
||||
Provides consistent branding, prompts, and message formatting
|
||||
across all CLI harnesses built with the cli-anything methodology.
|
||||
"""
|
||||
|
||||
def __init__(self, software: str, version: str = "1.0.0",
|
||||
history_file: str | None = None):
|
||||
"""Initialize the REPL skin.
|
||||
|
||||
Args:
|
||||
software: Software name (e.g., "gimp", "shotcut", "ollama").
|
||||
version: CLI version string.
|
||||
history_file: Path for persistent command history.
|
||||
Defaults to ~/.cli-anything-<software>/history
|
||||
"""
|
||||
self.software = software.lower().replace("-", "_")
|
||||
self.display_name = software.replace("_", " ").title()
|
||||
self.version = version
|
||||
self.accent = _ACCENT_COLORS.get(self.software, _DEFAULT_ACCENT)
|
||||
|
||||
# History file
|
||||
if history_file is None:
|
||||
from pathlib import Path
|
||||
hist_dir = Path.home() / f".cli-anything-{self.software}"
|
||||
hist_dir.mkdir(parents=True, exist_ok=True)
|
||||
self.history_file = str(hist_dir / "history")
|
||||
else:
|
||||
self.history_file = history_file
|
||||
|
||||
# Detect terminal capabilities
|
||||
self._color = self._detect_color_support()
|
||||
|
||||
def _detect_color_support(self) -> bool:
|
||||
"""Check if terminal supports color."""
|
||||
if os.environ.get("NO_COLOR"):
|
||||
return False
|
||||
if os.environ.get("CLI_ANYTHING_NO_COLOR"):
|
||||
return False
|
||||
if not hasattr(sys.stdout, "isatty"):
|
||||
return False
|
||||
return sys.stdout.isatty()
|
||||
|
||||
def _c(self, code: str, text: str) -> str:
|
||||
"""Apply color code if colors are supported."""
|
||||
if not self._color:
|
||||
return text
|
||||
return f"{code}{text}{_RESET}"
|
||||
|
||||
# ── Banner ────────────────────────────────────────────────────────
|
||||
|
||||
def print_banner(self):
|
||||
"""Print the startup banner with branding."""
|
||||
inner = 54
|
||||
|
||||
def _box_line(content: str) -> str:
|
||||
"""Wrap content in box drawing, padding to inner width."""
|
||||
pad = inner - _visible_len(content)
|
||||
vl = self._c(_DARK_GRAY, _V_LINE)
|
||||
return f"{vl}{content}{' ' * max(0, pad)}{vl}"
|
||||
|
||||
top = self._c(_DARK_GRAY, f"{_TL}{_H_LINE * inner}{_TR}")
|
||||
bot = self._c(_DARK_GRAY, f"{_BL}{_H_LINE * inner}{_BR}")
|
||||
|
||||
# Title: ◆ cli-anything · Ollama
|
||||
icon = self._c(_CYAN + _BOLD, "◆")
|
||||
brand = self._c(_CYAN + _BOLD, "cli-anything")
|
||||
dot = self._c(_DARK_GRAY, "·")
|
||||
name = self._c(self.accent + _BOLD, self.display_name)
|
||||
title = f" {icon} {brand} {dot} {name}"
|
||||
|
||||
ver = f" {self._c(_DARK_GRAY, f' v{self.version}')}"
|
||||
tip = f" {self._c(_DARK_GRAY, ' Type help for commands, quit to exit')}"
|
||||
empty = ""
|
||||
|
||||
print(top)
|
||||
print(_box_line(title))
|
||||
print(_box_line(ver))
|
||||
print(_box_line(empty))
|
||||
print(_box_line(tip))
|
||||
print(bot)
|
||||
print()
|
||||
|
||||
# ── Prompt ────────────────────────────────────────────────────────
|
||||
|
||||
def prompt(self, project_name: str = "", modified: bool = False,
|
||||
context: str = "") -> str:
|
||||
"""Build a styled prompt string for prompt_toolkit or input().
|
||||
|
||||
Args:
|
||||
project_name: Current project name (empty if none open).
|
||||
modified: Whether the project has unsaved changes.
|
||||
context: Optional extra context to show in prompt.
|
||||
|
||||
Returns:
|
||||
Formatted prompt string.
|
||||
"""
|
||||
parts = []
|
||||
|
||||
# Icon
|
||||
if self._color:
|
||||
parts.append(f"{_CYAN}◆{_RESET} ")
|
||||
else:
|
||||
parts.append("> ")
|
||||
|
||||
# Software name
|
||||
parts.append(self._c(self.accent + _BOLD, self.software))
|
||||
|
||||
# Project context
|
||||
if project_name or context:
|
||||
ctx = context or project_name
|
||||
mod = "*" if modified else ""
|
||||
parts.append(f" {self._c(_DARK_GRAY, '[')}")
|
||||
parts.append(self._c(_LIGHT_GRAY, f"{ctx}{mod}"))
|
||||
parts.append(self._c(_DARK_GRAY, ']'))
|
||||
|
||||
parts.append(self._c(_GRAY, " ❯ "))
|
||||
|
||||
return "".join(parts)
|
||||
|
||||
def prompt_tokens(self, project_name: str = "", modified: bool = False,
|
||||
context: str = ""):
|
||||
"""Build prompt_toolkit formatted text tokens for the prompt.
|
||||
|
||||
Use with prompt_toolkit's FormattedText for proper ANSI handling.
|
||||
|
||||
Returns:
|
||||
list of (style, text) tuples for prompt_toolkit.
|
||||
"""
|
||||
accent_hex = _ANSI_256_TO_HEX.get(self.accent, "#5fafff")
|
||||
tokens = []
|
||||
|
||||
tokens.append(("class:icon", "◆ "))
|
||||
tokens.append(("class:software", self.software))
|
||||
|
||||
if project_name or context:
|
||||
ctx = context or project_name
|
||||
mod = "*" if modified else ""
|
||||
tokens.append(("class:bracket", " ["))
|
||||
tokens.append(("class:context", f"{ctx}{mod}"))
|
||||
tokens.append(("class:bracket", "]"))
|
||||
|
||||
tokens.append(("class:arrow", " ❯ "))
|
||||
|
||||
return tokens
|
||||
|
||||
def get_prompt_style(self):
|
||||
"""Get a prompt_toolkit Style object matching the skin.
|
||||
|
||||
Returns:
|
||||
prompt_toolkit.styles.Style
|
||||
"""
|
||||
try:
|
||||
from prompt_toolkit.styles import Style
|
||||
except ImportError:
|
||||
return None
|
||||
|
||||
accent_hex = _ANSI_256_TO_HEX.get(self.accent, "#5fafff")
|
||||
|
||||
return Style.from_dict({
|
||||
"icon": "#5fdfdf bold", # cyan brand color
|
||||
"software": f"{accent_hex} bold",
|
||||
"bracket": "#585858",
|
||||
"context": "#bcbcbc",
|
||||
"arrow": "#808080",
|
||||
# Completion menu
|
||||
"completion-menu.completion": "bg:#303030 #bcbcbc",
|
||||
"completion-menu.completion.current": f"bg:{accent_hex} #000000",
|
||||
"completion-menu.meta.completion": "bg:#303030 #808080",
|
||||
"completion-menu.meta.completion.current": f"bg:{accent_hex} #000000",
|
||||
# Auto-suggest
|
||||
"auto-suggest": "#585858",
|
||||
# Bottom toolbar
|
||||
"bottom-toolbar": "bg:#1c1c1c #808080",
|
||||
"bottom-toolbar.text": "#808080",
|
||||
})
|
||||
|
||||
# ── Messages ──────────────────────────────────────────────────────
|
||||
|
||||
def success(self, message: str):
|
||||
"""Print a success message with green checkmark."""
|
||||
icon = self._c(_GREEN + _BOLD, "✓")
|
||||
print(f" {icon} {self._c(_GREEN, message)}")
|
||||
|
||||
def error(self, message: str):
|
||||
"""Print an error message with red cross."""
|
||||
icon = self._c(_RED + _BOLD, "✗")
|
||||
print(f" {icon} {self._c(_RED, message)}", file=sys.stderr)
|
||||
|
||||
def warning(self, message: str):
|
||||
"""Print a warning message with yellow triangle."""
|
||||
icon = self._c(_YELLOW + _BOLD, "⚠")
|
||||
print(f" {icon} {self._c(_YELLOW, message)}")
|
||||
|
||||
def info(self, message: str):
|
||||
"""Print an info message with blue dot."""
|
||||
icon = self._c(_BLUE, "●")
|
||||
print(f" {icon} {self._c(_LIGHT_GRAY, message)}")
|
||||
|
||||
def hint(self, message: str):
|
||||
"""Print a subtle hint message."""
|
||||
print(f" {self._c(_DARK_GRAY, message)}")
|
||||
|
||||
def section(self, title: str):
|
||||
"""Print a section header."""
|
||||
print()
|
||||
print(f" {self._c(self.accent + _BOLD, title)}")
|
||||
print(f" {self._c(_DARK_GRAY, _H_LINE * len(title))}")
|
||||
|
||||
# ── Status display ────────────────────────────────────────────────
|
||||
|
||||
def status(self, label: str, value: str):
|
||||
"""Print a key-value status line."""
|
||||
lbl = self._c(_GRAY, f" {label}:")
|
||||
val = self._c(_WHITE, f" {value}")
|
||||
print(f"{lbl}{val}")
|
||||
|
||||
def status_block(self, items: dict[str, str], title: str = ""):
|
||||
"""Print a block of status key-value pairs.
|
||||
|
||||
Args:
|
||||
items: Dict of label -> value pairs.
|
||||
title: Optional title for the block.
|
||||
"""
|
||||
if title:
|
||||
self.section(title)
|
||||
|
||||
max_key = max(len(k) for k in items) if items else 0
|
||||
for label, value in items.items():
|
||||
lbl = self._c(_GRAY, f" {label:<{max_key}}")
|
||||
val = self._c(_WHITE, f" {value}")
|
||||
print(f"{lbl}{val}")
|
||||
|
||||
def progress(self, current: int, total: int, label: str = ""):
|
||||
"""Print a simple progress indicator.
|
||||
|
||||
Args:
|
||||
current: Current step number.
|
||||
total: Total number of steps.
|
||||
label: Optional label for the progress.
|
||||
"""
|
||||
pct = int(current / total * 100) if total > 0 else 0
|
||||
bar_width = 20
|
||||
filled = int(bar_width * current / total) if total > 0 else 0
|
||||
bar = "█" * filled + "░" * (bar_width - filled)
|
||||
text = f" {self._c(_CYAN, bar)} {self._c(_GRAY, f'{pct:3d}%')}"
|
||||
if label:
|
||||
text += f" {self._c(_LIGHT_GRAY, label)}"
|
||||
print(text)
|
||||
|
||||
# ── Table display ─────────────────────────────────────────────────
|
||||
|
||||
def table(self, headers: list[str], rows: list[list[str]],
|
||||
max_col_width: int = 40):
|
||||
"""Print a formatted table with box-drawing characters.
|
||||
|
||||
Args:
|
||||
headers: Column header strings.
|
||||
rows: List of rows, each a list of cell strings.
|
||||
max_col_width: Maximum column width before truncation.
|
||||
"""
|
||||
if not headers:
|
||||
return
|
||||
|
||||
# Calculate column widths
|
||||
col_widths = [min(len(h), max_col_width) for h in headers]
|
||||
for row in rows:
|
||||
for i, cell in enumerate(row):
|
||||
if i < len(col_widths):
|
||||
col_widths[i] = min(
|
||||
max(col_widths[i], len(str(cell))), max_col_width
|
||||
)
|
||||
|
||||
def pad(text: str, width: int) -> str:
|
||||
t = str(text)[:width]
|
||||
return t + " " * (width - len(t))
|
||||
|
||||
# Header
|
||||
header_cells = [
|
||||
self._c(_CYAN + _BOLD, pad(h, col_widths[i]))
|
||||
for i, h in enumerate(headers)
|
||||
]
|
||||
sep = self._c(_DARK_GRAY, f" {_V_LINE} ")
|
||||
header_line = f" {sep.join(header_cells)}"
|
||||
print(header_line)
|
||||
|
||||
# Separator
|
||||
sep_parts = [self._c(_DARK_GRAY, _H_LINE * w) for w in col_widths]
|
||||
sep_line = self._c(_DARK_GRAY, f" {'───'.join([_H_LINE * w for w in col_widths])}")
|
||||
print(sep_line)
|
||||
|
||||
# Rows
|
||||
for row in rows:
|
||||
cells = []
|
||||
for i, cell in enumerate(row):
|
||||
if i < len(col_widths):
|
||||
cells.append(self._c(_LIGHT_GRAY, pad(str(cell), col_widths[i])))
|
||||
row_sep = self._c(_DARK_GRAY, f" {_V_LINE} ")
|
||||
print(f" {row_sep.join(cells)}")
|
||||
|
||||
# ── Help display ──────────────────────────────────────────────────
|
||||
|
||||
def help(self, commands: dict[str, str]):
|
||||
"""Print a formatted help listing.
|
||||
|
||||
Args:
|
||||
commands: Dict of command -> description pairs.
|
||||
"""
|
||||
self.section("Commands")
|
||||
max_cmd = max(len(c) for c in commands) if commands else 0
|
||||
for cmd, desc in commands.items():
|
||||
cmd_styled = self._c(self.accent, f" {cmd:<{max_cmd}}")
|
||||
desc_styled = self._c(_GRAY, f" {desc}")
|
||||
print(f"{cmd_styled}{desc_styled}")
|
||||
print()
|
||||
|
||||
# ── Goodbye ───────────────────────────────────────────────────────
|
||||
|
||||
def print_goodbye(self):
|
||||
"""Print a styled goodbye message."""
|
||||
print(f"\n {_ICON_SMALL} {self._c(_GRAY, 'Goodbye!')}\n")
|
||||
|
||||
# ── Prompt toolkit session factory ────────────────────────────────
|
||||
|
||||
def create_prompt_session(self):
|
||||
"""Create a prompt_toolkit PromptSession with skin styling.
|
||||
|
||||
Returns:
|
||||
A configured PromptSession, or None if prompt_toolkit unavailable.
|
||||
"""
|
||||
try:
|
||||
from prompt_toolkit import PromptSession
|
||||
from prompt_toolkit.history import FileHistory
|
||||
from prompt_toolkit.auto_suggest import AutoSuggestFromHistory
|
||||
from prompt_toolkit.formatted_text import FormattedText
|
||||
|
||||
style = self.get_prompt_style()
|
||||
|
||||
session = PromptSession(
|
||||
history=FileHistory(self.history_file),
|
||||
auto_suggest=AutoSuggestFromHistory(),
|
||||
style=style,
|
||||
enable_history_search=True,
|
||||
)
|
||||
return session
|
||||
except ImportError:
|
||||
return None
|
||||
|
||||
def get_input(self, pt_session, project_name: str = "",
|
||||
modified: bool = False, context: str = "") -> str:
|
||||
"""Get input from user using prompt_toolkit or fallback.
|
||||
|
||||
Args:
|
||||
pt_session: A prompt_toolkit PromptSession (or None).
|
||||
project_name: Current project name.
|
||||
modified: Whether project has unsaved changes.
|
||||
context: Optional context string.
|
||||
|
||||
Returns:
|
||||
User input string (stripped).
|
||||
"""
|
||||
if pt_session is not None:
|
||||
from prompt_toolkit.formatted_text import FormattedText
|
||||
tokens = self.prompt_tokens(project_name, modified, context)
|
||||
return pt_session.prompt(FormattedText(tokens)).strip()
|
||||
else:
|
||||
raw_prompt = self.prompt(project_name, modified, context)
|
||||
return input(raw_prompt).strip()
|
||||
|
||||
# ── Toolbar builder ───────────────────────────────────────────────
|
||||
|
||||
def bottom_toolbar(self, items: dict[str, str]):
|
||||
"""Create a bottom toolbar callback for prompt_toolkit.
|
||||
|
||||
Args:
|
||||
items: Dict of label -> value pairs to show in toolbar.
|
||||
|
||||
Returns:
|
||||
A callable that returns FormattedText for the toolbar.
|
||||
"""
|
||||
def toolbar():
|
||||
from prompt_toolkit.formatted_text import FormattedText
|
||||
parts = []
|
||||
for i, (k, v) in enumerate(items.items()):
|
||||
if i > 0:
|
||||
parts.append(("class:bottom-toolbar.text", " │ "))
|
||||
parts.append(("class:bottom-toolbar.text", f" {k}: "))
|
||||
parts.append(("class:bottom-toolbar", v))
|
||||
return FormattedText(parts)
|
||||
return toolbar
|
||||
|
||||
|
||||
# ── ANSI 256-color to hex mapping (for prompt_toolkit styles) ─────────
|
||||
|
||||
_ANSI_256_TO_HEX = {
|
||||
"\033[38;5;33m": "#0087ff", # audacity navy blue
|
||||
"\033[38;5;35m": "#00af5f", # shotcut teal
|
||||
"\033[38;5;39m": "#00afff", # inkscape bright blue
|
||||
"\033[38;5;40m": "#00d700", # libreoffice green
|
||||
"\033[38;5;55m": "#5f00af", # obs purple
|
||||
"\033[38;5;69m": "#5f87ff", # kdenlive slate blue
|
||||
"\033[38;5;75m": "#5fafff", # default sky blue
|
||||
"\033[38;5;80m": "#5fd7d7", # brand cyan
|
||||
"\033[38;5;208m": "#ff8700", # blender deep orange
|
||||
"\033[38;5;214m": "#ffaf00", # gimp warm orange
|
||||
"\033[38;5;255m": "#eeeeee", # ollama white
|
||||
}
|
||||
@@ -0,0 +1,87 @@
|
||||
"""VideoCaptioner CLI backend — subprocess wrapper for the videocaptioner command.
|
||||
|
||||
All core modules call through this single module to invoke the existing
|
||||
videocaptioner CLI. This keeps the Click harness thin and delegates real
|
||||
work to the production-tested videocaptioner package.
|
||||
"""
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
|
||||
def _find_vc() -> str:
|
||||
"""Locate the videocaptioner binary."""
|
||||
path = shutil.which("videocaptioner")
|
||||
if not path:
|
||||
raise RuntimeError(
|
||||
"videocaptioner not found on PATH. "
|
||||
"Install with: pip install videocaptioner"
|
||||
)
|
||||
return path
|
||||
|
||||
|
||||
def run(args: list[str], timeout: int = 600) -> dict[str, Any]:
|
||||
"""Run a videocaptioner CLI command and return structured result.
|
||||
|
||||
Args:
|
||||
args: Command arguments (without 'videocaptioner' prefix).
|
||||
timeout: Max seconds to wait.
|
||||
|
||||
Returns:
|
||||
Dict with 'exit_code', 'stdout', 'stderr', 'output_path' (if found).
|
||||
"""
|
||||
cmd = [_find_vc()] + args
|
||||
try:
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=timeout,
|
||||
)
|
||||
except subprocess.TimeoutExpired:
|
||||
raise RuntimeError(f"Command timed out after {timeout}s: {' '.join(cmd)}")
|
||||
|
||||
# Extract output path from quiet mode stdout
|
||||
stdout = result.stdout.strip()
|
||||
output_path = stdout if stdout and Path(stdout).suffix else None
|
||||
|
||||
return {
|
||||
"exit_code": result.returncode,
|
||||
"stdout": stdout,
|
||||
"stderr": result.stderr.strip(),
|
||||
"output_path": output_path,
|
||||
"command": " ".join(cmd),
|
||||
}
|
||||
|
||||
|
||||
def run_quiet(args: list[str], timeout: int = 600) -> str:
|
||||
"""Run in quiet mode and return the output file path.
|
||||
|
||||
Raises RuntimeError on failure.
|
||||
"""
|
||||
result = run(args + ["-q"], timeout=timeout)
|
||||
if result["exit_code"] != 0:
|
||||
error_msg = result["stderr"] or result["stdout"] or "Unknown error"
|
||||
raise RuntimeError(f"videocaptioner failed (exit {result['exit_code']}): {error_msg}")
|
||||
return result["stdout"]
|
||||
|
||||
|
||||
def get_version() -> str:
|
||||
"""Get videocaptioner version string."""
|
||||
result = run(["--version"])
|
||||
return result["stdout"]
|
||||
|
||||
|
||||
def get_config() -> str:
|
||||
"""Get current configuration."""
|
||||
result = run(["config", "show"])
|
||||
return result["stdout"]
|
||||
|
||||
|
||||
def get_styles() -> str:
|
||||
"""Get available subtitle styles."""
|
||||
result = run(["style"])
|
||||
return result["stdout"]
|
||||
@@ -0,0 +1,362 @@
|
||||
#!/usr/bin/env python3
|
||||
"""VideoCaptioner CLI — AI-powered video captioning from the command line.
|
||||
|
||||
Transcribe speech, optimize and translate subtitles, then burn them into
|
||||
video with beautiful customizable styles (ASS outline or rounded background).
|
||||
|
||||
Usage:
|
||||
cli-anything-videocaptioner transcribe video.mp4 --asr bijian
|
||||
cli-anything-videocaptioner subtitle input.srt --translator bing --target-language en
|
||||
cli-anything-videocaptioner synthesize video.mp4 -s sub.srt --subtitle-mode hard --style anime
|
||||
cli-anything-videocaptioner process video.mp4 --asr bijian --translator bing --target-language ja
|
||||
cli-anything-videocaptioner --json transcribe video.mp4 --asr bijian
|
||||
"""
|
||||
|
||||
import sys
|
||||
import json
|
||||
import shlex
|
||||
import click
|
||||
from typing import Optional
|
||||
|
||||
from cli_anything.videocaptioner.utils import vc_backend
|
||||
from cli_anything.videocaptioner.core import transcribe as transcribe_mod
|
||||
from cli_anything.videocaptioner.core import subtitle as subtitle_mod
|
||||
from cli_anything.videocaptioner.core import synthesize as synthesize_mod
|
||||
from cli_anything.videocaptioner.core import pipeline as pipeline_mod
|
||||
|
||||
_json_output = False
|
||||
_repl_mode = False
|
||||
|
||||
|
||||
def output(data, message: str = ""):
|
||||
if _json_output:
|
||||
click.echo(json.dumps(data, indent=2, default=str))
|
||||
else:
|
||||
if message:
|
||||
click.echo(message)
|
||||
if isinstance(data, dict):
|
||||
for k, v in data.items():
|
||||
click.echo(f" {k}: {v}")
|
||||
elif isinstance(data, str):
|
||||
click.echo(data)
|
||||
|
||||
|
||||
def handle_error(func):
|
||||
def wrapper(*args, **kwargs):
|
||||
try:
|
||||
return func(*args, **kwargs)
|
||||
except RuntimeError as e:
|
||||
if _json_output:
|
||||
click.echo(json.dumps({"error": str(e), "type": "runtime_error"}))
|
||||
else:
|
||||
click.echo(f"Error: {e}", err=True)
|
||||
if not _repl_mode:
|
||||
sys.exit(1)
|
||||
wrapper.__name__ = func.__name__
|
||||
wrapper.__doc__ = func.__doc__
|
||||
return wrapper
|
||||
|
||||
|
||||
# ── Main CLI Group ──────────────────────────────────────────────
|
||||
@click.group(invoke_without_command=True)
|
||||
@click.option("--json", "use_json", is_flag=True, help="Output as JSON")
|
||||
@click.pass_context
|
||||
def cli(ctx, use_json):
|
||||
"""VideoCaptioner CLI — AI-powered video captioning.
|
||||
|
||||
Transcribe speech, optimize/translate subtitles, burn into video with
|
||||
beautiful styles. Free ASR (bijian) and translation (Bing/Google) included.
|
||||
|
||||
Run without a subcommand to enter interactive REPL mode.
|
||||
"""
|
||||
global _json_output
|
||||
_json_output = use_json
|
||||
if ctx.invoked_subcommand is None:
|
||||
ctx.invoke(repl)
|
||||
|
||||
|
||||
# ── Transcribe ──────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@click.argument("input_path")
|
||||
@click.option("--asr", type=click.Choice(["bijian", "jianying", "whisper-api", "whisper-cpp"]),
|
||||
default="bijian", help="ASR engine (bijian/jianying: free, Chinese & English only)")
|
||||
@click.option("--language", default="auto", help="Source language ISO 639-1 code, or 'auto'")
|
||||
@click.option("--format", "fmt", type=click.Choice(["srt", "ass", "txt", "json"]),
|
||||
default="srt", help="Output format")
|
||||
@click.option("-o", "--output", "output_path", default=None, help="Output file or directory path")
|
||||
@click.option("--word-timestamps", is_flag=True, help="Include word-level timestamps")
|
||||
@click.option("--whisper-api-key", default=None, help="Whisper API key")
|
||||
@click.option("--whisper-api-base", default=None, help="Whisper API base URL")
|
||||
@click.option("--whisper-model", default=None, help="Whisper model name")
|
||||
@handle_error
|
||||
def transcribe(input_path, asr, language, fmt, output_path, word_timestamps,
|
||||
whisper_api_key, whisper_api_base, whisper_model):
|
||||
"""Transcribe audio/video to subtitles."""
|
||||
result_path = transcribe_mod.transcribe(
|
||||
input_path, output_path=output_path, asr=asr, language=language,
|
||||
format=fmt, word_timestamps=word_timestamps,
|
||||
whisper_api_key=whisper_api_key, whisper_api_base=whisper_api_base,
|
||||
whisper_model=whisper_model,
|
||||
)
|
||||
output({"output_path": result_path}, f"✓ Transcription complete → {result_path}")
|
||||
|
||||
|
||||
# ── Subtitle ────────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@click.argument("input_path")
|
||||
@click.option("--translator", type=click.Choice(["llm", "bing", "google"]),
|
||||
default=None, help="Translation service (bing/google: free)")
|
||||
@click.option("--target-language", default=None, help="Target language BCP 47 code (e.g. en, ja, ko)")
|
||||
@click.option("--format", "fmt", type=click.Choice(["srt", "ass", "txt", "json"]),
|
||||
default="srt", help="Output format")
|
||||
@click.option("-o", "--output", "output_path", default=None, help="Output file or directory path")
|
||||
@click.option("--layout", type=click.Choice(["target-above", "source-above", "target-only", "source-only"]),
|
||||
default=None, help="Bilingual subtitle layout")
|
||||
@click.option("--no-optimize", is_flag=True, help="Skip LLM optimization")
|
||||
@click.option("--no-translate", is_flag=True, help="Skip translation")
|
||||
@click.option("--no-split", is_flag=True, help="Skip re-segmentation")
|
||||
@click.option("--reflect", is_flag=True, help="Reflective translation (LLM only, higher quality)")
|
||||
@click.option("--prompt", default=None, help="Custom LLM prompt")
|
||||
@click.option("--api-key", default=None, help="LLM API key")
|
||||
@click.option("--api-base", default=None, help="LLM API base URL")
|
||||
@click.option("--model", default=None, help="LLM model name")
|
||||
@handle_error
|
||||
def subtitle(input_path, translator, target_language, fmt, output_path, layout,
|
||||
no_optimize, no_translate, no_split, reflect, prompt, api_key, api_base, model):
|
||||
"""Optimize and/or translate subtitle files.
|
||||
|
||||
Three processing steps (all enabled by default except translation):
|
||||
1. Split — re-segment by semantic boundaries (LLM)
|
||||
2. Optimize — fix ASR errors, punctuation (LLM)
|
||||
3. Translate — to another language (LLM/Bing/Google)
|
||||
|
||||
Use --translator or --target-language to enable translation.
|
||||
"""
|
||||
result_path = subtitle_mod.process_subtitle(
|
||||
input_path, output_path=output_path, translator=translator,
|
||||
target_language=target_language, format=fmt, layout=layout,
|
||||
no_optimize=no_optimize, no_translate=no_translate, no_split=no_split,
|
||||
reflect=reflect, prompt=prompt, api_key=api_key, api_base=api_base, model=model,
|
||||
)
|
||||
output({"output_path": result_path}, f"✓ Subtitle processing complete → {result_path}")
|
||||
|
||||
|
||||
# ── Synthesize ──────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@click.argument("video_path")
|
||||
@click.option("-s", "--subtitle", "subtitle_path", required=True, help="Subtitle file path")
|
||||
@click.option("--subtitle-mode", type=click.Choice(["soft", "hard"]),
|
||||
default="soft", help="soft: embedded track, hard: burned into frames")
|
||||
@click.option("--quality", type=click.Choice(["ultra", "high", "medium", "low"]),
|
||||
default="medium", help="Video quality (ultra=CRF18, high=CRF23, medium=CRF28, low=CRF32)")
|
||||
@click.option("-o", "--output", "output_path", default=None, help="Output video file path")
|
||||
@click.option("--layout", type=click.Choice(["target-above", "source-above", "target-only", "source-only"]),
|
||||
default=None, help="Bilingual subtitle layout")
|
||||
@click.option("--render-mode", type=click.Choice(["ass", "rounded"]),
|
||||
default=None, help="ass: outline/shadow, rounded: background boxes")
|
||||
@click.option("--style", default=None, help="Style preset (default, anime, vertical, rounded)")
|
||||
@click.option("--style-override", default=None, help='Inline JSON, e.g. \'{"outline_color": "#ff0000"}\'')
|
||||
@click.option("--font-file", default=None, help="Custom font file (.ttf/.otf)")
|
||||
@handle_error
|
||||
def synthesize(video_path, subtitle_path, subtitle_mode, quality, output_path,
|
||||
layout, render_mode, style, style_override, font_file):
|
||||
"""Burn subtitles into video with customizable styles.
|
||||
|
||||
Two rendering modes for beautiful subtitles:
|
||||
ASS — traditional outline/shadow (presets: default, anime, vertical)
|
||||
Rounded — modern rounded background boxes
|
||||
|
||||
Use 'cli-anything-videocaptioner styles' to see all presets.
|
||||
"""
|
||||
result_path = synthesize_mod.synthesize(
|
||||
video_path, subtitle_path, output_path=output_path,
|
||||
subtitle_mode=subtitle_mode, quality=quality, layout=layout,
|
||||
render_mode=render_mode, style=style, style_override=style_override,
|
||||
font_file=font_file,
|
||||
)
|
||||
output({"output_path": result_path}, f"✓ Video synthesis complete → {result_path}")
|
||||
|
||||
|
||||
# ── Process (full pipeline) ─────────────────────────────────────
|
||||
@cli.command()
|
||||
@click.argument("input_path")
|
||||
@click.option("--asr", type=click.Choice(["bijian", "jianying", "whisper-api", "whisper-cpp"]),
|
||||
default="bijian", help="ASR engine")
|
||||
@click.option("--language", default="auto", help="Source language")
|
||||
@click.option("--translator", type=click.Choice(["llm", "bing", "google"]),
|
||||
default=None, help="Translation service (bing/google: free)")
|
||||
@click.option("--target-language", default=None, help="Target language BCP 47 code")
|
||||
@click.option("--subtitle-mode", type=click.Choice(["soft", "hard"]), default="soft")
|
||||
@click.option("--quality", type=click.Choice(["ultra", "high", "medium", "low"]), default="medium")
|
||||
@click.option("-o", "--output", "output_path", default=None, help="Output file or directory path")
|
||||
@click.option("--layout", type=click.Choice(["target-above", "source-above", "target-only", "source-only"]), default=None)
|
||||
@click.option("--style", default=None, help="Style preset name")
|
||||
@click.option("--style-override", default=None, help="Inline JSON style override")
|
||||
@click.option("--render-mode", type=click.Choice(["ass", "rounded"]), default=None)
|
||||
@click.option("--no-optimize", is_flag=True, help="Skip optimization")
|
||||
@click.option("--no-translate", is_flag=True, help="Skip translation")
|
||||
@click.option("--no-split", is_flag=True, help="Skip re-segmentation")
|
||||
@click.option("--no-synthesize", is_flag=True, help="Skip video synthesis")
|
||||
@click.option("--reflect", is_flag=True, help="Reflective translation (LLM only)")
|
||||
@click.option("--prompt", default=None, help="Custom LLM prompt")
|
||||
@click.option("--api-key", default=None, help="LLM API key")
|
||||
@click.option("--api-base", default=None, help="LLM API base URL")
|
||||
@click.option("--model", default=None, help="LLM model name")
|
||||
@handle_error
|
||||
def process(input_path, asr, language, translator, target_language, subtitle_mode,
|
||||
quality, output_path, layout, style, style_override, render_mode,
|
||||
no_optimize, no_translate, no_split, no_synthesize, reflect,
|
||||
prompt, api_key, api_base, model):
|
||||
"""Full pipeline: transcribe → optimize → translate → synthesize.
|
||||
|
||||
One command to go from video to captioned video with translated subtitles.
|
||||
Audio files automatically skip video synthesis.
|
||||
"""
|
||||
result_path = pipeline_mod.process(
|
||||
input_path, output_path=output_path, asr=asr, language=language,
|
||||
translator=translator, target_language=target_language,
|
||||
subtitle_mode=subtitle_mode, quality=quality, layout=layout,
|
||||
style=style, style_override=style_override, render_mode=render_mode,
|
||||
no_optimize=no_optimize, no_translate=no_translate, no_split=no_split,
|
||||
no_synthesize=no_synthesize, reflect=reflect, prompt=prompt,
|
||||
api_key=api_key, api_base=api_base, model=model,
|
||||
)
|
||||
output({"output_path": result_path}, f"✓ Pipeline complete → {result_path}")
|
||||
|
||||
|
||||
# ── Styles ──────────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@handle_error
|
||||
def styles():
|
||||
"""List available subtitle style presets."""
|
||||
result = vc_backend.get_styles()
|
||||
if _json_output:
|
||||
click.echo(json.dumps({"styles": result}))
|
||||
else:
|
||||
click.echo(result)
|
||||
|
||||
|
||||
# ── Config ──────────────────────────────────────────────────────
|
||||
@cli.group()
|
||||
def config():
|
||||
"""View and manage configuration."""
|
||||
pass
|
||||
|
||||
|
||||
@config.command("show")
|
||||
@handle_error
|
||||
def config_show():
|
||||
"""Display current configuration."""
|
||||
result = vc_backend.get_config()
|
||||
if _json_output:
|
||||
click.echo(json.dumps({"config": result}))
|
||||
else:
|
||||
click.echo(result)
|
||||
|
||||
|
||||
@config.command("set")
|
||||
@click.argument("key")
|
||||
@click.argument("value")
|
||||
@handle_error
|
||||
def config_set(key, value):
|
||||
"""Set a configuration value."""
|
||||
result = vc_backend.run(["config", "set", key, value])
|
||||
if result["exit_code"] != 0:
|
||||
raise RuntimeError(result["stderr"] or result["stdout"])
|
||||
output({"key": key, "value": value}, f"✓ {key} = {value}")
|
||||
|
||||
|
||||
# ── Download ────────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@click.argument("url")
|
||||
@click.option("-o", "--output", "output_dir", default=".", help="Output directory")
|
||||
@handle_error
|
||||
def download(url, output_dir):
|
||||
"""Download online video (YouTube, Bilibili, etc.)."""
|
||||
result_path = vc_backend.run_quiet(["download", url, "-o", output_dir])
|
||||
output({"output_path": result_path}, f"✓ Downloaded → {result_path}")
|
||||
|
||||
|
||||
# ── Session ─────────────────────────────────────────────────────
|
||||
@cli.group()
|
||||
def session():
|
||||
"""Session state commands."""
|
||||
pass
|
||||
|
||||
|
||||
@session.command("status")
|
||||
@handle_error
|
||||
def session_status():
|
||||
"""Show VideoCaptioner version and configuration."""
|
||||
version = vc_backend.get_version()
|
||||
data = {"version": version, "json_output": _json_output}
|
||||
output(data, f"VideoCaptioner {version}")
|
||||
|
||||
|
||||
# ── REPL ────────────────────────────────────────────────────────
|
||||
@cli.command()
|
||||
@handle_error
|
||||
def repl():
|
||||
"""Start interactive REPL session."""
|
||||
from cli_anything.videocaptioner.utils.repl_skin import ReplSkin
|
||||
|
||||
global _repl_mode
|
||||
_repl_mode = True
|
||||
|
||||
skin = ReplSkin("videocaptioner", version="1.0.0")
|
||||
skin.print_banner()
|
||||
|
||||
pt_session = skin.create_prompt_session()
|
||||
|
||||
_repl_commands = {
|
||||
"transcribe": "Transcribe audio/video to subtitles",
|
||||
"subtitle": "Optimize and/or translate subtitles",
|
||||
"synthesize": "Burn subtitles into video",
|
||||
"process": "Full pipeline (transcribe → translate → synthesize)",
|
||||
"styles": "List subtitle style presets",
|
||||
"config": "show|set <key> <value>",
|
||||
"download": "Download online video",
|
||||
"session": "status",
|
||||
"help": "Show this help",
|
||||
"quit": "Exit REPL",
|
||||
}
|
||||
|
||||
while True:
|
||||
try:
|
||||
line = skin.get_input(pt_session, project_name="", modified=False)
|
||||
if not line:
|
||||
continue
|
||||
if line.lower() in ("quit", "exit", "q"):
|
||||
skin.print_goodbye()
|
||||
break
|
||||
if line.lower() == "help":
|
||||
skin.help(_repl_commands)
|
||||
continue
|
||||
|
||||
try:
|
||||
args = shlex.split(line)
|
||||
except ValueError:
|
||||
args = line.split()
|
||||
try:
|
||||
cli.main(args, standalone_mode=False)
|
||||
except SystemExit:
|
||||
pass
|
||||
except click.exceptions.UsageError as e:
|
||||
skin.warning(f"Usage error: {e}")
|
||||
except Exception as e:
|
||||
skin.error(f"{e}")
|
||||
|
||||
except (EOFError, KeyboardInterrupt):
|
||||
skin.print_goodbye()
|
||||
break
|
||||
|
||||
_repl_mode = False
|
||||
|
||||
|
||||
# ── Entry Point ─────────────────────────────────────────────────
|
||||
def main():
|
||||
cli()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
51
videocaptioner/agent-harness/setup.py
Normal file
51
videocaptioner/agent-harness/setup.py
Normal file
@@ -0,0 +1,51 @@
|
||||
#!/usr/bin/env python3
|
||||
"""setup.py for cli-anything-videocaptioner"""
|
||||
|
||||
from setuptools import setup, find_namespace_packages
|
||||
|
||||
with open("cli_anything/videocaptioner/README.md", "r", encoding="utf-8") as fh:
|
||||
long_description = fh.read()
|
||||
|
||||
setup(
|
||||
name="cli-anything-videocaptioner",
|
||||
version="1.0.0",
|
||||
author="Weifeng",
|
||||
author_email="",
|
||||
description="CLI harness for VideoCaptioner — AI-powered video captioning with beautiful subtitle styles. Requires: videocaptioner (pip install videocaptioner), ffmpeg",
|
||||
long_description=long_description,
|
||||
long_description_content_type="text/markdown",
|
||||
url="https://github.com/WEIFENG2333/VideoCaptioner",
|
||||
packages=find_namespace_packages(include=["cli_anything.*"]),
|
||||
classifiers=[
|
||||
"Development Status :: 4 - Beta",
|
||||
"Intended Audience :: Developers",
|
||||
"Topic :: Multimedia :: Video",
|
||||
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"Programming Language :: Python :: 3.12",
|
||||
],
|
||||
python_requires=">=3.10",
|
||||
install_requires=[
|
||||
"click>=8.0.0",
|
||||
"prompt-toolkit>=3.0.0",
|
||||
"videocaptioner",
|
||||
],
|
||||
extras_require={
|
||||
"dev": [
|
||||
"pytest>=7.0.0",
|
||||
"pytest-cov>=4.0.0",
|
||||
],
|
||||
},
|
||||
entry_points={
|
||||
"console_scripts": [
|
||||
"cli-anything-videocaptioner=cli_anything.videocaptioner.videocaptioner_cli:main",
|
||||
],
|
||||
},
|
||||
package_data={
|
||||
"cli_anything.videocaptioner": ["skills/*.md"],
|
||||
},
|
||||
include_package_data=True,
|
||||
zip_safe=False,
|
||||
)
|
||||
Reference in New Issue
Block a user