Compare commits

...

7 Commits

Author SHA1 Message Date
Scott Hanselman
c9c1aed602 chore: upgrade and recompile agentic workflows
Some checks failed
Copilot Setup Steps / copilot-setup-steps (push) Has been cancelled
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-13 19:26:42 -07:00
AlexAlves87
f4dbc521df fix: repair screen recording capture and encoding pipeline
- Fix InvalidCastException in CreateForMonitor: pass IID_IInspectable
  instead of typeof(GraphicsCaptureItem).GUID, which returns a C#/WinRT-
  generated GUID unrecognized by the native COM method (E_NOINTERFACE).
- Replace PrepareStreamTranscodeAsync with PrepareMediaStreamSourceTranscodeAsync
  + MediaStreamSource feeding NV12 samples on demand, fixing "Transcode
  failed: Unknown" on all three screen recording commands.
- Add 500 MB frame-buffer cap (MaxFrameBufferBytes) with early stop and
  warning log to prevent OOM on long or high-fps recordings.
- Save encoded MP4 to %TEMP%\openclaw\ and return filePath in the response.
- Change ScreenRecordResult.Fps from float to int.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:00:09 +02:00
AlexAlves87
21b0d315be Merge remote-tracking branch 'upstream/master' into feat/screen-record 2026-04-09 16:46:33 +02:00
AlexAlves87
7cce9fe01e feat: add screen.record.start and screen.record.stop
Two new commands for session-based recording:

- screen.record.start: opens a recording session and returns a recordingId
- screen.record.stop: closes the session and returns the video

ActiveSession manages the capture loop with a CancellationToken and stores
frames safely under a lock. A ConcurrentDictionary keyed by recordingId
allows concurrent sessions.

9 new tests cover: start/stop without a handler, args and monitor alias,
recordingId in the start response, full stop payload, and exception paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:19:30 +02:00
AlexAlves87
e4e6abb01e feat: wire screen.record into NodeService and add capability tests
NodeService instantiates ScreenRecordingService and subscribes OnScreenRecord
to ScreenCapability's RecordRequested event.

Tests cover the full surface of screen.record: missing handler error, correct
arg forwarding, defaults (durationMs=5000, fps=10, screenIndex=0), the
monitor→screenIndex alias, and exception handling in the handler.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:19:24 +02:00
AlexAlves87
45912512d0 feat: add ScreenRecordingService for fixed-duration monitor capture
WinRT-based implementation backing screen.record:

- D3D11 + Direct3D11CaptureFramePool for GPU-backed frame acquisition
- Software BGRA→NV12 conversion (BT.601 limited range) before encoding
- MediaTranscoder pipeline with hardware acceleration and SW fallback
- No external dependencies: pure P/Invoke (d3d11.dll, combase.dll)

Records the full monitor only. Per-window capture is not yet implemented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:19:19 +02:00
AlexAlves87
22e378f8a7 feat: add screen.record to ScreenCapability
New command in the shared capability layer:

- screen.record: fixed-duration capture; blocks until done and returns
  the video as base64 MP4.

Args: durationMs (def. 5000), fps (def. 10), screenIndex/monitor (def. 0).
The monitor→screenIndex alias keeps consistency with screen.capture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 19:19:16 +02:00
8 changed files with 2022 additions and 1052 deletions

View File

@ -0,0 +1,178 @@
---
description: GitHub Agentic Workflows (gh-aw) - Create, debug, and upgrade AI-powered workflows with intelligent prompt routing
disable-model-invocation: true
---
# GitHub Agentic Workflows Agent
This agent helps you work with **GitHub Agentic Workflows (gh-aw)**, a CLI extension for creating AI-powered workflows in natural language using markdown files.
## What This Agent Does
This is a **dispatcher agent** that routes your request to the appropriate specialized prompt based on your task:
- **Creating new workflows**: Routes to `create` prompt
- **Updating existing workflows**: Routes to `update` prompt
- **Debugging workflows**: Routes to `debug` prompt
- **Upgrading workflows**: Routes to `upgrade-agentic-workflows` prompt
- **Creating report-generating workflows**: Routes to `report` prompt — consult this whenever the workflow posts status updates, audits, analyses, or any structured output as issues, discussions, or comments
- **Creating shared components**: Routes to `create-shared-agentic-workflow` prompt
- **Fixing Dependabot PRs**: Routes to `dependabot` prompt — use this when Dependabot opens PRs that modify generated manifest files (`.github/workflows/package.json`, `.github/workflows/requirements.txt`, `.github/workflows/go.mod`). Never merge those PRs directly; instead update the source `.md` files and rerun `gh aw compile --dependabot` to bundle all fixes
- **Analyzing test coverage**: Routes to `test-coverage` prompt — consult this whenever the workflow reads, analyzes, or reports on test coverage data from PRs or CI runs
Workflows may optionally include:
- **Project tracking / monitoring** (GitHub Projects updates, status reporting)
- **Orchestration / coordination** (one workflow assigning agents or dispatching and coordinating other workflows)
## Files This Applies To
- Workflow files: `.github/workflows/*.md` and `.github/workflows/**/*.md`
- Workflow lock files: `.github/workflows/*.lock.yml`
- Shared components: `.github/workflows/shared/*.md`
- Configuration: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/github-agentic-workflows.md
## Problems This Solves
- **Workflow Creation**: Design secure, validated agentic workflows with proper triggers, tools, and permissions
- **Workflow Debugging**: Analyze logs, identify missing tools, investigate failures, and fix configuration issues
- **Version Upgrades**: Migrate workflows to new gh-aw versions, apply codemods, fix breaking changes
- **Component Design**: Create reusable shared workflow components that wrap MCP servers
## How to Use
When you interact with this agent, it will:
1. **Understand your intent** - Determine what kind of task you're trying to accomplish
2. **Route to the right prompt** - Load the specialized prompt file for your task
3. **Execute the task** - Follow the detailed instructions in the loaded prompt
## Available Prompts
### Create New Workflow
**Load when**: User wants to create a new workflow from scratch, add automation, or design a workflow that doesn't exist yet
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/create-agentic-workflow.md
**Use cases**:
- "Create a workflow that triages issues"
- "I need a workflow to label pull requests"
- "Design a weekly research automation"
### Update Existing Workflow
**Load when**: User wants to modify, improve, or refactor an existing workflow
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/update-agentic-workflow.md
**Use cases**:
- "Add web-fetch tool to the issue-classifier workflow"
- "Update the PR reviewer to use discussions instead of issues"
- "Improve the prompt for the weekly-research workflow"
### Debug Workflow
**Load when**: User needs to investigate, audit, debug, or understand a workflow, troubleshoot issues, analyze logs, or fix errors
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/debug-agentic-workflow.md
**Use cases**:
- "Why is this workflow failing?"
- "Analyze the logs for workflow X"
- "Investigate missing tool calls in run #12345"
### Upgrade Agentic Workflows
**Load when**: User wants to upgrade workflows to a new gh-aw version or fix deprecations
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/upgrade-agentic-workflows.md
**Use cases**:
- "Upgrade all workflows to the latest version"
- "Fix deprecated fields in workflows"
- "Apply breaking changes from the new release"
### Create a Report-Generating Workflow
**Load when**: The workflow being created or updated produces reports — recurring status updates, audit summaries, analyses, or any structured output posted as a GitHub issue, discussion, or comment
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/report.md
**Use cases**:
- "Create a weekly CI health report"
- "Post a daily security audit to Discussions"
- "Add a status update comment to open PRs"
### Create Shared Agentic Workflow
**Load when**: User wants to create a reusable workflow component or wrap an MCP server
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/create-shared-agentic-workflow.md
**Use cases**:
- "Create a shared component for Notion integration"
- "Wrap the Slack MCP server as a reusable component"
- "Design a shared workflow for database queries"
### Fix Dependabot PRs
**Load when**: User needs to close or fix open Dependabot PRs that update dependencies in generated manifest files (`.github/workflows/package.json`, `.github/workflows/requirements.txt`, `.github/workflows/go.mod`)
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/dependabot.md
**Use cases**:
- "Fix the open Dependabot PRs for npm dependencies"
- "Bundle and close the Dependabot PRs for workflow dependencies"
- "Update @playwright/test to fix the Dependabot PR"
### Analyze Test Coverage
**Load when**: The workflow reads, analyzes, or reports test coverage — whether triggered by a PR, a schedule, or a slash command. Always consult this prompt before designing the coverage data strategy.
**Prompt file**: https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/test-coverage.md
**Use cases**:
- "Create a workflow that comments coverage on PRs"
- "Analyze coverage trends over time"
- "Add a coverage gate that blocks PRs below a threshold"
## Instructions
When a user interacts with you:
1. **Identify the task type** from the user's request
2. **Load the appropriate prompt** from the GitHub repository URLs listed above
3. **Follow the loaded prompt's instructions** exactly
4. **If uncertain**, ask clarifying questions to determine the right prompt
## Quick Reference
```bash
# Initialize repository for agentic workflows
gh aw init
# Generate the lock file for a workflow
gh aw compile [workflow-name]
# Debug workflow runs
gh aw logs [workflow-name]
gh aw audit <run-id>
# Upgrade workflows
gh aw fix --write
gh aw compile --validate
```
## Key Features of gh-aw
- **Natural Language Workflows**: Write workflows in markdown with YAML frontmatter
- **AI Engine Support**: Copilot, Claude, Codex, or custom engines
- **MCP Server Integration**: Connect to Model Context Protocol servers for tools
- **Safe Outputs**: Structured communication between AI and GitHub API
- **Strict Mode**: Security-first validation and sandboxing
- **Shared Components**: Reusable workflow building blocks
- **Repo Memory**: Persistent git-backed storage for agents
- **Sandboxed Execution**: All workflows run in the Agent Workflow Firewall (AWF) sandbox, enabling full `bash` and `edit` tools by default
## Important Notes
- Always reference the instructions file at https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/github-agentic-workflows.md for complete documentation
- Use the MCP tool `agentic-workflows` when running in GitHub Copilot Cloud
- Workflows must be compiled to `.lock.yml` files before running in GitHub Actions
- **Bash tools are enabled by default** - Don't restrict bash commands unnecessarily since workflows are sandboxed by the AWF
- Follow security best practices: minimal permissions, explicit network access, no template injection
- **Network configuration**: Use ecosystem identifiers (`node`, `python`, `go`, etc.) or explicit FQDNs in `network.allowed`. Bare shorthands like `npm` or `pypi` are **not** valid. See https://github.com/github/gh-aw/blob/v0.68.1/.github/aw/network.md for the full list of valid ecosystem identifiers and domain patterns.
- **Single-file output**: When creating a workflow, produce exactly **one** workflow `.md` file. Do not create separate documentation files (architecture docs, runbooks, usage guides, etc.). If documentation is needed, add a brief `## Usage` section inside the workflow file itself.

14
.github/aw/actions-lock.json vendored Normal file
View File

@ -0,0 +1,14 @@
{
"entries": {
"actions/github-script@v9": {
"repo": "actions/github-script",
"version": "v9",
"sha": "373c709c69115d41ff229c7e5df9f8788daa9553"
},
"github/gh-aw-actions/setup@v0.68.1": {
"repo": "github/gh-aw-actions/setup",
"version": "v0.68.1",
"sha": "2fe53acc038ba01c3bbdc767d4b25df31ca5bdfc"
}
}
}

View File

@ -0,0 +1,26 @@
name: "Copilot Setup Steps"
# This workflow configures the environment for GitHub Copilot Agent with gh-aw MCP server
on:
workflow_dispatch:
push:
paths:
- .github/workflows/copilot-setup-steps.yml
jobs:
# The job MUST be called 'copilot-setup-steps' to be recognized by GitHub Copilot Agent
copilot-setup-steps:
runs-on: ubuntu-latest
# Set minimal permissions for setup steps
# Copilot Agent receives its own token with appropriate permissions
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Install gh-aw extension
uses: github/gh-aw-actions/setup-cli@2fe53acc038ba01c3bbdc767d4b25df31ca5bdfc # v0.68.1
with:
version: v0.68.1

1821
.github/workflows/repo-assist.lock.yml generated vendored

File diff suppressed because it is too large Load Diff

View File

@ -14,15 +14,20 @@ public class ScreenCapability : NodeCapabilityBase
private static readonly string[] _commands = new[]
{
"screen.capture",
"screen.list"
// Future: "screen.record"
"screen.list",
"screen.record",
"screen.record.start",
"screen.record.stop",
};
public override IReadOnlyList<string> Commands => _commands;
// Events for UI/platform-specific implementation
public event Func<ScreenCaptureArgs, Task<ScreenCaptureResult>>? CaptureRequested;
public event Func<Task<ScreenInfo[]>>? ListRequested;
public event Func<ScreenRecordArgs, Task<ScreenRecordResult>>? RecordRequested;
public event Func<ScreenRecordStartArgs, Task<string>>? StartRequested;
public event Func<string, Task<ScreenRecordResult>>? StopRequested;
public ScreenCapability(IOpenClawLogger logger) : base(logger)
{
@ -32,8 +37,11 @@ public class ScreenCapability : NodeCapabilityBase
{
return request.Command switch
{
"screen.capture" => await HandleCaptureAsync(request),
"screen.list" => await HandleListAsync(request),
"screen.capture" => await HandleCaptureAsync(request),
"screen.list" => await HandleListAsync(request),
"screen.record" => await HandleRecordAsync(request),
"screen.record.start" => await HandleStartAsync(request),
"screen.record.stop" => await HandleStopAsync(request),
_ => Error($"Unknown command: {request.Command}")
};
}
@ -114,6 +122,143 @@ public class ScreenCapability : NodeCapabilityBase
return Error($"List failed: {ex.Message}");
}
}
private async Task<NodeInvokeResponse> HandleRecordAsync(NodeInvokeRequest request)
{
var durationMs = GetIntArg(request.Args, "durationMs", 5000);
var fps = GetIntArg(request.Args, "fps", 10);
var screenIndex = GetIntArg(request.Args, "screenIndex", GetIntArg(request.Args, "monitor", 0));
Logger.Info($"screen.record: durationMs={durationMs} fps={fps} screenIndex={screenIndex}");
if (RecordRequested == null)
return Error("Screen recording not available");
try
{
var result = await RecordRequested(new ScreenRecordArgs
{
DurationMs = durationMs,
Fps = fps,
ScreenIndex = screenIndex,
});
return Success(new
{
format = result.Format,
base64 = result.Base64,
filePath = result.FilePath,
durationMs = result.DurationMs,
fps = result.Fps,
screenIndex = result.ScreenIndex,
width = result.Width,
height = result.Height,
hasAudio = result.HasAudio,
});
}
catch (Exception ex)
{
Logger.Error("screen.record failed", ex);
return Error($"Record failed: {ex.GetType().Name}: {ex.Message} | {ex.StackTrace?.Split('\n').FirstOrDefault()?.Trim()}");
}
}
private async Task<NodeInvokeResponse> HandleStartAsync(NodeInvokeRequest request)
{
var fps = GetIntArg(request.Args, "fps", 10);
var screenIndex = GetIntArg(request.Args, "screenIndex", GetIntArg(request.Args, "monitor", 0));
Logger.Info($"screen.record.start: fps={fps} screenIndex={screenIndex}");
if (StartRequested == null)
return Error("Screen recording not available");
try
{
var recordingId = await StartRequested(new ScreenRecordStartArgs
{
Fps = fps,
ScreenIndex = screenIndex,
});
return Success(new { recordingId });
}
catch (Exception ex)
{
Logger.Error("screen.record.start failed", ex);
return Error($"Start failed: {ex.Message}");
}
}
private async Task<NodeInvokeResponse> HandleStopAsync(NodeInvokeRequest request)
{
var recordingId = GetStringArg(request.Args, "recordingId", "");
Logger.Info($"screen.record.stop: recordingId={recordingId}");
if (string.IsNullOrEmpty(recordingId))
return Error("recordingId is required");
if (StopRequested == null)
return Error("Screen recording not available");
try
{
var result = await StopRequested(recordingId);
return Success(new
{
format = result.Format,
base64 = result.Base64,
filePath = result.FilePath,
durationMs = result.DurationMs,
fps = result.Fps,
screenIndex = result.ScreenIndex,
width = result.Width,
height = result.Height,
hasAudio = result.HasAudio,
});
}
catch (Exception ex)
{
Logger.Error("screen.record.stop failed", ex);
return Error($"Stop failed: {ex.Message}");
}
}
}
/// <summary>
/// Parameters for a fixed-duration screen recording.
/// Memory usage: width × height × 4 bytes × (durationMs/1000 × fps) frames.
/// Recommended limits: durationMs ≤ 10 000, fps ≤ 10 for 1080p to stay under 500 MB.
/// The service enforces a hard 500 MB frame-buffer cap and stops capture early if exceeded.
/// </summary>
public class ScreenRecordArgs
{
public int DurationMs { get; set; } = 5000;
public int Fps { get; set; } = 10;
public int ScreenIndex { get; set; }
}
/// <summary>
/// Parameters for an open-ended screen recording session (screen.record.start / screen.record.stop).
/// The same 500 MB frame-buffer cap applies; capture stops automatically if the limit is hit.
/// </summary>
public class ScreenRecordStartArgs
{
public int Fps { get; set; } = 10;
public int ScreenIndex { get; set; }
}
public class ScreenRecordResult
{
public string Base64 { get; set; } = "";
public string Format { get; set; } = "mp4";
public string? FilePath { get; set; }
public int DurationMs { get; set; }
public int Fps { get; set; }
public int ScreenIndex { get; set; }
public int Width { get; set; }
public int Height { get; set; }
public bool HasAudio { get; set; }
}
public class ScreenCaptureArgs

View File

@ -20,6 +20,7 @@ public class NodeService : IDisposable
private WindowsNodeClient? _nodeClient;
private CanvasWindow? _canvasWindow;
private ScreenCaptureService? _screenCaptureService;
private ScreenRecordingService? _screenRecordingService;
private CameraCaptureService? _cameraCaptureService;
private DateTime _lastScreenCaptureNotification = DateTime.MinValue;
private string? _a2uiHostUrl;
@ -49,8 +50,9 @@ public class NodeService : IDisposable
_logger = logger;
_dispatcherQueue = dispatcherQueue;
_dataPath = dataPath;
_screenCaptureService = new ScreenCaptureService(logger);
_cameraCaptureService = new CameraCaptureService(logger);
_screenCaptureService = new ScreenCaptureService(logger);
_screenRecordingService = new ScreenRecordingService(logger);
_cameraCaptureService = new CameraCaptureService(logger);
}
/// <summary>
@ -125,8 +127,11 @@ public class NodeService : IDisposable
// Screen capability
_screenCapability = new ScreenCapability(_logger);
_screenCapability.ListRequested += OnScreenList;
_screenCapability.ListRequested += OnScreenList;
_screenCapability.CaptureRequested += OnScreenCapture;
_screenCapability.RecordRequested += OnScreenRecord;
_screenCapability.StartRequested += OnScreenRecordStart;
_screenCapability.StopRequested += OnScreenRecordStop;
_nodeClient.RegisterCapability(_screenCapability);
// Camera capability
@ -432,7 +437,31 @@ public class NodeService : IDisposable
return await _screenCaptureService.CaptureAsync(args);
}
private Task<ScreenRecordResult> OnScreenRecord(ScreenRecordArgs args)
{
if (_screenRecordingService == null)
throw new InvalidOperationException("Screen recording service not available");
return _screenRecordingService.RecordAsync(args);
}
private Task<string> OnScreenRecordStart(ScreenRecordStartArgs args)
{
if (_screenRecordingService == null)
throw new InvalidOperationException("Screen recording service not available");
return _screenRecordingService.StartAsync(args);
}
private Task<ScreenRecordResult> OnScreenRecordStop(string recordingId)
{
if (_screenRecordingService == null)
throw new InvalidOperationException("Screen recording service not available");
return _screenRecordingService.StopAsync(recordingId);
}
#endregion
#region Camera Capability Handlers
@ -483,6 +512,7 @@ public class NodeService : IDisposable
_nodeClient = null;
try { client?.Dispose(); } catch { /* ignore */ }
try { _screenRecordingService?.Dispose(); } catch { /* ignore */ }
try { _cameraCaptureService?.Dispose(); } catch { /* ignore */ }
if (_canvasWindow != null && !_canvasWindow.IsClosed)

View File

@ -0,0 +1,562 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;
using Windows.Graphics.Capture;
using Windows.Graphics.DirectX;
using Windows.Graphics.DirectX.Direct3D11;
using Windows.Graphics.Imaging;
using Windows.Media.Core;
using Windows.Media.MediaProperties;
using Windows.Media.Transcoding;
using Windows.Storage.Streams;
using OpenClaw.Shared;
using OpenClaw.Shared.Capabilities;
using WinRT;
namespace OpenClawTray.Services;
/// <summary>
/// Records the screen using Windows.Graphics.Capture and encodes to MP4 via MediaTranscoder.
/// </summary>
internal sealed class ScreenRecordingService : IDisposable
{
private readonly IOpenClawLogger _logger;
private readonly ConcurrentDictionary<string, ActiveSession> _sessions = new();
private const int MaxFps = 60;
private const int MinFps = 1;
private const int MinDurationMs = 250;
private const int MaxDurationMs = 60_000;
private const int PoolBuffers = 2;
// BGRA frame buffer safety cap: ~500 MB across all queued frames.
// At 1080p (8 MB/frame) this allows ~62 frames; at 720p (~4 MB) ~125 frames.
// Frames beyond this limit are dropped to prevent OOM on long/high-fps recordings.
private const long MaxFrameBufferBytes = 500L * 1024 * 1024;
public ScreenRecordingService(IOpenClawLogger logger)
{
_logger = logger;
}
// ── Public API ────────────────────────────────────────────────────────────
public async Task<ScreenRecordResult> RecordAsync(ScreenRecordArgs args)
{
var durationMs = Math.Clamp(args.DurationMs, MinDurationMs, MaxDurationMs);
var fps = Math.Clamp(args.Fps, MinFps, MaxFps);
var screenIndex = args.ScreenIndex;
_logger.Info($"[ScreenRecording] duration={durationMs}ms fps={fps} screen={screenIndex}");
var item = CreateCaptureItem(screenIndex);
var width = item.Size.Width;
var height = item.Size.Height;
var d3d = CreateDirect3DDevice();
Direct3D11CaptureFramePool? pool = null;
GraphicsCaptureSession? session = null;
var latestFrame = (Direct3D11CaptureFrame?)null;
using var ready = new SemaphoreSlim(0, 1);
var frames = new List<byte[]>();
var frameBytes = (long)width * height * 4; // BGRA bytes per frame
try
{
pool = Direct3D11CaptureFramePool.CreateFreeThreaded(
d3d,
DirectXPixelFormat.B8G8R8A8UIntNormalized,
PoolBuffers,
new global::Windows.Graphics.SizeInt32 { Width = width, Height = height });
session = pool.CreateCaptureSession(item);
session.IsCursorCaptureEnabled = false;
pool.FrameArrived += (p, _) =>
{
var f = p.TryGetNextFrame();
if (f == null) return;
Interlocked.Exchange(ref latestFrame, f)?.Dispose();
try { ready.Release(); } catch { /* already signaled */ }
};
session.StartCapture();
var intervalMs = 1000 / fps;
var deadline = DateTime.UtcNow.AddMilliseconds(durationMs);
var nextCapture = DateTime.UtcNow;
while (DateTime.UtcNow < deadline)
{
var waitMs = (int)(nextCapture - DateTime.UtcNow).TotalMilliseconds;
if (waitMs > 0)
await Task.Delay(waitMs);
if (!await ready.WaitAsync(intervalMs * 2))
continue;
var frame = Interlocked.Exchange(ref latestFrame, null);
if (frame == null) continue;
using (frame)
{
if (frames.Count * frameBytes >= MaxFrameBufferBytes)
{
_logger.Warn($"[ScreenRecording] Frame buffer cap reached ({MaxFrameBufferBytes / 1024 / 1024} MB), stopping early.");
break;
}
try
{
var bmp = await SoftwareBitmap.CreateCopyFromSurfaceAsync(frame.Surface);
frames.Add(ExtractBitmapBytes(bmp));
}
catch (Exception ex)
{
_logger.Warn($"[ScreenRecording] Frame skipped: {ex.Message}");
}
}
nextCapture = nextCapture.AddMilliseconds(intervalMs);
}
}
finally
{
session?.Dispose();
pool?.Dispose();
Interlocked.Exchange(ref latestFrame, null)?.Dispose();
}
_logger.Info($"[ScreenRecording] Captured {frames.Count} frames, encoding...");
var base64 = await EncodeToMp4Async(frames, width, height, fps);
var filePath = SaveToTempFile(base64);
return new ScreenRecordResult
{
Format = "mp4",
Base64 = base64,
FilePath = filePath,
DurationMs = durationMs,
Fps = fps,
ScreenIndex = screenIndex,
Width = width,
Height = height,
HasAudio = false,
};
}
public Task<string> StartAsync(ScreenRecordStartArgs args)
{
var fps = Math.Clamp(args.Fps, MinFps, MaxFps);
var screenIndex = args.ScreenIndex;
_logger.Info($"[ScreenRecording] start fps={fps} screen={screenIndex}");
var item = CreateCaptureItem(screenIndex);
var width = item.Size.Width;
var height = item.Size.Height;
var d3d = CreateDirect3DDevice();
var pool = Direct3D11CaptureFramePool.CreateFreeThreaded(
d3d,
DirectXPixelFormat.B8G8R8A8UIntNormalized,
PoolBuffers,
new global::Windows.Graphics.SizeInt32 { Width = width, Height = height });
var captureSession = pool.CreateCaptureSession(item);
captureSession.IsCursorCaptureEnabled = false;
var session = new ActiveSession(screenIndex, fps, width, height, pool, captureSession, _logger);
_sessions[session.Id] = session;
_logger.Info($"[ScreenRecording] started session {session.Id}");
return Task.FromResult(session.Id);
}
public async Task<ScreenRecordResult> StopAsync(string recordingId)
{
if (!_sessions.TryRemove(recordingId, out var session))
throw new KeyNotFoundException($"Recording session '{recordingId}' not found");
_logger.Info($"[ScreenRecording] stopping session {recordingId}...");
List<byte[]> frames;
int width, height, fps, screenIndex, durationMs;
using (session)
{
(frames, durationMs) = await session.StopAsync();
width = session.Width;
height = session.Height;
fps = session.Fps;
screenIndex = session.ScreenIndex;
}
_logger.Info($"[ScreenRecording] session {recordingId}: {frames.Count} frames, encoding...");
var base64 = await EncodeToMp4Async(frames, width, height, fps);
var filePath = SaveToTempFile(base64);
return new ScreenRecordResult
{
Format = "mp4",
Base64 = base64,
FilePath = filePath,
DurationMs = durationMs,
Fps = fps,
ScreenIndex = screenIndex,
Width = width,
Height = height,
HasAudio = false,
};
}
public void Dispose()
{
foreach (var kv in _sessions)
{
if (_sessions.TryRemove(kv.Key, out var s))
try { s.Dispose(); } catch { }
}
}
// ── Temp file ─────────────────────────────────────────────────────────────
private string SaveToTempFile(string base64)
{
var dir = Path.Combine(Path.GetTempPath(), "openclaw");
Directory.CreateDirectory(dir);
var path = Path.Combine(dir, $"openclaw-screen-record-{Guid.NewGuid()}.mp4");
File.WriteAllBytes(path, Convert.FromBase64String(base64));
_logger.Info($"[ScreenRecording] Saved to {path}");
return path;
}
// ── Encoding ──────────────────────────────────────────────────────────────
private static async Task<string> EncodeToMp4Async(
List<byte[]> frames, int width, int height, int fps)
{
if (frames.Count == 0)
throw new InvalidOperationException("No frames to encode");
var encWidth = (uint)(width & ~1);
var encHeight = (uint)(height & ~1);
var fi = new[] { 0 };
MediaStreamSource MakeMss()
{
fi[0] = 0;
var inputProps = VideoEncodingProperties.CreateUncompressed(
MediaEncodingSubtypes.Nv12, encWidth, encHeight);
inputProps.FrameRate.Numerator = (uint)fps;
inputProps.FrameRate.Denominator = 1;
var mss = new MediaStreamSource(new VideoStreamDescriptor(inputProps));
mss.BufferTime = TimeSpan.Zero;
mss.SampleRequested += (_, e) =>
{
if (fi[0] >= frames.Count) { e.Request.Sample = null; return; }
var nv12 = BgraToNv12(frames[fi[0]], width, height, (int)encWidth, (int)encHeight);
var ts = TimeSpan.FromTicks((long)(fi[0] * 10_000_000.0 / fps));
var dur = TimeSpan.FromTicks((long)(10_000_000.0 / fps));
var dw = new DataWriter();
dw.WriteBytes(nv12);
var sample = MediaStreamSample.CreateFromBuffer(dw.DetachBuffer(), ts);
sample.Duration = dur;
e.Request.Sample = sample;
fi[0]++;
};
return mss;
}
MediaEncodingProfile MakeProfile()
{
var profile = MediaEncodingProfile.CreateMp4(VideoEncodingQuality.Auto);
profile.Video.Width = encWidth;
profile.Video.Height = encHeight;
profile.Video.Bitrate = 4_000_000;
profile.Video.FrameRate.Numerator = (uint)fps;
profile.Video.FrameRate.Denominator = 1;
profile.Audio = null;
return profile;
}
foreach (var hwEnabled in new[] { true, false })
{
using var output = new InMemoryRandomAccessStream();
var transcoder = new MediaTranscoder { HardwareAccelerationEnabled = hwEnabled };
PrepareTranscodeResult result;
try
{
result = await transcoder
.PrepareMediaStreamSourceTranscodeAsync(MakeMss(), output, MakeProfile());
}
catch (System.Runtime.InteropServices.COMException) when (hwEnabled)
{
continue;
}
if (!result.CanTranscode) continue;
await result.TranscodeAsync();
var size = (uint)output.Size;
if (size == 0) continue;
var dr = new DataReader(output.GetInputStreamAt(0));
await dr.LoadAsync(size);
var bytes = new byte[size];
dr.ReadBytes(bytes);
return Convert.ToBase64String(bytes);
}
throw new InvalidOperationException("No encoder available (hardware or software)");
}
private static byte[] BgraToNv12(byte[] bgra, int srcWidth, int srcHeight,
int encWidth, int encHeight)
{
var nv12 = new byte[encWidth * encHeight * 3 / 2];
for (int y = 0; y < encHeight; y++)
for (int x = 0; x < encWidth; x++)
{
int i = (y * srcWidth + x) * 4;
int b = bgra[i], g = bgra[i + 1], r = bgra[i + 2];
nv12[y * encWidth + x] = (byte)(((66 * r + 129 * g + 25 * b + 128) >> 8) + 16);
}
int uvBase = encWidth * encHeight;
for (int y = 0; y < encHeight; y += 2)
for (int x = 0; x < encWidth; x += 2)
{
int i = (y * srcWidth + x) * 4;
int b = bgra[i], g = bgra[i + 1], r = bgra[i + 2];
int uvIdx = uvBase + (y / 2) * encWidth + x;
nv12[uvIdx] = (byte)(((-38 * r - 74 * g + 112 * b + 128) >> 8) + 128);
nv12[uvIdx + 1] = (byte)(((112 * r - 94 * g - 18 * b + 128) >> 8) + 128);
}
return nv12;
}
// ── D3D11 / WinRT interop ─────────────────────────────────────────────────
// IID_IDXGIDevice
private static readonly Guid IID_DXGIDevice =
new Guid("54ec77fa-1377-44e6-8c32-88fd5f44c84c");
private static IDirect3DDevice CreateDirect3DDevice()
{
// D3D_DRIVER_TYPE_HARDWARE=1, D3D11_CREATE_DEVICE_BGRA_SUPPORT=0x20, D3D11_SDK_VERSION=7
D3D11CreateDevice(IntPtr.Zero, 1, IntPtr.Zero, 0x20, IntPtr.Zero, 0, 7,
out var d3dPtr, IntPtr.Zero, IntPtr.Zero);
var iid = IID_DXGIDevice;
Marshal.QueryInterface(d3dPtr, ref iid, out var dxgiPtr);
Marshal.Release(d3dPtr);
NativeCreateDirect3D11DeviceFromDXGIDevice(dxgiPtr, out var winrtPtr);
Marshal.Release(dxgiPtr);
var device = MarshalInterface<IDirect3DDevice>.FromAbi(winrtPtr);
Marshal.Release(winrtPtr);
return device;
}
private static GraphicsCaptureItem CreateCaptureItem(int screenIndex)
{
var monitors = GetMonitorHandles();
if (screenIndex < 0 || screenIndex >= monitors.Count)
screenIndex = 0;
const string classId = "Windows.Graphics.Capture.GraphicsCaptureItem";
var iid = typeof(IGraphicsCaptureItemInterop).GUID;
WindowsCreateString(classId, classId.Length, out var hstring);
try
{
RoGetActivationFactory(hstring, ref iid, out var factoryPtr);
var factory = (IGraphicsCaptureItemInterop)Marshal.GetObjectForIUnknown(factoryPtr);
Marshal.Release(factoryPtr);
var itemIid = new Guid("AF86E2E0-B12D-4C6A-9C5A-D7AA65101E90"); // IInspectable
factory.CreateForMonitor(monitors[screenIndex], in itemIid, out var itemPtr);
var item = MarshalInspectable<GraphicsCaptureItem>.FromAbi(itemPtr);
Marshal.Release(itemPtr);
return item;
}
finally
{
WindowsDeleteString(hstring);
}
}
private static List<IntPtr> GetMonitorHandles()
{
var handles = new List<IntPtr>();
EnumDisplayMonitors(IntPtr.Zero, IntPtr.Zero,
(hMon, _, ref _, _) => { handles.Add(hMon); return true; },
IntPtr.Zero);
return handles;
}
private static byte[] ExtractBitmapBytes(SoftwareBitmap bitmap)
{
var capacity = (uint)(bitmap.PixelWidth * bitmap.PixelHeight * 4);
var buf = new global::Windows.Storage.Streams.Buffer(capacity);
bitmap.CopyToBuffer(buf);
using var dr = DataReader.FromBuffer(buf);
var bytes = new byte[buf.Length];
dr.ReadBytes(bytes);
return bytes;
}
// ── P/Invoke declarations ─────────────────────────────────────────────────
[DllImport("d3d11.dll")]
private static extern int D3D11CreateDevice(
IntPtr pAdapter, uint DriverType, IntPtr Software, uint Flags,
IntPtr pFeatureLevels, uint FeatureLevels, uint SDKVersion,
out IntPtr ppDevice, IntPtr pFeatureLevel, IntPtr ppImmediateContext);
[DllImport("d3d11.dll", EntryPoint = "CreateDirect3D11DeviceFromDXGIDevice")]
private static extern int NativeCreateDirect3D11DeviceFromDXGIDevice(
IntPtr dxgiDevice, out IntPtr graphicsDevice);
[DllImport("combase.dll")]
private static extern int WindowsCreateString(
[MarshalAs(UnmanagedType.LPWStr)] string sourceString, int length, out IntPtr hstring);
[DllImport("combase.dll")]
private static extern int WindowsDeleteString(IntPtr hstring);
[DllImport("combase.dll")]
private static extern int RoGetActivationFactory(
IntPtr runtimeClassId, ref Guid iid, out IntPtr factory);
[DllImport("user32.dll")]
private static extern bool EnumDisplayMonitors(
IntPtr hdc, IntPtr lprcClip, MonitorEnumProc lpfnEnum, IntPtr dwData);
private delegate bool MonitorEnumProc(
IntPtr hMonitor, IntPtr hdcMonitor, ref RECT lprcMonitor, IntPtr dwData);
[StructLayout(LayoutKind.Sequential)]
private struct RECT { public int Left, Top, Right, Bottom; }
[ComImport]
[Guid("3628E81B-3CAC-4C60-B7F4-23CE0E0C3356")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
private interface IGraphicsCaptureItemInterop
{
void CreateForWindow(IntPtr hwnd, in Guid riid, out IntPtr ppv);
void CreateForMonitor(IntPtr hMonitor, in Guid riid, out IntPtr ppv);
}
// ── Active session (start/stop) ───────────────────────────────────────────
private sealed class ActiveSession : IDisposable
{
public readonly string Id = Guid.NewGuid().ToString("N")[..12];
public readonly int ScreenIndex;
public readonly int Fps;
public readonly int Width;
public readonly int Height;
private readonly IOpenClawLogger _logger;
private readonly List<byte[]> _frames = new();
private readonly object _framesLock = new();
private readonly CancellationTokenSource _cts = new();
private readonly Direct3D11CaptureFramePool _pool;
private readonly GraphicsCaptureSession _session;
private readonly DateTime _startedAt = DateTime.UtcNow;
private volatile Direct3D11CaptureFrame? _latestFrame;
private readonly SemaphoreSlim _ready = new(0, 1);
private readonly Task _captureTask;
public ActiveSession(int screenIndex, int fps, int width, int height,
Direct3D11CaptureFramePool pool, GraphicsCaptureSession session,
IOpenClawLogger logger)
{
ScreenIndex = screenIndex; Fps = fps; Width = width; Height = height;
_pool = pool; _session = session; _logger = logger;
pool.FrameArrived += OnFrameArrived;
session.StartCapture();
_captureTask = RunAsync(_cts.Token);
}
private void OnFrameArrived(Direct3D11CaptureFramePool pool, object _)
{
var f = pool.TryGetNextFrame();
if (f == null) return;
Interlocked.Exchange(ref _latestFrame, f)?.Dispose();
try { _ready.Release(); } catch { /* already signaled */ }
}
private async Task RunAsync(CancellationToken ct)
{
var intervalMs = 1000 / Fps;
var nextCapture = DateTime.UtcNow;
var frameBytes = (long)Width * Height * 4;
while (!ct.IsCancellationRequested)
{
try
{
var waitMs = (int)(nextCapture - DateTime.UtcNow).TotalMilliseconds;
if (waitMs > 0) await Task.Delay(waitMs, ct);
if (!await _ready.WaitAsync(intervalMs * 2, ct)) continue;
}
catch (OperationCanceledException) { break; }
var frame = Interlocked.Exchange(ref _latestFrame, null);
if (frame == null) continue;
using (frame)
{
int frameCount;
lock (_framesLock) frameCount = _frames.Count;
if (frameCount * frameBytes >= MaxFrameBufferBytes)
{
_logger.Warn($"[ScreenRecording] Session {Id}: frame buffer cap reached ({MaxFrameBufferBytes / 1024 / 1024} MB), stopping capture.");
_cts.Cancel();
break;
}
try
{
var bmp = await SoftwareBitmap.CreateCopyFromSurfaceAsync(frame.Surface);
var bytes = ExtractBitmapBytes(bmp);
lock (_framesLock) _frames.Add(bytes);
}
catch (Exception ex)
{
_logger.Warn($"[ScreenRecording] Session {Id} frame skipped: {ex.Message}");
}
}
nextCapture = nextCapture.AddMilliseconds(intervalMs);
}
}
public async Task<(List<byte[]> frames, int durationMs)> StopAsync()
{
_cts.Cancel();
try { await _captureTask; } catch (OperationCanceledException) { } catch { }
var durationMs = (int)(DateTime.UtcNow - _startedAt).TotalMilliseconds;
List<byte[]> snapshot;
lock (_framesLock) snapshot = new List<byte[]>(_frames);
return (snapshot, durationMs);
}
public void Dispose()
{
_cts.Cancel();
try { _session.Dispose(); } catch { }
try { _pool.Dispose(); } catch { }
Interlocked.Exchange(ref _latestFrame, null)?.Dispose();
_cts.Dispose();
_ready.Dispose();
}
}
}

View File

@ -683,7 +683,8 @@ public class ScreenCapabilityTests
var cap = new ScreenCapability(NullLogger.Instance);
Assert.True(cap.CanHandle("screen.capture"));
Assert.True(cap.CanHandle("screen.list"));
Assert.False(cap.CanHandle("screen.record"));
Assert.True(cap.CanHandle("screen.record"));
Assert.False(cap.CanHandle("screen.unknown"));
Assert.Equal("screen", cap.Category);
}
@ -835,6 +836,281 @@ public class ScreenCapabilityTests
Assert.NotNull(receivedArgs);
Assert.Equal(2, receivedArgs!.MonitorIndex);
}
[Fact]
public async Task Record_ReturnsError_WhenNoHandler()
{
var cap = new ScreenCapability(NullLogger.Instance);
var req = new NodeInvokeRequest { Id = "sr1", Command = "screen.record", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("not available", res.Error, StringComparison.OrdinalIgnoreCase);
}
[Fact]
public async Task Record_CallsHandler_WithArgs()
{
var cap = new ScreenCapability(NullLogger.Instance);
ScreenRecordArgs? receivedArgs = null;
cap.RecordRequested += (args) =>
{
receivedArgs = args;
return Task.FromResult(new ScreenRecordResult
{
Format = "mp4", Base64 = "vid", DurationMs = 2000, Fps = 10,
ScreenIndex = 1, Width = 1920, Height = 1080
});
};
var req = new NodeInvokeRequest
{
Id = "sr2",
Command = "screen.record",
Args = Parse("""{"durationMs":2000,"fps":10,"screenIndex":1}""")
};
var res = await cap.ExecuteAsync(req);
Assert.True(res.Ok);
Assert.NotNull(receivedArgs);
Assert.Equal(2000, receivedArgs!.DurationMs);
Assert.Equal(10, receivedArgs.Fps);
Assert.Equal(1, receivedArgs.ScreenIndex);
var json = JsonSerializer.Serialize(res.Payload);
using var doc = JsonDocument.Parse(json);
var root = doc.RootElement;
Assert.Equal("mp4", root.GetProperty("format").GetString());
Assert.Equal("vid", root.GetProperty("base64").GetString());
Assert.Equal(2000, root.GetProperty("durationMs").GetInt32());
Assert.Equal(10, root.GetProperty("fps").GetInt32());
Assert.Equal(1, root.GetProperty("screenIndex").GetInt32());
Assert.Equal(1920, root.GetProperty("width").GetInt32());
Assert.Equal(1080, root.GetProperty("height").GetInt32());
Assert.False( root.GetProperty("hasAudio").GetBoolean());
}
[Fact]
public async Task Record_UsesDefaults_WhenArgsMissing()
{
var cap = new ScreenCapability(NullLogger.Instance);
ScreenRecordArgs? receivedArgs = null;
cap.RecordRequested += (args) =>
{
receivedArgs = args;
return Task.FromResult(new ScreenRecordResult());
};
var req = new NodeInvokeRequest { Id = "sr3", Command = "screen.record", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.True(res.Ok);
Assert.Equal(5000, receivedArgs!.DurationMs);
Assert.Equal(10, receivedArgs.Fps);
Assert.Equal(0, receivedArgs.ScreenIndex);
}
[Fact]
public async Task Record_UsesMonitorAlias_ForScreenIndex()
{
var cap = new ScreenCapability(NullLogger.Instance);
ScreenRecordArgs? receivedArgs = null;
cap.RecordRequested += (args) =>
{
receivedArgs = args;
return Task.FromResult(new ScreenRecordResult());
};
var req = new NodeInvokeRequest
{
Id = "sr4",
Command = "screen.record",
Args = Parse("""{"monitor":2}""")
};
var res = await cap.ExecuteAsync(req);
Assert.True(res.Ok);
Assert.Equal(2, receivedArgs!.ScreenIndex);
}
[Fact]
public async Task Record_ReturnsError_WhenHandlerThrows()
{
var cap = new ScreenCapability(NullLogger.Instance);
cap.RecordRequested += (args) => throw new InvalidOperationException("GPU capture failed");
var req = new NodeInvokeRequest { Id = "sr5", Command = "screen.record", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("GPU capture failed", res.Error);
}
// ── screen.record.start ────────────────────────────────────────────────────
[Fact]
public void CanHandle_RecordStartStop()
{
var cap = new ScreenCapability(NullLogger.Instance);
Assert.True(cap.CanHandle("screen.record.start"));
Assert.True(cap.CanHandle("screen.record.stop"));
Assert.False(cap.CanHandle("screen.record.pause"));
}
[Fact]
public async Task Start_ReturnsError_WhenNoHandler()
{
var cap = new ScreenCapability(NullLogger.Instance);
var req = new NodeInvokeRequest { Id = "ss1", Command = "screen.record.start", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("not available", res.Error!, StringComparison.OrdinalIgnoreCase);
}
[Fact]
public async Task Start_CallsHandler_WithArgs_AndReturnsRecordingId()
{
var cap = new ScreenCapability(NullLogger.Instance);
ScreenRecordStartArgs? receivedArgs = null;
cap.StartRequested += args =>
{
receivedArgs = args;
return Task.FromResult("abc123");
};
var req = new NodeInvokeRequest
{
Id = "ss2",
Command = "screen.record.start",
Args = Parse("""{"fps":15,"screenIndex":2}""")
};
var res = await cap.ExecuteAsync(req);
Assert.True(res.Ok);
Assert.NotNull(receivedArgs);
Assert.Equal(15, receivedArgs!.Fps);
Assert.Equal(2, receivedArgs.ScreenIndex);
var json = JsonSerializer.Serialize(res.Payload);
using var doc = JsonDocument.Parse(json);
Assert.Equal("abc123", doc.RootElement.GetProperty("recordingId").GetString());
}
[Fact]
public async Task Start_UsesMonitorAlias_ForScreenIndex()
{
var cap = new ScreenCapability(NullLogger.Instance);
ScreenRecordStartArgs? receivedArgs = null;
cap.StartRequested += args => { receivedArgs = args; return Task.FromResult("id1"); };
var req = new NodeInvokeRequest
{
Id = "ss3",
Command = "screen.record.start",
Args = Parse("""{"monitor":1}""")
};
await cap.ExecuteAsync(req);
Assert.Equal(1, receivedArgs!.ScreenIndex);
}
[Fact]
public async Task Start_ReturnsError_WhenHandlerThrows()
{
var cap = new ScreenCapability(NullLogger.Instance);
cap.StartRequested += _ => throw new InvalidOperationException("D3D init failed");
var req = new NodeInvokeRequest { Id = "ss4", Command = "screen.record.start", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("D3D init failed", res.Error);
}
// ── screen.record.stop ─────────────────────────────────────────────────────
[Fact]
public async Task Stop_ReturnsError_WhenNoHandler()
{
var cap = new ScreenCapability(NullLogger.Instance);
var req = new NodeInvokeRequest
{
Id = "st1",
Command = "screen.record.stop",
Args = Parse("""{"recordingId":"abc"}""")
};
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("not available", res.Error!, StringComparison.OrdinalIgnoreCase);
}
[Fact]
public async Task Stop_ReturnsError_WhenMissingRecordingId()
{
var cap = new ScreenCapability(NullLogger.Instance);
cap.StopRequested += _ => Task.FromResult(new ScreenRecordResult());
var req = new NodeInvokeRequest { Id = "st2", Command = "screen.record.stop", Args = Parse("""{}""") };
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("recordingId", res.Error!, StringComparison.OrdinalIgnoreCase);
}
[Fact]
public async Task Stop_CallsHandler_WithRecordingId_AndReturnsFullPayload()
{
var cap = new ScreenCapability(NullLogger.Instance);
string? receivedId = null;
cap.StopRequested += id =>
{
receivedId = id;
return Task.FromResult(new ScreenRecordResult
{
Format = "mp4",
Base64 = "dGVzdA==",
DurationMs = 3200,
Fps = 15,
ScreenIndex = 1,
Width = 1920,
Height = 1080,
HasAudio = false,
});
};
var req = new NodeInvokeRequest
{
Id = "st3",
Command = "screen.record.stop",
Args = Parse("""{"recordingId":"myRecId"}""")
};
var res = await cap.ExecuteAsync(req);
Assert.True(res.Ok);
Assert.Equal("myRecId", receivedId);
var json = JsonSerializer.Serialize(res.Payload);
using var doc = JsonDocument.Parse(json);
var p = doc.RootElement;
Assert.Equal("mp4", p.GetProperty("format").GetString());
Assert.Equal("dGVzdA==", p.GetProperty("base64").GetString());
Assert.Equal(3200, p.GetProperty("durationMs").GetInt32());
Assert.Equal(15, p.GetProperty("fps").GetInt32());
Assert.Equal(1, p.GetProperty("screenIndex").GetInt32());
Assert.Equal(1920, p.GetProperty("width").GetInt32());
Assert.Equal(1080, p.GetProperty("height").GetInt32());
Assert.False( p.GetProperty("hasAudio").GetBoolean());
}
[Fact]
public async Task Stop_ReturnsError_WhenHandlerThrows()
{
var cap = new ScreenCapability(NullLogger.Instance);
cap.StopRequested += _ => throw new KeyNotFoundException("session not found");
var req = new NodeInvokeRequest
{
Id = "st4",
Command = "screen.record.stop",
Args = Parse("""{"recordingId":"bad"}""")
};
var res = await cap.ExecuteAsync(req);
Assert.False(res.Ok);
Assert.Contains("session not found", res.Error);
}
}
public class CameraCapabilityTests