feat: Add explicit communication instructions for o3 model

- Enhanced system prompt to explicitly request thinking out loud
- Added instructions for models to share their reasoning process
- Increased temperature for o3 model to encourage more verbose output
- Set maxTokens to 4096 to ensure room for explanations

This should help make o3's thought process visible to users.
This commit is contained in:
Peter Steinberger 2025-07-26 17:31:26 +02:00
parent f30b53b32c
commit ffcfaa052d
4 changed files with 204 additions and 24 deletions

View File

@ -38,9 +38,23 @@ npm run poltergeist:haunt
Once Poltergeist is running, you can assume the CLI is always fresh and up-to-date. No manual rebuilding is needed!
If you encounter a "build staleness" error when running the CLI:
1. Wait 1 second for the rebuild to complete
2. Try running the command again
**IMPORTANT**: Always use the smart wrapper script instead of calling the CLI directly:
```bash
# WRONG: ./peekaboo command
# RIGHT: ./scripts/peekaboo-wait.sh command
```
The wrapper automatically:
- Checks if the binary is fresh (newer than Swift sources)
- Waits for any ongoing Poltergeist rebuilds (up to 30 seconds)
- Runs the CLI once ready
This eliminates "build staleness" errors and ensures you're always using the latest code.
For debugging the wrapper, set `PEEKABOO_WAIT_DEBUG=true`:
```bash
PEEKABOO_WAIT_DEBUG=true ./scripts/peekaboo-wait.sh list apps
```
Poltergeist watches all Swift source files and automatically rebuilds when changes are detected.
@ -208,29 +222,35 @@ peekaboo-mcp
```
### Using the Swift CLI directly
**ALWAYS use the smart wrapper to avoid build staleness issues:**
```bash
# Capture screenshots
./Apps/CLI/.build/debug/peekaboo image --app "Safari" --path screenshot.png
./Apps/CLI/.build/debug/peekaboo image --mode frontmost --path screenshot.png
# Create a convenient alias (add to your shell profile)
alias pb='./scripts/peekaboo-wait.sh'
# Then use it like:
pb image --app "Safari" --path screenshot.png
pb image --mode frontmost --path screenshot.png
# List applications or windows
./Apps/CLI/.build/debug/peekaboo list apps --json-output
./Apps/CLI/.build/debug/peekaboo list windows --app "Finder" --json-output
pb list apps --json-output
pb list windows --app "Finder" --json-output
# Analyze images with AI (NEW)
PEEKABOO_AI_PROVIDERS="openai/gpt-4.1" ./Apps/CLI/.build/debug/peekaboo analyze image.png "What is shown in this image?"
PEEKABOO_AI_PROVIDERS="ollama/llava:latest" ./Apps/CLI/.build/debug/peekaboo analyze image.png "Describe this screenshot" --json-output
# Analyze images with AI
PEEKABOO_AI_PROVIDERS="openai/gpt-4.1" pb analyze image.png "What is shown in this image?"
PEEKABOO_AI_PROVIDERS="ollama/llava:latest" pb analyze image.png "Describe this screenshot" --json-output
# Use multiple AI providers (auto-selects first available)
PEEKABOO_AI_PROVIDERS="openai/gpt-4.1,ollama/llava:latest" ./Apps/CLI/.build/debug/peekaboo analyze image.png "What application is this?"
PEEKABOO_AI_PROVIDERS="openai/gpt-4.1,ollama/llava:latest" pb analyze image.png "What application is this?"
# Configuration management (UPDATED)
./Apps/CLI/.build/debug/peekaboo config init # Create default config file
./Apps/CLI/.build/debug/peekaboo config show # Display current config
./Apps/CLI/.build/debug/peekaboo config show --effective # Show merged configuration
./Apps/CLI/.build/debug/peekaboo config edit # Edit config in default editor
./Apps/CLI/.build/debug/peekaboo config validate # Validate config syntax
./Apps/CLI/.build/debug/peekaboo config set-credential KEY VALUE # Set API key securely
# Configuration management
pb config init # Create default config file
pb config show # Display current config
pb config show --effective # Show merged configuration
pb config edit # Edit config in default editor
pb config validate # Validate config syntax
pb config set-credential KEY VALUE # Set API key securely
```
### Agent Command (NEW)

View File

@ -137,6 +137,8 @@ public final class PeekabooAgentService: AgentServiceProtocol {
tools: createPeekabooTools(),
modelSettings: ModelSettings(
modelName: modelName,
temperature: modelName == "o3" ? 0.7 : nil, // Slightly higher temperature for o3 to encourage more output
maxTokens: 4096, // Ensure we have room for explanations
toolChoice: .auto // Let model decide when to use tools
),
description: "An AI assistant for macOS automation using Peekaboo"
@ -641,14 +643,26 @@ public final class PeekabooAgentService: AgentServiceProtocol {
IMPORTANT: You MUST use the provided tools to accomplish tasks. Do not describe what you would do - actually do it using the tools.
## Communication Style
You MUST communicate with the user throughout the process. Share your thinking and reasoning:
- ALWAYS explain your thought process before using tools
- State what you're about to do and why (e.g., "I need to see what's on screen, so I'll take a screenshot")
- After each tool completes, interpret the results and explain your next steps
- Share your reasoning when making decisions
- Think out loud - the user wants to understand your decision-making process
- Even if you're very capable, explain your approach so the user can follow along
## Critical Guidelines
1. **Be Resilient**: If a tool fails, try alternative approaches. Don't give up at the first error.
2. **Verify UI State**: ALWAYS take a screenshot after launching apps to see their current state (dialogs, intro screens, etc.)
3. **Complete ALL Tasks**: Read the user's request carefully and ensure you complete EVERY part, including any specific phrases they want you to say.
4. **Error Recovery**: When operations fail, analyze why and adapt your approach. If AppleScript fails, check your quoting!
5. **Dialog Handling**: Apps often show intro/welcome dialogs. Take a screenshot to see them, then click "Continue", "Get Started", or dismiss them.
6. **Final Response**: ALWAYS end with what you accomplished, what failed, and ANY requested output (like specific phrases the user wants).
2. **Communicate Your Actions**: Before using each tool, briefly explain what you're about to do and why. This helps the user understand your process.
3. **Verify UI State**: ALWAYS take a screenshot after launching apps to see their current state (dialogs, intro screens, etc.)
4. **Complete ALL Tasks**: Read the user's request carefully and ensure you complete EVERY part, including any specific phrases they want you to say.
5. **Error Recovery**: When operations fail, analyze why and adapt your approach. If AppleScript fails, check your quoting!
6. **Dialog Handling**: Apps often show intro/welcome dialogs. Take a screenshot to see them, then click "Continue", "Get Started", or dismiss them.
7. **Progress Updates**: After completing significant steps, briefly summarize what was accomplished before moving to the next step.
8. **Final Response**: ALWAYS end with what you accomplished, what failed, and ANY requested output (like specific phrases the user wants).
## Your Capabilities

Binary file not shown.

146
scripts/peekaboo-wait.sh Executable file
View File

@ -0,0 +1,146 @@
#!/bin/bash
# Smart CLI Wrapper for Peekaboo
# Automatically waits for Poltergeist rebuilds to complete before running
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
BINARY_PATH="$PROJECT_ROOT/peekaboo"
BUILD_LOCK="/tmp/peekaboo-swift-build.lock"
POLTERGEIST_LOG="$PROJECT_ROOT/.poltergeist.log"
MAX_WAIT=30 # Maximum seconds to wait for build
DEBUG="${PEEKABOO_WAIT_DEBUG:-false}"
# Debug logging
debug_log() {
if [ "$DEBUG" = "true" ]; then
echo "[peekaboo-wait] $1" >&2
fi
}
# Function to check if binary is newer than all Swift sources
is_binary_fresh() {
if [ ! -f "$BINARY_PATH" ]; then
debug_log "Binary not found at $BINARY_PATH"
return 1
fi
# Get binary modification time
if [[ "$OSTYPE" == "darwin"* ]]; then
BINARY_TIME=$(stat -f "%m" "$BINARY_PATH" 2>/dev/null)
else
BINARY_TIME=$(stat -c "%Y" "$BINARY_PATH" 2>/dev/null)
fi
debug_log "Binary modification time: $BINARY_TIME"
# Find newest Swift file modification time
NEWEST_SWIFT=0
while IFS= read -r -d '' file; do
if [[ "$OSTYPE" == "darwin"* ]]; then
FILE_TIME=$(stat -f "%m" "$file" 2>/dev/null)
else
FILE_TIME=$(stat -c "%Y" "$file" 2>/dev/null)
fi
if [ "$FILE_TIME" -gt "$NEWEST_SWIFT" ]; then
NEWEST_SWIFT=$FILE_TIME
NEWEST_FILE="$file"
fi
done < <(find "$PROJECT_ROOT/Core/PeekabooCore/Sources" "$PROJECT_ROOT/Core/AXorcist/Sources" "$PROJECT_ROOT/Apps/CLI/Sources" -name "*.swift" -type f -print0 2>/dev/null)
debug_log "Newest Swift file: $NEWEST_FILE (time: $NEWEST_SWIFT)"
# Binary is fresh if it's newer than all Swift files
if [ "$BINARY_TIME" -ge "$NEWEST_SWIFT" ]; then
debug_log "Binary is fresh"
return 0
else
debug_log "Binary is stale (older than Swift sources)"
return 1
fi
}
# Function to check if a build is running
is_build_running() {
if [ -f "$BUILD_LOCK" ]; then
PID=$(cat "$BUILD_LOCK" 2>/dev/null)
if [ -n "$PID" ] && ps -p "$PID" > /dev/null 2>&1; then
return 0
else
# Stale lock file
debug_log "Removing stale build lock (PID $PID not running)"
rm -f "$BUILD_LOCK"
fi
fi
return 1
}
# Function to check if Poltergeist is active
is_poltergeist_active() {
# Check if there's recent activity in the log (within last 5 seconds)
if [ -f "$POLTERGEIST_LOG" ]; then
if [[ "$OSTYPE" == "darwin"* ]]; then
LOG_TIME=$(stat -f "%m" "$POLTERGEIST_LOG" 2>/dev/null)
CURRENT_TIME=$(date +%s)
else
LOG_TIME=$(stat -c "%Y" "$POLTERGEIST_LOG" 2>/dev/null)
CURRENT_TIME=$(date +%s)
fi
TIME_DIFF=$((CURRENT_TIME - LOG_TIME))
if [ "$TIME_DIFF" -le 5 ]; then
debug_log "Poltergeist is actively working (log updated ${TIME_DIFF}s ago)"
return 0
fi
fi
return 1
}
# Main logic
debug_log "Starting peekaboo-wait wrapper"
debug_log "Binary path: $BINARY_PATH"
debug_log "Build lock: $BUILD_LOCK"
# First, check if binary is already fresh
if is_binary_fresh; then
debug_log "Binary is fresh, executing immediately"
exec "$BINARY_PATH" "$@"
fi
# Binary is stale, wait for any ongoing build
debug_log "Binary is stale, checking for ongoing builds"
wait_count=0
while is_build_running && [ $wait_count -lt $MAX_WAIT ]; do
if [ $wait_count -eq 0 ]; then
echo "⏳ Waiting for Poltergeist to finish rebuilding..." >&2
fi
sleep 1
((wait_count++))
# Show progress every 5 seconds
if [ $((wait_count % 5)) -eq 0 ]; then
echo " Still waiting... (${wait_count}s)" >&2
fi
done
if [ $wait_count -ge $MAX_WAIT ]; then
echo "⚠️ Build is taking too long (>${MAX_WAIT}s). Running anyway..." >&2
fi
# If Poltergeist is actively working, give it a moment more
if is_poltergeist_active; then
debug_log "Poltergeist is active, waiting 2 more seconds"
sleep 2
fi
# Final freshness check
if is_binary_fresh; then
debug_log "Binary is now fresh after waiting"
else
debug_log "Binary might still be stale, but proceeding"
# If the binary exists but is stale, Poltergeist should pick it up
# We'll run it anyway to avoid blocking
fi
# Execute the binary
debug_log "Executing: $BINARY_PATH $*"
exec "$BINARY_PATH" "$@"