style: reformat shared modules and guides

This commit is contained in:
Peter Steinberger 2025-11-05 12:23:48 +00:00
parent 142111ae04
commit a5be930bd7
27 changed files with 1039 additions and 1381 deletions

View File

@ -1,4 +1,4 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(

294
CLAUDE.md
View File

@ -1,294 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Philosophy
**NEVER PUBLISH TO NPM WITHOUT EXPLICIT PERMISSION**: Under no circumstances should you publish any packages to npm or any other public registry without explicit permission from the user. This is a critical security and trust boundary that must never be crossed.
**No Backwards Compatibility**: We never care about backwards compatibility. We prioritize clean, modern code and user experience over maintaining legacy support. Breaking changes are acceptable and expected as the project evolves. This includes removing deprecated code, changing APIs freely, and not supporting legacy formats or approaches.
**No "Modern" or Version Suffixes**: When refactoring, never use names like "Modern", "New", "V2", etc. Simply refactor the existing things in place. If we are doing a refactor, we want to replace the old implementation completely, not create parallel versions. Use the idiomatic name that the API should have.
**Strong Typing Over Type Erasure**: We strongly prefer type-safe code over type-erased patterns. Avoid using `AnyCodable`, `[String: Any]`, `AnyObject`, or similar type-erased containers. Instead:
- Use enums with associated values for heterogeneous types
- Create specific types for data structures
- Use generics where appropriate
- Prefer compile-time type checking over runtime casting
**Modern Swift Patterns**: Follow modern Swift/SwiftUI patterns:
- Use `@Observable` (iOS 17+/macOS 14+) instead of `ObservableObject`
- Avoid unnecessary ViewModels - keep state in views when appropriate
- Use `@State` and `@Environment` for dependency injection
- Embrace SwiftUI's declarative nature, don't fight the framework
- See `/Users/steipete/Projects/vibetunnel/apple/docs/modern-swift.md` for details
**Minimum macOS Version**: This project targets macOS 14.0 (Sonoma) and later. Do not add availability checks for macOS versions below 14.0.
**Direct API Over Subprocess**: Always prefer using PeekabooCore services directly instead of spawning CLI subprocesses. The migration to direct API calls improves performance by ~10x and provides better type safety.
**Ollama Timeout Requirements**: When testing Ollama integration, use longer timeouts (300000ms or 5+ minutes) for Bash tool commands, as Ollama can be slow to load models and process requests, especially on first use.
**Claude Opus 4.1 Availability**: Claude Opus 4.1 (model ID: `claude-opus-4-1-20250805`) is currently available and working. This is not a future model - it exists and functions properly as of August 2025.
**GPT-5 Availability**: GPT-5 (model ID: `gpt-5`) was released on August 7, 2025. It is now the default OpenAI model for Peekaboo agent tasks. The API offers three sizes: `gpt-5` (best for logic and multi-step tasks, 74.9% on SWE-bench), `gpt-5-mini` (cost-optimized), and `gpt-5-nano` (ultra-low latency). All models support 400K total context (272K input + 128K output tokens).
**GPT-5 Preamble Messages**: When instructed, GPT-5 outputs user-visible preamble messages before and between tool calls to update users on progress during longer agentic tasks. This makes complex operations more transparent by showing the AI's plan and progress at each step.
**GPT-5 Responses API**: GPT-5 uses OpenAI's Responses API (`/v1/responses`) which provides persisted reasoning across tool calls, leading to more coherent and efficient outputs. This API supports `reasoning_effort` (minimal/low/medium/high) and `verbosity` (low/medium/high) parameters for fine-tuned control.
**File Headers**: Use minimal file headers without author attribution or creation dates:
- Swift files: `//\n// FileName.swift\n// PeekabooCore\n//` (adapt module name: PeekabooCore, AXorcist, etc.)
- TypeScript files: `//\n// filename.ts\n// Peekaboo\n//`
- Omit "Created by" comments and dates to keep headers clean and focused
To test this project interactive we can use:
`PEEKABOO_AI_PROVIDERS="ollama/llava:latest" npx @modelcontextprotocol/inspector npx -y @steipete/peekaboo-mcp@beta`
## Binary Location and Version Checking
**CRITICAL: Always use `polter peekaboo` to ensure fresh builds!**
1. **Check the build timestamp**: Every Peekaboo execution shows when it was compiled:
```
Peekaboo 3.0.0-beta.1 (main/bdbaf32-dirty, 2025-07-28 17:13:41 +0200)
```
If the timestamp is older than your recent changes, the binary is stale!
2. **Expected binary location**: `/Users/steipete/Projects/Peekaboo/peekaboo` (project root)
- This is where Poltergeist puts the binary
- Always use `polter peekaboo` to run it (ensures fresh builds)
- If you see binaries in other locations, they might be outdated
3. **Verify before testing**:
```bash
# Check version and timestamp
polter peekaboo --version
```
## Quick Reference
```bash
# Core commands
polter peekaboo <command> # Run CLI with automatic rebuild
./scripts/pblog.sh -f # Stream logs
npm run poltergeist:status # Check build status
alias pb='polter peekaboo' # Add to ~/.zshrc for convenience
# Examples
polter peekaboo agent "take screenshot"
polter peekaboo list apps
polter peekaboo see --annotate
# NEVER use:
# ./peekaboo # May run stale binary
# ./scripts/peekaboo-wait.sh # Redundant wrapper, use polter directly
```
## Poltergeist Usage
**polter runs binaries, NOT commands. Poltergeist auto-builds when files change.**
### Commands
```bash
npm run poltergeist:status # Check if running & build status
npm run poltergeist:haunt # Start auto-builder
npm run poltergeist:stop # Stop auto-builder
polter peekaboo <args> # Run CLI (waits for fresh build)
```
### NEVER
- `polter wait` - doesn't exist
- `npm run build:swift` - Poltergeist does this automatically
- `./peekaboo` - use `polter peekaboo` for fresh builds
- `./scripts/peekaboo-wait.sh` - redundant wrapper, use `polter peekaboo` directly
### Workflow
1. Start: `npm run poltergeist:haunt`
2. Edit files → Poltergeist rebuilds automatically
3. Run: `polter peekaboo <command>`
### Build Failures
Exit code 42 = build failed. Fix: `npm run build:swift` once, then continue.
### State
- Location: `/tmp/poltergeist/{project}-{hash}-{target}.state`
- Contains: build status, timestamps, process info
### SPM Issues
Clean caches if corrupted: `rm -rf ~/Library/Developer/Xcode/DerivedData/* ~/Library/Caches/org.swift.swiftpm`
## Common Commands
### Building
#### Building the Mac App
# important-instruction-reminders
Do what has been asked; nothing more, nothing less.
NEVER create files unless they're absolutely necessary for achieving your goal.
ALWAYS prefer editing an existing file to creating a new one.
NEVER proactively create documentation files (*.md) or README files. Only create documentation files if explicitly requested by the User.
NEVER use AnyCodable anywhere in the codebase. We are actively removing all usage of AnyCodable. If you encounter a need for type-erased encoding/decoding, create proper typed structs instead. This is a critical architectural decision - AnyCodable leads to type-unsafe code and we've spent significant effort removing it.
NEVER open Xcode projects or workspaces - the user already has them open. Use polter or xcodebuild to verify builds.
Stay professional in code comments - avoid casual phrases like "FIXED VERSION" or "NEW AND IMPROVED". Keep comments technical and descriptive.
NEVER create duplicate files with suffixes like "Fixed", "Enhanced", "New", etc. Always work on the existing files. If a file needs fixes, fix it in place. Creating duplicates creates confusion and maintenance burden.
## Playground Testing Methodology
When asked to test CLI tools with the Playground app, follow the comprehensive testing methodology documented in `/docs/playground-testing.md`. Key points:
1. **Systematic Testing**: Test EVERY command exhaustively
2. **Documentation First**: Always read `--help` and source code
3. **Log Monitoring**: Check playground logs after each command
4. **Bug Tracking**: Document all issues in `Apps/Playground/PLAYGROUND_TEST.md`
5. **Fix and Verify**: Apply fixes and retest until working
The Playground app is specifically designed for testing Peekaboo's automation capabilities with various UI elements and logging to verify command execution.
## Agent Log Debug Mode
When the user types "agent log debug", analyze Peekaboo CLI logs to identify bugs and improvement opportunities. The goal is to make Peekaboo more agent-friendly.
**What to Look For:**
1. **Common Agent Mistakes**:
- Missing required parameters or incorrect parameter usage
- Misunderstanding of command syntax or options
- Attempting unsupported operations
- Confusion about tool capabilities or limitations
2. **Actual Bugs**:
- Crashes, errors, or unexpected behavior
- Missing functionality that should exist
- Performance issues or timeouts
- Inconsistent behavior across similar commands
3. **UX Improvements**:
- Unclear error messages that could be more helpful
- Missing hints or suggestions when agents make mistakes
- Opportunities to add guardrails or validation
- Places where agents get stuck in loops or retry patterns
4. **Missing Features**:
- Common operations that require multiple steps but could be simplified
- Patterns where agents work around limitations
- Frequently attempted unsupported commands
**How to Analyze:**
1. Read through the entire log systematically
2. Identify patterns of confusion or repeated attempts
3. Note any error messages that could be clearer
4. Look for places where the agent had to guess or try multiple approaches
5. Consider what helpful messages or features would have prevented issues
**Output Format:**
- List specific bugs found with reproduction steps
- Suggest concrete improvements to error messages
- Recommend new features or commands based on agent behavior
- Propose additions to system/tool prompts to guide future agents
- Prioritize fixes by impact on agent experience
## AXorcist Integration
- **Always use AXorcist APIs** rather than raw accessibility APIs
- **We can modify AXorcist** - Enhance the library directly when needed
- **You are encouraged to improve AXorcist** - When you encounter missing functionality (like `element.label()` not being available), add it to AXorcist rather than working around it
- **Move generic functionality to AXorcist** - If you have functionality in PeekabooCore that is generic enough to be useful for any accessibility automation, move it to AXorcist
- Use `Element` wrapper, typed attributes, and enum-based actions
- All Element methods are `@MainActor`
## Swift Testing Framework
**IMPORTANT**: Use Swift Testing (Xcode 16+), NOT XCTest:
- Import `Testing` not `XCTest`
- Use `@Test` attribute and `#expect()` macros
- See `/docs/swift-testing-playbook.md` for migration guide
## Debugging with pblog
pblog monitors logs from ALL Peekaboo apps and services:
```bash
# Show recent logs (default: last 50 lines from past 5 minutes)
./scripts/pblog.sh
# Stream logs continuously
./scripts/pblog.sh -f
# Show only errors
./scripts/pblog.sh -e
# Debug element detection issues
./scripts/pblog.sh -c ElementDetectionService -d
# Monitor specific subsystem
./scripts/pblog.sh --subsystem boo.peekaboo.core
# Search for specific text
./scripts/pblog.sh -s "Dialog" -n 100
```
See `./scripts/README-pblog.md` for full documentation.
Also available: `./scripts/playground-log.sh` for quick Playground-only logs.
## Agent System and Tool Prompts
### System Prompt
The agent system prompt is defined in `/Core/PeekabooCore/Sources/PeekabooCore/Services/Agent/PeekabooAgentService.swift` in the `generateSystemPrompt()` method (around line 875). This prompt contains:
- Communication style requirements
- Task completion guidelines
- Window management strategies
- Dialog interaction patterns
- Error recovery approaches
### Tool Prompts
Individual tool descriptions are defined in the same file (`PeekabooAgentService.swift`) in their respective creation methods:
- `createSeeTool()` - Primary screen capture and UI analysis
- `createShellTool()` - Shell command execution with quote handling examples
- `createMenuClickTool()` - Menu navigation with error guidance
- `createDialogInputTool()` - Dialog interaction with common issues
- `createFocusWindowTool()` - Window focusing with app state detection
- And many more...
When modifying agent behavior, update these prompts to guide the AI's responses and tool usage patterns.
## Troubleshooting
### Permission Errors
- **Screen Recording**: Grant in System Settings → Privacy & Security → Screen Recording
- **Accessibility**: Grant in System Settings → Privacy & Security → Accessibility
### Common Issues
- **Window capture hangs**: Use `PEEKABOO_USE_MODERN_CAPTURE=false`
- **API key issues**: Run `./peekaboo config set-credential OPENAI_API_KEY sk-...`
- **Build fails**: See Swift Package Manager troubleshooting section above
## SwiftUI App Delegate Pattern
**IMPORTANT**: In SwiftUI apps, `NSApp.delegate as? AppDelegate` does NOT work! SwiftUI manages its own internal app delegate, and the `@NSApplicationDelegateAdaptor` property wrapper doesn't make the delegate accessible via `NSApp.delegate`.
**Wrong approach**:
```swift
if let appDelegate = NSApp.delegate as? AppDelegate {
// This will always fail in SwiftUI apps!
}
```
**Correct approaches**:
1. Use notifications to communicate between components
2. Pass the AppDelegate through environment values
3. Use shared singleton patterns for app-wide services
4. Store references in accessible places during initialization
# important-instruction-reminders
Do what has been asked; nothing more, nothing less.
NEVER create files unless they're absolutely necessary for achieving your goal.
ALWAYS prefer editing an existing file to creating a new one.
NEVER proactively create documentation files (*.md) or README files. Only create documentation files if explicitly requested by the User.

1
CLAUDE.md Symbolic link
View File

@ -0,0 +1 @@
AGENTS.md

View File

@ -1,4 +1,4 @@
// swift-tools-version:6.0
// swift-tools-version: 6.2
// The swift-tools-version declares the minimum version of Swift required to build this package.
import PackageDescription

View File

@ -60,7 +60,7 @@ public struct AXValueWrapper: Codable, Sendable, Equatable {
private static func recursivelySanitize(_ item: Any?) -> Any {
return recursivelySanitizeWithDepth(item, depth: 0, visited: Set<ObjectIdentifier>())
}
// Convert sanitized Any value to AttributeValue
private static func convertToAttributeValue(_ value: Any) -> AttributeValue? {
switch value {

View File

@ -1,17 +1,16 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(
name: "PeekabooExternalDependencies",
platforms: [
.macOS(.v14)
.macOS(.v14),
],
products: [
.library(
name: "PeekabooExternalDependencies",
targets: ["PeekabooExternalDependencies"]
),
targets: ["PeekabooExternalDependencies"]),
],
dependencies: [
// External dependencies centralized here
@ -33,8 +32,6 @@ let package = Package(
],
swiftSettings: [
.enableExperimentalFeature("StrictConcurrency"),
]
),
]),
],
swiftLanguageModes: [.v6]
)
swiftLanguageModes: [.v6])

View File

@ -6,9 +6,9 @@
// Re-export all external dependencies for easy access
// This centralizes version management and provides a single import point
@_exported import AXorcist
@_exported import AsyncAlgorithms
@_exported import ArgumentParser
@_exported import AsyncAlgorithms
@_exported import AXorcist
@_exported import Logging
@_exported import SystemPackage
@ -20,7 +20,7 @@ public enum DependencyInfo {
public static let argumentParserVersion = "1.3.0"
public static let swiftLogVersion = "1.5.3"
public static let swiftSystemVersion = "1.3.0"
public static var allDependencies: [String: String] {
[
"AXorcist": axorcistVersion,
@ -41,4 +41,4 @@ public enum DependencyConfiguration {
// Add any necessary configuration for external dependencies here
// For example, setting up default loggers, configuring HTTP clients, etc.
}
}
}

View File

@ -1,4 +1,4 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
@ -24,4 +24,4 @@ let package = Package(
name: "PeekabooFoundationTests",
dependencies: ["PeekabooFoundation"]),
],
swiftLanguageModes: [.v6])
swiftLanguageModes: [.v6])

View File

@ -59,7 +59,7 @@ public enum ModifierKey: String, Sendable {
case command = "cmd"
case control = "ctrl"
case option = "alt"
case shift = "shift"
case shift
case function = "fn"
}
@ -127,4 +127,4 @@ extension ScrollDirection: CustomStringConvertible {
case .right: "right"
}
}
}
}

View File

@ -28,7 +28,6 @@ extension Error {
public func asPeekabooError(
context: String) -> PeekabooError
{
// Try to preserve specific PeekabooError types
if let peekabooError = self as? PeekabooError {
return peekabooError

View File

@ -1,17 +1,16 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(
name: "PeekabooProtocols",
platforms: [
.macOS(.v14)
.macOS(.v14),
],
products: [
.library(
name: "PeekabooProtocols",
targets: ["PeekabooProtocols"]
),
targets: ["PeekabooProtocols"]),
],
dependencies: [
.package(path: "../PeekabooFoundation"),
@ -25,15 +24,12 @@ let package = Package(
swiftSettings: [
.unsafeFlags([
"-Xfrontend", "-warn-long-function-bodies=50",
"-Xfrontend", "-warn-long-expression-type-checking=50"
"-Xfrontend", "-warn-long-expression-type-checking=50",
], .when(configuration: .debug)),
.enableExperimentalFeature("StrictConcurrency"),
]
),
]),
.testTarget(
name: "PeekabooProtocolsTests",
dependencies: ["PeekabooProtocols"]
),
dependencies: ["PeekabooProtocols"]),
],
swiftLanguageModes: [.v6]
)
swiftLanguageModes: [.v6])

View File

@ -14,7 +14,7 @@ public protocol ObservablePermissionsServiceProtocol: AnyObject {
var accessibilityStatus: PermissionState { get }
var appleScriptStatus: PermissionState { get }
var hasAllPermissions: Bool { get }
func checkPermissions()
func requestPermissions() async
}
@ -36,8 +36,8 @@ public protocol ToolFormatterProtocol {
public struct ToolOutput: Sendable {
public let tool: String
public let result: String
public let metadata: [String: String] // Changed from Any to String for Sendable conformance
public let metadata: [String: String] // Changed from Any to String for Sendable conformance
public init(tool: String, result: String, metadata: [String: String] = [:]) {
self.tool = tool
self.result = result
@ -95,7 +95,7 @@ public struct ConversationSession: Sendable {
public let id: String
public let startedAt: Date
public let messages: [ConversationMessage]
public init(id: String, startedAt: Date, messages: [ConversationMessage] = []) {
self.id = id
self.startedAt = startedAt
@ -107,10 +107,10 @@ public struct ConversationMessage: Sendable {
public let role: String
public let content: String
public let timestamp: Date
public init(role: String, content: String, timestamp: Date) {
self.role = role
self.content = content
self.timestamp = timestamp
}
}
}

View File

@ -73,7 +73,8 @@ public protocol MenuServiceProtocol: Sendable {
/// Protocol for process service operations
public protocol ProcessServiceProtocol: Sendable {
func runCommand(_ command: String, arguments: [String], environment: [String: String]?) async throws -> ProcessOutput
func runCommand(_ command: String, arguments: [String], environment: [String: String]?) async throws
-> ProcessOutput
func runShellCommand(_ command: String) async throws -> ProcessOutput
func killProcess(pid: Int32) async throws
func findProcess(name: String) async throws -> Int32?
@ -83,10 +84,10 @@ public struct ProcessOutput: Sendable {
public let stdout: String
public let stderr: String
public let exitCode: Int32
public init(stdout: String, stderr: String, exitCode: Int32) {
self.stdout = stdout
self.stderr = stderr
self.exitCode = exitCode
}
}
}

View File

@ -3,8 +3,8 @@
// PeekabooProtocols
//
import Foundation
import CoreGraphics
import Foundation
import PeekabooFoundation
// MARK: - UI Service Protocols
@ -22,7 +22,7 @@ public struct WindowInfo: Sendable {
public let title: String
public let appName: String
public let bounds: CGRect
public init(id: Int, title: String, appName: String, bounds: CGRect) {
self.id = id
self.title = title
@ -44,7 +44,7 @@ public struct ScreenInfo: Sendable {
public let frame: CGRect
public let visibleFrame: CGRect
public let scaleFactor: CGFloat
public init(id: Int, frame: CGRect, visibleFrame: CGRect, scaleFactor: CGFloat) {
self.id = id
self.frame = frame
@ -68,7 +68,7 @@ public struct SessionData: Sendable {
public let id: String
public let createdAt: Date
public let metadata: [String: String]
public init(id: String, createdAt: Date, metadata: [String: String] = [:]) {
self.id = id
self.createdAt = createdAt
@ -79,7 +79,7 @@ public struct SessionData: Sendable {
public struct DetectionResult: Sendable {
public let elements: ElementCollection
public let timestamp: Date
public init(elements: ElementCollection, timestamp: Date) {
self.elements = elements
self.timestamp = timestamp
@ -88,13 +88,13 @@ public struct DetectionResult: Sendable {
public struct ElementCollection: Sendable {
public let all: [DetectedElement]
public init(all: [DetectedElement]) {
self.all = all
}
public func findById(_ id: String) -> DetectedElement? {
all.first { $0.id == id }
self.all.first { $0.id == id }
}
}
@ -105,8 +105,15 @@ public struct DetectedElement: Sendable {
public let label: String?
public let value: String?
public let isEnabled: Bool
public init(id: String, type: ElementType, bounds: CGRect, label: String? = nil, value: String? = nil, isEnabled: Bool = true) {
public init(
id: String,
type: ElementType,
bounds: CGRect,
label: String? = nil,
value: String? = nil,
isEnabled: Bool = true)
{
self.id = id
self.type = type
self.bounds = bounds
@ -136,4 +143,4 @@ public protocol WindowManagementServiceProtocol: Sendable {
func moveWindow(id: Int, to point: CGPoint) async throws
func resizeWindow(id: Int, to size: CGSize) async throws
func getWindowInfo(id: Int) async throws -> WindowInfo?
}
}

View File

@ -1,4 +1,4 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(

View File

@ -1,10 +1,10 @@
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(
name: "TachikomaExamples",
platforms: [
.macOS(.v14)
.macOS(.v14),
],
products: [
// Individual executable examples
@ -13,14 +13,14 @@ let package = Package(
.executable(name: "TachikomaStreaming", targets: ["TachikomaStreaming"]),
.executable(name: "TachikomaAgent", targets: ["TachikomaAgent"]),
.executable(name: "TachikomaMultimodal", targets: ["TachikomaMultimodal"]),
// Shared utilities library
.library(name: "SharedExampleUtils", targets: ["SharedExampleUtils"]),
],
dependencies: [
// Local Tachikoma dependency
.package(path: "../Tachikoma"),
// External dependencies for examples
.package(url: "https://github.com/apple/swift-argument-parser", from: "1.0.0"),
.package(url: "https://github.com/jpsim/Yams.git", from: "5.0.0"),
@ -33,9 +33,8 @@ let package = Package(
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
.product(name: "Yams", package: "Yams"),
]
),
]),
// 1. TachikomaComparison - The killer demo
.executableTarget(
name: "TachikomaComparison",
@ -43,9 +42,8 @@ let package = Package(
"SharedExampleUtils",
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
]
),
]),
// 2. TachikomaBasics - Getting started
.executableTarget(
name: "TachikomaBasics",
@ -53,9 +51,8 @@ let package = Package(
"SharedExampleUtils",
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
]
),
]),
// 3. TachikomaStreaming - Real-time responses
.executableTarget(
name: "TachikomaStreaming",
@ -63,9 +60,8 @@ let package = Package(
"SharedExampleUtils",
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
]
),
]),
// 4. TachikomaAgent - Function calling and AI agents
.executableTarget(
name: "TachikomaAgent",
@ -73,9 +69,8 @@ let package = Package(
"SharedExampleUtils",
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
]
),
]),
// 5. TachikomaMultimodal - Vision + text processing
.executableTarget(
name: "TachikomaMultimodal",
@ -83,7 +78,5 @@ let package = Package(
"SharedExampleUtils",
.product(name: "Tachikoma", package: "Tachikoma"),
.product(name: "ArgumentParser", package: "swift-argument-parser"),
]
),
]
)
]),
])

View File

@ -19,43 +19,43 @@ public enum TerminalOutput {
case bold = "\u{001B}[1m"
case dim = "\u{001B}[2m"
}
/// Print colored text to terminal
public static func print(_ text: String, color: Color = .reset) {
Swift.print("\(color.rawValue)\(text)\(Color.reset.rawValue)")
}
/// Print a separator line
public static func separator(_ char: Character = "", length: Int = 80) {
Swift.print(String(repeating: char, count: length))
}
/// Print a section header
public static func header(_ title: String) {
separator("")
print(" \(title) ", color: .bold)
separator("")
self.separator("")
self.print(" \(title) ", color: .bold)
self.separator("")
}
/// Print provider name with emoji
public static func providerHeader(_ provider: String) {
let emoji = providerEmoji(provider)
print("\(emoji) \(provider)", color: .cyan)
let emoji = self.providerEmoji(provider)
self.print("\(emoji) \(provider)", color: .cyan)
}
/// Get emoji for provider - makes provider identification visual and fun
public static func providerEmoji(_ provider: String) -> String {
switch provider.lowercased() {
case let p where p.contains("openai") || p.contains("gpt"):
return "🤖" // OpenAI - robot/AI theme
"🤖" // OpenAI - robot/AI theme
case let p where p.contains("anthropic") || p.contains("claude"):
return "🧠" // Anthropic - brain/thinking theme
"🧠" // Anthropic - brain/thinking theme
case let p where p.contains("ollama") || p.contains("llama"):
return "🦙" // Ollama - llama theme
"🦙" // Ollama - llama theme
case let p where p.contains("grok"):
return "🚀" // Grok - rocket/fast theme
"🚀" // Grok - rocket/fast theme
default:
return "🤖"
"🤖"
}
}
}
@ -70,8 +70,15 @@ public struct ResponseComparison: Sendable {
public let tokenCount: Int
public let estimatedCost: Double?
public let error: String?
public init(provider: String, response: String, duration: TimeInterval, tokenCount: Int, estimatedCost: Double? = nil, error: String? = nil) {
public init(
provider: String,
response: String,
duration: TimeInterval,
tokenCount: Int,
estimatedCost: Double? = nil,
error: String? = nil)
{
self.provider = provider
self.response = response
self.duration = duration
@ -82,36 +89,36 @@ public struct ResponseComparison: Sendable {
}
/// Format response comparison in a nice table
public struct ResponseFormatter {
public enum ResponseFormatter {
/// Format responses side by side
public static func formatSideBySide(_ comparisons: [ResponseComparison], maxWidth: Int = 60) -> String {
var output = ""
// Create header
let headers = comparisons.map { comparison in
let emoji = TerminalOutput.providerEmoji(comparison.provider)
return "\(emoji) \(comparison.provider)"
}
// Print headers with boxes
let headerLine = headers.map { _ in
"\(String(repeating: "", count: maxWidth))"
}.joined(separator: " ")
output += headerLine + "\n"
let headerContentLine = headers.map { header in
let padding = max(0, maxWidth - header.count)
let leftPad = padding / 2
let rightPad = padding - leftPad
return "\(String(repeating: " ", count: leftPad))\(header)\(String(repeating: " ", count: rightPad))"
}.joined(separator: " ")
output += headerContentLine + "\n"
// Content area
let maxLines = comparisons.map { $0.response.split(separator: "\n").count }.max() ?? 0
for lineIndex in 0..<maxLines {
let contentLine = comparisons.map { comparison in
let lines = comparison.response.split(separator: "\n")
@ -120,42 +127,42 @@ public struct ResponseFormatter {
let padding = maxWidth - truncated.count
return "\(truncated)\(String(repeating: " ", count: padding - 1))"
}.joined(separator: " ")
output += contentLine + "\n"
}
// Footer with stats
let footerLine = comparisons.map { comparison in
let stats = formatStats(comparison)
let stats = self.formatStats(comparison)
let padding = max(0, maxWidth - stats.count)
return "\(stats)\(String(repeating: " ", count: padding - 1))"
}.joined(separator: " ")
output += footerLine + "\n"
let bottomLine = comparisons.map { _ in
"\(String(repeating: "", count: maxWidth))"
}.joined(separator: " ")
output += bottomLine + "\n"
return output
}
/// Format statistics line for a comparison
public static func formatStats(_ comparison: ResponseComparison) -> String {
let timeStr = String(format: "⏱️ %.1fs", comparison.duration)
let tokenStr = "🔤 \(comparison.tokenCount) tokens"
var stats = "\(timeStr) | \(tokenStr)"
if let cost = comparison.estimatedCost {
let costStr = String(format: "💰 $%.4f", cost)
stats += " | \(costStr)"
} else {
stats += " | 💰 Free"
}
return stats
}
}
@ -163,39 +170,40 @@ public struct ResponseFormatter {
// MARK: - Provider Detection and Setup
/// Utility for detecting available providers based on environment variables
public struct ProviderDetector {
public enum ProviderDetector {
/// Detect which providers are available based on environment variables
/// This helps examples gracefully handle missing API keys
public static func detectAvailableProviders() -> [String] {
var providers: [String] = []
// Check for API keys in environment variables
if ProcessInfo.processInfo.environment["OPENAI_API_KEY"] != nil {
providers.append("OpenAI")
}
if ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] != nil {
providers.append("Anthropic")
}
if ProcessInfo.processInfo.environment["X_AI_API_KEY"] != nil ||
ProcessInfo.processInfo.environment["XAI_API_KEY"] != nil {
ProcessInfo.processInfo.environment["XAI_API_KEY"] != nil
{
providers.append("Grok")
}
// Ollama is always available (assuming local installation)
providers.append("Ollama")
return providers
}
/// Get recommended model for each provider - updated with latest models
public static func recommendedModels() -> [String: String] {
return [
"OpenAI": "gpt-4.1", // Latest GPT-4.1
[
"OpenAI": "gpt-4.1", // Latest GPT-4.1
"Anthropic": "claude-opus-4-20250514", // Claude Opus 4 (May 2025)
"Grok": "grok-4", // Latest Grok
"Ollama": "llama3.3" // Best Ollama model for function calling
"Grok": "grok-4", // Latest Grok
"Ollama": "llama3.3", // Best Ollama model for function calling
]
}
}
@ -203,47 +211,47 @@ public struct ProviderDetector {
// MARK: - Configuration Helpers
/// Helper for creating provider configurations
public struct ConfigurationHelper {
public enum ConfigurationHelper {
/// Create AIModelProvider with recommended models for available providers
public static func createProviderWithAvailableModels() throws -> AIModelProvider {
return try AIConfiguration.fromEnvironment()
try AIConfiguration.fromEnvironment()
}
/// Get available model names
public static func getAvailableModelNames() throws -> [String] {
let provider = try createProviderWithAvailableModels()
return provider.availableModels()
}
/// Print setup instructions for missing providers
public static func printSetupInstructions() {
TerminalOutput.header("🚀 Tachikoma Examples Setup")
let available = ProviderDetector.detectAvailableProviders()
TerminalOutput.print("Available providers: \(available.joined(separator: ", "))", color: .green)
if !available.contains("OpenAI") {
TerminalOutput.print("\n💡 To enable OpenAI:", color: .yellow)
TerminalOutput.print(" export OPENAI_API_KEY=sk-your-key-here", color: .dim)
}
if !available.contains("Anthropic") {
TerminalOutput.print("\n💡 To enable Anthropic:", color: .yellow)
TerminalOutput.print(" export ANTHROPIC_API_KEY=sk-ant-your-key-here", color: .dim)
}
if !available.contains("Grok") {
TerminalOutput.print("\n💡 To enable Grok:", color: .yellow)
TerminalOutput.print(" export X_AI_API_KEY=xai-your-key-here", color: .dim)
}
if available.contains("Ollama") {
TerminalOutput.print("\n🦙 For Ollama, ensure these models are installed:", color: .cyan)
TerminalOutput.print(" ollama pull llama3.3", color: .dim)
TerminalOutput.print(" ollama pull llava", color: .dim)
}
TerminalOutput.separator()
}
}
@ -251,38 +259,39 @@ public struct ConfigurationHelper {
// MARK: - Performance Measurement
/// Utility for measuring performance
public struct PerformanceMeasurement {
public enum PerformanceMeasurement {
/// Measure execution time of an async operation
public static func measure<T>(_ operation: () async throws -> T) async rethrows -> (result: T, duration: TimeInterval) {
public static func measure<T>(_ operation: () async throws -> T) async rethrows
-> (result: T, duration: TimeInterval) {
let startTime = Date()
let result = try await operation()
let endTime = Date()
return (result, endTime.timeIntervalSince(startTime))
}
/// Estimate token count (rough approximation)
public static func estimateTokenCount(_ text: String) -> Int {
// Rough approximation: ~4 characters per token
return text.count / 4
text.count / 4
}
/// Estimate cost based on provider and token count
public static func estimateCost(provider: String, inputTokens: Int, outputTokens: Int) -> Double? {
switch provider.lowercased() {
case let p where p.contains("gpt-4.1"):
return Double(inputTokens) * 0.00003 + Double(outputTokens) * 0.00012 // $30/$120 per 1M tokens
Double(inputTokens) * 0.00003 + Double(outputTokens) * 0.00012 // $30/$120 per 1M tokens
case let p where p.contains("gpt-4o"):
return Double(inputTokens) * 0.000005 + Double(outputTokens) * 0.000015 // $5/$15 per 1M tokens
Double(inputTokens) * 0.000005 + Double(outputTokens) * 0.000015 // $5/$15 per 1M tokens
case let p where p.contains("claude-opus-4"):
return Double(inputTokens) * 0.000015 + Double(outputTokens) * 0.000075 // $15/$75 per 1M tokens
Double(inputTokens) * 0.000015 + Double(outputTokens) * 0.000075 // $15/$75 per 1M tokens
case let p where p.contains("claude-sonnet-4"):
return Double(inputTokens) * 0.000003 + Double(outputTokens) * 0.000015 // $3/$15 per 1M tokens
Double(inputTokens) * 0.000003 + Double(outputTokens) * 0.000015 // $3/$15 per 1M tokens
case let p where p.contains("grok"):
return Double(inputTokens) * 0.000005 + Double(outputTokens) * 0.000015 // Estimated pricing
Double(inputTokens) * 0.000005 + Double(outputTokens) * 0.000015 // Estimated pricing
case let p where p.contains("ollama") || p.contains("llama"):
return nil // Free (local)
nil // Free (local)
default:
return nil
nil
}
}
}
@ -290,27 +299,27 @@ public struct PerformanceMeasurement {
// MARK: - Example Content
/// Predefined content for examples
public struct ExampleContent {
public enum ExampleContent {
/// Sample prompts for different use cases
public static let samplePrompts = [
"Explain quantum computing in simple terms",
"Write a Swift function to calculate fibonacci numbers",
"What are the key differences between async/await and callbacks?",
"Describe the architecture of a modern web application",
"How do neural networks learn?"
"How do neural networks learn?",
]
/// Sample images for multimodal examples (base64 encoded)
public static let sampleImages: [String: String] = [
"chart": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNkYPhfDwAChwGA60e6kgAAAABJRU5ErkJggg==",
// Add more sample images as needed
]
/// Sample tools for agent examples
public static let sampleTools = [
"weather": "Get current weather for a location",
"calculator": "Perform mathematical calculations",
"calculator": "Perform mathematical calculations",
"file_reader": "Read contents of text files",
"web_search": "Search the web for information"
"web_search": "Search the web for information",
]
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import ArgumentParser
import Tachikoma
import Foundation
import SharedExampleUtils
import Tachikoma
/// Demonstrate AI agent patterns with function calling using Tachikoma
@main
@ -13,273 +13,286 @@ struct TachikomaAgent: AsyncParsableCommand {
This example showcases Tachikoma's function calling capabilities, demonstrating how to
build AI agents that can use custom tools to accomplish complex tasks. The agent can
call functions for weather, calculations, file operations, and more.
Examples:
tachikoma-agent "What's the weather in Tokyo?"
tachikoma-agent --tools calculator "Calculate 15% tip for $67.50"
tachikoma-agent --tools weather,file_reader "Check weather and save to file"
tachikoma-agent --conversation "Start multi-turn conversation"
"""
)
""")
@Argument(help: "The task for the AI agent to perform")
var task: String?
@Option(name: .shortAndLong, help: "Comma-separated list of tools to enable (weather, calculator, file_reader, web_search, all)")
@Option(
name: .shortAndLong,
help: "Comma-separated list of tools to enable (weather, calculator, file_reader, web_search, all)")
var tools: String?
@Option(name: .shortAndLong, help: "Specific provider to use for the agent")
var provider: String?
@Flag(name: .shortAndLong, help: "Start conversation mode for multi-turn interactions")
var conversation: Bool = false
@Flag(name: .shortAndLong, help: "Show verbose function call details")
var verbose: Bool = false
@Flag(name: .long, help: "List available tools and exit")
var listTools: Bool = false
@Option(help: "Maximum number of function calls per request")
var maxFunctionCalls: Int = 5
func run() async throws {
TerminalOutput.header("🤖 Tachikoma Agent Demo")
if listTools {
listAvailableTools()
if self.listTools {
self.listAvailableTools()
return
}
let modelProvider = try ConfigurationHelper.createProviderWithAvailableModels()
let availableModels = modelProvider.availableModels()
if availableModels.isEmpty {
TerminalOutput.print("❌ No AI providers configured! Please set up API keys.", color: .red)
ConfigurationHelper.printSetupInstructions()
return
}
// Select tools to enable
let enabledTools = selectTools()
if conversation {
try await runConversationMode(modelProvider: modelProvider, availableModels: availableModels, tools: enabledTools)
let enabledTools = self.selectTools()
if self.conversation {
try await self.runConversationMode(
modelProvider: modelProvider,
availableModels: availableModels,
tools: enabledTools)
} else {
guard let task = task else {
guard let task else {
TerminalOutput.print("❌ Please provide a task or use --conversation mode", color: .red)
return
}
try await runSingleTask(task: task, modelProvider: modelProvider, availableModels: availableModels, tools: enabledTools)
try await self.runSingleTask(
task: task,
modelProvider: modelProvider,
availableModels: availableModels,
tools: enabledTools)
}
}
/// List available tools
private func listAvailableTools() {
TerminalOutput.print("🔧 Available Agent Tools:", color: .cyan)
TerminalOutput.separator("")
for (name, description) in ExampleContent.sampleTools {
TerminalOutput.print("\(name): \(description)", color: .white)
}
TerminalOutput.separator("")
TerminalOutput.print("💡 Use --tools weather,calculator to enable specific tools", color: .yellow)
TerminalOutput.print("💡 Use --tools all to enable all available tools", color: .yellow)
}
/// Select which tools to enable for the agent
private func selectTools() -> [ToolDefinition] {
let allTools = createAllTools()
let allTools = self.createAllTools()
if let toolsString = tools {
if toolsString.lowercased() == "all" {
return allTools
}
// Parse comma-separated tool names
let requestedTools = toolsString.split(separator: ",").map { $0.trimmingCharacters(in: .whitespaces).lowercased() }
let requestedTools = toolsString.split(separator: ",")
.map { $0.trimmingCharacters(in: .whitespaces).lowercased() }
return allTools.filter { tool in
requestedTools.contains(tool.function.name.lowercased())
}
}
// Default: enable basic tools for demonstration
return allTools.filter { ["weather", "calculator"].contains($0.function.name) }
}
/// Run a single task
private func runSingleTask(task: String, modelProvider: AIModelProvider, availableModels: [String], tools: [ToolDefinition]) async throws {
private func runSingleTask(
task: String,
modelProvider: AIModelProvider,
availableModels: [String],
tools: [ToolDefinition]) async throws
{
let selectedModel = try selectModel(from: availableModels)
let model = try modelProvider.getModel(selectedModel)
let providerName = getProviderName(from: selectedModel)
let providerName = self.getProviderName(from: selectedModel)
TerminalOutput.print("🎯 Agent Provider: \(providerName)", color: .cyan)
TerminalOutput.print("🔧 Enabled Tools: \(tools.map { $0.function.name }.joined(separator: ", "))", color: .dim)
TerminalOutput.print("🔧 Enabled Tools: \(tools.map(\.function.name).joined(separator: ", "))", color: .dim)
TerminalOutput.print("💭 Task: \(task)", color: .yellow)
TerminalOutput.separator("")
let agent = AgentRunner(model: model, tools: tools, verbose: verbose, maxFunctionCalls: maxFunctionCalls)
try await agent.executeTask(task)
}
/// Run conversation mode
private func runConversationMode(modelProvider: AIModelProvider, availableModels: [String], tools: [ToolDefinition]) async throws {
private func runConversationMode(
modelProvider: AIModelProvider,
availableModels: [String],
tools: [ToolDefinition]) async throws
{
let selectedModel = try selectModel(from: availableModels)
let model = try modelProvider.getModel(selectedModel)
let providerName = getProviderName(from: selectedModel)
let providerName = self.getProviderName(from: selectedModel)
TerminalOutput.print("🎭 Starting conversation with \(providerName) agent", color: .cyan)
TerminalOutput.print("🔧 Available tools: \(tools.map { $0.function.name }.joined(separator: ", "))", color: .dim)
TerminalOutput.print("🔧 Available tools: \(tools.map(\.function.name).joined(separator: ", "))", color: .dim)
TerminalOutput.print("Type 'quit' or 'exit' to end the conversation.", color: .dim)
TerminalOutput.separator("")
let agent = AgentRunner(model: model, tools: tools, verbose: verbose, maxFunctionCalls: maxFunctionCalls)
while true {
TerminalOutput.print("\n🗣️ You: ", color: .magenta)
guard let input = readLine()?.trimmingCharacters(in: .whitespaces) else {
continue
}
if input.lowercased() == "quit" || input.lowercased() == "exit" {
TerminalOutput.print("👋 Goodbye!", color: .green)
break
}
if input.isEmpty {
continue
}
try await agent.continueConversation(input)
}
}
/// Select a model that supports function calling
private func selectModel(from availableModels: [String]) throws -> String {
if let requestedProvider = provider {
let recommended = ProviderDetector.recommendedModels()
if let recommendedModel = recommended[requestedProvider.capitalized],
availableModels.contains(recommendedModel) {
availableModels.contains(recommendedModel)
{
return recommendedModel
}
}
// Prefer models with good function calling support
let functionCallingPreferred = ["gpt-4.1", "claude-opus-4-20250514", "grok-4", "llama3.3"]
for preferred in functionCallingPreferred {
if availableModels.contains(preferred) {
return preferred
}
}
return availableModels.first!
}
/// Extract provider name from model name
private func getProviderName(from modelName: String) -> String {
switch modelName.lowercased() {
case let m where m.contains("gpt") || m.contains("o3") || m.contains("o4"):
return "OpenAI"
"OpenAI"
case let m where m.contains("claude"):
return "Anthropic"
"Anthropic"
case let m where m.contains("llama") || m.contains("llava"):
return "Ollama"
"Ollama"
case let m where m.contains("grok"):
return "Grok"
"Grok"
default:
return "Unknown"
"Unknown"
}
}
/// Create all available tools
private func createAllTools() -> [ToolDefinition] {
return [
createWeatherTool(),
createCalculatorTool(),
createFileReaderTool(),
createWebSearchTool(),
createTimeTool(),
createRandomTool()
[
self.createWeatherTool(),
self.createCalculatorTool(),
self.createFileReaderTool(),
self.createWebSearchTool(),
self.createTimeTool(),
self.createRandomTool(),
]
}
/// Weather lookup tool
private func createWeatherTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "weather",
description: "Get current weather information for a specific location",
parameters: ToolParameters.object(properties: [
"location": .string(description: "The city and country/state, e.g. 'Tokyo, Japan' or 'San Francisco, CA'"),
"units": .string(description: "Temperature units: 'celsius' or 'fahrenheit'")
], required: ["location"])
)
)
"location": .string(
description: "The city and country/state, e.g. 'Tokyo, Japan' or 'San Francisco, CA'"),
"units": .string(description: "Temperature units: 'celsius' or 'fahrenheit'"),
], required: ["location"])))
}
/// Calculator tool
private func createCalculatorTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "calculator",
description: "Perform mathematical calculations including basic math, percentages, and conversions",
parameters: ToolParameters.object(properties: [
"expression": .string(description: "Mathematical expression to evaluate, e.g. '15 * 0.15' or '67.50 * 1.15'"),
"operation": .enumeration(["basic", "percentage", "tip", "conversion"], description: "Type of calculation")
], required: ["expression"])
)
)
"expression": .string(
description: "Mathematical expression to evaluate, e.g. '15 * 0.15' or '67.50 * 1.15'"),
"operation": .enumeration(
["basic", "percentage", "tip", "conversion"],
description: "Type of calculation"),
], required: ["expression"])))
}
/// File reader tool
private func createFileReaderTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "file_reader",
description: "Read contents of text files from the local filesystem",
parameters: ToolParameters.object(properties: [
"file_path": .string(description: "Path to the file to read"),
"encoding": .enumeration(["utf8", "ascii"], description: "Text encoding")
], required: ["file_path"])
)
)
"encoding": .enumeration(["utf8", "ascii"], description: "Text encoding"),
], required: ["file_path"])))
}
/// Web search tool
private func createWebSearchTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "web_search",
description: "Search the web for current information (simulated for demo)",
parameters: ToolParameters.object(properties: [
"query": .string(description: "Search query"),
"num_results": .integer(description: "Number of results to return (1-10)")
], required: ["query"])
)
)
"num_results": .integer(description: "Number of results to return (1-10)"),
], required: ["query"])))
}
/// Time/date tool
private func createTimeTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "time",
description: "Get current time, date, or timezone information",
parameters: ToolParameters.object(properties: [
"timezone": .string(description: "Timezone identifier, e.g. 'America/New_York' or 'UTC'"),
"format": .enumeration(["iso8601", "human", "timestamp"], description: "Output format")
])
)
)
"format": .enumeration(["iso8601", "human", "timestamp"], description: "Output format"),
])))
}
/// Random number/choice tool
private func createRandomTool() -> ToolDefinition {
return ToolDefinition(
ToolDefinition(
function: FunctionDefinition(
name: "random",
description: "Generate random numbers or make random choices",
@ -288,10 +301,8 @@ struct TachikomaAgent: AsyncParsableCommand {
"min": .integer(description: "Minimum value for number generation"),
"max": .integer(description: "Maximum value for number generation"),
"choices": .array(of: .string(), description: "List of choices to pick from"),
"sides": .integer(description: "Number of sides for dice roll")
], required: ["type"])
)
)
"sides": .integer(description: "Number of sides for dice roll"),
], required: ["type"])))
}
}
@ -304,51 +315,52 @@ class AgentRunner {
private let verbose: Bool
private let maxFunctionCalls: Int
private var conversationHistory: [Message] = []
init(model: ModelInterface, tools: [ToolDefinition], verbose: Bool, maxFunctionCalls: Int) {
self.model = model
self.tools = tools
self.verbose = verbose
self.maxFunctionCalls = maxFunctionCalls
}
/// Execute a single task
func executeTask(_ task: String) async throws {
conversationHistory = [
Message.system(content: createSystemPrompt()),
Message.user(content: .text(task))
self.conversationHistory = [
Message.system(content: self.createSystemPrompt()),
Message.user(content: .text(task)),
]
try await processConversation()
try await self.processConversation()
}
/// Continue an ongoing conversation
func continueConversation(_ userInput: String) async throws {
conversationHistory.append(Message.user(content: .text(userInput)))
try await processConversation()
self.conversationHistory.append(Message.user(content: .text(userInput)))
try await self.processConversation()
}
/// Process the conversation with function calling
/// This demonstrates the core agent loop: request -> response -> function calls -> repeat
private func processConversation() async throws {
var functionCallCount = 0
let startTime = Date() // Track total execution time
var totalTokens = 0 // Track total tokens used across all requests
while functionCallCount < maxFunctionCalls {
while functionCallCount < self.maxFunctionCalls {
// Create request with conversation history and available tools
let request = ModelRequest(
messages: conversationHistory,
tools: tools.isEmpty ? nil : tools, // Include tools for function calling
settings: ModelSettings(maxTokens: 1000)
)
if verbose {
TerminalOutput.print("📡 Sending request to model... (Function calls: \(functionCallCount)/\(maxFunctionCalls))", color: .yellow)
tools: tools.isEmpty ? nil : self.tools, // Include tools for function calling
settings: ModelSettings(maxTokens: 1000))
if self.verbose {
TerminalOutput.print(
"📡 Sending request to model... (Function calls: \(functionCallCount)/\(self.maxFunctionCalls))",
color: .yellow)
}
let response = try await model.getResponse(request: request)
// Extract text content and tool calls from response
// AssistantContent can contain both text and function calls
let textContent = response.content.compactMap { item in
@ -357,122 +369,127 @@ class AgentRunner {
}
return nil
}.joined()
// Track token usage for performance metrics
totalTokens += PerformanceMeasurement.estimateTokenCount(textContent)
let toolCalls = response.content.compactMap { item in
if case let .toolCall(call) = item {
return call
}
return nil
}
// Add assistant message to conversation history
conversationHistory.append(Message.assistant(content: response.content))
self.conversationHistory.append(Message.assistant(content: response.content))
// Check if the model wants to call functions
if !toolCalls.isEmpty {
if verbose {
if self.verbose {
TerminalOutput.print("🔧 Model requesting \(toolCalls.count) function call(s)", color: .cyan)
}
var functionResults: [Message] = []
// Execute each function call the model requested
for toolCall in toolCalls {
if verbose {
if self.verbose {
TerminalOutput.print(" 📞 Calling function: \(toolCall.function.name)", color: .yellow)
TerminalOutput.print(" Arguments: \(toolCall.function.arguments)", color: .dim)
}
// Execute the function and get the result
let result = try await executeFunction(toolCall.function.name, arguments: toolCall.function.arguments)
let result = try await executeFunction(
toolCall.function.name,
arguments: toolCall.function.arguments)
// Create a tool result message to send back to the model
let resultMessage = Message.tool(toolCallId: toolCall.id, content: result)
functionResults.append(resultMessage)
if verbose {
if self.verbose {
TerminalOutput.print(" Result: \(result)", color: .green)
}
}
// Add all function results to conversation history
conversationHistory.append(contentsOf: functionResults)
self.conversationHistory.append(contentsOf: functionResults)
functionCallCount += toolCalls.count
} else {
// No function calls, display the response and exit
let emoji = getProviderEmoji()
let emoji = self.getProviderEmoji()
TerminalOutput.print("\n\(emoji) Agent: ", color: .bold)
if !textContent.isEmpty {
TerminalOutput.print(textContent, color: .white)
} else {
TerminalOutput.print("(No response content)", color: .dim)
}
break
}
}
if functionCallCount >= maxFunctionCalls {
TerminalOutput.print("\n⚠️ Reached maximum function call limit (\(maxFunctionCalls))", color: .yellow)
if functionCallCount >= self.maxFunctionCalls {
TerminalOutput.print("\n⚠️ Reached maximum function call limit (\(self.maxFunctionCalls))", color: .yellow)
}
// Display performance metrics after agent task completion
let endTime = Date()
let totalDuration = endTime.timeIntervalSince(startTime)
displayAgentPerformance(duration: totalDuration, totalTokens: totalTokens, functionCalls: functionCallCount)
self.displayAgentPerformance(
duration: totalDuration,
totalTokens: totalTokens,
functionCalls: functionCallCount)
}
/// Execute a function call and return the result
private func executeFunction(_ functionName: String, arguments: String) async throws -> String {
switch functionName {
case "weather":
return try executeWeatherFunction(arguments)
try self.executeWeatherFunction(arguments)
case "calculator":
return try executeCalculatorFunction(arguments)
try self.executeCalculatorFunction(arguments)
case "file_reader":
return try executeFileReaderFunction(arguments)
try self.executeFileReaderFunction(arguments)
case "web_search":
return try executeWebSearchFunction(arguments)
try self.executeWebSearchFunction(arguments)
case "time":
return try executeTimeFunction(arguments)
try self.executeTimeFunction(arguments)
case "random":
return try executeRandomFunction(arguments)
try self.executeRandomFunction(arguments)
default:
return "Error: Unknown function '\(functionName)'"
"Error: Unknown function '\(functionName)'"
}
}
/// Execute weather function (simulated)
private func executeWeatherFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
guard let location = args["location"] as? String else {
return "Error: Missing location parameter"
}
let units = args["units"] as? String ?? "celsius"
// Simulate weather data
let weatherData = [
"Tokyo, Japan": ("Partly cloudy", 22, 18),
"San Francisco, CA": ("Foggy", 15, 12),
"New York, NY": ("Sunny", 25, 20),
"London, UK": ("Rainy", 12, 8),
"Sydney, Australia": ("Clear", 28, 24)
"Sydney, Australia": ("Clear", 28, 24),
]
if let (condition, highC, lowC) = weatherData[location] {
if units == "fahrenheit" {
let highF = Int(Double(highC) * 9/5 + 32)
let lowF = Int(Double(lowC) * 9/5 + 32)
let highF = Int(Double(highC) * 9 / 5 + 32)
let lowF = Int(Double(lowC) * 9 / 5 + 32)
return "Weather in \(location): \(condition), High: \(highF)°F, Low: \(lowF)°F"
} else {
return "Weather in \(location): \(condition), High: \(highC)°C, Low: \(lowC)°C"
@ -481,27 +498,27 @@ class AgentRunner {
return "Weather information not available for \(location). Try major cities like Tokyo, San Francisco, New York, London, or Sydney."
}
}
/// Execute calculator function
private func executeCalculatorFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
guard let expression = args["expression"] as? String else {
return "Error: Missing expression parameter"
}
// Simple expression evaluator (in real implementation, use a proper math parser)
let result = try evaluateExpression(expression)
let operation = args["operation"] as? String ?? "basic"
switch operation {
case "tip":
let tipAmount = result
let total = extractNumberFromExpression(expression) + tipAmount
let total = self.extractNumberFromExpression(expression) + tipAmount
return String(format: "Tip: $%.2f, Total: $%.2f", tipAmount, total)
case "percentage":
return String(format: "%.2f%%", result)
@ -509,18 +526,18 @@ class AgentRunner {
return String(format: "%.2f", result)
}
}
/// Execute file reader function
private func executeFileReaderFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
guard let filePath = args["file_path"] as? String else {
return "Error: Missing file_path parameter"
}
do {
let content = try String(contentsOfFile: filePath, encoding: .utf8)
return "File content (\(content.count) characters):\n\(content)"
@ -528,55 +545,55 @@ class AgentRunner {
return "Error reading file '\(filePath)': \(error.localizedDescription)"
}
}
/// Execute web search function (simulated)
private func executeWebSearchFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
guard let query = args["query"] as? String else {
return "Error: Missing query parameter"
}
let numResults = args["num_results"] as? Int ?? 3
// Simulate search results
return """
Search results for "\(query)" (\(numResults) results):
1. Example.com - Comprehensive guide to \(query)
Summary: Detailed information about \(query) with practical examples...
2. Reference.org - \(query) documentation
Summary: Official documentation and API reference for \(query)...
3. Tutorial.net - Learn \(query) step by step
Summary: Beginner-friendly tutorial covering the basics of \(query)...
Note: This is a simulated search. In a real implementation, this would use a web search API.
"""
}
/// Execute time function
private func executeTimeFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
let timezone = args["timezone"] as? String ?? "UTC"
let format = args["format"] as? String ?? "human"
let now = Date()
let formatter = DateFormatter()
// Set timezone
if let tz = TimeZone(identifier: timezone) {
formatter.timeZone = tz
}
switch format {
case "iso8601":
formatter.dateFormat = "yyyy-MM-dd'T'HH:mm:ssZ"
@ -589,95 +606,100 @@ class AgentRunner {
return formatter.string(from: now)
}
}
/// Execute random function
private func executeRandomFunction(_ arguments: String) throws -> String {
// Parse JSON arguments
let data = arguments.data(using: .utf8) ?? Data()
let parsed = try? JSONSerialization.jsonObject(with: data) as? [String: Any]
let args = parsed ?? [:]
guard let type = args["type"] as? String else {
return "Error: Missing type parameter"
}
switch type {
case "number":
let min = args["min"] as? Int ?? 1
let max = args["max"] as? Int ?? 100
let result = Int.random(in: min...max)
return "Random number between \(min) and \(max): \(result)"
case "choice":
guard let choicesArray = args["choices"] as? [String],
!choicesArray.isEmpty else {
!choicesArray.isEmpty
else {
return "Error: No choices provided"
}
let result = choicesArray.randomElement()!
return "Random choice from [\(choicesArray.joined(separator: ", "))]: \(result)"
case "dice":
let sides = args["sides"] as? Int ?? 6
let result = Int.random(in: 1...sides)
return "Rolled \(sides)-sided die: \(result)"
default:
return "Error: Unknown random type '\(type)'"
}
}
/// Simple expression evaluator
private func evaluateExpression(_ expression: String) throws -> Double {
// This is a very basic evaluator - in a real implementation, use NSExpression or a proper parser
let cleanExpression = expression.replacingOccurrences(of: " ", with: "")
// Handle simple operations
if cleanExpression.contains("*") {
let parts = cleanExpression.split(separator: "*")
if parts.count == 2,
let left = Double(parts[0]),
let right = Double(parts[1]) {
let right = Double(parts[1])
{
return left * right
}
}
if cleanExpression.contains("/") {
let parts = cleanExpression.split(separator: "/")
if parts.count == 2,
let left = Double(parts[0]),
let right = Double(parts[1]) {
let right = Double(parts[1])
{
return left / right
}
}
if cleanExpression.contains("+") {
let parts = cleanExpression.split(separator: "+")
if parts.count == 2,
let left = Double(parts[0]),
let right = Double(parts[1]) {
let right = Double(parts[1])
{
return left + right
}
}
if cleanExpression.contains("-") {
let parts = cleanExpression.split(separator: "-")
if parts.count == 2,
let left = Double(parts[0]),
let right = Double(parts[1]) {
let right = Double(parts[1])
{
return left - right
}
}
// Try to parse as a single number
if let number = Double(cleanExpression) {
return number
}
throw NSError(domain: "Calculator", code: 1, userInfo: [
NSLocalizedDescriptionKey: "Unable to evaluate expression: \(expression)"
NSLocalizedDescriptionKey: "Unable to evaluate expression: \(expression)",
])
}
/// Extract base number from expression for tip calculations
private func extractNumberFromExpression(_ expression: String) -> Double {
let components = expression.split(whereSeparator: { "+-*/".contains($0) })
@ -686,46 +708,46 @@ class AgentRunner {
}
return 0
}
/// Create system prompt for the agent
private func createSystemPrompt() -> String {
let toolNames = tools.map { $0.function.name }.joined(separator: ", ")
let toolNames = self.tools.map(\.function.name).joined(separator: ", ")
return """
You are a helpful AI agent with access to the following tools: \(toolNames).
Use these tools to help the user accomplish their tasks. Always:
1. Use appropriate tools when the task requires external information or computation
2. Provide clear, helpful responses
3. Explain what you're doing when calling functions
4. Be conversational and friendly
Available tools:
\(tools.map { "- \($0.function.name): \($0.function.description)" }.joined(separator: "\n"))
\(self.tools.map { "- \($0.function.name): \($0.function.description)" }.joined(separator: "\n"))
When a user asks for something that can be accomplished with your tools, use them!
"""
}
/// Get provider emoji for display
private func getProviderEmoji() -> String {
// This is a simple implementation - in practice, you'd detect from the model
return "🤖"
"🤖"
}
/// Display agent performance metrics after task completion
private func displayAgentPerformance(duration: TimeInterval, totalTokens: Int, functionCalls: Int) {
TerminalOutput.separator("")
TerminalOutput.print("📊 Agent Performance Summary:", color: .bold)
let stats = [
"⏱️ Total time: \(String(format: "%.2fs", duration))",
"🔤 Tokens used: ~\(totalTokens)",
"🔧 Function calls: \(functionCalls)"
"🔧 Function calls: \(functionCalls)",
]
TerminalOutput.print(stats.joined(separator: " | "), color: .dim)
// Performance assessment
if duration < 10 {
TerminalOutput.print("🚀 Performance: Fast", color: .green)
@ -734,7 +756,7 @@ class AgentRunner {
} else {
TerminalOutput.print("🐌 Performance: Slow (complex task or model latency)", color: .yellow)
}
TerminalOutput.separator("")
}
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import ArgumentParser
import Tachikoma
import Foundation
import SharedExampleUtils
import Tachikoma
/// Simple getting started example demonstrating basic Tachikoma usage
@main
@ -15,58 +15,57 @@ struct TachikomaBasics: AsyncParsableCommand {
- Making basic requests
- Handling responses and errors
- Provider-agnostic code patterns
Examples:
tachikoma-basics "Hello, AI!"
tachikoma-basics --provider openai "Write a haiku"
tachikoma-basics --list-providers
"""
)
""")
@Argument(help: "The message to send to the AI")
var message: String?
@Option(name: .shortAndLong, help: "Specific provider to use (openai, anthropic, ollama, grok)")
var provider: String?
@Flag(name: .long, help: "List all available providers and exit")
var listProviders: Bool = false
@Flag(name: .shortAndLong, help: "Show detailed information about the process")
var verbose: Bool = false
func run() async throws {
TerminalOutput.header("🎓 Tachikoma Basics")
if listProviders {
try listAvailableProviders()
if self.listProviders {
try self.listAvailableProviders()
return
}
guard let message = message else {
guard let message else {
TerminalOutput.print("❌ Please provide a message or use --list-providers", color: .red)
return
}
try await demonstrateBasicUsage(message: message)
try await self.demonstrateBasicUsage(message: message)
}
/// List available providers and their status
private func listAvailableProviders() throws {
TerminalOutput.print("🔍 Scanning for available AI providers...\n", color: .cyan)
// Show environment-based detection
let detectedProviders = ProviderDetector.detectAvailableProviders()
TerminalOutput.print("Detected providers: \(detectedProviders.joined(separator: ", "))", color: .green)
// Try to create the model provider
do {
let modelProvider = try AIConfiguration.fromEnvironment()
let availableModels = modelProvider.availableModels()
TerminalOutput.print("\n📋 Available models (\(availableModels.count) total):", color: .bold)
let groupedModels = groupModelsByProvider(availableModels)
let groupedModels = self.groupModelsByProvider(availableModels)
for (provider, models) in groupedModels.sorted(by: { $0.key < $1.key }) {
TerminalOutput.providerHeader(provider)
for model in models.sorted() {
@ -74,28 +73,28 @@ struct TachikomaBasics: AsyncParsableCommand {
}
print("")
}
} catch {
TerminalOutput.print("❌ Failed to initialize providers: \(error)", color: .red)
ConfigurationHelper.printSetupInstructions()
}
}
/// Group models by their provider
private func groupModelsByProvider(_ models: [String]) -> [String: [String]] {
var grouped: [String: [String]] = [:]
for model in models {
let provider = detectProviderFromModel(model)
let provider = self.detectProviderFromModel(model)
if grouped[provider] == nil {
grouped[provider] = []
}
grouped[provider]?.append(model)
}
return grouped
}
/// Detect provider name from model string
private func detectProviderFromModel(_ model: String) -> String {
let lowercased = model.lowercased()
@ -111,19 +110,19 @@ struct TachikomaBasics: AsyncParsableCommand {
return "Unknown"
}
}
/// Demonstrate basic Tachikoma usage patterns
private func demonstrateBasicUsage(message: String) async throws {
if verbose {
if self.verbose {
TerminalOutput.print("🔧 Setting up Tachikoma...", color: .yellow)
}
// Step 1: Create the model provider
// AIConfiguration.fromEnvironment() automatically detects API keys and sets up providers
let modelProvider: AIModelProvider
do {
modelProvider = try AIConfiguration.fromEnvironment()
if verbose {
if self.verbose {
TerminalOutput.print("✅ Successfully initialized AIModelProvider", color: .green)
}
} catch {
@ -132,23 +131,23 @@ struct TachikomaBasics: AsyncParsableCommand {
ConfigurationHelper.printSetupInstructions()
return
}
// Step 2: Select which model to use
// This demonstrates Tachikoma's provider-agnostic approach
let selectedModel = try selectModel(from: modelProvider)
if verbose {
if self.verbose {
TerminalOutput.print("🎯 Selected model: \(selectedModel)", color: .cyan)
}
// Step 3: Get the model instance
// Same interface works for OpenAI, Anthropic, Ollama, or Grok
let model = try modelProvider.getModel(selectedModel)
if verbose {
if self.verbose {
TerminalOutput.print("📡 Creating request...", color: .yellow)
}
// Step 4: Create a request
// ModelRequest provides a unified interface across all providers
let request = ModelRequest(
@ -156,11 +155,13 @@ struct TachikomaBasics: AsyncParsableCommand {
tools: nil, // No function calling for this basic example
settings: ModelSettings(maxTokens: 300) // Limit response length
)
if verbose {
TerminalOutput.print("🚀 Sending request to \(detectProviderFromModel(selectedModel))...", color: .yellow)
if self.verbose {
TerminalOutput.print(
"🚀 Sending request to \(self.detectProviderFromModel(selectedModel))...",
color: .yellow)
}
// Step 5: Send the request and measure performance
let startTime = Date()
do {
@ -168,83 +169,85 @@ struct TachikomaBasics: AsyncParsableCommand {
let response = try await model.getResponse(request: request)
let endTime = Date()
let duration = endTime.timeIntervalSince(startTime)
// Display the results
displayResponse(
self.displayResponse(
message: message,
response: response,
model: selectedModel,
duration: duration
)
duration: duration)
} catch {
TerminalOutput.print("❌ Request failed: \(error)", color: .red)
if verbose {
if self.verbose {
TerminalOutput.print("\n🔍 Debugging information:", color: .yellow)
TerminalOutput.print("Model: \(selectedModel)", color: .dim)
TerminalOutput.print("Error type: \(type(of: error))", color: .dim)
}
}
}
/// Select a model based on user preference or auto-detection
private func selectModel(from modelProvider: AIModelProvider) throws -> String {
let availableModels = modelProvider.availableModels()
if availableModels.isEmpty {
throw NSError(domain: "TachikomaBasics", code: 1, userInfo: [
NSLocalizedDescriptionKey: "No models available. Please configure API keys."
NSLocalizedDescriptionKey: "No models available. Please configure API keys.",
])
}
// If user specified a provider, find the best model for it
if let requestedProvider = provider {
let recommended = ProviderDetector.recommendedModels()
// Try to use the recommended model for this provider
if let recommendedModel = recommended[requestedProvider.capitalized],
availableModels.contains(recommendedModel) {
availableModels.contains(recommendedModel)
{
return recommendedModel
} else {
// Find any model from the requested provider
let providerModels = availableModels.filter { model in
detectProviderFromModel(model).lowercased() == requestedProvider.lowercased()
self.detectProviderFromModel(model).lowercased() == requestedProvider.lowercased()
}
if let firstModel = providerModels.first {
return firstModel
} else {
TerminalOutput.print("⚠️ Provider '\(requestedProvider)' not available. Using default.", color: .yellow)
TerminalOutput.print(
"⚠️ Provider '\(requestedProvider)' not available. Using default.",
color: .yellow)
}
}
}
// Auto-select the best available model
// Prioritized by quality and general capabilities
let preferredOrder = ["claude-opus-4-20250514", "gpt-4.1", "llama3.3", "grok-4"]
for preferred in preferredOrder {
if availableModels.contains(preferred) {
return preferred
}
}
// Fallback to first available
return availableModels.first!
}
/// Display the response in a formatted way
private func displayResponse(message: String, response: ModelResponse, model: String, duration: TimeInterval) {
let provider = detectProviderFromModel(model)
let provider = self.detectProviderFromModel(model)
let emoji = TerminalOutput.providerEmoji(provider)
TerminalOutput.separator("")
TerminalOutput.print("💬 Your message: \(message)", color: .cyan)
TerminalOutput.separator("")
TerminalOutput.print("\(emoji) \(provider) response:", color: .bold)
TerminalOutput.separator("")
// Extract text content from response
// ModelResponse.content is an array of AssistantContent items
let textContent = response.content.compactMap { item in
@ -253,49 +256,51 @@ struct TachikomaBasics: AsyncParsableCommand {
}
return nil
}.joined()
if !textContent.isEmpty {
TerminalOutput.print(textContent, color: .white)
} else {
TerminalOutput.print("(No text content in response)", color: .dim)
}
TerminalOutput.separator("")
// Show statistics
let tokenCount = PerformanceMeasurement.estimateTokenCount(textContent)
let stats = [
"⏱️ Duration: \(String(format: "%.2fs", duration))",
"🔤 Tokens: ~\(tokenCount)",
"🤖 Model: \(model)"
"🤖 Model: \(model)",
]
TerminalOutput.print(stats.joined(separator: " | "), color: .dim)
// Cost estimation if available
if let cost = PerformanceMeasurement.estimateCost(
provider: model,
inputTokens: PerformanceMeasurement.estimateTokenCount(message),
outputTokens: tokenCount
) {
outputTokens: tokenCount)
{
TerminalOutput.print("💰 Estimated cost: $\(String(format: "%.4f", cost))", color: .green)
} else {
TerminalOutput.print("💰 Cost: Free (local model)", color: .green)
}
TerminalOutput.separator("")
if verbose {
if self.verbose {
TerminalOutput.print("\n🎓 Key concepts demonstrated:", color: .yellow)
TerminalOutput.print("1. ✅ Environment-based configuration (AIConfiguration.fromEnvironment())", color: .dim)
TerminalOutput.print(
"1. ✅ Environment-based configuration (AIConfiguration.fromEnvironment())",
color: .dim)
TerminalOutput.print("2. ✅ Provider-agnostic model access (modelProvider.getModel())", color: .dim)
TerminalOutput.print("3. ✅ Unified request/response format across all providers", color: .dim)
TerminalOutput.print("4. ✅ Error handling and graceful degradation", color: .dim)
TerminalOutput.print("\n💡 Next steps:", color: .cyan)
TerminalOutput.print("• Try: tachikoma-comparison \"Your question\" (side-by-side comparison)", color: .dim)
TerminalOutput.print("• Try: tachikoma-streaming \"Tell me a story\" (real-time responses)", color: .dim)
TerminalOutput.print("• Try: tachikoma-agent --help (AI agents with tool calling)", color: .dim)
}
}
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import ArgumentParser
import Tachikoma
import Foundation
import SharedExampleUtils
import Tachikoma
/// The killer demo: Compare AI providers side-by-side using Tachikoma
@main
@ -13,159 +13,163 @@ struct TachikomaComparison: AsyncParsableCommand {
This demo showcases Tachikoma's unique value proposition: seamlessly switching between
AI providers with identical code. Send the same prompt to multiple providers and see
their responses, performance, and costs compared side-by-side.
Examples:
tachikoma-comparison "Explain quantum computing"
tachikoma-comparison --providers openai,anthropic "Write a Swift function"
tachikoma-comparison --interactive
"""
)
""")
@Argument(help: "The prompt to send to all providers")
var prompt: String?
@Option(name: .shortAndLong, help: "Comma-separated list of providers to compare (auto-detects if not specified)")
var providers: String?
@Flag(name: .shortAndLong, help: "Interactive mode - keep prompting for questions")
var interactive: Bool = false
@Flag(name: .shortAndLong, help: "Show verbose output including request/response details")
var verbose: Bool = false
@Option(help: "Maximum width for each provider column")
var columnWidth: Int = 60
@Option(help: "Maximum response length to display (0 = no limit)")
var maxLength: Int = 500
func run() async throws {
// Setup and show available providers
ConfigurationHelper.printSetupInstructions()
// Create the model provider using environment-based configuration
// This automatically detects all available API keys and sets up providers
let modelProvider = try ConfigurationHelper.createProviderWithAvailableModels()
let availableModels = modelProvider.availableModels()
if availableModels.isEmpty {
TerminalOutput.print("❌ No AI providers configured! Please set up API keys.", color: .red)
TerminalOutput.print("See setup instructions above.", color: .yellow)
return
}
// Determine which providers/models to use for comparison
let modelsToCompare = try selectModelsToCompare(availableModels: availableModels)
if modelsToCompare.isEmpty {
TerminalOutput.print("❌ No valid providers selected.", color: .red)
return
}
TerminalOutput.print("\n🎯 Comparing \(modelsToCompare.count) providers: \(modelsToCompare.joined(separator: ", "))", color: .green)
if interactive {
try await runInteractiveMode(modelProvider: modelProvider, models: modelsToCompare)
TerminalOutput.print(
"\n🎯 Comparing \(modelsToCompare.count) providers: \(modelsToCompare.joined(separator: ", "))",
color: .green)
if self.interactive {
try await self.runInteractiveMode(modelProvider: modelProvider, models: modelsToCompare)
} else {
guard let prompt = prompt else {
guard let prompt else {
TerminalOutput.print("❌ Please provide a prompt or use --interactive mode", color: .red)
return
}
try await compareProviders(prompt: prompt, modelProvider: modelProvider, models: modelsToCompare)
try await self.compareProviders(prompt: prompt, modelProvider: modelProvider, models: modelsToCompare)
}
}
/// Select which models to compare based on user preference and availability
private func selectModelsToCompare(availableModels: [String]) throws -> [String] {
if let providersString = providers {
// User specified providers
let requestedProviders = providersString.split(separator: ",").map { $0.trimmingCharacters(in: .whitespaces) }
let requestedProviders = providersString.split(separator: ",")
.map { $0.trimmingCharacters(in: .whitespaces) }
let providerToModel = ProviderDetector.recommendedModels()
var selectedModels: [String] = []
for provider in requestedProviders {
// Make provider matching case-insensitive
let normalizedProvider = provider.lowercased()
let matchingKey = providerToModel.keys.first { key in
key.lowercased() == normalizedProvider
}
if let key = matchingKey, let recommendedModel = providerToModel[key] {
if availableModels.contains(recommendedModel) {
selectedModels.append(recommendedModel)
} else {
TerminalOutput.print("⚠️ Provider \(provider) not available (missing \(recommendedModel))", color: .yellow)
TerminalOutput.print(
"⚠️ Provider \(provider) not available (missing \(recommendedModel))",
color: .yellow)
}
} else {
TerminalOutput.print("⚠️ Unknown provider: \(provider)", color: .yellow)
}
}
return selectedModels
} else {
// Auto-detect available providers, limit to 4 for display
let recommended = ProviderDetector.recommendedModels()
let availableProviders = recommended.values.filter { availableModels.contains($0) }
// Prefer a good mix if we have many available
let preferredOrder = ["gpt-4.1", "claude-opus-4-20250514", "llama3.3", "grok-4"]
var selected: [String] = []
for model in preferredOrder {
if availableProviders.contains(model) && selected.count < 4 {
if availableProviders.contains(model), selected.count < 4 {
selected.append(model)
}
}
// Fill remaining slots
for model in availableProviders {
if !selected.contains(model) && selected.count < 4 {
if !selected.contains(model), selected.count < 4 {
selected.append(model)
}
}
return selected
}
}
/// Run interactive mode where user can keep asking questions
private func runInteractiveMode(modelProvider: AIModelProvider, models: [String]) async throws {
TerminalOutput.header("🎭 Interactive AI Provider Comparison")
TerminalOutput.print("Type your questions and see how different AI providers respond!", color: .cyan)
TerminalOutput.print("Type 'quit' or 'exit' to stop.", color: .dim)
TerminalOutput.separator()
while true {
TerminalOutput.print("\n💭 Your question: ", color: .magenta)
guard let input = readLine()?.trimmingCharacters(in: .whitespaces) else {
continue
}
if input.lowercased() == "quit" || input.lowercased() == "exit" {
TerminalOutput.print("👋 Goodbye!", color: .green)
break
}
if input.isEmpty {
continue
}
try await compareProviders(prompt: input, modelProvider: modelProvider, models: models)
try await self.compareProviders(prompt: input, modelProvider: modelProvider, models: models)
}
}
/// The main comparison logic - this is where Tachikoma really shines!
private func compareProviders(prompt: String, modelProvider: AIModelProvider, models: [String]) async throws {
TerminalOutput.print("\n" + String(repeating: "", count: 100), color: .blue)
TerminalOutput.print("🤔 Prompt: \(prompt)", color: .bold)
TerminalOutput.print(String(repeating: "", count: 100), color: .blue)
// Send requests to all providers concurrently
// This demonstrates Tachikoma's power: same code, multiple providers
var comparisons: [ResponseComparison] = []
try await withThrowingTaskGroup(of: ResponseComparison.self) { group in
// Start all provider requests in parallel
for model in models {
@ -173,38 +177,41 @@ struct TachikomaComparison: AsyncParsableCommand {
try await self.getResponseFromProvider(
prompt: prompt,
modelProvider: modelProvider,
modelName: model
)
modelName: model)
}
}
// Collect results as they complete
for try await comparison in group {
comparisons.append(comparison)
}
}
// Sort by provider name for consistent display
comparisons.sort { $0.provider < $1.provider }
// Display results in requested format
if verbose {
displayVerboseResults(comparisons)
if self.verbose {
self.displayVerboseResults(comparisons)
} else {
displayCompactResults(comparisons)
self.displayCompactResults(comparisons)
}
// Display summary statistics
displaySummaryStats(comparisons)
self.displaySummaryStats(comparisons)
}
/// Get response from a single provider with performance measurement
private func getResponseFromProvider(prompt: String, modelProvider: AIModelProvider, modelName: String) async throws -> ResponseComparison {
private func getResponseFromProvider(
prompt: String,
modelProvider: AIModelProvider,
modelName: String) async throws -> ResponseComparison
{
do {
// Get the model instance - same interface for all providers
let model = try modelProvider.getModel(modelName)
let providerName = getProviderName(from: modelName)
let providerName = self.getProviderName(from: modelName)
// Measure performance while getting the response
let (response, duration) = try await PerformanceMeasurement.measure {
// Create a standard request that works with any provider
@ -213,9 +220,9 @@ struct TachikomaComparison: AsyncParsableCommand {
tools: nil, // No function calling for comparison
settings: ModelSettings(maxTokens: 500) // Limit response length
)
let result = try await model.getResponse(request: request)
// Extract text content from response
// All providers return the same AssistantContent format
let textContent = result.content.compactMap { item in
@ -224,114 +231,121 @@ struct TachikomaComparison: AsyncParsableCommand {
}
return nil
}.joined()
return textContent.isEmpty ? "No response" : textContent
}
let tokenCount = PerformanceMeasurement.estimateTokenCount(response)
let cost = PerformanceMeasurement.estimateCost(
provider: modelName,
inputTokens: PerformanceMeasurement.estimateTokenCount(prompt),
outputTokens: tokenCount
)
outputTokens: tokenCount)
return ResponseComparison(
provider: providerName,
response: response,
duration: duration,
tokenCount: tokenCount,
estimatedCost: cost
)
estimatedCost: cost)
} catch {
let providerName = getProviderName(from: modelName)
let providerName = self.getProviderName(from: modelName)
return ResponseComparison(
provider: providerName,
response: "",
duration: 0,
tokenCount: 0,
estimatedCost: nil,
error: error.localizedDescription
)
error: error.localizedDescription)
}
}
/// Extract provider name from model name
private func getProviderName(from modelName: String) -> String {
switch modelName.lowercased() {
case let m where m.contains("gpt") || m.contains("o3") || m.contains("o4"):
return "OpenAI \(modelName)"
"OpenAI \(modelName)"
case let m where m.contains("claude"):
return "Anthropic \(modelName)"
"Anthropic \(modelName)"
case let m where m.contains("llama") || m.contains("llava"):
return "Ollama \(modelName)"
"Ollama \(modelName)"
case let m where m.contains("grok"):
return "Grok \(modelName)"
"Grok \(modelName)"
default:
return modelName
modelName
}
}
/// Display results in compact side-by-side format
private func displayCompactResults(_ comparisons: [ResponseComparison]) {
let formatted = ResponseFormatter.formatSideBySide(comparisons, maxWidth: columnWidth)
let formatted = ResponseFormatter.formatSideBySide(comparisons, maxWidth: self.columnWidth)
print(formatted)
}
/// Display verbose results with full details
private func displayVerboseResults(_ comparisons: [ResponseComparison]) {
for comparison in comparisons {
TerminalOutput.separator("", length: 100)
TerminalOutput.providerHeader(comparison.provider)
if let error = comparison.error {
TerminalOutput.print("❌ Error: \(error)", color: .red)
} else {
TerminalOutput.print(comparison.response, color: .white)
let stats = ResponseFormatter.formatStats(comparison)
TerminalOutput.print("\n\(stats)", color: .dim)
}
TerminalOutput.separator("", length: 100)
}
}
/// Display summary statistics
private func displaySummaryStats(_ comparisons: [ResponseComparison]) {
let successful = comparisons.filter { $0.error == nil }
if successful.isEmpty {
TerminalOutput.print("\n❌ All providers failed", color: .red)
return
}
TerminalOutput.print("\n📊 Summary Statistics:", color: .bold)
// Speed comparison
let fastest = successful.min(by: { $0.duration < $1.duration })!
let slowest = successful.max(by: { $0.duration < $1.duration })!
TerminalOutput.print("⚡ Fastest: \(fastest.provider) (\(String(format: "%.2fs", fastest.duration)))", color: .green)
TerminalOutput.print("🐌 Slowest: \(slowest.provider) (\(String(format: "%.2fs", slowest.duration)))", color: .yellow)
TerminalOutput.print(
"⚡ Fastest: \(fastest.provider) (\(String(format: "%.2fs", fastest.duration)))",
color: .green)
TerminalOutput.print(
"🐌 Slowest: \(slowest.provider) (\(String(format: "%.2fs", slowest.duration)))",
color: .yellow)
// Cost comparison (if available)
let withCosts = successful.filter { $0.estimatedCost != nil }
if !withCosts.isEmpty {
let cheapest = withCosts.min(by: { $0.estimatedCost! < $1.estimatedCost! })!
let mostExpensive = withCosts.max(by: { $0.estimatedCost! < $1.estimatedCost! })!
TerminalOutput.print("💰 Cheapest: \(cheapest.provider) ($\(String(format: "%.4f", cheapest.estimatedCost!)))", color: .green)
TerminalOutput.print("💸 Most Expensive: \(mostExpensive.provider) ($\(String(format: "%.4f", mostExpensive.estimatedCost!)))", color: .yellow)
TerminalOutput.print(
"💰 Cheapest: \(cheapest.provider) ($\(String(format: "%.4f", cheapest.estimatedCost!)))",
color: .green)
TerminalOutput.print(
"💸 Most Expensive: \(mostExpensive.provider) ($\(String(format: "%.4f", mostExpensive.estimatedCost!)))",
color: .yellow)
}
// Response length comparison
let longest = successful.max(by: { $0.response.count < $1.response.count })!
let shortest = successful.min(by: { $0.response.count < $1.response.count })!
TerminalOutput.print("📏 Longest response: \(longest.provider) (\(longest.response.count) chars)", color: .cyan)
TerminalOutput.print("📏 Shortest response: \(shortest.provider) (\(shortest.response.count) chars)", color: .cyan)
TerminalOutput.print(
"📏 Shortest response: \(shortest.provider) (\(shortest.response.count) chars)",
color: .cyan)
TerminalOutput.separator()
}
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import ArgumentParser
import Tachikoma
import Foundation
import SharedExampleUtils
import Tachikoma
/// Demonstrate multimodal AI capabilities (vision + text) using Tachikoma
@main
@ -13,203 +13,213 @@ struct TachikomaMultimodal: AsyncParsableCommand {
This example showcases Tachikoma's multimodal capabilities, demonstrating how different
AI providers handle image analysis, OCR, visual reasoning, and combining visual and
textual information. Compare vision capabilities across providers.
Examples:
tachikoma-multimodal --image chart.png "Analyze this chart"
tachikoma-multimodal --image photo.jpg --compare-vision "What do you see?"
tachikoma-multimodal --ocr document.png "Extract all text"
tachikoma-multimodal --describe screenshot.png
"""
)
""")
@Option(name: .shortAndLong, help: "Path to image file to analyze")
var image: String?
@Argument(help: "Text prompt about the image")
var prompt: String?
@Option(name: .shortAndLong, help: "Specific provider to use")
var provider: String?
@Flag(name: .long, help: "Compare vision capabilities across multiple providers")
var compareVision: Bool = false
@Flag(name: .long, help: "Perform OCR (Optical Character Recognition) on the image")
var ocr: Bool = false
@Flag(name: .long, help: "Just describe what's in the image without additional prompt")
var describe: Bool = false
@Flag(name: .shortAndLong, help: "Show verbose analysis details")
var verbose: Bool = false
@Flag(name: .long, help: "List vision-capable models and exit")
var listVisionModels: Bool = false
@Option(help: "Max dimension for image processing (default: 1024)")
var maxDimension: Int = 1024
func run() async throws {
TerminalOutput.header("👁️ Tachikoma Multimodal Demo")
let modelProvider = try ConfigurationHelper.createProviderWithAvailableModels()
let availableModels = modelProvider.availableModels()
if availableModels.isEmpty {
TerminalOutput.print("❌ No AI providers configured! Please set up API keys.", color: .red)
ConfigurationHelper.printSetupInstructions()
return
}
if listVisionModels {
listAvailableVisionModels(availableModels)
if self.listVisionModels {
self.listAvailableVisionModels(availableModels)
return
}
guard let imagePath = image else {
TerminalOutput.print("❌ Please provide an image file with --image", color: .red)
return
}
// Load and validate the image file
let imageData = try loadImage(from: imagePath)
// Determine the final prompt based on flags and user input
let finalPrompt = determineFinalPrompt()
if compareVision {
let finalPrompt = self.determineFinalPrompt()
if self.compareVision {
// Compare how different providers analyze the same image
try await compareVisionAcrossProviders(
try await self.compareVisionAcrossProviders(
imageData: imageData,
imagePath: imagePath,
prompt: finalPrompt,
modelProvider: modelProvider,
availableModels: availableModels
)
availableModels: availableModels)
} else {
// Analyze with a single provider
try await analyzeSingleProvider(
try await self.analyzeSingleProvider(
imageData: imageData,
imagePath: imagePath,
prompt: finalPrompt,
modelProvider: modelProvider,
availableModels: availableModels
)
availableModels: availableModels)
}
}
/// List available vision models
private func listAvailableVisionModels(_ availableModels: [String]) {
TerminalOutput.print("👁️ Vision-Capable Models:", color: .cyan)
TerminalOutput.separator("")
let visionModels = getVisionCapableModels(availableModels)
let visionModels = self.getVisionCapableModels(availableModels)
if visionModels.isEmpty {
TerminalOutput.print("❌ No vision-capable models available", color: .red)
TerminalOutput.print("💡 Vision models require: GPT-4V, Claude 3+, or LLaVA", color: .yellow)
return
}
for model in visionModels {
let provider = getProviderName(from: model)
let provider = self.getProviderName(from: model)
let emoji = TerminalOutput.providerEmoji(provider)
let capabilities = getModelCapabilities(model)
let capabilities = self.getModelCapabilities(model)
TerminalOutput.print("\(emoji) \(model)", color: .white)
TerminalOutput.print(" Capabilities: \(capabilities.joined(separator: ", "))", color: .dim)
}
TerminalOutput.separator("")
TerminalOutput.print("💡 Use --compare-vision to test multiple models at once", color: .yellow)
}
/// Load and validate image file
private func loadImage(from path: String) throws -> Data {
let url = URL(fileURLWithPath: path)
guard FileManager.default.fileExists(atPath: path) else {
throw NSError(domain: "TachikomaMultimodal", code: 1, userInfo: [
NSLocalizedDescriptionKey: "Image file not found: \(path)"
NSLocalizedDescriptionKey: "Image file not found: \(path)",
])
}
let imageData = try Data(contentsOf: url)
// Basic validation - check if it looks like an image
let validHeaders = [
[0xFF, 0xD8], // JPEG
[0x89, 0x50, 0x4E, 0x47], // PNG
[0x47, 0x49, 0x46], // GIF
[0x42, 0x4D], // BMP
[0x52, 0x49, 0x46, 0x46] // WebP
[0x52, 0x49, 0x46, 0x46], // WebP
]
let isValidImage = validHeaders.contains { header in
imageData.prefix(header.count).elementsEqual(header.map { UInt8($0) })
}
if !isValidImage {
TerminalOutput.print("⚠️ Warning: File doesn't appear to be a standard image format", color: .yellow)
}
if verbose {
if self.verbose {
TerminalOutput.print("📸 Loaded image: \(imageData.count) bytes", color: .dim)
}
return imageData
}
/// Determine the final prompt to use
private func determineFinalPrompt() -> String {
if ocr {
return "Extract all text from this image. Provide the text content as accurately as possible, maintaining formatting when relevant."
} else if describe {
return "Describe what you see in this image in detail. Include objects, people, text, colors, setting, and any other notable features."
if self.ocr {
"Extract all text from this image. Provide the text content as accurately as possible, maintaining formatting when relevant."
} else if self.describe {
"Describe what you see in this image in detail. Include objects, people, text, colors, setting, and any other notable features."
} else if let userPrompt = prompt {
return userPrompt
userPrompt
} else {
return "Analyze this image and describe what you see."
"Analyze this image and describe what you see."
}
}
/// Analyze with a single provider
private func analyzeSingleProvider(imageData: Data, imagePath: String, prompt: String, modelProvider: AIModelProvider, availableModels: [String]) async throws {
private func analyzeSingleProvider(
imageData: Data,
imagePath: String,
prompt: String,
modelProvider: AIModelProvider,
availableModels: [String]) async throws
{
let selectedModel = try selectVisionModel(from: availableModels)
let model = try modelProvider.getModel(selectedModel)
let providerName = getProviderName(from: selectedModel)
let providerName = self.getProviderName(from: selectedModel)
TerminalOutput.print("🎯 Analyzing with: \(providerName)", color: .cyan)
TerminalOutput.print("📸 Image: \(imagePath)", color: .dim)
TerminalOutput.print("💭 Prompt: \(prompt)", color: .yellow)
TerminalOutput.separator("")
let analysis = try await analyzeImageWithProvider(
imageData: imageData,
prompt: prompt,
model: model,
modelName: selectedModel
)
displaySingleAnalysis(analysis)
modelName: selectedModel)
self.displaySingleAnalysis(analysis)
}
/// Compare vision across multiple providers
private func compareVisionAcrossProviders(imageData: Data, imagePath: String, prompt: String, modelProvider: AIModelProvider, availableModels: [String]) async throws {
let visionModels = getVisionCapableModels(availableModels)
private func compareVisionAcrossProviders(
imageData: Data,
imagePath: String,
prompt: String,
modelProvider: AIModelProvider,
availableModels: [String]) async throws
{
let visionModels = self.getVisionCapableModels(availableModels)
if visionModels.count < 2 {
TerminalOutput.print("❌ Need at least 2 vision models for comparison. Available: \(visionModels.count)", color: .red)
TerminalOutput.print(
"❌ Need at least 2 vision models for comparison. Available: \(visionModels.count)",
color: .red)
return
}
TerminalOutput.print("👁️ Comparing vision across \(visionModels.count) providers", color: .cyan)
TerminalOutput.print("📸 Image: \(imagePath)", color: .dim)
TerminalOutput.print("💭 Prompt: \(prompt)", color: .yellow)
TerminalOutput.separator("")
var analyses: [VisionAnalysis] = []
// Analyze with each provider concurrently
try await withThrowingTaskGroup(of: VisionAnalysis.self) { group in
for model in visionModels.prefix(4) { // Limit to 4 for display
@ -219,48 +229,51 @@ struct TachikomaMultimodal: AsyncParsableCommand {
imageData: imageData,
prompt: prompt,
model: modelInstance,
modelName: model
)
modelName: model)
}
}
for try await analysis in group {
analyses.append(analysis)
}
}
// Sort by provider name for consistent display
analyses.sort { $0.provider < $1.provider }
displayComparisonResults(analyses)
self.displayComparisonResults(analyses)
}
/// Analyze image with a specific provider using multimodal capabilities
private func analyzeImageWithProvider(imageData: Data, prompt: String, model: ModelInterface, modelName: String) async throws -> VisionAnalysis {
let providerName = getProviderName(from: modelName)
private func analyzeImageWithProvider(
imageData: Data,
prompt: String,
model: ModelInterface,
modelName: String) async throws -> VisionAnalysis
{
let providerName = self.getProviderName(from: modelName)
let startTime = Date()
do {
// Prepare the image for multimodal request
let base64Image = imageData.base64EncodedString()
// Create multimodal content combining text prompt and image
// This demonstrates Tachikoma's unified multimodal interface
let multimodalContent = MessageContent.multimodal([
MessageContentPart(type: "text", text: prompt, imageUrl: nil),
MessageContentPart(type: "image_url", text: nil, imageUrl: ImageContent(base64: base64Image))
MessageContentPart(type: "image_url", text: nil, imageUrl: ImageContent(base64: base64Image)),
])
let request = ModelRequest(
messages: [Message.user(content: multimodalContent)],
tools: nil, // No function calling for vision analysis
settings: ModelSettings(maxTokens: 1000)
)
settings: ModelSettings(maxTokens: 1000))
let response = try await model.getResponse(request: request)
let endTime = Date()
let duration = endTime.timeIntervalSince(startTime)
// Extract text content from response
// Vision models return their analysis as text content
let responseText = response.content.compactMap { item in
@ -269,10 +282,10 @@ struct TachikomaMultimodal: AsyncParsableCommand {
}
return nil
}.joined()
let finalResponseText = responseText.isEmpty ? "No response" : responseText
let tokenCount = PerformanceMeasurement.estimateTokenCount(finalResponseText)
return VisionAnalysis(
provider: providerName,
model: modelName,
@ -280,10 +293,9 @@ struct TachikomaMultimodal: AsyncParsableCommand {
duration: duration,
tokenCount: tokenCount,
wordCount: finalResponseText.split(separator: " ").count,
confidenceScore: calculateConfidenceScore(finalResponseText),
capabilities: getModelCapabilities(modelName)
)
confidenceScore: self.calculateConfidenceScore(finalResponseText),
capabilities: self.getModelCapabilities(modelName))
} catch {
return VisionAnalysis(
provider: providerName,
@ -294,109 +306,110 @@ struct TachikomaMultimodal: AsyncParsableCommand {
wordCount: 0,
confidenceScore: 0,
capabilities: [],
error: error.localizedDescription
)
error: error.localizedDescription)
}
}
/// Select a vision-capable model
private func selectVisionModel(from availableModels: [String]) throws -> String {
let visionModels = getVisionCapableModels(availableModels)
let visionModels = self.getVisionCapableModels(availableModels)
if visionModels.isEmpty {
throw NSError(domain: "TachikomaMultimodal", code: 1, userInfo: [
NSLocalizedDescriptionKey: "No vision-capable models available. Need GPT-4V, Claude 3+, or LLaVA."
NSLocalizedDescriptionKey: "No vision-capable models available. Need GPT-4V, Claude 3+, or LLaVA.",
])
}
if let requestedProvider = provider {
for model in visionModels {
if getProviderName(from: model).lowercased().contains(requestedProvider.lowercased()) {
if self.getProviderName(from: model).lowercased().contains(requestedProvider.lowercased()) {
return model
}
}
TerminalOutput.print("⚠️ Requested provider '\(requestedProvider)' not available for vision. Using default.", color: .yellow)
TerminalOutput.print(
"⚠️ Requested provider '\(requestedProvider)' not available for vision. Using default.",
color: .yellow)
}
// Prefer high-quality vision models
let visionPreferred = ["gpt-4o", "claude-opus-4-20250514", "claude-3-5-sonnet", "llava"]
for preferred in visionPreferred {
if visionModels.contains(preferred) {
return preferred
}
}
return visionModels.first!
}
/// Get vision-capable models from available models
private func getVisionCapableModels(_ availableModels: [String]) -> [String] {
return availableModels.filter { model in
availableModels.filter { model in
let lowercased = model.lowercased()
return lowercased.contains("gpt-4o") ||
lowercased.contains("gpt-4-vision") ||
lowercased.contains("claude-3") ||
lowercased.contains("claude-4") ||
lowercased.contains("llava") ||
lowercased.contains("vision")
lowercased.contains("gpt-4-vision") ||
lowercased.contains("claude-3") ||
lowercased.contains("claude-4") ||
lowercased.contains("llava") ||
lowercased.contains("vision")
}
}
/// Get model capabilities
private func getModelCapabilities(_ model: String) -> [String] {
let lowercased = model.lowercased()
var capabilities: [String] = []
// All vision models can do basic analysis
capabilities.append("Vision")
capabilities.append("OCR")
capabilities.append("Description")
// Model-specific capabilities
if lowercased.contains("gpt-4o") {
capabilities.append("Chart Analysis")
capabilities.append("Code Reading")
capabilities.append("Spatial Reasoning")
}
if lowercased.contains("claude") {
capabilities.append("Document Analysis")
capabilities.append("Artistic Analysis")
capabilities.append("Technical Diagrams")
}
if lowercased.contains("llava") {
capabilities.append("General Vision")
capabilities.append("Scene Understanding")
}
return capabilities
}
/// Detect MIME type from image data
private func detectMimeType(from data: Data) -> String {
guard !data.isEmpty else { return "application/octet-stream" }
if data.count >= 2 && data[0] == 0xFF && data[1] == 0xD8 {
if data.count >= 2, data[0] == 0xFF, data[1] == 0xD8 {
return "image/jpeg"
} else if data.count >= 4 && data[0] == 0x89 && data[1] == 0x50 && data[2] == 0x4E && data[3] == 0x47 {
} else if data.count >= 4, data[0] == 0x89, data[1] == 0x50, data[2] == 0x4E, data[3] == 0x47 {
return "image/png"
} else if data.count >= 3 && data[0] == 0x47 && data[1] == 0x49 && data[2] == 0x46 {
} else if data.count >= 3, data[0] == 0x47, data[1] == 0x49, data[2] == 0x46 {
return "image/gif"
} else if data.count >= 2 && data[0] == 0x42 && data[1] == 0x4D {
} else if data.count >= 2, data[0] == 0x42, data[1] == 0x4D {
return "image/bmp"
} else if data.count >= 12 && data[8...11].elementsEqual([0x57, 0x45, 0x42, 0x50]) {
} else if data.count >= 12, data[8...11].elementsEqual([0x57, 0x45, 0x42, 0x50]) {
return "image/webp"
}
return "image/jpeg" // Default fallback
}
/// Calculate confidence score based on response characteristics
private func calculateConfidenceScore(_ response: String) -> Double {
var score = 0.5 // Base score
// Longer responses often indicate more detailed analysis
if response.count > 100 {
score += 0.1
@ -404,131 +417,137 @@ struct TachikomaMultimodal: AsyncParsableCommand {
if response.count > 300 {
score += 0.1
}
// Specific details indicate confidence
let specificWords = ["color", "text", "number", "person", "object", "background", "size", "position"]
let mentionedSpecifics = specificWords.filter { response.lowercased().contains($0) }
score += Double(mentionedSpecifics.count) * 0.05
// Hedging language indicates lower confidence
let hedgeWords = ["might", "possibly", "appears", "seems", "likely", "probably", "unclear"]
let hedgeCount = hedgeWords.filter { response.lowercased().contains($0) }.count
let hedgeCount = hedgeWords.count(where: { response.lowercased().contains($0) })
score -= Double(hedgeCount) * 0.05
return min(max(score, 0.0), 1.0)
}
/// Extract provider name from model name
private func getProviderName(from modelName: String) -> String {
switch modelName.lowercased() {
case let m where m.contains("gpt") || m.contains("o3") || m.contains("o4"):
return "OpenAI"
"OpenAI"
case let m where m.contains("claude"):
return "Anthropic"
"Anthropic"
case let m where m.contains("llama") || m.contains("llava"):
return "Ollama"
"Ollama"
case let m where m.contains("grok"):
return "Grok"
"Grok"
default:
return "Unknown"
"Unknown"
}
}
/// Display single analysis result
private func displaySingleAnalysis(_ analysis: VisionAnalysis) {
let emoji = TerminalOutput.providerEmoji(analysis.provider)
if let error = analysis.error {
TerminalOutput.print("\(emoji) \(analysis.provider) Error: \(error)", color: .red)
return
}
TerminalOutput.print("\(emoji) \(analysis.provider) Analysis:", color: .bold)
TerminalOutput.separator("")
TerminalOutput.print(analysis.response, color: .white)
TerminalOutput.separator("")
displayAnalysisStats(analysis)
if verbose {
self.displayAnalysisStats(analysis)
if self.verbose {
TerminalOutput.print("\n🔍 Model Capabilities:", color: .yellow)
for capability in analysis.capabilities {
TerminalOutput.print("\(capability)", color: .dim)
}
}
}
/// Display comparison results
private func displayComparisonResults(_ analyses: [VisionAnalysis]) {
let successful = analyses.filter { $0.error == nil }
if successful.isEmpty {
TerminalOutput.print("❌ All vision analyses failed", color: .red)
return
}
// Display each analysis
for analysis in analyses {
let emoji = TerminalOutput.providerEmoji(analysis.provider)
TerminalOutput.print("\(emoji) \(analysis.provider):", color: .bold)
if let error = analysis.error {
TerminalOutput.print("❌ Error: \(error)", color: .red)
} else {
// Show truncated response
let preview = analysis.response.count > 200 ?
String(analysis.response.prefix(200)) + "..." :
let preview = analysis.response.count > 200 ?
String(analysis.response.prefix(200)) + "..." :
analysis.response
TerminalOutput.print(preview, color: .white)
let stats = formatCompactStats(analysis)
let stats = self.formatCompactStats(analysis)
TerminalOutput.print(stats, color: .dim)
}
TerminalOutput.separator("", length: 60)
}
// Summary comparison
displayVisionComparisonSummary(successful)
self.displayVisionComparisonSummary(successful)
}
/// Display analysis statistics
private func displayAnalysisStats(_ analysis: VisionAnalysis) {
let stats = [
"⏱️ Duration: \(String(format: "%.2fs", analysis.duration))",
"🔤 Tokens: \(analysis.tokenCount)",
"📝 Words: \(analysis.wordCount)",
"🎯 Confidence: \(String(format: "%.0f%%", analysis.confidenceScore * 100))"
"🎯 Confidence: \(String(format: "%.0f%%", analysis.confidenceScore * 100))",
]
TerminalOutput.print(stats.joined(separator: " | "), color: .dim)
}
/// Format compact statistics for comparison
private func formatCompactStats(_ analysis: VisionAnalysis) -> String {
return "⏱️ \(String(format: "%.1fs", analysis.duration)) | 📝 \(analysis.wordCount) words | 🎯 \(String(format: "%.0f%%", analysis.confidenceScore * 100))"
"⏱️ \(String(format: "%.1fs", analysis.duration)) | 📝 \(analysis.wordCount) words | 🎯 \(String(format: "%.0f%%", analysis.confidenceScore * 100))"
}
/// Display vision comparison summary
private func displayVisionComparisonSummary(_ analyses: [VisionAnalysis]) {
TerminalOutput.separator("")
TerminalOutput.print("🏆 Vision Analysis Summary:", color: .bold)
TerminalOutput.separator("")
// Find best performers
let fastest = analyses.min(by: { $0.duration < $1.duration })!
let mostDetailed = analyses.max(by: { $0.wordCount < $1.wordCount })!
let mostConfident = analyses.max(by: { $0.confidenceScore < $1.confidenceScore })!
TerminalOutput.print("⚡ Fastest: \(fastest.provider) (\(String(format: "%.1fs", fastest.duration)))", color: .green)
TerminalOutput.print("📝 Most Detailed: \(mostDetailed.provider) (\(mostDetailed.wordCount) words)", color: .cyan)
TerminalOutput.print("🎯 Most Confident: \(mostConfident.provider) (\(String(format: "%.0f%%", mostConfident.confidenceScore * 100)))", color: .yellow)
TerminalOutput.print(
"⚡ Fastest: \(fastest.provider) (\(String(format: "%.1fs", fastest.duration)))",
color: .green)
TerminalOutput.print(
"📝 Most Detailed: \(mostDetailed.provider) (\(mostDetailed.wordCount) words)",
color: .cyan)
TerminalOutput.print(
"🎯 Most Confident: \(mostConfident.provider) (\(String(format: "%.0f%%", mostConfident.confidenceScore * 100)))",
color: .yellow)
// Response length comparison
let avgLength = analyses.reduce(0) { $0 + $1.wordCount } / analyses.count
TerminalOutput.print("📊 Average response: \(avgLength) words", color: .dim)
TerminalOutput.separator("")
TerminalOutput.print("💡 Each provider has different strengths for vision tasks", color: .yellow)
}
@ -547,8 +566,18 @@ struct VisionAnalysis {
let confidenceScore: Double
let capabilities: [String]
let error: String?
init(provider: String, model: String, response: String, duration: TimeInterval, tokenCount: Int, wordCount: Int, confidenceScore: Double, capabilities: [String], error: String? = nil) {
init(
provider: String,
model: String,
response: String,
duration: TimeInterval,
tokenCount: Int,
wordCount: Int,
confidenceScore: Double,
capabilities: [String],
error: String? = nil)
{
self.provider = provider
self.model = model
self.response = response
@ -559,4 +588,4 @@ struct VisionAnalysis {
self.capabilities = capabilities
self.error = error
}
}
}

View File

@ -1,7 +1,7 @@
import Foundation
import ArgumentParser
import Tachikoma
import Foundation
import SharedExampleUtils
import Tachikoma
/// Demonstrate real-time streaming responses from AI providers
@main
@ -13,244 +13,255 @@ struct TachikomaStreaming: AsyncParsableCommand {
This example demonstrates Tachikoma's streaming capabilities, showing how different
providers deliver responses in real-time. Perfect for understanding performance
characteristics and user experience differences between providers.
Examples:
tachikoma-streaming "Tell me a story about robots"
tachikoma-streaming --race "Compare streaming speeds across providers"
tachikoma-streaming --provider anthropic "Write a detailed explanation"
"""
)
""")
@Argument(help: "The prompt to stream responses for")
var prompt: String?
@Option(name: .shortAndLong, help: "Specific provider to use for streaming")
var provider: String?
@Flag(name: .long, help: "Race mode - stream from multiple providers simultaneously")
var race: Bool = false
@Flag(name: .shortAndLong, help: "Show verbose streaming information")
var verbose: Bool = false
@Option(help: "Maximum tokens to stream (default: 1000)")
var maxTokens: Int = 1000
@Option(help: "Character delay between streamed words (milliseconds)")
var delayMs: Int = 50
func run() async throws {
TerminalOutput.header("⚡ Tachikoma Streaming Demo")
guard let prompt = prompt else {
guard let prompt else {
TerminalOutput.print("❌ Please provide a prompt to stream", color: .red)
return
}
let modelProvider = try ConfigurationHelper.createProviderWithAvailableModels()
let availableModels = modelProvider.availableModels()
if availableModels.isEmpty {
TerminalOutput.print("❌ No AI providers configured! Please set up API keys.", color: .red)
ConfigurationHelper.printSetupInstructions()
return
}
if race {
try await runRaceMode(prompt: prompt, modelProvider: modelProvider, availableModels: availableModels)
if self.race {
try await self.runRaceMode(prompt: prompt, modelProvider: modelProvider, availableModels: availableModels)
} else {
try await runSingleStream(prompt: prompt, modelProvider: modelProvider, availableModels: availableModels)
try await self.runSingleStream(
prompt: prompt,
modelProvider: modelProvider,
availableModels: availableModels)
}
}
/// Stream from a single provider to demonstrate real-time responses
private func runSingleStream(prompt: String, modelProvider: AIModelProvider, availableModels: [String]) async throws {
private func runSingleStream(
prompt: String,
modelProvider: AIModelProvider,
availableModels: [String]) async throws
{
let selectedModel = try selectModel(from: availableModels)
let model = try modelProvider.getModel(selectedModel)
let providerName = getProviderName(from: selectedModel)
let providerName = self.getProviderName(from: selectedModel)
TerminalOutput.print("🎯 Streaming from: \(providerName)", color: .cyan)
TerminalOutput.print("💭 Prompt: \(prompt)", color: .dim)
TerminalOutput.separator("")
// Track performance metrics
let startTime = Date()
var totalTokens = 0
var firstTokenTime: Date?
// Create the streaming request
let request = ModelRequest(
messages: [Message.user(content: .text(prompt))],
tools: nil, // No function calling for streaming demo
settings: ModelSettings(maxTokens: maxTokens)
)
if verbose {
settings: ModelSettings(maxTokens: self.maxTokens))
if self.verbose {
TerminalOutput.print("📡 Starting stream request...", color: .yellow)
}
do {
let emoji = TerminalOutput.providerEmoji(providerName)
TerminalOutput.print("\(emoji) \(providerName) response:", color: .bold)
TerminalOutput.separator("")
var responseText = ""
// Process the streaming response
// getStreamedResponse() returns an AsyncSequence of StreamEvent
for try await event in try await model.getStreamedResponse(request: request) {
if firstTokenTime == nil {
firstTokenTime = Date()
if verbose {
if self.verbose {
let timeToFirst = Date().timeIntervalSince(startTime)
TerminalOutput.print("\n⚡ First token received in \(String(format: "%.2fs", timeToFirst))", color: .green)
TerminalOutput.print(
"\n⚡ First token received in \(String(format: "%.2fs", timeToFirst))",
color: .green)
}
}
// Handle different types of streaming events
switch event {
case .textDelta(let delta):
case let .textDelta(delta):
// Text content arrives incrementally as the model generates it
let text = delta.delta
responseText += text
print(text, terminator: "") // Print immediately for real-time effect
fflush(stdout)
totalTokens += PerformanceMeasurement.estimateTokenCount(text)
// Optional: Add artificial delay to visualize streaming
if delayMs > 0 {
try await Task.sleep(nanoseconds: UInt64(delayMs * 1_000_000))
if self.delayMs > 0 {
try await Task.sleep(nanoseconds: UInt64(self.delayMs * 1_000_000))
}
case .responseCompleted:
// Stream has finished - break out of the loop
break
case .error(let errorEvent):
case let .error(errorEvent):
// Handle streaming errors
throw NSError(domain: "StreamingError", code: 1, userInfo: [
NSLocalizedDescriptionKey: errorEvent.error.message
NSLocalizedDescriptionKey: errorEvent.error.message,
])
default:
// Handle other event types silently (metadata, etc.)
break
}
}
let endTime = Date()
let totalDuration = endTime.timeIntervalSince(startTime)
let timeToFirst = firstTokenTime?.timeIntervalSince(startTime) ?? 0
print("\n")
TerminalOutput.separator("")
// Display streaming statistics
displayStreamingStats(
self.displayStreamingStats(
provider: providerName,
model: selectedModel,
totalDuration: totalDuration,
timeToFirst: timeToFirst,
totalTokens: totalTokens,
responseLength: responseText.count
)
responseLength: responseText.count)
} catch {
TerminalOutput.print("\n❌ Streaming failed: \(error)", color: .red)
}
}
/// Race mode - stream from multiple providers simultaneously
private func runRaceMode(prompt: String, modelProvider: AIModelProvider, availableModels: [String]) async throws {
let racingModels = selectRacingModels(from: availableModels)
let racingModels = self.selectRacingModels(from: availableModels)
if racingModels.count < 2 {
TerminalOutput.print("❌ Need at least 2 providers for race mode. Configure more API keys.", color: .red)
return
}
TerminalOutput.print("🏁 Racing \(racingModels.count) providers:", color: .cyan)
for model in racingModels {
let provider = getProviderName(from: model)
let provider = self.getProviderName(from: model)
let emoji = TerminalOutput.providerEmoji(provider)
TerminalOutput.print(" \(emoji) \(provider)", color: .white)
}
TerminalOutput.print("\n💭 Prompt: \(prompt)", color: .dim)
TerminalOutput.separator("")
// Create racing lanes
var completionOrder: [String] = []
var raceResults: [RaceResult] = []
try await withThrowingTaskGroup(of: RaceResult.self) { group in
for model in racingModels {
group.addTask {
try await self.runRacingStream(
prompt: prompt,
modelProvider: modelProvider,
modelName: model
)
modelName: model)
}
}
var position = 1
for try await result in group {
result.finishPosition = position
raceResults.append(result)
completionOrder.append(result.provider)
position += 1
let emoji = TerminalOutput.providerEmoji(result.provider)
TerminalOutput.print("🏁 #\(result.finishPosition): \(emoji) \(result.provider) finished! (\(String(format: "%.2fs", result.totalDuration)))", color: result.finishPosition == 1 ? .green : .yellow)
TerminalOutput.print(
"🏁 #\(result.finishPosition): \(emoji) \(result.provider) finished! (\(String(format: "%.2fs", result.totalDuration)))",
color: result.finishPosition == 1 ? .green : .yellow)
}
}
// Display race results
displayRaceResults(raceResults.sorted { $0.finishPosition < $1.finishPosition })
self.displayRaceResults(raceResults.sorted { $0.finishPosition < $1.finishPosition })
}
/// Run a single racing stream
private func runRacingStream(prompt: String, modelProvider: AIModelProvider, modelName: String) async throws -> RaceResult {
private func runRacingStream(
prompt: String,
modelProvider: AIModelProvider,
modelName: String) async throws -> RaceResult
{
let model = try modelProvider.getModel(modelName)
let providerName = getProviderName(from: modelName)
let providerName = self.getProviderName(from: modelName)
let startTime = Date()
var firstTokenTime: Date?
var totalTokens = 0
var responseText = ""
let request = ModelRequest(
messages: [Message.user(content: .text(prompt))],
tools: nil,
settings: ModelSettings(maxTokens: maxTokens / 2) // Shorter for racing
settings: ModelSettings(maxTokens: self.maxTokens / 2) // Shorter for racing
)
do {
for try await event in try await model.getStreamedResponse(request: request) {
if firstTokenTime == nil {
firstTokenTime = Date()
}
// Handle different event types
switch event {
case .textDelta(let delta):
case let .textDelta(delta):
let text = delta.delta
responseText += text
totalTokens += PerformanceMeasurement.estimateTokenCount(text)
case .responseCompleted:
break
case .error(let errorEvent):
case let .error(errorEvent):
throw NSError(domain: "StreamingError", code: 1, userInfo: [
NSLocalizedDescriptionKey: errorEvent.error.message
NSLocalizedDescriptionKey: errorEvent.error.message,
])
default:
// Handle other event types silently
break
}
}
let endTime = Date()
let totalDuration = endTime.timeIntervalSince(startTime)
let timeToFirst = firstTokenTime?.timeIntervalSince(startTime) ?? 0
return RaceResult(
provider: providerName,
model: modelName,
@ -259,9 +270,9 @@ struct TachikomaStreaming: AsyncParsableCommand {
totalTokens: totalTokens,
responseLength: responseText.count,
responsePreview: String(responseText.prefix(100)),
finishPosition: 0 // Will be set later
finishPosition: 0 // Will be set later
)
} catch {
// Return error result
return RaceResult(
@ -273,176 +284,191 @@ struct TachikomaStreaming: AsyncParsableCommand {
responseLength: 0,
responsePreview: "Error: \(error.localizedDescription)",
finishPosition: 999,
error: error.localizedDescription
)
error: error.localizedDescription)
}
}
/// Select a single model based on user preference
private func selectModel(from availableModels: [String]) throws -> String {
if let requestedProvider = provider {
let recommended = ProviderDetector.recommendedModels()
if let recommendedModel = recommended[requestedProvider.capitalized],
availableModels.contains(recommendedModel) {
availableModels.contains(recommendedModel)
{
return recommendedModel
} else {
// Find any model from the requested provider
let providerModels = availableModels.filter { model in
getProviderName(from: model).lowercased().contains(requestedProvider.lowercased())
self.getProviderName(from: model).lowercased().contains(requestedProvider.lowercased())
}
if let firstModel = providerModels.first {
return firstModel
} else {
throw NSError(domain: "TachikomaStreaming", code: 1, userInfo: [
NSLocalizedDescriptionKey: "Provider '\(requestedProvider)' not available"
NSLocalizedDescriptionKey: "Provider '\(requestedProvider)' not available",
])
}
}
}
// Auto-select best available for streaming
let streamingPreferred = ["claude-opus-4-20250514", "gpt-4.1", "llama3.3", "grok-4"]
for preferred in streamingPreferred {
if availableModels.contains(preferred) {
return preferred
}
}
return availableModels.first!
}
/// Select models for racing (up to 4)
private func selectRacingModels(from availableModels: [String]) -> [String] {
let recommended = ProviderDetector.recommendedModels()
let availableProviderModels = recommended.values.filter { availableModels.contains($0) }
// Prefer a good mix for racing
let racingOrder = ["gpt-4.1", "claude-opus-4-20250514", "llama3.3", "grok-4"]
var selected: [String] = []
for model in racingOrder {
if availableProviderModels.contains(model) && selected.count < 4 {
if availableProviderModels.contains(model), selected.count < 4 {
selected.append(model)
}
}
return selected
}
/// Extract provider name from model name
private func getProviderName(from modelName: String) -> String {
switch modelName.lowercased() {
case let m where m.contains("gpt") || m.contains("o3") || m.contains("o4"):
return "OpenAI"
"OpenAI"
case let m where m.contains("claude"):
return "Anthropic"
"Anthropic"
case let m where m.contains("llama") || m.contains("llava"):
return "Ollama"
"Ollama"
case let m where m.contains("grok"):
return "Grok"
"Grok"
default:
return "Unknown"
"Unknown"
}
}
/// Display streaming statistics
private func displayStreamingStats(provider: String, model: String, totalDuration: TimeInterval, timeToFirst: TimeInterval, totalTokens: Int, responseLength: Int) {
private func displayStreamingStats(
provider: String,
model: String,
totalDuration: TimeInterval,
timeToFirst: TimeInterval,
totalTokens: Int,
responseLength: Int)
{
TerminalOutput.print("📊 Streaming Statistics:", color: .bold)
let tokensPerSecond = totalTokens > 0 ? Double(totalTokens) / totalDuration : 0
let charsPerSecond = responseLength > 0 ? Double(responseLength) / totalDuration : 0
let stats = [
"⏱️ Total time: \(String(format: "%.2fs", totalDuration))",
"🚀 Time to first token: \(String(format: "%.2fs", timeToFirst))",
"📊 Streaming rate: \(String(format: "%.1f", tokensPerSecond)) tokens/sec",
"⚡ Character rate: \(String(format: "%.0f", charsPerSecond)) chars/sec",
"🔤 Total tokens: \(totalTokens)",
"📏 Response length: \(responseLength) characters"
"📏 Response length: \(responseLength) characters",
]
for stat in stats {
TerminalOutput.print(" \(stat)", color: .dim)
}
// Performance rating
let rating = getPerformanceRating(timeToFirst: timeToFirst, tokensPerSecond: tokensPerSecond)
let rating = self.getPerformanceRating(timeToFirst: timeToFirst, tokensPerSecond: tokensPerSecond)
TerminalOutput.print("\n🎯 Performance: \(rating)", color: .cyan)
if let cost = PerformanceMeasurement.estimateCost(
provider: model,
inputTokens: PerformanceMeasurement.estimateTokenCount(prompt ?? ""),
outputTokens: totalTokens
) {
outputTokens: totalTokens)
{
TerminalOutput.print("💰 Estimated cost: $\(String(format: "%.4f", cost))", color: .green)
} else {
TerminalOutput.print("💰 Cost: Free (local model)", color: .green)
}
}
/// Display race results
private func displayRaceResults(_ results: [RaceResult]) {
TerminalOutput.separator("")
TerminalOutput.print("🏆 Final Race Results:", color: .bold)
TerminalOutput.separator("")
for result in results {
let emoji = TerminalOutput.providerEmoji(result.provider)
let medal = getMedal(result.finishPosition)
let medal = self.getMedal(result.finishPosition)
if let error = result.error {
TerminalOutput.print("\(medal) \(emoji) \(result.provider): ❌ \(error)", color: .red)
} else {
let tokensPerSecond = result.totalTokens > 0 ? Double(result.totalTokens) / result.totalDuration : 0
TerminalOutput.print("\(medal) \(emoji) \(result.provider):", color: result.finishPosition == 1 ? .green : .white)
TerminalOutput.print(" ⏱️ \(String(format: "%.2fs", result.totalDuration)) | 🚀 \(String(format: "%.2fs", result.timeToFirst)) TTFT | ⚡ \(String(format: "%.1f", tokensPerSecond)) tok/sec", color: .dim)
TerminalOutput.print(
"\(medal) \(emoji) \(result.provider):",
color: result.finishPosition == 1 ? .green : .white)
TerminalOutput.print(
" ⏱️ \(String(format: "%.2fs", result.totalDuration)) | 🚀 \(String(format: "%.2fs", result.timeToFirst)) TTFT | ⚡ \(String(format: "%.1f", tokensPerSecond)) tok/sec",
color: .dim)
if !result.responsePreview.isEmpty {
TerminalOutput.print(" 💬 \"\(result.responsePreview)...\"", color: .dim)
}
}
}
TerminalOutput.separator("")
// Race analysis
let successful = results.filter { $0.error == nil }
if successful.count >= 2 {
let fastest = successful.min(by: { $0.totalDuration < $1.totalDuration })!
let slowest = successful.max(by: { $0.totalDuration < $1.totalDuration })!
TerminalOutput.print("🔍 Race Analysis:", color: .yellow)
TerminalOutput.print(" 🥇 Winner: \(fastest.provider) (\(String(format: "%.2fs", fastest.totalDuration)))", color: .green)
TerminalOutput.print(" 🐌 Slowest: \(slowest.provider) (\(String(format: "%.2fs", slowest.totalDuration)))", color: .yellow)
TerminalOutput.print(
" 🥇 Winner: \(fastest.provider) (\(String(format: "%.2fs", fastest.totalDuration)))",
color: .green)
TerminalOutput.print(
" 🐌 Slowest: \(slowest.provider) (\(String(format: "%.2fs", slowest.totalDuration)))",
color: .yellow)
let speedDifference = slowest.totalDuration - fastest.totalDuration
let percentFaster = (speedDifference / slowest.totalDuration) * 100
TerminalOutput.print(" ⚡ Speed advantage: \(String(format: "%.1f%%", percentFaster)) faster", color: .cyan)
}
}
/// Get performance rating based on metrics
private func getPerformanceRating(timeToFirst: TimeInterval, tokensPerSecond: Double) -> String {
if timeToFirst < 1.0 && tokensPerSecond > 50 {
return "🚀 Excellent (Very fast response and streaming)"
} else if timeToFirst < 2.0 && tokensPerSecond > 30 {
return "⚡ Good (Fast streaming)"
} else if timeToFirst < 5.0 && tokensPerSecond > 15 {
return "👍 Fair (Acceptable speed)"
if timeToFirst < 1.0, tokensPerSecond > 50 {
"🚀 Excellent (Very fast response and streaming)"
} else if timeToFirst < 2.0, tokensPerSecond > 30 {
"⚡ Good (Fast streaming)"
} else if timeToFirst < 5.0, tokensPerSecond > 15 {
"👍 Fair (Acceptable speed)"
} else {
return "🐌 Slow (Consider different provider)"
"🐌 Slow (Consider different provider)"
}
}
/// Get medal emoji for race position
private func getMedal(_ position: Int) -> String {
switch position {
case 1: return "🥇"
case 2: return "🥈"
case 3: return "🥉"
default: return "🔸"
case 1: "🥇"
case 2: "🥈"
case 3: "🥉"
default: "🔸"
}
}
}
@ -460,8 +486,18 @@ class RaceResult: @unchecked Sendable {
let responsePreview: String
var finishPosition: Int
let error: String?
init(provider: String, model: String, totalDuration: TimeInterval, timeToFirst: TimeInterval, totalTokens: Int, responseLength: Int, responsePreview: String, finishPosition: Int, error: String? = nil) {
init(
provider: String,
model: String,
totalDuration: TimeInterval,
timeToFirst: TimeInterval,
totalTokens: Int,
responseLength: Int,
responsePreview: String,
finishPosition: Int,
error: String? = nil)
{
self.provider = provider
self.model = model
self.totalDuration = totalDuration
@ -472,4 +508,4 @@ class RaceResult: @unchecked Sendable {
self.finishPosition = finishPosition
self.error = error
}
}
}

View File

@ -38,4 +38,4 @@ do {
print("=== Basic Tachikoma API Test ===")
print("This would test:", testCode)
print("=== Test Completed ===")
print("=== Test Completed ===")

View File

@ -15,7 +15,7 @@ mkdir -p Core/PeekabooModels/Tests/PeekabooModelsTests
```swift
// Core/PeekabooModels/Package.swift
// swift-tools-version: 6.0
// swift-tools-version: 6.2
import PackageDescription
let package = Package(

View File

@ -676,7 +676,7 @@ Guarantee your code is free of data races by enabling the Swift 6 language mode.
A `Package.swift` file that uses `swift-tools-version` of `6.0` will enable the Swift 6 language mode for all targets:
```swift
// swift-tools-version: 6.0
// swift-tools-version: 6.2
let package = Package(
name: "MyPackage",

View File

@ -1,53 +1,29 @@
# Terminal User Interface (TUI) and Progressive Enhancement
# Terminal Output Modes and Progressive Enhancement
Peekaboo's agent command features intelligent terminal detection and progressive enhancement, automatically providing the best possible user experience based on your terminal's capabilities.
Peekaboo's agent command automatically adjusts its output for modern terminals while staying CI-friendly.
> **Note**: The TermKit-based TUI was retired in November 2025. The agent now focuses on enhanced, compact, and minimal text output modes.
## Overview
Instead of manual mode selection, Peekaboo automatically detects your terminal's capabilities and selects the optimal output mode:
Peekaboo automatically detects your terminal's capabilities and selects the optimal output mode:
- **Full TUI** for capable terminals with TermKit interface
- **Enhanced formatting** for color terminals with rich typography
- **Standard output** for basic terminals with colors and icons
- **Compact mode** for standard ANSI terminals
- **Minimal mode** for CI environments and pipes
You can still override the selection with `--quiet`, `--verbose`, `--simple`, or by setting `PEEKABOO_OUTPUT_MODE`.
## Output Modes
### 🎮 TUI Mode (Automatic)
*Enabled for terminals ≥100x20 characters with color support*
Full terminal user interface with:
- **Progress Dashboard**: Real-time task progress, step count, duration, token usage
- **Status Sidebar**: Current tool execution, recent tool history with timing
- **Live Output**: Streaming tool execution results and AI messages
- **Enhanced Visuals**: Split-pane layout, progress bars, structured information
```
┌─── Agent Status ────────────────────────────────────────────────┐
│ Task: Take a screenshot of Safari and save it to desktop │
│ Progress: ████████░░░░ 8/20 steps • 🕒 1m 23s • ⚒ 5 tools │
└─────────────────────────────────────────────────────────────────┘
┌─ Tools & Status ─┐ ┌─ Output ─────────────────────────────────┐
│ 🎯 Current: │ │ 14:32:15 🚀 Task started: Take a scr... │
│ 👁 see screen │ │ 14:32:16 🤖 Using model: Claude Opus 4 │
│ │ │ 14:32:17 👁 see: screen │
│ 📋 Recent Tools: │ │ 14:32:19 ✓ Captured screen (3 buttons) │
│ ✓ see (1.2s) │ │ 14:32:20 🖱 click: element B3 │
│ ✓ click (0.8s) │ │ 14:32:21 ✓ Clicked 'Address Bar' │
│ → type (running) │ │ 14:32:22 ⌨️ type: 'screenshot tutorial' │
└───────────────────┘ └─────────────────────────────────────────┘
```
### ✨ Enhanced Mode (Automatic)
*Enabled for color terminals ≥80 characters wide*
*Enabled for color terminals*
Rich formatting with improved typography:
- Enhanced completion summaries with visual separators
- Better emoji usage (🧠 for thinking, ✅ for completion)
- Improved spacing and visual structure
Provides rich formatting with improved typography:
- Structured completion summaries with visual separators
- Clear emoji usage (🧠 for thinking, ✅ for completion)
- Contextual progress information
Example output:
```
🤖 Peekaboo Agent v3.0.0 using Claude Opus 4 (main/abc123, 2025-01-30)
@ -60,23 +36,21 @@ Example output:
────────────────────────────────────────────────────────────
```
### 🎨 Compact Mode (Legacy)
*Standard color terminals*
### 🎨 Compact Mode (Automatic)
Current implementation with colors and icons:
Colorized output with status indicators for terminals that support ANSI colors:
- Ghost animation during thinking phases
- Colorized tool execution with status indicators
- Standard completion summaries
- Colorized tool execution summary
- Familiar single-column layout
### 📋 Minimal Mode (Automatic)
*CI environments, pipes, and limited terminals*
Plain text, CI-friendly output:
Plain text, automation-friendly output:
- No colors or special characters
- Simple "OK/FAILED" status indicators
- Pipe-safe formatting for automation
- Pipe-safe formatting for logs
Example output:
```
Starting: Take a screenshot of Safari
see screen OK Captured screen (1.2s)
@ -86,8 +60,6 @@ Task completed in 2m 15s with 5 tools
## Terminal Detection
### Capabilities Analysis
Peekaboo performs comprehensive terminal capability detection:
```swift
@ -95,158 +67,18 @@ struct TerminalCapabilities {
let isInteractive: Bool // isatty(STDOUT_FILENO)
let supportsColors: Bool // COLORTERM + TERM patterns
let supportsTrueColor: Bool // 24-bit color detection
let supportsTUI: Bool // Full TUI requirements
let width: Int // Real-time dimensions via ioctl
let width: Int // Real-time dimensions via ioctl
let height: Int
let termType: String? // $TERM environment variable
let isCI: Bool // CI environment detection
let isPiped: Bool // Output redirection detection
let termType: String? // $TERM environment variable
let isCI: Bool // CI environment detection
let isPiped: Bool // Output redirection detection
}
```
### Detection Methods
Key detection techniques:
**Color Support Detection**:
1. `COLORTERM` environment variable (most reliable)
2. `TERM` patterns (`xterm-256color`, `*-color`)
3. Known color-capable terminals
4. Platform-specific defaults (macOS terminals)
- **Color support** via `COLORTERM`, `TERM`, and known terminal lists
- **CI detection** for GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.
- **Terminal size** through `ioctl` with fallbacks to `COLUMNS`/`LINES`
**CI Environment Detection**:
- Checks 20+ CI service environment variables
- GitHub Actions, GitLab CI, Travis, CircleCI, etc.
- Automatically uses minimal mode for automation
**Terminal Dimensions**:
- Real-time size detection via `ioctl(TIOCGWINSZ)`
- Fallback to `$COLUMNS`/`$LINES` environment variables
- Minimum size requirements for TUI mode (100x20)
## Manual Control
### Command Line Flags
Override automatic detection with explicit flags:
```bash
# Force specific output modes
peekaboo agent --force-tui "complex task" # Force TUI even in limited terminals
peekaboo agent --simple "basic task" # Force minimal output
peekaboo agent --no-color "ci task" # Disable colors only
# Standard flags (unchanged)
peekaboo agent --quiet "silent task" # Only final result
peekaboo agent --verbose "debug task" # Full JSON debug info
```
### Environment Variables
Control output mode via environment variables:
```bash
# Explicit mode selection
export PEEKABOO_OUTPUT_MODE=enhanced
export PEEKABOO_OUTPUT_MODE=minimal
# Standard color controls
export NO_COLOR=1 # Disable colors (forces minimal)
export FORCE_COLOR=1 # Force color support
export CLICOLOR_FORCE=1 # Alternative color forcing
```
## Usage Examples
### Automatic Mode Selection
```bash
# Automatically selects best mode for your terminal
peekaboo agent "Take a screenshot and analyze the content"
# In a good terminal (iTerm2, Terminal.app): Uses TUI mode
# In SSH session with colors: Uses enhanced mode
# In CI environment: Uses minimal mode
# When piped: Uses minimal mode
```
### Manual Overrides
```bash
# Force TUI for demonstration
peekaboo agent --force-tui "complex automation workflow"
# Force simple output for scripting
peekaboo agent --simple "automated task" | tee log.txt
# Disable colors for accessibility
NO_COLOR=1 peekaboo agent "task without colors"
```
### Environment-Specific Usage
```bash
# GitHub Actions (automatically minimal)
- run: peekaboo agent "CI automation task"
# Local development (automatically enhanced/TUI)
peekaboo agent "interactive development task"
# Docker container (automatically minimal)
docker run --rm app peekaboo agent "containerized task"
```
## Technical Implementation
### Progressive Enhancement Algorithm
1. **Explicit overrides** - User flags take highest priority
2. **Environment variables** - `NO_COLOR`, `FORCE_COLOR`, `PEEKABOO_OUTPUT_MODE`
3. **Context detection** - CI environments, pipes, non-interactive shells
4. **Capability analysis** - Terminal size, color support, TUI compatibility
5. **Optimal selection** - Best mode for detected capabilities
### Compatibility
**Supported Terminals**:
- **TUI Mode**: iTerm2, Terminal.app, Alacritty, Kitty, WezTerm
- **Enhanced Mode**: Most modern terminals with color support
- **Compact Mode**: Any terminal with ANSI color support
- **Minimal Mode**: Any terminal, text-only environments
**CI/Automation Support**:
- GitHub Actions, GitLab CI, Travis CI, CircleCI
- Jenkins, Azure Pipelines, Buildkite
- Docker containers, SSH sessions
- Shell pipes and redirections
### Debugging
View terminal detection details in verbose mode:
```bash
peekaboo agent --verbose "debug task"
# Shows:
# Terminal: xterm-256color (120x40) - interactive, colors, truecolor, TUI-capable
# Selected mode: TUI (full terminal interface)
```
## Benefits
### For Users
- **Zero configuration** - Optimal experience automatically
- **Universal compatibility** - Works everywhere
- **Enhanced productivity** - Rich visual feedback in capable terminals
- **Accessibility** - Respects color preferences and limitations
### For Automation
- **CI-friendly** - Automatic minimal mode for scripts
- **Pipe-safe** - Clean output for processing
- **Log-friendly** - Plain text for log analysis
- **Scriptable** - Predictable output formats
### For Development
- **Debugging support** - Verbose mode shows detection logic
- **Override options** - Force specific modes for testing
- **Environment awareness** - Adapts to deployment context
- **Future-proof** - Easy to add new modes or detection logic
The progressive enhancement system ensures that Peekaboo provides the best possible user experience across all terminal environments while maintaining complete backward compatibility and automation-friendly behavior.
The recommended mode is derived from these capabilities, but explicit flags and environment variables always take precedence.

10
info Normal file
View File

@ -0,0 +1,10 @@
{"timestamp":"2025-08-09T14:03:10.009Z","level":"info","message":"[peekaboo] Building with 0 changed file(s)"}
{"timestamp":"2025-08-09T14:03:10.010Z","level":"info","message":"[peekaboo] Build already in progress, skipping"}
{"timestamp":"2025-08-09T14:04:43.562Z","level":"info","message":"[peekaboo] Building with 0 changed file(s)"}
{"timestamp":"2025-08-09T14:04:43.563Z","level":"info","message":"[peekaboo] Build already in progress, skipping"}
{"timestamp":"2025-08-09T14:07:54.747Z","level":"info","message":"[peekaboo] Building with 0 changed file(s)"}
{"timestamp":"2025-08-09T14:07:54.747Z","level":"info","message":"[peekaboo] Build already in progress, skipping"}
{"timestamp":"2025-08-09T14:08:24.930Z","level":"info","message":"[peekaboo] Building with 0 changed file(s)"}
{"timestamp":"2025-08-09T14:08:24.931Z","level":"info","message":"[peekaboo] Build already in progress, skipping"}
{"timestamp":"2025-08-09T14:16:17.034Z","level":"info","message":"[peekaboo] [peekaboo] Building with 0 changed file(s)"}
{"timestamp":"2025-08-09T14:16:17.034Z","level":"info","message":"[peekaboo] [peekaboo] Build already in progress, skipping"}

View File

@ -2,7 +2,8 @@
import Foundation
let apiKey = ProcessInfo.processInfo.environment["X_AI_API_KEY"] ?? ProcessInfo.processInfo.environment["XAI_API_KEY"] ?? ""
let apiKey = ProcessInfo.processInfo.environment["X_AI_API_KEY"] ?? ProcessInfo.processInfo
.environment["XAI_API_KEY"] ?? ""
guard !apiKey.isEmpty else {
print("Error: X_AI_API_KEY or XAI_API_KEY not set")
exit(1)
@ -59,12 +60,12 @@ var startTime = Date()
let task = URLSession.shared.dataTask(with: request) { data, response, error in
let duration = Date().timeIntervalSince(startTime)
print("Response received after \(String(format: "%.2f", duration))s")
if let error = error {
if let error {
print("❌ Error: \(error)")
} else if let httpResponse = response as? HTTPURLResponse {
print("Response status: \(httpResponse.statusCode)")
if let data = data {
if let data {
let responseText = String(data: data, encoding: .utf8) ?? ""
print("Response (first 1000 chars): \(String(responseText.prefix(1000)))")
}
@ -79,4 +80,4 @@ if semaphore.wait(timeout: .now() + 30) == .timedOut {
print("❌ Request timed out after 30 seconds")
task.cancel()
exit(1)
}
}