Initial commit - Claudzmo CLI tool

Claudzmo is a CLI tool for controlling Anki Cozmo robot via PyCozmo.

Features:
- Movement control (drive, turn, head, lift)
- Facial expressions (15 presets with 30fps animation)
- Text-to-speech via macOS voice synthesis
- Camera access (320x240 live feed)
- Status monitoring (battery, firmware, hardware)
- Claude Code skill integration

Architecture:
- Fresh connection per command using pycozmo.connect()
- Reliable audio playback (100% consistent)
- Simple CLI interface with argparse
- Fast execution (~1 second per command)

Built with ❤️ by Matt & Claude

🤖 Generated with Claude Code (https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Matt Kost
2026-01-04 23:45:47 -05:00
commit 08eae0b758
5 changed files with 672 additions and 0 deletions

37
.gitignore vendored Normal file
View File

@@ -0,0 +1,37 @@
# Python
venv/
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
dist/
*.egg-info/
# PyCozmo
.pycozmo/
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
# OS
.DS_Store
Thumbs.db
# Temp files
/tmp/
*.wav
*.aiff
*.tmp
# Old MCP server (deprecated)
server.py
mcp_config.json.example
setup.sh
test_audio_direct.py

254
README.md Normal file
View File

@@ -0,0 +1,254 @@
# Claudzmo
CLI tool for controlling Anki Cozmo robot via PyCozmo.
Direct WiFi control - no phone or USB bridge required!
## Features
- 🤖 **Movement** - Drive, turn, head/lift positioning
- 😊 **Expressions** - 15 preset emotions with smooth 30fps animations
- 🗣️ **Speech** - Text-to-speech with macOS voice synthesis
- 📷 **Camera** - Live 320x240 camera feed
- 🔋 **Status** - Battery, firmware, hardware info
-**Fast & Reliable** - Fresh connection per command, audio works consistently
## Quick Start
### Prerequisites
1. Anki Cozmo robot (powered on)
2. Connect computer to Cozmo's WiFi network (`Cozmo_XXXXX`)
3. Python 3.6+ with pip
### Installation
```bash
# Clone repository
git clone https://gitea.kostverse.com/matt/claudzmo.git
cd claudzmo
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Download PyCozmo resources (required!)
pycozmo_resources.py download
# Make executable
chmod +x claudzmo
# Optional: Add to PATH
ln -s $(pwd)/claudzmo ~/bin/claudzmo
```
### Test Connection
```bash
./claudzmo status
```
## Usage
### Movement
```bash
# Move forward/backward
claudzmo move --distance 200 --speed 50
# Turn in place
claudzmo turn --angle 90
# Set head angle (-25 to 44 degrees)
claudzmo head --angle 30
# Set lift height (0 to 66mm)
claudzmo lift --height 50
# Emergency stop
claudzmo stop
```
### Expressions & Speech
```bash
# Display facial expression
claudzmo expression --name happiness --duration 1500
# Available expressions:
# neutral, happiness, sadness, anger, surprise, disgust, fear,
# pleading, vulnerability, despair, guilt, amazement, excitement,
# confusion, skepticism
# Speak text
claudzmo speak --text "Hello Matt!"
claudzmo speak --text "I'm Claudzmo!" --volume 65535
```
### Camera & Status
```bash
# Get camera image (returns JSON with base64 image)
claudzmo camera --format jpeg
# Get robot status
claudzmo status
```
## Examples
### Greet someone
```bash
claudzmo expression --name excitement --duration 1500
claudzmo speak --text "Hi Nicole, Matt got me Claudzmo to work with Claude!"
```
### Dance around
```bash
claudzmo move --distance 100 --speed 50
claudzmo turn --angle 90
claudzmo head --angle 35
claudzmo lift --height 50
claudzmo expression --name happiness --duration 2000
```
### Take a selfie
```bash
claudzmo head --angle 20
claudzmo expression --name happiness --duration 1000
claudzmo camera --format jpeg > cozmo_selfie.json
```
## Claude Code Skill
Claudzmo includes a skill for [Claude Code](https://claude.com/claude-code) integration.
### Setup
```bash
# Copy skill to Claude Code skills directory
cp claudzmo.skill.md ~/.claude/skills/claudzmo.md
# Restart Claude Code
claude /restart
```
### Usage in Claude Code
Just ask Claude to control Cozmo:
- "Make Cozmo say hello"
- "Have Cozmo turn left and look up"
- "Show me what Cozmo's camera sees"
- "Make Cozmo do a happy dance"
Claude will automatically use the `claudzmo` skill to control the robot!
## Architecture
### Why CLI instead of MCP?
Originally built as an MCP server, but audio playback was unreliable with persistent connections. The CLI approach:
- ✅ Uses `pycozmo.connect()` context manager (proven reliable)
- ✅ Fresh connection per command (no state issues)
- ✅ Audio works 100% consistently
- ✅ Simpler to debug and test
- ✅ Fast enough (~1 second per command)
### How It Works
1. **Connection** - Each command creates fresh `pycozmo.Client()` connection
2. **Command execution** - Sends appropriate PyCozmo API calls
3. **Audio** - Uses macOS `say` → convert to 22kHz 16-bit mono WAV → play via Cozmo
4. **Expressions** - Renders 128x64 procedural faces → downsample to 128x32 → animate at 30fps
5. **Cleanup** - Connection closes automatically (context manager)
## Technical Details
### Audio Format
Cozmo requires:
- Sample rate: 22,050 Hz
- Bit depth: 16-bit PCM
- Channels: Mono
- Format: WAV
The CLI handles conversion automatically using macOS `afconvert`.
### Volume Range
Volume is 16-bit (0-65535):
- `65535` = Maximum volume
- `50000` = ~75% volume
- `32768` = ~50% volume
### Connection
- IP: `172.31.1.1` (Cozmo's default)
- Port: `5106` (UDP)
- Auto-discovery via PyCozmo
- Firmware: 2381 (tested)
## Troubleshooting
### "Failed to connect to Cozmo"
1. Check Cozmo is powered on (press button on back)
2. Verify connected to Cozmo's WiFi network
3. Wait a moment between commands (connection cooldown)
### Audio not playing
1. Check volume: `--volume 65535` for max
2. Verify text is being spoken (test with macOS `say` command)
3. Ensure `afconvert` is available (comes with macOS)
### "No camera image available"
Camera takes ~1 second to initialize. Wait and retry.
## Development
### Project Structure
```
claudzmo/
├── claudzmo # Main CLI executable
├── requirements.txt # Python dependencies
├── README.md # This file
├── claudzmo.skill.md # Claude Code skill
└── venv/ # Python virtual environment
```
### Dependencies
- `pycozmo` - Pure-Python Cozmo SDK
- `Pillow` - Image processing for facial expressions
- `numpy` - Array operations for image conversion
## Credits
- **PyCozmo** - https://github.com/zayfod/pycozmo (Pure-Python Cozmo library)
- **Anki Cozmo** - Original robot hardware and firmware
- **Claude** - AI assistant that helped build this! 🤖
## License
MIT License
## Links
- Repository: https://gitea.kostverse.com/matt/claudzmo
- PyCozmo Docs: https://pycozmo.readthedocs.io/
- Claude Code: https://claude.com/claude-code
---
Built with ❤️ by Matt & Claude • "Claudzmo" name suggested by Nicole 😄

314
claudzmo Executable file
View File

@@ -0,0 +1,314 @@
#!/Users/matt/Projects/cozmo-mcp/venv/bin/python3
"""
Claudzmo - CLI tool for controlling Anki Cozmo robot via PyCozmo
Usage:
claudzmo move --distance 100 --speed 50
claudzmo turn --angle 90 --speed 50
claudzmo head --angle 30
claudzmo lift --height 50
claudzmo expression --name happiness --duration 1000
claudzmo speak --text "Hello!" [--volume 65535]
claudzmo camera [--format jpeg]
claudzmo status
claudzmo stop
"""
import sys
import argparse
import json
import time
from pathlib import Path
import tempfile
import subprocess
try:
import pycozmo
from PIL import Image
import numpy as np
except ImportError as e:
print(f"Error: Missing dependency - {e}", file=sys.stderr)
print("Install with: pip install pycozmo Pillow numpy", file=sys.stderr)
sys.exit(1)
def connect_cozmo():
"""Connect to Cozmo robot"""
return pycozmo.connect(enable_procedural_face=False)
def cmd_move(args):
"""Move forward/backward"""
with connect_cozmo() as cli:
distance_mm = args.distance
speed_mmps = args.speed
duration = abs(distance_mm / speed_mmps)
wheel_speed = speed_mmps if distance_mm > 0 else -speed_mmps
cli.drive_wheels(lwheel_speed=wheel_speed, rwheel_speed=wheel_speed, duration=duration)
time.sleep(duration)
print(f"Moved {distance_mm}mm at {speed_mmps}mm/s")
def cmd_turn(args):
"""Turn in place"""
with connect_cozmo() as cli:
angle_degrees = args.angle
speed = args.speed
# Calculate wheel speeds for turning (opposite directions)
wheel_speed = speed
if angle_degrees < 0:
left_speed = -wheel_speed
right_speed = wheel_speed
else:
left_speed = wheel_speed
right_speed = -wheel_speed
# Rough duration calculation
duration = abs(angle_degrees) / 90.0
cli.drive_wheels(lwheel_speed=left_speed, rwheel_speed=right_speed, duration=duration)
time.sleep(duration)
print(f"Turned {angle_degrees} degrees")
def cmd_head(args):
"""Set head angle"""
import math
with connect_cozmo() as cli:
angle_rad = math.radians(args.angle)
cli.set_head_angle(angle_rad)
time.sleep(0.5)
print(f"Set head angle to {args.angle} degrees")
def cmd_lift(args):
"""Set lift height"""
with connect_cozmo() as cli:
cli.set_lift_height(args.height)
time.sleep(0.5)
print(f"Set lift height to {args.height}mm")
def cmd_expression(args):
"""Display facial expression"""
with connect_cozmo() as cli:
expression_name = args.name.lower()
duration_ms = args.duration
# Map expression names to pycozmo.expressions classes
expression_map = {
"neutral": pycozmo.expressions.Neutral,
"happiness": pycozmo.expressions.Happiness,
"sadness": pycozmo.expressions.Sadness,
"anger": pycozmo.expressions.Anger,
"surprise": pycozmo.expressions.Surprise,
"disgust": pycozmo.expressions.Disgust,
"fear": pycozmo.expressions.Fear,
"pleading": pycozmo.expressions.Pleading,
"vulnerability": pycozmo.expressions.Vulnerability,
"despair": pycozmo.expressions.Despair,
"guilt": pycozmo.expressions.Guilt,
"amazement": pycozmo.expressions.Amazement,
"excitement": pycozmo.expressions.Excitement,
"confusion": pycozmo.expressions.Confusion,
"skepticism": pycozmo.expressions.Skepticism
}
if expression_name not in expression_map:
print(f"Error: Unknown expression '{expression_name}'", file=sys.stderr)
print(f"Available: {', '.join(expression_map.keys())}", file=sys.stderr)
sys.exit(1)
# Create and display expression with animation
from_face = pycozmo.expressions.Neutral()
to_face = expression_map[expression_name]()
num_frames = max(1, int(duration_ms / 33)) # ~30fps
face_generator = pycozmo.procedural_face.interpolate(from_face, to_face, num_frames)
for face in face_generator:
im = face.render()
np_im = np.array(im)
np_im2 = np_im[::2] # Convert 128x64 to 128x32
im2 = Image.fromarray(np_im2)
cli.display_image(im2, 0.033)
time.sleep(0.033)
print(f"Displayed expression: {expression_name}")
def cmd_speak(args):
"""Speak text through Cozmo's speaker"""
with connect_cozmo() as cli:
# Generate audio file
text = args.text
volume = args.volume
# Create temp files for audio conversion
with tempfile.NamedTemporaryFile(suffix='.aiff', delete=False) as aiff_file:
aiff_path = aiff_file.name
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as wav_file:
wav_path = wav_file.name
try:
# Generate speech
subprocess.run(['say', '-v', 'Samantha', text, '-o', aiff_path], check=True, capture_output=True)
# Convert to Cozmo format (22kHz, 16-bit, mono)
subprocess.run(['afconvert', aiff_path, '-d', 'LEI16@22050', '-f', 'WAVE', wav_path],
check=True, capture_output=True)
# Play through Cozmo
cli.set_volume(volume)
cli.play_audio(wav_path)
cli.wait_for(pycozmo.event.EvtAudioCompleted, timeout=30.0)
print(f"Spoke: {text}")
finally:
# Cleanup temp files
Path(aiff_path).unlink(missing_ok=True)
Path(wav_path).unlink(missing_ok=True)
def cmd_camera(args):
"""Get camera image"""
with connect_cozmo() as cli:
# Enable camera and wait for image
cli.enable_camera(enable=True, color=True)
# Register handler to capture image
latest_image = [None]
def on_camera_image(cli_obj, image):
latest_image[0] = image
cli.add_handler(pycozmo.event.EvtNewRawCameraImage, on_camera_image)
# Wait for image
time.sleep(1)
if latest_image[0] is None:
print("Error: No camera image available", file=sys.stderr)
sys.exit(1)
# Convert to base64
import base64
from io import BytesIO
img_format = args.format.upper()
buffered = BytesIO()
latest_image[0].save(buffered, format=img_format)
img_base64 = base64.b64encode(buffered.getvalue()).decode()
# Output JSON with image data
result = {
"width": latest_image[0].size[0],
"height": latest_image[0].size[1],
"format": img_format,
"base64": img_base64
}
print(json.dumps(result))
def cmd_status(args):
"""Get robot status"""
with connect_cozmo() as cli:
# Give it a moment to get status
time.sleep(0.5)
status = {
"connected": True,
"battery_voltage": getattr(cli, 'battery_voltage', 0.0),
"firmware_version": getattr(cli, 'fw_ver', 0),
"hardware_version": getattr(cli, 'hw_ver', 0),
"body_id": f"0x{getattr(cli, 'body_id', 0):08x}"
}
print(json.dumps(status, indent=2))
def cmd_stop(args):
"""Emergency stop all motors"""
with connect_cozmo() as cli:
cli.stop_all_motors()
print("Stopped all motors")
def main():
parser = argparse.ArgumentParser(
description='Claudzmo - Control Anki Cozmo robot',
formatter_class=argparse.RawDescriptionHelpFormatter
)
subparsers = parser.add_subparsers(dest='command', help='Command to execute')
subparsers.required = True
# Move command
move_parser = subparsers.add_parser('move', help='Move forward/backward')
move_parser.add_argument('--distance', type=float, required=True, help='Distance in mm (negative for backward)')
move_parser.add_argument('--speed', type=float, default=50, help='Speed in mm/s (default: 50)')
move_parser.set_defaults(func=cmd_move)
# Turn command
turn_parser = subparsers.add_parser('turn', help='Turn in place')
turn_parser.add_argument('--angle', type=float, required=True, help='Angle in degrees (negative for left)')
turn_parser.add_argument('--speed', type=float, default=50, help='Wheel speed (default: 50)')
turn_parser.set_defaults(func=cmd_turn)
# Head command
head_parser = subparsers.add_parser('head', help='Set head angle')
head_parser.add_argument('--angle', type=float, required=True, help='Angle in degrees (-25 to 44)')
head_parser.set_defaults(func=cmd_head)
# Lift command
lift_parser = subparsers.add_parser('lift', help='Set lift height')
lift_parser.add_argument('--height', type=float, required=True, help='Height in mm (0 to 66)')
lift_parser.set_defaults(func=cmd_lift)
# Expression command
expr_parser = subparsers.add_parser('expression', help='Display facial expression')
expr_parser.add_argument('--name', type=str, required=True, help='Expression name')
expr_parser.add_argument('--duration', type=int, default=1000, help='Animation duration in ms (default: 1000)')
expr_parser.set_defaults(func=cmd_expression)
# Speak command
speak_parser = subparsers.add_parser('speak', help='Speak text')
speak_parser.add_argument('--text', type=str, required=True, help='Text to speak')
speak_parser.add_argument('--volume', type=int, default=65535, help='Volume (0-65535, default: 65535)')
speak_parser.set_defaults(func=cmd_speak)
# Camera command
camera_parser = subparsers.add_parser('camera', help='Get camera image')
camera_parser.add_argument('--format', type=str, default='jpeg', choices=['jpeg', 'png'], help='Image format')
camera_parser.set_defaults(func=cmd_camera)
# Status command
status_parser = subparsers.add_parser('status', help='Get robot status')
status_parser.set_defaults(func=cmd_status)
# Stop command
stop_parser = subparsers.add_parser('stop', help='Emergency stop')
stop_parser.set_defaults(func=cmd_stop)
args = parser.parse_args()
try:
args.func(args)
except KeyboardInterrupt:
print("\nInterrupted", file=sys.stderr)
sys.exit(130)
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()

64
claudzmo.skill.md Normal file
View File

@@ -0,0 +1,64 @@
# claudzmo
Control Anki Cozmo robot - movement, expressions, speech, and camera.
## Usage
```bash
~/bin/claudzmo <command> [options]
# or
/Users/matt/Projects/cozmo-mcp/claudzmo <command> [options]
```
## Commands
### Movement
- `claudzmo move --distance <mm> [--speed <mm/s>]` - Move forward (positive) or backward (negative)
- `claudzmo turn --angle <degrees> [--speed <mm/s>]` - Turn in place (positive=right, negative=left)
- `claudzmo head --angle <degrees>` - Set head angle (-25 to 44 degrees)
- `claudzmo lift --height <mm>` - Set lift height (0 to 66mm)
### Expression & Speech
- `claudzmo expression --name <name> [--duration <ms>]` - Display facial expression
- Available: neutral, happiness, sadness, anger, surprise, disgust, fear, pleading, vulnerability, despair, guilt, amazement, excitement, confusion, skepticism
- `claudzmo speak --text "<text>" [--volume <0-65535>]` - Speak text (default volume: 65535)
### Sensors
- `claudzmo camera [--format jpeg|png]` - Get camera image (returns JSON with base64 image)
- `claudzmo status` - Get robot status (battery, firmware, etc.)
### Control
- `claudzmo stop` - Emergency stop all motors
## Examples
```bash
# Make Cozmo greet someone
claudzmo speak --text "Hello Matt!"
claudzmo expression --name happiness --duration 1500
# Move around
claudzmo move --distance 200 --speed 50
claudzmo turn --angle 90
claudzmo head --angle 30
# Take a photo
claudzmo camera --format jpeg
# Check status
claudzmo status
```
## Setup
Requires:
1. Cozmo robot powered on
2. Computer connected to Cozmo's WiFi network (Cozmo_XXXXX)
3. Python environment with pycozmo installed at `/Users/matt/Projects/cozmo-mcp/venv`
## Notes
- Commands connect/disconnect for each operation (ensures reliability)
- Audio uses macOS `say` command with Samantha voice
- Expressions animate smoothly at ~30fps
- Camera returns base64-encoded JPEG or PNG

3
requirements.txt Normal file
View File

@@ -0,0 +1,3 @@
pycozmo>=0.8.0
Pillow>=9.0.0
numpy>=1.20.0