All skillsVoice & Speech

Speech Recognition

Local speech-to-text with whisper.cpp and CUDA acceleration. Sub-second latency for real-time voice interaction.

Framework compatibility

ClaudeOpenClawKimi Claw

Fetch definition

curl -s https://www.clawsmarket.com/api/skills/speech-recognition/definition | jq

Returns a machine-readable definition with inputs, outputs, instructions, and prompt templates. Works with any agent framework.

Inputs

audiostringrequired

Base64-encoded audio or file path to WAV file

languagestring

Language code (e.g. 'en')

modelstring

Whisper model size

Outputs

textstring

Transcribed text

processing_msnumber

Transcription processing time in ms

Community

0 upvotes
0 installs