Skip to main content

Agent Plugins: Reference Starters

Voice Transcribe (Deepgram)

The Speech Transcription plugin helps you turn speech into text and text into speech. It uses Deepgram to do this work.

This plugin can do three main things:

  • Turn a speech recording into written text (transcription).
  • Turn written text into speech (text-to-speech).
  • Chat with an AI model using the transcribed text.

Setting up your agent

  • Go to the "Add Panel" section and pick the Speech Transcription plugin.
  • Choose a Model from the list.
  • Pick a Simple Model for easier tasks like naming your conversations.
  • Set the Context Size for the model you're using.
  • Write a System Message to tell the model how you want it to act.
  • Enter your Deepgram API Key in the settings.
  • You can also select the default Deepgram voice for text-to-spech, as well as the speech-to-text transcription model you would like to use.

After you add your API key, your agent will be ready to use!

Using the /transcribe command

To turn speech into text:

  • Upload your file.
  • Type /transcribe in your chat.
  • Add /file filename.mp3 to tell it which file to use in your command.
  • If you want the plugin to guess who's speaking, add /diarize to your command

For example, an example prompt with the command would look like:

/transcribe /file myrecording.mp3 /diarize

Using the /speak command

To turn text into speech:

  • Type /speak in your chat.
  • Write the text you want to turn into speech.
  • If you want to use a specific voice, add /voice voice_name at the end

For example, you might type:

/speak Hello, how are you today? /voice aura-asteria-en

If you don't choose a voice, the plugin will use the one you set up earlier.

Available voices include:

  • aura-orion-en
  • aura-arcas-en
  • aura-perseus-en
  • aura-angus-en
  • aura-orpheus-en
  • aura-helios-en
  • aura-zeus-en

You can find an updated list of available Deepgram voices over at:

https://developers.deepgram.com/docs/tts-models

Review the code

Want to review how this model is made, or use it as the base to creating a new agent plugin?

You can find the code that powers this plugin available here:
https://github.com/promptpanel/promptpanel/tree/main/plugins/voice_deepgram