Shisa API - Enterprise AI Translation, TTS & ASR APIs

听起来像人类的声音

具有情感智能的最先进神经TTS

自然声音

具有自然语调和韵律的高度逼真的日语和英语声音

情感表达

通过情感语音调制表达喜悦、关心、兴奋或专业性

自定义语音克隆

克隆您的品牌声音或创建自定义声音以实现一致的音频品牌

实时流式传输

为交互式应用程序和实时对话提供低延迟流式传输

多种语言

支持日语、英语等，具有母语发音质量

企业级

99.9%正常运行时间SLA、安全处理和企业需求的专用支持

自己试试

通过这个交互式演示体验我们的文本转语音API。将文本转换为自然语音。

文本转语音演示

将文本转换为自然语音

选择语音

46 个语音

示例文本：

45 / 500

API Voice ID:c3abe79a-99b3-4a5f-8549-f5cb42985291

Stream

API Voice ID:c3abe79a-99b3-4a5f-8549-f5cb42985291

快速开始

通过三个简单步骤生成语音。

步骤1：身份验证

在每个请求的Authorization头中包含您的API密钥。

Authorization: Bearer YOUR_API_KEY

步骤2：发送第一个请求

发送包含要转换为语音的文本的POST请求。

curl -s -X POST "https://api.shisa.ai/tts" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "voice_id": "c3abe79a-99b3-4a5f-8549-f5cb42985291",
    "format": "mp3",
    "stream": false,
    "text": "こんにちは。Shisa APIへようこそ。"
  }' \
  --output speech.mp3

最简请求

只需voice_id、text和format。设置stream: true可启用实时流式传输。

步骤3：播放音频

API以您请求的格式返回二进制音频数据。保存到文件或直接流式传输到音频播放器。

# Play the generated audio
ffplay -nodisp -autoexit speech.mp3

# Or stream directly
curl -s -X POST "https://api.shisa.ai/tts" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"voice_id": "c3abe79a-99b3-4a5f-8549-f5cb42985291", "format": "mp3", "stream": true, "text": "ストリーミングテストです。"}' \
  --output - | ffplay -nodisp -autoexit -

API端点

用于语音生成和获取可用语音列表的两个端点。

生成语音

POSThttps://api.shisa.ai/tts

将文本转换为语音音频。以请求的格式返回二进制音频数据。

语音列表

GEThttps://api.shisa.ai/tts/voices

返回所有可用语音的JSON数组，包括元数据、支持的格式和流式传输功能。

请求参数

POST /tts 端点的参数。

POST /tts 参数

参数	类型	必填	描述
voice_id	string	必填	要使用的语音UUID。可从 GET /tts/voices 获取可用ID。
text	string	必填	要转换为语音的文本。最多5000个字符。
format	string	必填	输出音频格式。必须是所选语音支持的格式。选项： `mp3`, `wav`, `ogg`, `pcm`, `flac`
stream	boolean	可选	为true时，以分块流形式返回音频用于实时播放。仅适用于streaming: true的语音。默认值： `false`

响应格式

语音生成和语音列表的响应格式。

POST /tts — 二进制音频

成功时，API返回带有适当Content-Type头（例如audio/mp3）的原始二进制音频数据。将响应体直接保存到文件。

# The response is binary audio data — save directly to file
curl -s -X POST "https://api.shisa.ai/tts" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"voice_id": "c3abe79a-99b3-4a5f-8549-f5cb42985291", "format": "mp3", "text": "テスト"}' \
  --output speech.mp3

GET /tts/voices — JSON

返回可用语音对象的JSON数组。

[
  {
    "id": "c3abe79a-99b3-4a5f-8549-f5cb42985291",
    "description": "Young male Japanese voice...",
    "language": "Japanese & English",
    "gender": "Male",
    "formats": ["mp3", "ogg", "pcm"],
    "streaming": true
  }
]

语音字段

id: 在请求中用作voice_id的UUID
description: 人类可读的语音描述
language: 支持的语言
gender: 语音性别（Male、Female、Neutral）
formats: 支持的输出音频格式
streaming: 是否支持实时流式传输

错误处理

错误响应及其处理方法。

错误响应格式

{
  "context": ["..."],
  "code": 104,
  "name": "ErrAuthenticationFailed",
  "error": "Authentication error: Invalid token"
}

错误代码

状态	原因	解决方案
400	参数缺失或无效	请检查voice_id、text和format字段
400	语音不支持的格式	使用语音formats数组中列出的格式
401	API密钥无效或缺失	请检查Authorization: Bearer头
429	超出速率限制	使用指数退避等待后重试
500	内部服务器错误	重试请求或联系技术支持

简单集成

使用易于使用的API在几分钟内开始生成语音

使用cURL快速开始

# List available voices
curl -s -X GET "https://api.shisa.ai/tts/voices" \
  -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Generate speech
curl -s -X POST "https://api.shisa.ai/tts" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "voice_id": "c3abe79a-99b3-4a5f-8549-f5cb42985291",
    "format": "mp3",
    "stream": false,
    "text": "こんにちは。Shisa APIへようこそ。"
  }' \
  --output speech.mp3

# Stream audio directly to a player
curl -s -X POST "https://api.shisa.ai/tts" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "voice_id": "c3abe79a-99b3-4a5f-8549-f5cb42985291",
    "format": "mp3",
    "stream": true,
    "text": "ストリーミングテストです。"
  }' \
  --output - | ffplay -nodisp -autoexit -

Python集成

import requests

API_URL = "https://api.shisa.ai"
API_KEY = "YOUR_API_KEY"
HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# List available voices
def list_voices():
    response = requests.get(f"{API_URL}/tts/voices", headers=HEADERS)
    return response.json()

# Generate speech
def generate_speech(text, voice_id="c3abe79a-99b3-4a5f-8549-f5cb42985291", format="mp3", stream=False):
    response = requests.post(
        f"{API_URL}/tts",
        headers=HEADERS,
        json={
            "voice_id": voice_id,
            "format": format,
            "stream": stream,
            "text": text
        },
        stream=stream
    )

    output_file = f"output.{format}"
    with open(output_file, "wb") as f:
        if stream:
            for chunk in response.iter_content():
                f.write(chunk)
        else:
            f.write(response.content)

    return output_file

# Example usage
voices = list_voices()
print(voices)

audio_file = generate_speech(
    "お客様の声を大切にしています。",
    voice_id="c3abe79a-99b3-4a5f-8549-f5cb42985291"
)

JavaScript/TypeScript流式传输

const API_URL = 'https://api.shisa.ai';
const API_KEY = 'YOUR_API_KEY';
const headers = {
  'Authorization': `Bearer ${API_KEY}`,
  'Content-Type': 'application/json',
};

// List available voices
const listVoices = async () => {
  const response = await fetch(`${API_URL}/tts/voices`, { headers });
  return response.json();
};

// Generate speech
const generateSpeech = async (text, voiceId = 'c3abe79a-99b3-4a5f-8549-f5cb42985291', format = 'mp3') => {
  const response = await fetch(`${API_URL}/tts`, {
    method: 'POST',
    headers,
    body: JSON.stringify({
      voice_id: voiceId,
      format,
      stream: true,
      text,
    }),
  });

  // Handle streaming response
  const reader = response.body.getReader();
  const chunks = [];

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    chunks.push(value);
  }

  // Combine chunks and create audio blob
  const blob = new Blob(chunks, { type: `audio/${format}` });
  const url = URL.createObjectURL(blob);

  // Play audio
  const audio = new Audio(url);
  audio.play();
};

// Example usage
const voices = await listVoices();
console.log(voices);

await generateSpeech('ようこそ、Shisa APIへ。', 'c3abe79a-99b3-4a5f-8549-f5cb42985291');

受信任的使用场景

了解企业如何利用我们的TTS API

虚拟助手

用自然语音为AI助手、聊天机器人和语音界面提供动力，实现引人入胜的用户交互。

AI电话坐席
智能家居助手
交互式语音应答
语音启用应用

有声读物和内容

以专业质量的叙述大规模创建有声读物、播客和教育内容。

有声读物旁白
在线学习课程
播客制作
视频配音

无障碍

使视障用户可以访问内容，并为所有用户提供音频替代方案。

屏幕阅读器
新闻文章音频
文档旁白
导航辅助

为您的应用程序赋予声音

从每月20,000个免费字符开始。随时升级以获得更多容量和功能。

立即试用演示查看定价计划

文本​转语​音API

听​起来​像​人类​的​声音

自己​试试

API密钥

快速​开始

API​端点

请​求​参数

响​应格式

语​音​字段

错误​处理

简单​集成

受​信任​的​使用​场​景

为​您​的​应用​程序​赋予​声​音

文本转语音API