react native 语音转文字

2025年5月30日 Mr 焦 142

在 React Native 中实现语音转文字（Speech-to-Text）功能，可以通过以下几种方式来实现。以下是详细的实现步骤和说明：

1. 使用 React Native 社区库

React Native 社区提供了许多开源库来处理语音识别功能，其中最常用的是 react-native-voice 和 @react-native-community/voice。

安装依赖

首先安装 @react-native-community/voice 库：

npm install @react-native-community/voice

或者使用 Yarn：

yarn add @react-native-community/voice

如果你使用的是 React Native 0.60 及以上版本，库会自动链接。如果低于该版本，需要手动链接：

react-native link @react-native-community/voice

对于 iOS，还需要额外配置：

打开 Xcode 项目，在 Info.plist 文件中添加以下权限声明：

<key>NSSpeechRecognitionUsageDescription</key>
<string>我们需要您的许可来使用语音识别功能</string>
<key>NSMicrophoneUsageDescription</key>
<string>我们需要访问麦克风以录制您的语音</string>

对于 Android，在 AndroidManifest.xml 中添加以下权限：

<uses-permission android:name="android.permission.RECORD_AUDIO" />

代码实现

以下是一个简单的示例代码，展示如何使用 @react-native-community/voice 实现语音转文字功能：

import React, { useEffect, useState } from 'react';
import { Button, Text, View, PermissionsAndroid, Platform } from 'react-native';
import Voice from '@react-native-community/voice';

const App = () => {
  const [recognizedText, setRecognizedText] = useState('');
  const [isListening, setIsListening] = useState(false);

  // 请求权限
  const requestAudioPermission = async () => {
    if (Platform.OS === 'android') {
      try {
        const granted = await PermissionsAndroid.request(
          PermissionsAndroid.PERMISSIONS.RECORD_AUDIO,
          {
            title: 'Microphone Permission',
            message: 'App needs access to your microphone to recognize speech.',
            buttonNeutral: 'Ask Me Later',
            buttonNegative: 'Cancel',
            buttonPositive: 'OK',
          }
        );
        return granted === PermissionsAndroid.RESULTS.GRANTED;
      } catch (err) {
        console.warn(err);
        return false;
      }
    }
    return true; // iOS 不需要额外请求权限
  };

  // 初始化语音识别事件监听器
  useEffect(() => {
    Voice.onSpeechStart = onSpeechStart;
    Voice.onSpeechRecognized = onSpeechRecognized;
    Voice.onSpeechEnd = onSpeechEnd;
    Voice.onSpeechError = onSpeechError;
    Voice.onSpeechResults = onSpeechResults;

    return () => {
      Voice.destroy().then(Voice.removeAllListeners);
    };
  }, []);

  // 开始语音识别
  const startSpeechToText = async () => {
    const isPermissionGranted = await requestAudioPermission();
    if (!isPermissionGranted) {
      console.log('录音权限未授予');
      return;
    }

    try {
      await Voice.start('zh-CN'); // 设置语言为中文（可根据需求修改）
      setIsListening(true);
    } catch (e) {
      console.error(e);
    }
  };

  // 停止语音识别
  const stopSpeechToText = async () => {
    try {
      await Voice.stop();
      setIsListening(false);
    } catch (e) {
      console.error(e);
    }
  };

  // 事件处理函数
  const onSpeechStart = () => {
    console.log('语音识别开始');
  };

  const onSpeechRecognized = () => {
    console.log('语音被识别');
  };

  const onSpeechEnd = () => {
    console.log('语音识别结束');
    setIsListening(false);
  };

  const onSpeechError = (e) => {
    console.error('语音识别错误:', e);
  };

  const onSpeechResults = (e) => {
    console.log('语音识别结果:', e.value[0]);
    setRecognizedText(e.value[0]); // 获取第一个识别结果
  };

  return (
    <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
      <Text style={{ fontSize: 20, marginBottom: 20 }}>识别结果: {recognizedText}</Text>
      <Button
        title={isListening ? '停止录音' : '开始录音'}
        onPress={isListening ? stopSpeechToText : startSpeechToText}
      />
    </View>
  );
};

export default App;

2. 使用云服务 API

如果需要更高的准确率或支持更多语言，可以使用第三方云服务（如 Google Cloud Speech-to-Text、Microsoft Azure Speech Service 或百度语音识别等）。

步骤概述

注册并获取云服务的 API 密钥。
使用 react-native-sound-recorder 或类似库录制音频。
将录制的音频文件上传到云服务，并通过 API 获取识别结果。

示例：使用 Google Cloud Speech-to-Text

安装录音库：
```
npm install react-native-sound-recorder
```
配置 Google Cloud Speech-to-Text API 并上传音频文件。
解析返回的 JSON 数据，提取识别结果。

由于此方法涉及较多后端逻辑，适合有一定开发经验的开发者。

3. 注意事项

设备兼容性：不同设备对语音识别的支持可能有所不同，建议在主流设备上进行测试。
权限管理：确保正确处理录音权限，尤其是在 Android 和 iOS 上。
语言设置：根据用户需求选择合适的语言代码（如 en-US 表示美式英语，zh-CN 表示简体中文）。

总结

对于大多数应用场景，推荐使用 @react-native-community/voice 库，因为它简单易用且支持本地语音识别。如果需要更高精度或跨平台支持，可以选择云服务 API。希望上述内容能帮助你顺利实现语音转文字功能！如果有进一步的问题，请随时提问！

要将百度语音识别服务与 React Native 配合使用，可以按照以下步骤实现。百度语音识别提供了强大的语音转文字功能，并且支持多种语言和场景。以下是详细的实现步骤：

1. 准备工作

注册百度云账号并获取 API 密钥

访问百度智能云并注册账号。
创建一个语音识别应用，获取 API Key 和 Secret Key。
下载百度语音 SDK（可选），但这里我们主要通过 RESTful API 实现。

2. 安装依赖

为了实现录音和音频处理，我们需要安装以下依赖：

react-native-sound-recorder：用于录制音频。
axios 或 fetch：用于发送 HTTP 请求到百度语音识别 API。

npm install react-native-sound-recorder axios

如果你使用的是 React Native 0.60 及以上版本，库会自动链接。如果低于该版本，需要手动链接。

3. 录制音频

使用 react-native-sound-recorder 录制音频文件。百度语音识别支持 PCM 格式的音频文件。

import SoundRecorder from 'react-native-sound-recorder';

const startRecording = async () => {
  try {
    await SoundRecorder.start(SoundRecorder.PATH_CACHE + '/audio.pcm', {
      format: 'pcm',
      encoder: 'pcm',
      channels: 1, // 单声道
      sampleRate: 16000, // 百度语音识别要求的采样率
    });
    console.log('开始录音');
  } catch (error) {
    console.error('录音失败:', error);
  }
};

const stopRecording = async () => {
  try {
    const filePath = await SoundRecorder.stop();
    console.log('录音结束，文件路径:', filePath);
    return filePath;
  } catch (error) {
    console.error('停止录音失败:', error);
  }
};

4. 获取百度语音识别 Token

百度语音识别 API 使用 OAuth 2.0 授权机制，因此需要先获取 access_token。

import axios from 'axios';

const getAccessToken = async (apiKey, secretKey) => {
  const url = `https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=${apiKey}&client_secret=${secretKey}`;
  try {
    const response = await axios.get(url);
    return response.data.access_token;
  } catch (error) {
    console.error('获取 access_token 失败:', error);
    throw error;
  }
};

5. 调用百度语音识别 API

将录制的音频文件上传到百度语音识别 API，并解析返回的结果。

发送音频文件

百度语音识别支持两种方式：

直接上传音频数据（适合短语音）。
通过 WebSocket 流式上传（适合长语音）。

以下是一个简单的示例，使用直接上传的方式：

const recognizeSpeech = async (filePath, accessToken) => {
  const url = `https://vop.baidu.com/server_api?dev_pid=1537&cuid=YOUR_DEVICE_ID&token=${accessToken}`;

  // 读取音频文件内容
  const audioData = await RNFS.readFile(filePath, 'base64');

  try {
    const response = await axios.post(url, {
      format: 'pcm',
      rate: 16000,
      channel: 1,
      token: accessToken,
      speech: audioData, // Base64 编码的音频数据
      len: audioData.length,
    });

    if (response.data.err_no === 0) {
      console.log('识别结果:', response.data.result[0]);
      return response.data.result[0]; // 返回识别的文字
    } else {
      console.error('识别失败:', response.data.err_msg);
      throw new Error(response.data.err_msg);
    }
  } catch (error) {
    console.error('语音识别请求失败:', error);
    throw error;
  }
};

6. 完整代码示例

以下是一个完整的示例，展示如何录制音频并通过百度语音识别 API 转换为文字：

import React, { useState } from 'react';
import { Button, Text, View } from 'react-native';
import SoundRecorder from 'react-native-sound-recorder';
import axios from 'axios';

const App = () => {
  const [recognizedText, setRecognizedText] = useState('');
  const apiKey = 'YOUR_API_KEY'; // 替换为你的 API Key
  const secretKey = 'YOUR_SECRET_KEY'; // 替换为你的 Secret Key

  const startRecording = async () => {
    try {
      await SoundRecorder.start(SoundRecorder.PATH_CACHE + '/audio.pcm', {
        format: 'pcm',
        encoder: 'pcm',
        channels: 1,
        sampleRate: 16000,
      });
      console.log('开始录音');
    } catch (error) {
      console.error('录音失败:', error);
    }
  };

  const stopRecordingAndRecognize = async () => {
    try {
      const filePath = await SoundRecorder.stop();
      console.log('录音结束，文件路径:', filePath);

      // 获取 access_token
      const accessToken = await getAccessToken(apiKey, secretKey);

      // 调用百度语音识别 API
      const result = await recognizeSpeech(filePath, accessToken);
      setRecognizedText(result);
    } catch (error) {
      console.error('语音识别失败:', error);
    }
  };

  const getAccessToken = async (apiKey, secretKey) => {
    const url = `https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=${apiKey}&client_secret=${secretKey}`;
    const response = await axios.get(url);
    return response.data.access_token;
  };

  const recognizeSpeech = async (filePath, accessToken) => {
    const url = `https://vop.baidu.com/server_api?dev_pid=1537&cuid=YOUR_DEVICE_ID&token=${accessToken}`;
    const audioData = await RNFS.readFile(filePath, 'base64');

    const response = await axios.post(url, {
      format: 'pcm',
      rate: 16000,
      channel: 1,
      token: accessToken,
      speech: audioData,
      len: audioData.length,
    });

    if (response.data.err_no === 0) {
      return response.data.result[0];
    } else {
      throw new Error(response.data.err_msg);
    }
  };

  return (
    <View style={{ flex: 1, justifyContent: 'center', alignItems: 'center' }}>
      <Text style={{ fontSize: 20, marginBottom: 20 }}>识别结果: {recognizedText}</Text>
      <Button title="开始录音" onPress={startRecording} />
      <Button title="停止录音并识别" onPress={stopRecordingAndRecognize} />
    </View>
  );
};

export default App;

7. 注意事项

音频格式：百度语音识别要求音频为单声道、16kHz 采样率的 PCM 格式。
网络权限：确保在 Android 和 iOS 上添加网络权限。
- Android: 在 AndroidManifest.xml 中添加 <uses-permission android:name="android.permission.INTERNET" />
- iOS: 默认支持网络权限。
错误处理：百度语音识别 API 返回的错误码需要仔细处理，例如 3300 表示参数错误，3301 表示音频质量问题等。

8. 总结

通过上述步骤，你可以轻松地在 React Native 应用中集成百度语音识别功能。此方法适用于大多数场景，尤其是需要高精度识别的场合。如果遇到问题或需要进一步优化，请随时提问！

版权属于：Mr 焦

本文链接：https://www.mtsws.cn/post-9.html

作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可