How to Build Speech-To-Text App Like Siri in iOS 10

How to Build Speech-To-Text App Like Siri in iOS 10
5 (100%) 3 votes

Speech recognition technology has become incredibly popular in recent years. From enterprises to individual developers, people are using this technology widely for the benefits it offers for developing a speech-to-text app.

From enterprises to individual developers, people are using this technology widely for the benefits it offers for developing a speech-to-text app.

However, one of the most incredible benefits of speech recognition technology is the dictation ability it offers. Yes, with the help of this API, users can create documents by speaking. Take ListNote for an example. This application allows its users to take ‘speech to text’ notes.

The good news for the iOS developers is, Apple has introduced an official speech to text framework during the WWDC 2016 event. This means you can now develop speech recognition app with the help of an official API. As a matter of fact, this is the same framework which Siri uses for speech recognition. However, before the release of iOS 10, there wasn’t any official API for developing such speech-to-text apps until now, and the developers had to rely on the third party frameworks to include this feature. But this isn’t the case anymore.

In this tutorial, we’ll share the process on how to build voice to text apps similar to Siri that will convert speech to text.

Developing The App

Create a new project under the file menu and select “Single View Application” and click on next button.

In the next tab, write a name for your project. Here, we’re naming it as “SOSpeechtoText”

Once you create the project, design the UI according to your requirements.

voice to text apps tutorial
After designing the UI of your iPhone app, import speech framework in ViewController.Swift and assign a delegate for the class.

import Speech

class ViewController: UIViewController, SFSpeechRecognizerDelegate {


Set outlets of UITextView and UIButton
@IBOutlet weak var txtSpeech: UITextView!
@IBOutlet weak var btnRecord: UIButton!

Next, create objects of “SFSpeechRecognizer” to create a speech recognizer for the specified locale.

“SFSpeechAudioBufferRecognitionRequest” object recognize live speech through the device microphone.

We’ll also use “SFSpeechRecognitionTask” object to monitor the progress of speech recognition and cancel it, if necessary,
“AVAudioEngine” object create each audio node separately and attach it to generate audio signals, process them, and perform audio input and output.

Now, Add “NSMicrophoneUsageDescription” and “NSSpeechRecognitionUsageDescription” keys in info.plist for show custom message during Authorization popup appears.


Set delegate in viewDidLoad() and request the authorization of Speech Recognition by getPermissionSpeechRecognizer() method.

override func viewDidLoad() {

speechRecognizer?.delegate = self
// Do any additional setup after loading the view, typically from a nib.

func getPermissionSpeechRecognizer() {
SFSpeechRecognizer.requestAuthorization { (status) in
switch status {
case .authorized:
case .denied:
case .notDetermined:
case .restricted:

Create a new function to handle Speech Recognition.

func startRecording() {

//Cancel task if already running
if regTask != nil {
regTask = nil

//Create and AVAudioSession for audio recording
let avAudioSession = AVAudioSession.sharedInstance()
do {
try avAudioSession.setCategory(AVAudioSessionCategoryRecord)
try avAudioSession.setMode(AVAudioSessionModeMeasurement)
try avAudioSession.setActive(true, with: .notifyOthersOnDeactivation)
} catch {
print("Audio Session is not active")

//Check the Audio input.
guard let inputEngineNode = avEngine.inputNode else {
fatalError("Some Error")

regRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = regRequest else {
fatalError("SFSpeechAudioBufferRecognitionRequest object is not created")
recognitionRequest.shouldReportPartialResults = true

//Start task of speech recognition
regTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in

var isComplete = false

if result != nil {
self.txtSpeech.text = result?.bestTranscription.formattedString
isComplete = (result?.isFinal)!

if error != nil || isComplete {
inputEngineNode.removeTap(onBus: 0)

self.regRequest = nil
self.regTask = nil

//Set Formation of Audio Input
let recordingFormat = inputEngineNode.outputFormat(forBus: 0)
inputEngineNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in


do {
try avEngine.start()
} catch {
print("some error")

Use Trigger action for “Start” and “Stop” speech recognition.

if running then it will stop recognition else call “startRecording()”
@IBAction func actionRecording(_ sender: AnyObject) {
if avEngine.isRunning {
} else {

Add delegate of SFSpeechRecognizer.
//MARK:- SFSpeechRecognizer Delegate
func speechRecognizer(_ speechRecognizer: SFSpeechRecognizer, availabilityDidChange available: Bool)
//Called when the availability of the given recognizer changes

And mission accomplished!

Here’s how it’ll look like:



That’s it. You can use Speech-to-text API to develop speech recognition app. You, as a developer, can take maximum advantage of this new incredible speech APIs to make iPhone applications that recognize speech and transcribe it into text.

If you face any problem regarding this demo, then you can contact any of our developers to help you resolve your issue. Also, if you’ve any idea for a talk to text app or we can say speech recognition app and looking for iPhone app development company, you can contact us for the same.

We are one of the leading mobile app development companies in India with the team of highly experienced and creative iOS developers with ‘Get Shit Done’ environment. And when you hire us for your iPhone app project, not only we develop an app as per your requirements, but we research deeply and suggest the trending features to include that could make your iPhone app become successful than ever.
For your reference, you can access the source code of Speech-to-Text from our GitHub demo.


Get Your Quote to Develop a Speech-to-Text App for No Cost at All.