Twilio - <Gather> multiple voice replies and transcribing text on outbound call

Question

OBJECTIVE using the Twilio Framework:

place an outbound call
present 3 questions and record 3 voice response
transcribe text from the call from the 3 voice responses
use the # key to signal answer to each question and to move forward

CURRENTLY WORKING:

Python Code calling daisy-chained TwiML hosted via TWiML Bins
3 questions being asked
the call is recorded and can be listened to via Twilio console

PROBLEMS/WHAT IS NOT WORKING:

there is no transcribed text from the call.
there is noticeable time delay in calling the TwiML via the TwiML bin.
the #key does not progress to the next question

Any suggestions appreciated:

from twilio.rest import Client
account_sid = 'XXXXXXXXXXXXXXXX'
auth_token = 'XXXXXXXXXXXXXXXX'
client = Client(account_sid, auth_token)

call = client.calls.create(
                        url='http://www.companyname.com/Auditor/MessageName.xml',
                        to="+61437231327",
                        from_='+61437231327',
                        record=True
                    )
print(call.sid)
print(call.status)
#print(call.transcription_text)
print(call.uri)


-------- TWIML hosted website MessageName.xml ---------------
<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say voice="alice">Collecting Name </Say>
    <Gather input="speech" timeout="3" numDigits="1" action="https://handler.twilio.com/twiml/someTwilBinURLID">
        <Say>Please say Name. Press # when complete</Say>
    </Gather>
  </Response>


-------- TwimlBin TWIML - 2nd required voice response ---------------
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Gather input="speech" finishOnKey="#" timeout="3" numDigits="1" action="https://handler.twilio.com/twiml/someTwilBinURLIDForNextVoice">
        <Say>Please say how old you are. Press # when complete</Say>
    </Gather>    
</Response>
-------- TwimlBin TWIML -3rd required voice response ---------------
<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Gather input="speech" finishOnKey="#" timeout="3" numDigits="1">
        <Say>Please say what your location is. Press # when complete</Say>
    </Gather>    
</Response>

philnash philnash · Accepted Answer · 2018-10-25T23:57:32

Twilio developer evangelist here.

Nice progress here, but you are going to have to change a few bits around to reach your goal.

First up, finishOnKey and numDigits are only appropriate attributes when <Gather> is being used for DTMF input. For speech input, Twilio will listen to the user speaking and when they stop for the timeout (or speechTimeout) length of time, will submit the result. So, for speech input, you can't have the user press # when complete, but the conversation should just flow through timeouts.

Next, the transcriptions are sent via the request to the <Gather>'s action attribute. To capture that transcribed text you need to set the action to a server you control so that you can read the text.

You could achieve this with a Twilio Function if you didn't want to stand up your own application. The following code would read and log the result and then return the next TwiML in the call:

exports.handler = function(context, event, callback) {
  console.log(event.SpeechResult); // SpeechResult is the transcribed text 
  const twiml = new Twilio.twiml.VoiceResponse();
  twiml.gather({ input: 'speech', timeout: 3 }).say('The next question');
  callback(null, twiml);
})

You would likely want to save the transcribed text to a database of your own as part of this.

I note you're writing the application in Python. You could do this on a Python server of your own as well.

Finally, I'm not sure what would be causing a delay with your TwiML Bins. It might be worth playing about with the timeout value to optimise that.

Let me know if that helps at all.

Twilio - <Gather> multiple voice replies and transcribing text on outbound call

1 Answers