Help - Search - Members - Calendar
Full Version: SkypeIn DTMF recognition
Skype Community > English > Development, Betas and Skype Garage > Skype Public API
spud5
Also requested here.

CallDtmfReceived doesn't work for SkypeIn calls.
crying.png

I need to detect SkypeIn DTMF events using VB.Net
Does anyone have a workaround for this scenario?
europolitan
QUOTE(spud5 @ Thu May 17 2007, 10:48) [snapback]395946[/snapback]

Also requested here.

CallDtmfReceived doesn't work for SkypeIn calls.
:cry:

I need to detect SkypeIn DTMF events using VB.Net
Does anyone have a workaround for this scenario?


Well, I can't help you as it seems to work getting the toines through on skype-in calls, but I have a similar problem: receive the DTMF tones from the call recipient when I make a skype-out call. Sometimes it works, sometimes it doesn't. Any experience on that?

BTW I use a skype adapter (RJ11 to USB) making outgoing calls. Any ideas. Or maybe this would be easier for you as well, having the adapter and a voice modem/dialogic card.

/Henrik
spud5
QUOTE(europolitan @ Thu May 17 2007, 19:22) [snapback]396138[/snapback]

Well, I can't help you as it seems to work getting the toines through on skype-in calls..

When I call my SkypeIn number I can hear the tones. The problem is CallDtmfReceived in Skype4COM - it doesn't detect DTMF as an event.



QUOTE(europolitan @ Thu May 17 2007, 19:22) [snapback]396138[/snapback]

..but I have a similar problem: receive the DTMF tones from the call recipient when I make a skype-out call. Sometimes it works, sometimes it doesn't. Any experience on that?
..

Remote DTMF is supposed to be working since Skype 3.2.0.82 version. Again, this isn't detected by Skype4COM but you should be able to hear the tones. Obviously the call recipient must use a DTMF phone otherwise you will hear nothing. There is some more information here - https://developer.skype.com/jira/browse/SPA-91
TheUberOverlord
QUOTE(spud5 @ Thu May 17 2007, 13:43) [snapback]396148[/snapback]

When I call my SkypeIn number I can hear the tones. The problem is CallDtmfReceived in Skype4COM - it doesn't detect DTMF as an event.
Remote DTMF is supposed to be working since Skype 3.2.0.82 version. Again, this isn't detected by Skype4COM but you should be able to hear the tones. Obviously the call recipient must use a DTMF phone otherwise you will hear nothing. There is some more information here - https://developer.skype.com/jira/browse/SPA-91


Unless I am mistaken, there is no notification of DTMF events on inbound PSTN calls ("From SkpeIn"), the only thing that was fixed, is that DTMF tones are sent inband SkypeIn, but currently, you would need to create your own DTMF tone recogniser to process these DTMF tones.
spud5
QUOTE(TheUberOverlord @ Mon May 21 2007, 18:46) [snapback]397611[/snapback]

..you would need to create your own DTMF tone recogniser to process these DTMF tones.

yes.png
That's what I need to do, but I don't know how to do it.
TheUberOverlord
QUOTE(spud5 @ Mon May 21 2007, 13:13) [snapback]397622[/snapback]

yes.png
That's what I need to do, but I don't know how to do it.



Well this gets complicated, because it will cause some overhead, here is a link to one working example, with source code:

http://www.g6lvb.com/Articles/SSETI%20Expr...MF%20Telemetry/

Now, you would need to process Audio via the API/Skype4COM interface, most likely using the Port option, which would be raw PCW format, which then would need to be processed using some kludge of the above link.

1. Rumor is the port API option does not work, not sure how true this is?

2. The version above uses the Audio Out ("Speakers") which is not really a good way to do this, but if it was modfied to handle the port Audio logic, it could work well, but 16 bit 16 sample per second
wav format is about 32k a second, so this is why normally these DTMF decoders do NOT task the PC CPU, and are contained in firmaware as well as use their own processor, like in a modem or some other external/internal device which contains another CPU for DTMF decoding,
spud5
QUOTE(TheUberOverlord @ Mon May 21 2007, 21:28) [snapback]397663[/snapback]

Well this gets complicated, because it will cause some overhead, here is a link to one working example, with source code:

http://www.g6lvb.com/Articles/SSETI%20Expr...MF%20Telemetry/

Now, you would need to process Audio via the API/Skype4COM interface, most likely using the Port option, which would be raw PCW format, which then would need to be processed using some kludge of the above link.

1. Rumor is the port API option does not work, not sure how true this is?

2. The version above uses the Audio Out ("Speakers") which is not really a good way to do this, but if it was modfied to handle the port Audio logic, it could work well, but 16 bit 16 sample per second
wav format is about 32k a second, so this is why normally these DTMF decoders do NOT task the PC CPU, and are contained in firmaware as well as use their own processor, like in a modem or some other external/internal device which contains another CPU for DTMF decoding,

Thanks for the info. It echos what others have told me - SkypeIn DTMF has to be decoded using FFT and I need to analyze the audio stream in real time to achieve this. It's not looking too promising for an amateur programmer like myself.

Would I be right in guessing that Skype-to-Skype DTMF events are not based on decoded DTMF tones - more likely simple data transmissions (eg. one client sends "keypad pressed" event, the other client receives "keypad pressed" event)?
TheUberOverlord
QUOTE(spud5 @ Wed May 23 2007, 04:40) [snapback]398347[/snapback]

Thanks for the info. It echos what others have told me - SkypeIn DTMF has to be decoded using FFT and I need to analyze the audio stream in real time to achieve this. It's not looking too promising for an amateur programmer like myself.

Would I be right in guessing that Skype-to-Skype DTMF events are not based on decoded DTMF tones - more likely simple data transmissions (eg. one client sends "keypad pressed" event, the other client receives "keypad pressed" event)?


Yes, there is not any internal DTMF decoding being done for inband DTMF PSTN calls in the client, and to automatically do so ("have a working FFT decoding going on at all times in the client, would be alot of overhead.

I am thinking of creating an example of one for the DevZone which would include source code. It sounds like there is more and more of a need for this.
BobS
QUOTE(TheUberOverlord @ Fri May 25 2007, 21:00) [snapback]399411[/snapback]

Yes, there is not any interna; decoding being done in the client, and to automatically do so ("have a working FFT decoding going on at all times in the client, would be alot of overhead.

I am thinking of creating an example of one for the DevZone which would include source code. It sounds like there is more and more of a need for this.

Has a sample been created using Skype4Com? It sure would be nice.
TheUberOverlord
QUOTE(BobS @ Mon Jul 23 2007, 17:04) [snapback]421283[/snapback]

Has a sample been created using Skype4Com? It sure would be nice.


Working on it :-)
sequoyan
I'm curious too. Any idea when this might be done? I willing to help out if there's something I could do...
TheUberOverlord
Well, I have decided NOT to publish publicly source code at the moment showing the methods to do inband DTMF processing for inbound/outbound ("SkypeIn/SkypeOut") PSTN calls on Skype.

However I will explain how it can be done, and show you a working example done with MyToGo so you can see that it can in fact be done. MyToGo was done using Skype4COM however the same could be done by using the Raw Skype API interface as well.

1. First you will need to process Skype audio using the Skype API port interface.

You cannot use the Skype API file interface for audio for processing real-time inband DTMF which is provided by the Skype API because the Skype API opens the .wav file exclusively while it is writing audio data, so processing the audio via file using the Skype API file interface cannot be done currently in real-time since the file would remain opened by the Skype API until closed to process the file content.

2. You will need to decode the DTMF digits using your application.

Since the Skype API does not provide any inband DTMF decoding ("Which is Good actually, the CPU overhead is high, and should NOT be used when not needed, this is why inband DTMF decoding is normally done using a separate CPU or processor, like a modem for example"), this means you will need to do this asynchronously while receiving Skype audio data via a local port and do the other things your program is doing at the same time.

Because Skype uses high compression codecs for calls SkypeIn/SkypeOut ("Inbound/Outbound") PSTN calls, which use a very high compression ratio, the vast majority of Skype PSTN calls, both SkypeIn and SkypeOut currently use codecs such as g729 codecs for example and currently these calls only support inband DTMF ("No Skype API interface for out of band DTMF is currently present for any PSTN type calls").

It is currently, virtually impossible using inband DTMF and the g729 codec to process multiple DTMF tones of the same digit in a sequential manner in PSTN type calls on Skype at even close to a 95 percent accuracy ratio using the current g729 codecs used for PSTN calls on Skype. Nobody want to be failing DTMF tone recognition more than 5 percent of the time, calling wrong numbers and at a cost.

This problem is created because when packets are dropped or when slices of audio time ARE removed for better codec compression ("pauses between DTMF digits that could have be used to determine if 33 is really 2 threes or one 3, as an example, which was held down for a long time on a DTMF key pad of a phone") to save bytes being sent from point a to point b, so, there is no failsafe method to determine when one DTMF tone stops and another starts when the DTMF digits happen to be the same DTMF digit, because any pause that was between the DTMF digits prior to codec compression with the g729 codec could have been removed by the codec compression and because only inband DTMF is available, this leaves same tone DTMF sequential processing problems with the g729 codec.

In fact, it is suggested and generally accepted, that dependable inband DTMF processing NOT be used with the g729 codec because of this issue as well as not being able to get the correct amplitude of a DTMF signal in all cases depending on noise and other conditions. One Thing I did with MyToGo was to use the * key to express same digit DTMF processing to avoid these types of processing problems with inband DTMF tones and the g729 codec. Excessive background noise is another example of issues with the g729 codec and processing inband DTMF tones, because if there is high background noise and the phones mouthpiece is not covered, g729 codex compression may also create problems because there is a standard to measure the DTMF tone amplitude over noise which needs to be met to determine if a DTMF tone is present, if compression removes some of this data which was present prior to compression, inband DTMF processing can be compromised. So even different DTMF digit inband processing recognition can require lower noise levels with g729 codec compression for higher accuracy ratios.

Is This My DTMF Tone or Yours? or "Look My Garage Door Opener Opens my Neighbors Garage Door Too!"

3. Imagine if mutiple applications for Skype are processing the same DTMF input

How does one determine that the tones are for it or that they are not. Can you predict the outcome of what can/could/would/will happen when two or more applications for Skype process the same DTMF digit?

What would happen if you conference called together two automated PBX systems who are both waiting for DTMF digits from your phone? do you really want that mess?

When you call a PBX, or an IVR system, you don't have this issue because there is only one system ready to process your DTMF input, with Skype because of the API interface, you now can have MULTIPLE applications all waiting for DTMF input, and different digits could mean different things. As is the case for MyToGo Where a # means end of Skype Speed-Dial number and to call the Skype contact or PSTN number assigned to that Skype Speed-Dial number or that the digits prior to the # are a telephone number and to call that telephone number. Now what happens to the application where the # means to do something else?

A giant mess can be created when or if a Skype user is running two applications processing the same DTMF tones. We never see this in other cases because NOBODY has the rich API interface that Skype has.

PBX or IVR systems never need to worry about "What if there is another application listening for DTMF digits?"), they simply say "If you want to do this or that enter this or that now" this is NOT this case when designing applications using the Skype API.

So, please be careful if you think that ONLY your application is going to be using DTMF and plan for proper error logic, if and when your application ends up along side of other applications who may or may not be processing the same DTMF tones you are. Could another application have processed a DTMF tone that for example ended a call which you think is still in progress, do you have error logic that says, if call in progress then....or just then..... wink.png
Vern Baker
This post answered so many of my questions.

The environment I am thinking of bringing to Skype could be one sided as far as DTMF goes. I see Xtend has developed something for it.

We we were looking at these a couple years back, it appeared that Speech Recognition would be the only way to go. We had even thought of including a recognizer to "listen" for Touch Tones.

So, the question I have is:

-- Have there been any developments around the DTMF detection?
Ashish.za
Hi TheUberOverlord,

I was writing a code to decode Inband DTMF when I found that the DTMF is not always passed to Skype pressed by the called party. I place a call to my cell phone and then pressed some DTMF keys, Skype receieves the DTMFs most of the time but not all. After some investigation I found that when I receive a call in my cell with ANI +10000123456 is when DTMF is not passed to the Skype. Do you have any idea what could be going wrong here?
Also, as you mentioned realtime decoding is impossible. So I am saving the stream in a file and then decoding it.
Your help will be highly appreciated.

Thanks,
Ashish
TheUberOverlord
QUOTE (Ashish.za @ Wed Apr 30 2008, 12:01)
Go to the original post
Hi TheUberOverlord,

I was writing a code to decode Inband DTMF when I found that the DTMF is not always passed to Skype pressed by the called party. I place a call to my cell phone and then pressed some DTMF keys, Skype receieves the DTMFs most of the time but not all. After some investigation I found that when I receive a call in my cell with ANI +10000123456 is when DTMF is not passed to the Skype. Do you have any idea what could be going wrong here?
Also, as you mentioned realtime decoding is impossible. So I am saving the stream in a file and then decoding it.
Your help will be highly appreciated.

Thanks,
Ashish


Duplicate DTMF digits are very complicated to process because the g729 codec used for Skype PSTN calls uses compression and actually removes silence as well, in many case.

The only real way to detect when another same digit DTMF tone is detected is to sense when the last one ended, if that silence is removed it become very complicated indeed to determine if what seems like "55" is a single digit DTMF of "5" of two DTMF digits of "55".

This is one reason why I use the "*" key for MyToGo forSkype. As to avoid this problem.

DTMF digits "55" would need to be enterd as "5*" and "555" would need to be entered as "5*5" so there cannot be repeating DTMF digits with the DTMF sensing application I created for Skype called "MyToGo for Skype".

More here: Click here for more information
RamanaNagineni
Hi there,

the links related to "http://testing.onlytherightanswers.com" is not working.
TheUberOverlord
Yes, I am in the process of moving the site to a new hosting company, what do you need?
mythosmint1
Is there another place where I can download your software "MyToGo for Skype Extra" ? Thanks
TheUberOverlord
Here:

http://forum.skype.com/index.php?showtopic...st&p=911331

Let me know if you need any help.
NakorTheIsalani
QUOTE (TheUberOverlord @ Fri May 25 2007, 21:00)
Go to the original post
Yes, there is not any internal DTMF decoding being done for inband DTMF PSTN calls in the client, and to automatically do so ("have a working FFT decoding going on at all times in the client, would be alot of overhead.


FWIW, I have some *NON COMMERCIAL* (read: hacked-up for my own amateur uses and amusement) code which recognises in-band DTMF using the Goertzel algorithm (Wikipedia has a good page), which is far less CPU intensive than doing an FFT. It's part of a Windows app I use which sends the audio to a port for DTMF recognition, so it uses WinSock, but the code is generic. I have run 5 simultaneous parallel instances of it on an old P4, and CPU sits at something like 40%. It's been hacked up from bits and pieces, so it's not pretty wink.png but it might serve as a starting point for somebody...

What Uber says about recognising repeated digits, etc, is true - my own app listens for silence between digits and then requires a number of repeated "hits" on a given digit before detection, but this crude filtering method is just based on my own trial and error. Having said that, it works very well in my application, (Scottish PSTN must be good wink.png) but as Uber says, if you are using this commercially and charging people, you will want to make the reliabilty is sound.

Anyway - take a look if you like. I'm not a proud programmer, so make no apologies for the hackery (I override some 'quality' checks on the received DMTF among my other crimes) - I've also not bothered cleaning it up for public consumption wink.png Also, if you look at the Wikipedia page on Goertzel you'll see where I started from smile.png, so credit for that doesn't go to me.

Btw, I use this code in the following way in a thread which receives the raw PCM samples sent to the socket:

<codebox>
bytestoprocess = bytes_received_over_socket;

/* Detect the DTMF tones in the voice stream buffer */
for(calc_coeffs();bytestoprocess;bytestoprocess-=2)
{
sample = (char *) (&recvbuf[iResult - bytestoprocess]);

DetectedCode = goertzel( (short)((unsigned short)((*((short *)sample))+32768)>>8)); // Told you it was ugly wink.png

if (DetectedCode!=DTMF_CODE_NONE)
{
// DTMF Code Detected!
DTMF_Detect(*CallID, DetectedCode); // What you do here is up to you...
}
}
</codebox>

Hope it helps.

NTI

Ugh. Sorry about the formatting. Not used to this funky forum smile.png
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.