Well, I have decided
NOT to publish publicly source code at the moment showing the methods to do inband DTMF processing for inbound/outbound ("SkypeIn/SkypeOut") PSTN calls on Skype.
However I will explain how it can be done, and show you a working example done with
MyToGo so you can see that it can in fact be done.
MyToGo was done using Skype4COM however the same could be done by using the Raw Skype API interface as well.
1. First you will need to process Skype audio using the Skype API port interface. You cannot use the Skype API file interface for audio for processing real-time inband DTMF which is provided by the Skype API because the Skype API opens the .wav file exclusively while it is writing audio data, so processing the audio via file using the Skype API file interface cannot be done currently in real-time since the file would remain opened by the Skype API until closed to process the file content.
2. You will need to decode the DTMF digits using your application.Since the Skype API does not provide any inband DTMF decoding ("Which is Good actually, the CPU overhead is high, and should NOT be used when not needed, this is why inband DTMF decoding is normally done using a separate CPU or processor, like a modem for example"), this means you will need to do this asynchronously while receiving Skype audio data via a local port and do the other things your program is doing at the same time.
Because Skype uses high compression codecs for calls SkypeIn/SkypeOut ("Inbound/Outbound") PSTN calls, which use a very high compression ratio, the vast majority of Skype PSTN calls, both SkypeIn and SkypeOut currently use codecs such as g729 codecs for example and currently these calls only support inband DTMF ("No Skype API interface for out of band DTMF is currently present for any PSTN type calls").
It is currently, virtually impossible using inband DTMF and the g729 codec to process multiple DTMF tones of the same digit in a sequential manner in PSTN type calls on Skype at even close to a 95 percent accuracy ratio using the current g729 codecs used for PSTN calls on Skype. Nobody want to be failing DTMF tone recognition more than 5 percent of the time, calling wrong numbers and at a cost.
This problem is created because when packets are dropped or when slices of audio time ARE removed for better codec compression ("pauses between DTMF digits that could have be used to determine if 33 is really 2 threes or one 3, as an example, which was held down for a long time on a DTMF key pad of a phone") to save bytes being sent from point a to point b, so, there is no failsafe method to determine when one DTMF tone stops and another starts when the DTMF digits happen to be the same DTMF digit, because any pause that was between the DTMF digits prior to codec compression with the g729 codec could have been removed by the codec compression and because only inband DTMF is available, this leaves same tone DTMF sequential processing problems with the g729 codec.
In fact, it is suggested and generally accepted, that dependable inband DTMF processing NOT be used with the g729 codec because of this issue as well as not being able to get the correct amplitude of a DTMF signal in all cases depending on noise and other conditions. One Thing I did with
MyToGo was to use the * key to express same digit DTMF processing to avoid these types of processing problems with inband DTMF tones and the g729 codec. Excessive background noise is another example of issues with the g729 codec and processing inband DTMF tones, because if there is high background noise and the phones mouthpiece is not covered, g729 codex compression may also create problems because there is a standard to measure the DTMF tone amplitude over noise which needs to be met to determine if a DTMF tone is present, if compression removes some of this data which was present prior to compression, inband DTMF processing can be compromised. So even different DTMF digit inband processing recognition can require lower noise levels with g729 codec compression for higher accuracy ratios.
Is This My DTMF Tone or Yours? or "Look My Garage Door Opener Opens my Neighbors Garage Door Too!"3. Imagine if mutiple applications for Skype are processing the same DTMF inputHow does one determine that the tones are for it or that they are not. Can you predict the outcome of what can/could/would/will happen when two or more applications for Skype process the same DTMF digit?
What would happen if you conference called together two automated PBX systems who are both waiting for DTMF digits from your phone? do you really want that mess?
When you call a PBX, or an IVR system, you don't have this issue because there is only one system ready to process your DTMF input, with Skype because of the API interface, you now can have MULTIPLE applications all waiting for DTMF input, and different digits could mean different things. As is the case for
MyToGo Where a # means end of Skype Speed-Dial number and to call the Skype contact or PSTN number assigned to that Skype Speed-Dial number or that the digits prior to the # are a telephone number and to call that telephone number. Now what happens to the application where the # means to do something else?
A giant mess can be created when or if a Skype user is running two applications processing the same DTMF tones.
We never see this in other cases because NOBODY has the rich API interface that Skype has. PBX or IVR systems never need to worry about "What if there is another application listening for DTMF digits?"), they simply say "If you want to do this or that enter this or that now" this is NOT this case when designing applications using the Skype API.
So, please be careful if you think that ONLY your application is going to be using DTMF and plan for proper error logic, if and when your application ends up along side of other applications who may or may not be processing the same DTMF tones you are. Could another application have processed a DTMF tone that for example ended a call which you think is still in progress, do you have error logic that says, if call in progress then....or just then.....