Hi All,
I have had good success at getting skype to pull audio from a 'wav' file by redirecting the INPUT to this file.
I have decided to instead now get skype to connect to my app using the PORT approach. To this end, I have directed both the INPUT and OUPUT to different ports.
To my suprise, in both cases, Skype acts as a client and tries to connect to the port (thus you need to listen as a server). This seems contrary to some of the posts I've read, unless MIC is the opposite (I've not tried, nor do I need to capture that). This is quite tricky in skype with call conferences since you direct one of the 'calls' to a port, and if they disconnect, it gets hard to know whether to redirect on someone else or not. Ideally the 'redirects' would be done on a 'conversation' basis rather than call, but I'm guessing call conference is more of a later addition.
I can read the audio data fine, and in fact my app does a good job at silence detection and is able to isolate sentences very well. For the moment, I'm just dumping them into separate wav files (with a header autogenerated), but the eventual aim is to pass it through the Julius speech recognition system (actually I've already done this, and it does an ok job). Sphinx isn't bad either.
My issue is to play sound back to the caller. Skype correctly connects to my listener, and I can correctly pass the raw data to it (making sure to avoid the first 44 or 46 bytes of the WAV header if sending a wav file). However, there seems to be no limit to how fast I can write to the socket.
I am performing a 'select' on the socket to tell me when its available for writing, and it comes back ready-to-write nearly every time. This means I can write 30s of audio in just a few seconds.
What happens, however, is that after about 10s or so, the audio get very mushy and garbled, as if the skype audio buffer is being overrun.
Any thoughts? I'm trying to think how I might slow down my transmission to skype without it skipping. Putting slight delays seems to make it last longer, but only a bit. As a fallback, I may resort back to just redirecting to wav files for INPUT.
Keep in mind I'm using Skype Linux, so for all I know, this issue is resolved in the far newer windows one.
Thanks,
Bundabrg