it might also be worth trying Real-Time Audio-Video functionality. For this you would leave the device local to the client, (but you would split the usb device such that the hid device [ie the buttons]) are forwarded to the guest...
maybe before trying the splitting, just check if RTAV works for you. The audio sample rate might not be good enough for your guest application - so try it first and see. if it works then I could help you with splitting commands.
there is also this KB that helps with some dictation devices if forwarded via USB see: VMware KB: How to improve audio quality when using USB headsets or speakers with a View desktop however, I dont think this works for the Philip Speechmike unfortunately.
cheers
peterB