Dhwani is a self-hosted platform designed to provide Voice mode interaction for Kannada and Indian languages.
This platform leverages various tools and models to parse, transcribe, and improve conversation ultimately providing high-quality audio interactions
An experiment to build a production grade inference pipeline
curl -X 'POST' \
'https://gaganyatri-tts-indic-server-cpu.hf.space/v1/audio/speech' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"input": "ಉದ್ಯಾನದಲ್ಲಿ ಮಕ್ಕಳ ಆಟವಾಡುತ್ತಿದ್ದಾರೆ ಮತ್ತು ಪಕ್ಷಿಗಳು ಚಿಲಿಪಿಲಿ ಮಾಡುತ್ತಿವೆ.", "voice": "A female speaker delivers a slightly expressive and animated speech with a moderate speed and pitch. The recording is of very high quality, with the speakers voice sounding clear and very close up.",, "response_type": "wav"}' -o audio_kannada_gpu_cloud.wav
curl -X 'POST' \
'https://gaganyatri-asr-indic-server-cpu.hf.space/transcribe/?language=kannada' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@audio_kannada_gpu_cloud.wav;type=audio/x-wav'
curl -X ‘POST’
‘https://gaganyatri-translate-indic-server-cpu.hf.space/translate?src_lang=kan_Knda&tgt_lang=eng_Latn&device_type=cpu’
-H ‘accept: application/json’
-H ‘Content-Type: application/json’
-d ‘{
“sentences”: [
“ನಮಸ್ಕಾರ, ಹೇಗಿದ್ದೀರಾ?”, “ಶುಭೋದಯ!”
],
“src_lang”: “kan_Knda”,
“tgt_lang”: “eng_Latn”
}’