French transcription with a fine-tuned Whisper model
FrWhisper is a fine-tuned version of OpenAI's Whisper Large V3 model, specifically optimized for French speech recognition with enhanced capabilities for transcribing interjections, hesitations, word repetitions, and interrupted words in conversational French. FrWhisper was developed through a collaboration between the Chair of Romance Linguistics (French and Italian), coordinated by Annette Gerstenberg, and Hanno Müller at the AI Service Center of the Hasso-Plattner-Institut, with financial support from the Federal Ministry of Research, Technology and Space.
Model Description
This model was trained on a combination of two major French speech corpora:
- LangAge Corpus: Demographically-structured conversational French data
- ESLO (Enquêtes Sociolinguistiques à Orléans): Additional French conversational data for specific age groups (26-46 years, 65+ years)
The model is particularly well-suited for applications requiring detailed transcription of spontaneous speech, including discourse markers, hesitations, and other paralinguistic features that are typically ignored by standard ASR systems.
Key Features
- Enhanced Interjection Recognition: Accurately transcribes French interjections like "euh", "ah", "hé", "hein", etc.
- Hesitation Patterns: Captures natural speech hesitations and fillers
- Conversational Speech: Optimized for informal, spontaneous French speech
- Demographic Coverage: Trained on diverse speaker demographics and age groups
- Robust Performance: Significant improvement over base Whisper Large V3
Key Performance Highlights
- 14.18 percentage points overall WER improvement compared to Whisper Large V3
- Consistent performance across both training and test data (no overfitting)
More information
Further information is available on Hugging Face and Github.
Would you like to give us feedback on frwhisper? Then write to us at: langageuuni-potsdampde
