## 1️⃣ PhonoCoach Chrome extension with FastAPI backend for real-time pronunciation feedback using phoneme analysis leveraging OpenAI's whisper ASR model. --- ## 2️⃣ Getting Started ### Prerequisites - **Python 3.12+** - **pip** - **Chrome browser** - **ffmpeg** installed and available in your system PATH. ```bash # a)Clone the Repository git clone https://github.com/Manoj-HV30/phonocoach.git cd phonocoach # b)Create a virtual environment python3 -m venv venv ### c)Activate the virtual environment # Linux/macOS: source venv/bin/activate # Windows (PowerShell): # venv\Scripts\Activate.ps1 # d)Install dependencies pip install -r requirements.txt # e)Start the FastAPI backend server uvicorn backend.server:app --reload ``` ### Load the Chrome Extension Locally 1. Open Chrome and go to `chrome://extensions/` 2. Enable **Developer mode** (toggle in the top-right) 3. Click **Load unpacked** and select the `frontend` folder inside the cloned repo 4. The **PhonoCoach** icon should appear in your toolbar ## 3️⃣ PhonoCoach in action [![PhonoCoach in Action](https://i.postimg.cc/1tjyg0rG/2025-08-16-05-17.png)](https://postimg.cc/VrnxpCTv) [![PhonoCoach in Action](https://i.postimg.cc/0jvr50kV/2025-08-16-05-38.png)](https://postimg.cc/Vr7zDMbt) ### Using PhonoCoach 1. Select any text on a webpage. 2. Open the **PhonoCoach** popup. 3. Click **🎙 Record** to start recording your voice. 4. Click **Stop** to upload audio and analyze pronunciation. 5. View similarity score, phoneme-level feedback, and improvement tips. ## 4️⃣ Features - Real-time pronunciation analysis for any selected text on any webpage - Phoneme-level feedback highlighting correct, incorrect, missing, and extra sounds - Similarity score to quantify pronunciation accuracy - improvement tips based on your performance - Uses OpenAI's Whisper ASR for accurate speech-to-text transcription - Lightweight FastAPI backend for fast processing ## 5️⃣ Dev Notes - Make sure the backend server is running before using the Chrome extension. - Ensure `ffmpeg` is installed and accessible in your system PATH. - The Chrome extension is loaded locally and is ***NOT YET PUBLISHED*** on the Web Store. - In `backend/server.py`, the Whisper model `"small"` is loaded by default. Users can change it to other models based on their system’s processing power: | Model | Approx. RAM Required | Recommended Use Case | |---------|------------------|-------------------| | tiny | ~1 GB | Low-resource machines, faster processing | | base | ~2 GB | Lightweight, reasonable accuracy | | small | ~4 GB | Default, good balance of speed and accuracy | | medium | ~8 GB | Higher accuracy, slower processing | | large | ~16+ GB | Maximum accuracy, requires powerful CPU/GPU | - Users can adjust the model in `server.py` according to their available RAM and processing power. ```python import whisper # Change the model here: # Options: "tiny", "base", "small", "medium", "large" model = whisper.load_model("small")