added README.md

2026-05-16 19:35:26 +00:00 · 2025-08-16 06:48:16 +05:30
parent 263fafb762
commit ed1b3cf99f
1 changed files with 84 additions and 1 deletions
@@ -1,2 +1,85 @@
-# PhonoCoach
+## 1️⃣   PhonoCoach
 Chrome extension with FastAPI backend for real-time pronunciation feedback using phoneme analysis leveraging OpenAI's whisper ASR model.
+
+---
+
+
+## 2️⃣   Getting Started
+
+### 🛠️ Prerequisites
+
+- **Python 3.12+**
+- **pip**
+- **Chrome browser**
+- **ffmpeg** installed and available in your system PATH.
+
+
+```bash
+# a)Clone the Repository
+git clone https://github.com/Manoj-HV30/phonocoach.git
+cd phonocoach
+# b)Create a virtual environment
+python3 -m venv venv
+
+### c)Activate the virtual environment
+# Linux/macOS:
+source venv/bin/activate
+# Windows (PowerShell):
+# venv\Scripts\Activate.ps1
+
+# d)Install dependencies
+pip install -r requirements.txt
+
+# e)Start the FastAPI backend server
+uvicorn backend.server:app --reload
+```
+### 🖥️ Load the Chrome Extension Locally
+
+1. Open Chrome and go to `chrome://extensions/`
+2. Enable **Developer mode** (toggle in the top-right) 🛠️
+3. Click **Load unpacked** and select the `frontend` folder inside the cloned repo 📂
+4. The **PhonoCoach** icon should appear in your toolbar 🚀
+
+## 3️⃣   PhonoCoach in action
+[![PhonoCoach in Action](https://i.postimg.cc/1tjyg0rG/2025-08-16-05-17.png)](https://postimg.cc/VrnxpCTv)
+[![PhonoCoach in Action](https://i.postimg.cc/0jvr50kV/2025-08-16-05-38.png)](https://postimg.cc/Vr7zDMbt)
+### Using PhonoCoach
+
+1. Select any text on a webpage.
+2. Open the **PhonoCoach** popup.
+3. Click **🎙 Record** to start recording your voice.
+4. Click **Stop** to upload audio and analyze pronunciation.
+5. View similarity score, phoneme-level feedback, and improvement tips.
+## 4️⃣   Features ✨
+
+- Real-time pronunciation analysis for any selected text on any webpage
+- Phoneme-level feedback highlighting correct, incorrect, missing, and extra sounds
+- Similarity score to quantify pronunciation accuracy
+- improvement tips based on your performance
+- Uses OpenAI's Whisper ASR for accurate speech-to-text transcription
+- Lightweight FastAPI backend for fast processing
+
+
+
+## 5️⃣   Dev Notes
+
+- Make sure the backend server is running before using the Chrome extension.
+- Ensure `ffmpeg` is installed and accessible in your system PATH.  ⚡
+- ⚠️ The Chrome extension is loaded locally and is ***NOT YET PUBLISHED*** on the Web Store.
+- In `backend/server.py`, the Whisper model `"small"` is loaded by default. Users can change it to other models based on their system’s processing power:
+
+| Model   | Approx. RAM Required | Recommended Use Case |
+|---------|------------------|-------------------|
+| tiny    | ~1 GB             | Low-resource machines, faster processing |
+| base    | ~2 GB             | Lightweight, reasonable accuracy |
+| small   | ~4 GB             | Default, good balance of speed and accuracy |
+| medium  | ~8 GB             | Higher accuracy, slower processing |
+| large   | ~16+ GB           | Maximum accuracy, requires powerful CPU/GPU |
+
+- Users can adjust the model in `server.py` according to their available RAM and processing power.
+```python
+import whisper
+
+# Change the model here:
+# Options: "tiny", "base", "small", "medium", "large"
+model = whisper.load_model("small")