VoiceCue
Many of us have come across the tidy task of voice recording analysis, where you had to listen to the whole audio to identify the most essential parts.
Manual processing can be very time-inefficient. Just listening from end to end would often not be enough. You would have to double or even triple that time since you would have to pause and replay some parts of the audio.
I came up with an app that generates cue timecodes that lets you find all the important parts of your voice recordings like sentiments, entities, and tags with just a click.
Features
- Voice recognition - based on the Deepgram
- General stats - an overview of voice recording
- Sentiment analysis - positive and negative word detection
- Word cloud generation - most used word classification
- Entity name recognition - categories such as person, place, etc
- Activity tracking - find actions in past, present, or future
- Interactive transcript - see progress or click to control it
- Speaker detection - total number of speakers in recording
- Cue word usage - short text samples for better context
- Custom search - extended ability to query for cues
- Waveform preview - see the dynamics of voice, identify silences
- Audio controls - play, pause, fast forward, and backward
- Drag and drop support - drop audio in the file select area
- Upload MP3 files - the most commonly used audio format
- Progress loaders - improved UX for loading transcripts
- Fully responsive - works fine on mobile and tablets
- Colorful UI - for easier interaction and word highlighting
Tech stack
The project is created using NextJS and Deepgram API.
The project is under the MIT licence and the code is available on GitHub.
Final notes
The full article on the making of the app can be read here.