My First AI Tool Creation - Karaoke Now App
- jacketzkt
- 2 days ago
- 3 min read
Updated: 10 hours ago

Karaoke Now (Beta) is a AI tool letting you easily control vocal of audio playing on any player — just like a real KTV so you can enjoy singing or simply listen to the instrumental tracks for a more relaxing and focused environment.
Here is a short demo:
How to Get Karaoke Now (Beta)?
Step 1 : Download the zip file here or from Google Drive link
Step 2 : Install the Blackhole and then extract the Karaoke Now app
Step 3 : Drag the app to Application folder and launch Karaoke Now app
Step 4 : Play music and click the Start button to start remove the vocal
Only support on macOS for the time being. The rest of platforms will be supported in subsequent updates...
Here is the beginning of the story...
One Saturday night, my brother took me to a karaoke venue. When I tried to pick a song, the system redirected me to a third-party karaoke app and asked me to pay extra — on top of the venue fee — because the song wasn’t in their library.
The following Saturday, my wife wanted to go karaoke, but after hearing about my poor experience, she chose to simply hum along to Spotify on our TV at home. Even without a hi-fi system, we spent a fun afternoon together with zero extra cost.
Since then, the idea of creating a script to toggle vocals on and off in real time during Spotify playback has been stuck in my mind. As I observed the habits of my family and friends, I realized many of them would benefit from a simple vocal removal solution — whether at home, at their desks, or even in the car.
At the same time, I had been exploring GenAI tools like ComfyUI for image and video generation. That naturally led me to think: What if I built my own AI tool — but for audio, specifically vocal removal?
That's when I started diving into the audio separation AI models...
Scenario
There are three main scenarios targeted:



Home Entertainment
At home, users often listen to music on their TV, computer, or mobile device for leisure. Sometimes, they feel like singing along. However, traditional streaming services and local audio players don’t allow users to separate vocals from instrumentals.
Working or Studying from Home
When working or studying from home, users may play music on their computer for background ambiance. However, vocals and lyrics can be distracting.
Car Driving
While driving, users often play music through their car system or phone. Just like at home, they may want to casually hum along.
Requirement
Easy control vocal level
Easily toggle the vocal on/off and adjust the vocal level on the fly whenever needed, while retaining the instrumentals.
Multi-platform Support
Available on multiple platforms, including desktop, mobile device, TV and car system.
MVP
There are a few limitations facing:
To process the audio, the app has to be either embedded with those music players or an independent app that captures the audio via a virtual audio endpoint and redirects to a physical audio endpoint.
To deliver a seamless user experience, the processing needs to occur in real time with minimal latency.
Legal and compliance concerns regarding the music copyrights after modification by separating vocals and instrumentals.
Considering the target users/testers around me, I decide to build an MVP as such:
An independent plug-and-play macOS desktop app, supporting all kinds of music streaming services and local music resources.
Bundled with open-source BlackHole virtual audio endpoint software.
Minimalist macOS-style UI with a vocal removal start/stop button and status.