NZG73 TTS-ASR-VC Banner

NZG73 Software Documentation & User Guidelines

Developer: Muhammad Noman | Company: NZG73

⚠️ Strict Warning Against Misuse

🚨 NZG73 strictly instructs all users to use this software only for positive and constructive purposes.
Misusing someone's voice, targeting individuals, or causing harm to anyone's life is highly unethical and a grave offense.
We must fear the consequences of such actions and avoid them.
This tool is designed to simplify complex tasks; please use it responsibly. 🤝

🌟 Key Features

The system is divided into several advanced modules to ensure optimal performance for every task. Detailed descriptions of each feature are provided below:

🔊 Step 1: TTS CPU (Text-to-Speech)

This module is specifically designed for users who do not have expensive Graphics Cards (GPUs).

🏷️ Feature	📝 Description
🌍 Supported Languages	Primarily supports 2 languages (English and Chinese)
⚡ Performance	Runs smoothly on a CPU without any lag. No heavy GPU needed.
🎙️ Voice Cloning	Includes a highly functional Voice Cloning feature (Strictly for ethical use only)

🎤 Step 2: Voice Clone (Advanced)

An extremely powerful Voice Cloning module supporting a vast range of languages.

🏷️ Feature	📝 Description
🌍 Supported Languages	Supports 600 different languages 🌐
💻 Hardware Requirements	Due to its heavy and advanced nature, it is very slow on a CPU
🚀 Performance	A GPU is mandatory for best results. Runs easily on cards with 4GB to 8GB VRAM. The more powerful the GPU, the faster the processing speed.

💎 Step 3: Voice Clone Metadata (Special Feature)

🌟 This is the most specialized and advanced part of the software!
Previously, users had to record audio and create text files separately for Voice Training—a process that took hours.

✨ "Dataset preparation is now easier than ever!" ✨

Instead of manually cutting hundreds of clips and naming them, this Automated System handles everything. You simply:

✍️ Enter your text
🖱️ Click a button
🎧 The system generates the voice
📁 Saves it in a numbered sequence in a folder
📝 Automatically updates your metadata file

🎯 Training a dataset is now child's play! 🧒

📝 Step 4: Speech to Text (ASR)

This module converts audio into text with incredible accuracy.

🏷️ Feature	📝 Description
🌍 Supported Languages	Supports 99 languages
💻 Performance & Hardware	This is a powerful model requiring at least a 3GB VRAM GPU. While it works excellently on a GPU, it is nearly impossible to use on a CPU due to extremely slow speeds.

🔄 Step 5: Voice to Voice

This feature converts one voice directly into another.

🔧 How it works:
Suppose you generated a story or audio using an AI model (Target Voice), but you want it to sound like your own voice. You provide:

🎯 The Target Voice (AI generated audio)
🎙️ A 30-second reference clip of your Original Voice

After clicking Generate, the entire audio is transformed into your voice! 🎉

⚡ Performance: Somewhat slow on a CPU but functional. However, on a GPU, it processes with fluid speed. 🚀

🚀 Upcoming Releases

The journey of NZG73 doesn't end here. We are bringing you more exciting updates:

📱 N Preva (Mobile App)

Our mobile app, N Preva, will be released soon, bringing powerful AI features directly to your smartphone.

🌐 Advanced Web UI

A new, advanced Web UI is in development. It will support multiple AI models, allowing you to interact with various AI systems and handle different tasks simultaneously.