NZG73 TTS-ASR-VC Banner

NZG73 Software Documentation & User Guidelines

Developer: Muhammad Noman   |   Company: NZG73


โš ๏ธ Strict Warning Against Misuse

๐Ÿšจ NZG73 strictly instructs all users to use this software only for positive and constructive purposes.
Misusing someone's voice, targeting individuals, or causing harm to anyone's life is highly unethical and a grave offense.
We must fear the consequences of such actions and avoid them.
This tool is designed to simplify complex tasks; please use it responsibly. ๐Ÿค


๐ŸŒŸ Key Features

The system is divided into several advanced modules to ensure optimal performance for every task. Detailed descriptions of each feature are provided below:


๐Ÿ”Š Step 1: TTS CPU (Text-to-Speech)

This module is specifically designed for users who do not have expensive Graphics Cards (GPUs).

๐Ÿท๏ธ Feature ๐Ÿ“ Description
๐ŸŒ Supported Languages Primarily supports 2 languages (English and Chinese)
โšก Performance Runs smoothly on a CPU without any lag. No heavy GPU needed.
๐ŸŽ™๏ธ Voice Cloning Includes a highly functional Voice Cloning feature (Strictly for ethical use only)

๐ŸŽค Step 2: Voice Clone (Advanced)

An extremely powerful Voice Cloning module supporting a vast range of languages.

๐Ÿท๏ธ Feature ๐Ÿ“ Description
๐ŸŒ Supported Languages Supports 600 different languages ๐ŸŒ
๐Ÿ’ป Hardware Requirements Due to its heavy and advanced nature, it is very slow on a CPU
๐Ÿš€ Performance A GPU is mandatory for best results. Runs easily on cards with 4GB to 8GB VRAM. The more powerful the GPU, the faster the processing speed.

๐Ÿ’Ž Step 3: Voice Clone Metadata (Special Feature)

๐ŸŒŸ This is the most specialized and advanced part of the software!
Previously, users had to record audio and create text files separately for Voice Trainingโ€”a process that took hours.

โœจ "Dataset preparation is now easier than ever!" โœจ

Instead of manually cutting hundreds of clips and naming them, this Automated System handles everything. You simply:

  1. โœ๏ธ Enter your text
  2. ๐Ÿ–ฑ๏ธ Click a button
  3. ๐ŸŽง The system generates the voice
  4. ๐Ÿ“ Saves it in a numbered sequence in a folder
  5. ๐Ÿ“ Automatically updates your metadata file

๐ŸŽฏ Training a dataset is now child's play! ๐Ÿง’


๐Ÿ“ Step 4: Speech to Text (ASR)

This module converts audio into text with incredible accuracy.

๐Ÿท๏ธ Feature ๐Ÿ“ Description
๐ŸŒ Supported Languages Supports 99 languages
๐Ÿ’ป Performance & Hardware This is a powerful model requiring at least a 3GB VRAM GPU. While it works excellently on a GPU, it is nearly impossible to use on a CPU due to extremely slow speeds.

๐Ÿ”„ Step 5: Voice to Voice

This feature converts one voice directly into another.

๐Ÿ”ง How it works:
Suppose you generated a story or audio using an AI model (Target Voice), but you want it to sound like your own voice. You provide:

  • ๐ŸŽฏ The Target Voice (AI generated audio)
  • ๐ŸŽ™๏ธ A 30-second reference clip of your Original Voice

After clicking Generate, the entire audio is transformed into your voice! ๐ŸŽ‰

โšก Performance: Somewhat slow on a CPU but functional. However, on a GPU, it processes with fluid speed. ๐Ÿš€


๐Ÿš€ Upcoming Releases

The journey of NZG73 doesn't end here. We are bringing you more exciting updates:

๐Ÿ“ฑ N Preva (Mobile App)

Our mobile app, N Preva, will be released soon, bringing powerful AI features directly to your smartphone.

๐ŸŒ Advanced Web UI

A new, advanced Web UI is in development. It will support multiple AI models, allowing you to interact with various AI systems and handle different tasks simultaneously.


๐Ÿ› ๏ธ Powered By ๐Ÿ› ๏ธ


๐ŸŒŸ Connect With Me ๐ŸŒŸ


๐Ÿ“Š GitHub Stats ๐Ÿ“Š



โœจ Made with โค๏ธ by Muhammad Noman | NZG73 โœจ

โญ If you like this project, don't forget to give it a star! โญ

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support