1kxia
/

gemma-3-270m-modelopt-fp8

Feature Extraction

text-generation

nvidia-modelopt

text-embeddings-inference

Model card Files Files and versions

1kxia commited on 23 days ago

Commit

23ad6cc

·

verified ·

1 Parent(s): 6820eba

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +1 -16

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ tags:
   - quantized
   - embedding
   - nvidia-modelopt
-  - nanojet
 pipeline_tag: feature-extraction
 ---
@@ -78,21 +78,6 @@ All configurations achieve >0.99 cosine similarity with the BF16 baseline.
 └── generation_config.json   # Generation config
 ```
-## Usage
-This model is designed to be used with the [NanoJet](https://github.com/ai-microsoft/NanoJet_Kernels) inference engine:
-```python
-from test.infrastructure.model_utils import load_nanojet_model
-model = load_nanojet_model(
-    "1kxia/gemma-3-270m-modelopt-fp8",
-    batch=8,
-    seq_len=4096,
-    quantization="fp8"
-)
-```
 ## Intended Use
 This model is intended for efficient FP8 inference on NVIDIA GPUs with FP8 support (Hopper architecture and above).

   - quantized
   - embedding
   - nvidia-modelopt
 pipeline_tag: feature-extraction
 ---
 └── generation_config.json   # Generation config
 ```
 ## Intended Use
 This model is intended for efficient FP8 inference on NVIDIA GPUs with FP8 support (Hopper architecture and above).