--- language: - en license: apache-2.0 tags: - sentence-transformers - sparse-encoder - sparse - splade - generated_from_trainer - dataset_size:99000 - loss:SpladeLoss - loss:SparseMultipleNegativesRankingLoss - loss:FlopsLoss base_model: distilbert/distilbert-base-uncased widget: - text: 'The term emergent literacy signals a belief that, in a literate society, young children even one and two year olds, are in the process of becoming literate”. ... Gray (1956:21) notes: Functional literacy is used for the training of adults to ''meet independently the reading and writing demands placed on them''.' - text: Rey is seemingly confirmed as being The Chosen One per a quote by a Lucasfilm production designer who worked on The Rise of Skywalker. - text: are union gun safes fireproof? - text: Fruit is an essential part of a healthy diet — and may aid weight loss. Most fruits are low in calories while high in nutrients and fiber, which can boost your fullness. Keep in mind that it's best to eat fruits whole rather than juiced. What's more, simply eating fruit is not the key to weight loss. - text: Treatment of suspected bacterial infection is with antibiotics, such as amoxicillin/clavulanate or doxycycline, given for 5 to 7 days for acute sinusitis and for up to 6 weeks for chronic sinusitis. datasets: - sentence-transformers/gooaq pipeline_tag: feature-extraction library_name: sentence-transformers metrics: - dot_accuracy@1 - dot_accuracy@3 - dot_accuracy@5 - dot_accuracy@10 - dot_precision@1 - dot_precision@3 - dot_precision@5 - dot_precision@10 - dot_recall@1 - dot_recall@3 - dot_recall@5 - dot_recall@10 - dot_ndcg@10 - dot_mrr@10 - dot_map@100 - query_active_dims - query_sparsity_ratio - corpus_active_dims - corpus_sparsity_ratio co2_eq_emissions: emissions: 13.144676625187973 energy_consumed: 0.03381684844736578 source: codecarbon training_type: fine-tuning on_cloud: false cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K ram_total_size: 31.777088165283203 hours_used: 0.145 hardware_used: 1 x NVIDIA GeForce RTX 3090 model-index: - name: splade-distilbert-base-uncased trained on GooAQ results: - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoMSMARCO type: NanoMSMARCO metrics: - type: dot_accuracy@1 value: 0.22 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.4 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.5 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.7 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.22 name: Dot Precision@1 - type: dot_precision@3 value: 0.13333333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.1 name: Dot Precision@5 - type: dot_precision@10 value: 0.07 name: Dot Precision@10 - type: dot_recall@1 value: 0.22 name: Dot Recall@1 - type: dot_recall@3 value: 0.4 name: Dot Recall@3 - type: dot_recall@5 value: 0.5 name: Dot Recall@5 - type: dot_recall@10 value: 0.7 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.43322728177988873 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.35121428571428576 name: Dot Mrr@10 - type: dot_map@100 value: 0.36254438939466105 name: Dot Map@100 - type: query_active_dims value: 114.83999633789062 name: Query Active Dims - type: query_sparsity_ratio value: 0.9962374681758112 name: Query Sparsity Ratio - type: corpus_active_dims value: 504.9510192871094 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9834561621359311 name: Corpus Sparsity Ratio - type: dot_accuracy@1 value: 0.22 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.4 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.5 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.7 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.22 name: Dot Precision@1 - type: dot_precision@3 value: 0.13333333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.1 name: Dot Precision@5 - type: dot_precision@10 value: 0.07 name: Dot Precision@10 - type: dot_recall@1 value: 0.22 name: Dot Recall@1 - type: dot_recall@3 value: 0.4 name: Dot Recall@3 - type: dot_recall@5 value: 0.5 name: Dot Recall@5 - type: dot_recall@10 value: 0.7 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.43322728177988873 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.35121428571428576 name: Dot Mrr@10 - type: dot_map@100 value: 0.36254438939466105 name: Dot Map@100 - type: query_active_dims value: 114.83999633789062 name: Query Active Dims - type: query_sparsity_ratio value: 0.9962374681758112 name: Query Sparsity Ratio - type: corpus_active_dims value: 504.9510192871094 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9834561621359311 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoNFCorpus type: NanoNFCorpus metrics: - type: dot_accuracy@1 value: 0.28 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.42 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.44 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.48 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.28 name: Dot Precision@1 - type: dot_precision@3 value: 0.2533333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.20800000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.172 name: Dot Precision@10 - type: dot_recall@1 value: 0.01025265789874976 name: Dot Recall@1 - type: dot_recall@3 value: 0.024326098686792398 name: Dot Recall@3 - type: dot_recall@5 value: 0.03315745551680213 name: Dot Recall@5 - type: dot_recall@10 value: 0.058486915473213524 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.19719700869611326 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.35035714285714287 name: Dot Mrr@10 - type: dot_map@100 value: 0.06408607089134896 name: Dot Map@100 - type: query_active_dims value: 185.0 name: Query Active Dims - type: query_sparsity_ratio value: 0.9939387982438896 name: Query Sparsity Ratio - type: corpus_active_dims value: 1286.7938232421875 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9578404487503379 name: Corpus Sparsity Ratio - type: dot_accuracy@1 value: 0.28 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.42 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.44 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.48 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.28 name: Dot Precision@1 - type: dot_precision@3 value: 0.2533333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.20800000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.172 name: Dot Precision@10 - type: dot_recall@1 value: 0.01025265789874976 name: Dot Recall@1 - type: dot_recall@3 value: 0.024326098686792398 name: Dot Recall@3 - type: dot_recall@5 value: 0.03315745551680213 name: Dot Recall@5 - type: dot_recall@10 value: 0.058486915473213524 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.19719700869611326 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.35035714285714287 name: Dot Mrr@10 - type: dot_map@100 value: 0.06408607089134896 name: Dot Map@100 - type: query_active_dims value: 185.0 name: Query Active Dims - type: query_sparsity_ratio value: 0.9939387982438896 name: Query Sparsity Ratio - type: corpus_active_dims value: 1286.7938232421875 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9578404487503379 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoNQ type: NanoNQ metrics: - type: dot_accuracy@1 value: 0.22 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.36 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.5 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.56 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.22 name: Dot Precision@1 - type: dot_precision@3 value: 0.12 name: Dot Precision@3 - type: dot_precision@5 value: 0.10000000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.05600000000000001 name: Dot Precision@10 - type: dot_recall@1 value: 0.2 name: Dot Recall@1 - type: dot_recall@3 value: 0.33 name: Dot Recall@3 - type: dot_recall@5 value: 0.46 name: Dot Recall@5 - type: dot_recall@10 value: 0.52 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.3556861493087894 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.322 name: Dot Mrr@10 - type: dot_map@100 value: 0.31376550096906575 name: Dot Map@100 - type: query_active_dims value: 98.22000122070312 name: Query Active Dims - type: query_sparsity_ratio value: 0.9967819932763022 name: Query Sparsity Ratio - type: corpus_active_dims value: 841.8667602539062 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9724177065639898 name: Corpus Sparsity Ratio - type: dot_accuracy@1 value: 0.22 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.36 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.5 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.56 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.22 name: Dot Precision@1 - type: dot_precision@3 value: 0.12 name: Dot Precision@3 - type: dot_precision@5 value: 0.10000000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.05600000000000001 name: Dot Precision@10 - type: dot_recall@1 value: 0.2 name: Dot Recall@1 - type: dot_recall@3 value: 0.33 name: Dot Recall@3 - type: dot_recall@5 value: 0.46 name: Dot Recall@5 - type: dot_recall@10 value: 0.52 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.3556861493087894 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.322 name: Dot Mrr@10 - type: dot_map@100 value: 0.31376550096906575 name: Dot Map@100 - type: query_active_dims value: 98.22000122070312 name: Query Active Dims - type: query_sparsity_ratio value: 0.9967819932763022 name: Query Sparsity Ratio - type: corpus_active_dims value: 841.8667602539062 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9724177065639898 name: Corpus Sparsity Ratio - task: type: sparse-nano-beir name: Sparse Nano BEIR dataset: name: NanoBEIR mean type: NanoBEIR_mean metrics: - type: dot_accuracy@1 value: 0.24 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.39333333333333337 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.48 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.58 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.24 name: Dot Precision@1 - type: dot_precision@3 value: 0.16888888888888887 name: Dot Precision@3 - type: dot_precision@5 value: 0.13600000000000004 name: Dot Precision@5 - type: dot_precision@10 value: 0.09933333333333333 name: Dot Precision@10 - type: dot_recall@1 value: 0.14341755263291658 name: Dot Recall@1 - type: dot_recall@3 value: 0.25144203289559747 name: Dot Recall@3 - type: dot_recall@5 value: 0.3310524851722674 name: Dot Recall@5 - type: dot_recall@10 value: 0.42616230515773784 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.3287034799282638 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.3411904761904762 name: Dot Mrr@10 - type: dot_map@100 value: 0.24679865375169194 name: Dot Map@100 - type: query_active_dims value: 132.6866658528646 name: Query Active Dims - type: query_sparsity_ratio value: 0.9956527532320011 name: Query Sparsity Ratio - type: corpus_active_dims value: 812.3067522198979 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9733861885780781 name: Corpus Sparsity Ratio - type: dot_accuracy@1 value: 0.3254945054945055 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.4843328100470958 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.5676295133437991 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.6615384615384615 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.3254945054945055 name: Dot Precision@1 - type: dot_precision@3 value: 0.2203453689167975 name: Dot Precision@3 - type: dot_precision@5 value: 0.1832904238618524 name: Dot Precision@5 - type: dot_precision@10 value: 0.13404081632653062 name: Dot Precision@10 - type: dot_recall@1 value: 0.17156366311931473 name: Dot Recall@1 - type: dot_recall@3 value: 0.27243997398612047 name: Dot Recall@3 - type: dot_recall@5 value: 0.3368199222866662 name: Dot Recall@5 - type: dot_recall@10 value: 0.4238029847392705 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.3726337418448364 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.4264663726296379 name: Dot Mrr@10 - type: dot_map@100 value: 0.2989418038202097 name: Dot Map@100 - type: query_active_dims value: 234.31433094300914 name: Query Active Dims - type: query_sparsity_ratio value: 0.9923231003557103 name: Query Sparsity Ratio - type: corpus_active_dims value: 808.1458433081926 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9735225134883626 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoClimateFEVER type: NanoClimateFEVER metrics: - type: dot_accuracy@1 value: 0.22 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.28 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.36 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.46 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.22 name: Dot Precision@1 - type: dot_precision@3 value: 0.11333333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.084 name: Dot Precision@5 - type: dot_precision@10 value: 0.05800000000000001 name: Dot Precision@10 - type: dot_recall@1 value: 0.09166666666666666 name: Dot Recall@1 - type: dot_recall@3 value: 0.15333333333333332 name: Dot Recall@3 - type: dot_recall@5 value: 0.17666666666666664 name: Dot Recall@5 - type: dot_recall@10 value: 0.22466666666666665 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.19429559758090853 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.27672222222222226 name: Dot Mrr@10 - type: dot_map@100 value: 0.15485373044420248 name: Dot Map@100 - type: query_active_dims value: 259.8599853515625 name: Query Active Dims - type: query_sparsity_ratio value: 0.9914861416240233 name: Query Sparsity Ratio - type: corpus_active_dims value: 1094.6026611328125 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9641372563681013 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoDBPedia type: NanoDBPedia metrics: - type: dot_accuracy@1 value: 0.52 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.74 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.84 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.88 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.52 name: Dot Precision@1 - type: dot_precision@3 value: 0.4599999999999999 name: Dot Precision@3 - type: dot_precision@5 value: 0.45199999999999996 name: Dot Precision@5 - type: dot_precision@10 value: 0.384 name: Dot Precision@10 - type: dot_recall@1 value: 0.04966217676438495 name: Dot Recall@1 - type: dot_recall@3 value: 0.10354828293616407 name: Dot Recall@3 - type: dot_recall@5 value: 0.16425525763608173 name: Dot Recall@5 - type: dot_recall@10 value: 0.2406829559845734 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.456594069464261 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.6436666666666666 name: Dot Mrr@10 - type: dot_map@100 value: 0.3020935356938311 name: Dot Map@100 - type: query_active_dims value: 191.25999450683594 name: Query Active Dims - type: query_sparsity_ratio value: 0.9937337004617379 name: Query Sparsity Ratio - type: corpus_active_dims value: 809.2098999023438 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9734876515332435 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoFEVER type: NanoFEVER metrics: - type: dot_accuracy@1 value: 0.58 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.66 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.72 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.88 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.58 name: Dot Precision@1 - type: dot_precision@3 value: 0.22 name: Dot Precision@3 - type: dot_precision@5 value: 0.14800000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.08999999999999998 name: Dot Precision@10 - type: dot_recall@1 value: 0.56 name: Dot Recall@1 - type: dot_recall@3 value: 0.63 name: Dot Recall@3 - type: dot_recall@5 value: 0.7 name: Dot Recall@5 - type: dot_recall@10 value: 0.8466666666666666 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.681545812563628 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.6464126984126983 name: Dot Mrr@10 - type: dot_map@100 value: 0.6296549550854825 name: Dot Map@100 - type: query_active_dims value: 249.5399932861328 name: Query Active Dims - type: query_sparsity_ratio value: 0.9918242581322937 name: Query Sparsity Ratio - type: corpus_active_dims value: 1358.960205078125 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9554760433432237 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoFiQA2018 type: NanoFiQA2018 metrics: - type: dot_accuracy@1 value: 0.14 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.26 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.36 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.46 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.14 name: Dot Precision@1 - type: dot_precision@3 value: 0.10666666666666666 name: Dot Precision@3 - type: dot_precision@5 value: 0.10400000000000002 name: Dot Precision@5 - type: dot_precision@10 value: 0.068 name: Dot Precision@10 - type: dot_recall@1 value: 0.07933333333333334 name: Dot Recall@1 - type: dot_recall@3 value: 0.157 name: Dot Recall@3 - type: dot_recall@5 value: 0.2571666666666667 name: Dot Recall@5 - type: dot_recall@10 value: 0.30074603174603176 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.21720208465433088 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.23057936507936508 name: Dot Mrr@10 - type: dot_map@100 value: 0.17181110132538066 name: Dot Map@100 - type: query_active_dims value: 87.4000015258789 name: Query Active Dims - type: query_sparsity_ratio value: 0.9971364916609043 name: Query Sparsity Ratio - type: corpus_active_dims value: 517.6328735351562 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9830406633400447 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoHotpotQA type: NanoHotpotQA metrics: - type: dot_accuracy@1 value: 0.48 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.64 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.78 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.84 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.48 name: Dot Precision@1 - type: dot_precision@3 value: 0.26666666666666666 name: Dot Precision@3 - type: dot_precision@5 value: 0.204 name: Dot Precision@5 - type: dot_precision@10 value: 0.114 name: Dot Precision@10 - type: dot_recall@1 value: 0.24 name: Dot Recall@1 - type: dot_recall@3 value: 0.4 name: Dot Recall@3 - type: dot_recall@5 value: 0.51 name: Dot Recall@5 - type: dot_recall@10 value: 0.57 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.489382062974203 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.5975555555555556 name: Dot Mrr@10 - type: dot_map@100 value: 0.41273857719946977 name: Dot Map@100 - type: query_active_dims value: 151.22000122070312 name: Query Active Dims - type: query_sparsity_ratio value: 0.9950455408813085 name: Query Sparsity Ratio - type: corpus_active_dims value: 904.4683837890625 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9703666737504403 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoQuoraRetrieval type: NanoQuoraRetrieval metrics: - type: dot_accuracy@1 value: 0.34 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.48 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.58 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.74 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.34 name: Dot Precision@1 - type: dot_precision@3 value: 0.16 name: Dot Precision@3 - type: dot_precision@5 value: 0.12 name: Dot Precision@5 - type: dot_precision@10 value: 0.07800000000000001 name: Dot Precision@10 - type: dot_recall@1 value: 0.32666666666666666 name: Dot Recall@1 - type: dot_recall@3 value: 0.4466666666666667 name: Dot Recall@3 - type: dot_recall@5 value: 0.5406666666666666 name: Dot Recall@5 - type: dot_recall@10 value: 0.7106666666666667 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.5024501622170336 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.45037301587301587 name: Dot Mrr@10 - type: dot_map@100 value: 0.444050525697599 name: Dot Map@100 - type: query_active_dims value: 51.31999969482422 name: Query Active Dims - type: query_sparsity_ratio value: 0.9983185898796008 name: Query Sparsity Ratio - type: corpus_active_dims value: 59.146453857421875 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.998062169783847 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoSCIDOCS type: NanoSCIDOCS metrics: - type: dot_accuracy@1 value: 0.28 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.58 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.62 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.8 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.28 name: Dot Precision@1 - type: dot_precision@3 value: 0.24 name: Dot Precision@3 - type: dot_precision@5 value: 0.17199999999999996 name: Dot Precision@5 - type: dot_precision@10 value: 0.13599999999999998 name: Dot Precision@10 - type: dot_recall@1 value: 0.05866666666666667 name: Dot Recall@1 - type: dot_recall@3 value: 0.14766666666666667 name: Dot Recall@3 - type: dot_recall@5 value: 0.17566666666666664 name: Dot Recall@5 - type: dot_recall@10 value: 0.2796666666666667 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.25565589285716384 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.4341031746031745 name: Dot Mrr@10 - type: dot_map@100 value: 0.16804725663907635 name: Dot Map@100 - type: query_active_dims value: 195.27999877929688 name: Query Active Dims - type: query_sparsity_ratio value: 0.9936019920457605 name: Query Sparsity Ratio - type: corpus_active_dims value: 1035.02685546875 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9660891535460078 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoArguAna type: NanoArguAna metrics: - type: dot_accuracy@1 value: 0.02 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.12 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.14 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.16 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.02 name: Dot Precision@1 - type: dot_precision@3 value: 0.039999999999999994 name: Dot Precision@3 - type: dot_precision@5 value: 0.028000000000000004 name: Dot Precision@5 - type: dot_precision@10 value: 0.016 name: Dot Precision@10 - type: dot_recall@1 value: 0.02 name: Dot Recall@1 - type: dot_recall@3 value: 0.12 name: Dot Recall@3 - type: dot_recall@5 value: 0.14 name: Dot Recall@5 - type: dot_recall@10 value: 0.16 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.09097486504648661 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.06833333333333333 name: Dot Mrr@10 - type: dot_map@100 value: 0.07512669033130494 name: Dot Map@100 - type: query_active_dims value: 1119.800048828125 name: Query Active Dims - type: query_sparsity_ratio value: 0.9633117079867596 name: Query Sparsity Ratio - type: corpus_active_dims value: 936.6198120117188 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9693132883817667 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoSciFact type: NanoSciFact metrics: - type: dot_accuracy@1 value: 0.36 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.54 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.58 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 0.64 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.36 name: Dot Precision@1 - type: dot_precision@3 value: 0.19333333333333333 name: Dot Precision@3 - type: dot_precision@5 value: 0.124 name: Dot Precision@5 - type: dot_precision@10 value: 0.074 name: Dot Precision@10 - type: dot_recall@1 value: 0.335 name: Dot Recall@1 - type: dot_recall@3 value: 0.515 name: Dot Recall@3 - type: dot_recall@5 value: 0.545 name: Dot Recall@5 - type: dot_recall@10 value: 0.63 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.4915918543191975 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.4533333333333332 name: Dot Mrr@10 - type: dot_map@100 value: 0.4491987141282297 name: Dot Map@100 - type: query_active_dims value: 299.3399963378906 name: Query Active Dims - type: query_sparsity_ratio value: 0.9901926480460688 name: Query Sparsity Ratio - type: corpus_active_dims value: 1136.7972412109375 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9627548246769236 name: Corpus Sparsity Ratio - task: type: sparse-information-retrieval name: Sparse Information Retrieval dataset: name: NanoTouche2020 type: NanoTouche2020 metrics: - type: dot_accuracy@1 value: 0.5714285714285714 name: Dot Accuracy@1 - type: dot_accuracy@3 value: 0.8163265306122449 name: Dot Accuracy@3 - type: dot_accuracy@5 value: 0.9591836734693877 name: Dot Accuracy@5 - type: dot_accuracy@10 value: 1.0 name: Dot Accuracy@10 - type: dot_precision@1 value: 0.5714285714285714 name: Dot Precision@1 - type: dot_precision@3 value: 0.5578231292517006 name: Dot Precision@3 - type: dot_precision@5 value: 0.5387755102040817 name: Dot Precision@5 - type: dot_precision@10 value: 0.4265306122448979 name: Dot Precision@10 - type: dot_recall@1 value: 0.03907945255462338 name: Dot Recall@1 - type: dot_recall@3 value: 0.1141786135299426 name: Dot Recall@3 - type: dot_recall@5 value: 0.17607960990710933 name: Dot Recall@5 - type: dot_recall@10 value: 0.26785623174003165 name: Dot Recall@10 - type: dot_ndcg@10 value: 0.4784358025208683 name: Dot Ndcg@10 - type: dot_mrr@10 value: 0.7194120505344994 name: Dot Mrr@10 - type: dot_map@100 value: 0.3382724018630733 name: Dot Map@100 - type: query_active_dims value: 39.1020393371582 name: Query Active Dims - type: query_sparsity_ratio value: 0.9987188900027142 name: Query Sparsity Ratio - type: corpus_active_dims value: 630.3636474609375 name: Corpus Active Dims - type: corpus_sparsity_ratio value: 0.9793472365028197 name: Corpus Sparsity Ratio --- # splade-distilbert-base-uncased trained on GooAQ This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space and can be used for semantic search and sparse retrieval. ## Model Details ### Model Description - **Model Type:** SPLADE Sparse Encoder - **Base model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) - **Maximum Sequence Length:** 256 tokens - **Output Dimensionality:** 30522 dimensions - **Similarity Function:** Dot Product - **Training Dataset:** - [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) - **Language:** en - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder) ### Full Model Architecture ``` SparseEncoder( (0): MLMTransformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'}) (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SparseEncoder # Download from the 🤗 Hub model = SparseEncoder("tomaarsen/splade-distilbert-base-uncased-gooaq-peft") # Run inference queries = [ "how many days for doxycycline to work on sinus infection?", ] documents = [ 'Treatment of suspected bacterial infection is with antibiotics, such as amoxicillin/clavulanate or doxycycline, given for 5 to 7 days for acute sinusitis and for up to 6 weeks for chronic sinusitis.', 'Most engagements typically have a cocktail dress code, calling for dresses at, or slightly above, knee-length and high heels. If your party states a different dress code, however, such as semi-formal or dressy-casual, you may need to dress up or down accordingly.', 'The average service life of a gas furnace is about 15 years, but the actual life span of an individual unit can vary greatly. There are a number of contributing factors that determine the age a furnace reaches: The quality of the equipment.', ] query_embeddings = model.encode_query(queries) document_embeddings = model.encode_document(documents) print(query_embeddings.shape, document_embeddings.shape) # [1, 30522] [3, 30522] # Get the similarity scores for the embeddings similarities = model.similarity(query_embeddings, document_embeddings) print(similarities) # tensor([[93.4242, 28.8323, 33.3142]]) ``` ## Evaluation ### Metrics #### Sparse Information Retrieval * Datasets: `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoClimateFEVER`, `NanoDBPedia`, `NanoFEVER`, `NanoFiQA2018`, `NanoHotpotQA`, `NanoMSMARCO`, `NanoNFCorpus`, `NanoNQ`, `NanoQuoraRetrieval`, `NanoSCIDOCS`, `NanoArguAna`, `NanoSciFact` and `NanoTouche2020` * Evaluated with [SparseInformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator) | Metric | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 | |:----------------------|:------------|:-------------|:-----------|:-----------------|:------------|:-----------|:-------------|:-------------|:-------------------|:------------|:------------|:------------|:---------------| | dot_accuracy@1 | 0.22 | 0.28 | 0.22 | 0.22 | 0.52 | 0.58 | 0.14 | 0.48 | 0.34 | 0.28 | 0.02 | 0.36 | 0.5714 | | dot_accuracy@3 | 0.4 | 0.42 | 0.36 | 0.28 | 0.74 | 0.66 | 0.26 | 0.64 | 0.48 | 0.58 | 0.12 | 0.54 | 0.8163 | | dot_accuracy@5 | 0.5 | 0.44 | 0.5 | 0.36 | 0.84 | 0.72 | 0.36 | 0.78 | 0.58 | 0.62 | 0.14 | 0.58 | 0.9592 | | dot_accuracy@10 | 0.7 | 0.48 | 0.56 | 0.46 | 0.88 | 0.88 | 0.46 | 0.84 | 0.74 | 0.8 | 0.16 | 0.64 | 1.0 | | dot_precision@1 | 0.22 | 0.28 | 0.22 | 0.22 | 0.52 | 0.58 | 0.14 | 0.48 | 0.34 | 0.28 | 0.02 | 0.36 | 0.5714 | | dot_precision@3 | 0.1333 | 0.2533 | 0.12 | 0.1133 | 0.46 | 0.22 | 0.1067 | 0.2667 | 0.16 | 0.24 | 0.04 | 0.1933 | 0.5578 | | dot_precision@5 | 0.1 | 0.208 | 0.1 | 0.084 | 0.452 | 0.148 | 0.104 | 0.204 | 0.12 | 0.172 | 0.028 | 0.124 | 0.5388 | | dot_precision@10 | 0.07 | 0.172 | 0.056 | 0.058 | 0.384 | 0.09 | 0.068 | 0.114 | 0.078 | 0.136 | 0.016 | 0.074 | 0.4265 | | dot_recall@1 | 0.22 | 0.0103 | 0.2 | 0.0917 | 0.0497 | 0.56 | 0.0793 | 0.24 | 0.3267 | 0.0587 | 0.02 | 0.335 | 0.0391 | | dot_recall@3 | 0.4 | 0.0243 | 0.33 | 0.1533 | 0.1035 | 0.63 | 0.157 | 0.4 | 0.4467 | 0.1477 | 0.12 | 0.515 | 0.1142 | | dot_recall@5 | 0.5 | 0.0332 | 0.46 | 0.1767 | 0.1643 | 0.7 | 0.2572 | 0.51 | 0.5407 | 0.1757 | 0.14 | 0.545 | 0.1761 | | dot_recall@10 | 0.7 | 0.0585 | 0.52 | 0.2247 | 0.2407 | 0.8467 | 0.3007 | 0.57 | 0.7107 | 0.2797 | 0.16 | 0.63 | 0.2679 | | **dot_ndcg@10** | **0.4332** | **0.1972** | **0.3557** | **0.1943** | **0.4566** | **0.6815** | **0.2172** | **0.4894** | **0.5025** | **0.2557** | **0.091** | **0.4916** | **0.4784** | | dot_mrr@10 | 0.3512 | 0.3504 | 0.322 | 0.2767 | 0.6437 | 0.6464 | 0.2306 | 0.5976 | 0.4504 | 0.4341 | 0.0683 | 0.4533 | 0.7194 | | dot_map@100 | 0.3625 | 0.0641 | 0.3138 | 0.1549 | 0.3021 | 0.6297 | 0.1718 | 0.4127 | 0.4441 | 0.168 | 0.0751 | 0.4492 | 0.3383 | | query_active_dims | 114.84 | 185.0 | 98.22 | 259.86 | 191.26 | 249.54 | 87.4 | 151.22 | 51.32 | 195.28 | 1119.8 | 299.34 | 39.102 | | query_sparsity_ratio | 0.9962 | 0.9939 | 0.9968 | 0.9915 | 0.9937 | 0.9918 | 0.9971 | 0.995 | 0.9983 | 0.9936 | 0.9633 | 0.9902 | 0.9987 | | corpus_active_dims | 504.951 | 1286.7938 | 841.8668 | 1094.6027 | 809.2099 | 1358.9602 | 517.6329 | 904.4684 | 59.1465 | 1035.0269 | 936.6198 | 1136.7972 | 630.3636 | | corpus_sparsity_ratio | 0.9835 | 0.9578 | 0.9724 | 0.9641 | 0.9735 | 0.9555 | 0.983 | 0.9704 | 0.9981 | 0.9661 | 0.9693 | 0.9628 | 0.9793 | #### Sparse Nano BEIR * Dataset: `NanoBEIR_mean` * Evaluated with [SparseNanoBEIREvaluator](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters: ```json { "dataset_names": [ "msmarco", "nfcorpus", "nq" ] } ``` | Metric | Value | |:----------------------|:-----------| | dot_accuracy@1 | 0.24 | | dot_accuracy@3 | 0.3933 | | dot_accuracy@5 | 0.48 | | dot_accuracy@10 | 0.58 | | dot_precision@1 | 0.24 | | dot_precision@3 | 0.1689 | | dot_precision@5 | 0.136 | | dot_precision@10 | 0.0993 | | dot_recall@1 | 0.1434 | | dot_recall@3 | 0.2514 | | dot_recall@5 | 0.3311 | | dot_recall@10 | 0.4262 | | **dot_ndcg@10** | **0.3287** | | dot_mrr@10 | 0.3412 | | dot_map@100 | 0.2468 | | query_active_dims | 132.6867 | | query_sparsity_ratio | 0.9957 | | corpus_active_dims | 812.3068 | | corpus_sparsity_ratio | 0.9734 | #### Sparse Nano BEIR * Dataset: `NanoBEIR_mean` * Evaluated with [SparseNanoBEIREvaluator](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseNanoBEIREvaluator) with these parameters: ```json { "dataset_names": [ "climatefever", "dbpedia", "fever", "fiqa2018", "hotpotqa", "msmarco", "nfcorpus", "nq", "quoraretrieval", "scidocs", "arguana", "scifact", "touche2020" ] } ``` | Metric | Value | |:----------------------|:-----------| | dot_accuracy@1 | 0.3255 | | dot_accuracy@3 | 0.4843 | | dot_accuracy@5 | 0.5676 | | dot_accuracy@10 | 0.6615 | | dot_precision@1 | 0.3255 | | dot_precision@3 | 0.2203 | | dot_precision@5 | 0.1833 | | dot_precision@10 | 0.134 | | dot_recall@1 | 0.1716 | | dot_recall@3 | 0.2724 | | dot_recall@5 | 0.3368 | | dot_recall@10 | 0.4238 | | **dot_ndcg@10** | **0.3726** | | dot_mrr@10 | 0.4265 | | dot_map@100 | 0.2989 | | query_active_dims | 234.3143 | | query_sparsity_ratio | 0.9923 | | corpus_active_dims | 808.1458 | | corpus_sparsity_ratio | 0.9735 | ## Training Details ### Training Dataset #### gooaq * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c) * Size: 99,000 training samples * Columns: question and answer * Approximate statistics based on the first 1000 samples: | | question | answer | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | question | answer | |:-----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | what are the 5 characteristics of a star? | Key Concept: Characteristics used to classify stars include color, temperature, size, composition, and brightness. | | are copic markers alcohol ink? | Copic Ink is alcohol-based and flammable. Keep away from direct sunlight and extreme temperatures. | | what is the difference between appellate term and appellate division? | Appellate terms An appellate term is an intermediate appellate court that hears appeals from the inferior courts within their designated counties or judicial districts, and are intended to ease the workload on the Appellate Division and provide a less expensive forum closer to the people. | * Loss: [SpladeLoss](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters: ```json { "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')", "document_regularizer_weight": 3e-05, "query_regularizer_weight": 5e-05 } ``` ### Evaluation Dataset #### gooaq * Dataset: [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c) * Size: 1,000 evaluation samples * Columns: question and answer * Approximate statistics based on the first 1000 samples: | | question | answer | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | question | answer | |:-----------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | should you take ibuprofen with high blood pressure? | In general, people with high blood pressure should use acetaminophen or possibly aspirin for over-the-counter pain relief. Unless your health care provider has said it's OK, you should not use ibuprofen, ketoprofen, or naproxen sodium. If aspirin or acetaminophen doesn't help with your pain, call your doctor. | | how old do you have to be to work in sc? | The general minimum age of employment for South Carolina youth is 14, although the state allows younger children who are performers to work in show business. If their families are agricultural workers, children younger than age 14 may also participate in farm labor. | | how to write a topic proposal for a research paper? | ['Write down the main topic of your paper. ... ', 'Write two or three short sentences under the main topic that explain why you chose that topic. ... ', 'Write a thesis sentence that states the angle and purpose of your research paper. ... ', 'List the items you will cover in the body of the paper that support your thesis statement.'] | * Loss: [SpladeLoss](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters: ```json { "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')", "document_regularizer_weight": 3e-05, "query_regularizer_weight": 5e-05 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 32 - `learning_rate`: 2e-05 - `num_train_epochs`: 1 - `bf16`: True - `load_best_model_at_end`: True - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 32 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional - `router_mapping`: {} - `learning_rate_mapping`: {}
### Training Logs | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_dot_ndcg@10 | NanoNFCorpus_dot_ndcg@10 | NanoNQ_dot_ndcg@10 | NanoBEIR_mean_dot_ndcg@10 | NanoClimateFEVER_dot_ndcg@10 | NanoDBPedia_dot_ndcg@10 | NanoFEVER_dot_ndcg@10 | NanoFiQA2018_dot_ndcg@10 | NanoHotpotQA_dot_ndcg@10 | NanoQuoraRetrieval_dot_ndcg@10 | NanoSCIDOCS_dot_ndcg@10 | NanoArguAna_dot_ndcg@10 | NanoSciFact_dot_ndcg@10 | NanoTouche2020_dot_ndcg@10 | |:----------:|:--------:|:-------------:|:---------------:|:-----------------------:|:------------------------:|:------------------:|:-------------------------:|:----------------------------:|:-----------------------:|:---------------------:|:------------------------:|:------------------------:|:------------------------------:|:-----------------------:|:-----------------------:|:-----------------------:|:--------------------------:| | 0.0323 | 100 | 234.4946 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.0646 | 200 | 90.2538 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.0970 | 300 | 35.2404 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.1293 | 400 | 15.0794 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.1616 | 500 | 5.7405 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.1939 | 600 | 2.6706 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.1972 | 610 | - | 1.5711 | 0.1942 | 0.1431 | 0.1568 | 0.1647 | - | - | - | - | - | - | - | - | - | - | | 0.2262 | 700 | 1.4867 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.2586 | 800 | 0.9108 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.2909 | 900 | 0.7938 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.3232 | 1000 | 0.6679 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.3555 | 1100 | 0.5505 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.3878 | 1200 | 0.4851 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.3943 | 1220 | - | 0.3510 | 0.3406 | 0.1831 | 0.2740 | 0.2659 | - | - | - | - | - | - | - | - | - | - | | 0.4202 | 1300 | 0.4882 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.4525 | 1400 | 0.4156 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.4848 | 1500 | 0.452 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.5171 | 1600 | 0.3446 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.5495 | 1700 | 0.307 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.5818 | 1800 | 0.3416 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.5915 | 1830 | - | 0.2682 | 0.3942 | 0.1917 | 0.3140 | 0.3000 | - | - | - | - | - | - | - | - | - | - | | 0.6141 | 1900 | 0.2875 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.6464 | 2000 | 0.2989 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.6787 | 2100 | 0.3032 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.7111 | 2200 | 0.3843 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.7434 | 2300 | 0.2845 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.7757 | 2400 | 0.2838 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.7886 | 2440 | - | 0.2365 | 0.4144 | 0.1952 | 0.3378 | 0.3158 | - | - | - | - | - | - | - | - | - | - | | 0.8080 | 2500 | 0.2422 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.8403 | 2600 | 0.2546 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.8727 | 2700 | 0.2683 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.9050 | 2800 | 0.2923 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.9373 | 2900 | 0.301 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | 0.9696 | 3000 | 0.2796 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | **0.9858** | **3050** | **-** | **0.2284** | **0.4332** | **0.1972** | **0.3557** | **0.3287** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | **-** | | -1 | -1 | - | - | 0.4332 | 0.1972 | 0.3557 | 0.3726 | 0.1943 | 0.4566 | 0.6815 | 0.2172 | 0.4894 | 0.5025 | 0.2557 | 0.0910 | 0.4916 | 0.4784 | * The bold row denotes the saved checkpoint. ### Environmental Impact Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon). - **Energy Consumed**: 0.034 kWh - **Carbon Emitted**: 0.013 kg of CO2 - **Hours Used**: 0.145 hours ### Training Hardware - **On Cloud**: No - **GPU Model**: 1 x NVIDIA GeForce RTX 3090 - **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K - **RAM Size**: 31.78 GB ### Framework Versions - Python: 3.11.6 - Sentence Transformers: 4.2.0.dev0 - Transformers: 4.52.4 - PyTorch: 2.7.1+cu126 - Accelerate: 1.5.1 - Datasets: 2.21.0 - Tokenizers: 0.21.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### SpladeLoss ```bibtex @misc{formal2022distillationhardnegativesampling, title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective}, author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant}, year={2022}, eprint={2205.04733}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2205.04733}, } ``` #### SparseMultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` #### FlopsLoss ```bibtex @article{paria2020minimizing, title={Minimizing flops to learn efficient sparse representations}, author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s}, journal={arXiv preprint arXiv:2004.05665}, year={2020} } ```