SentenceTransformer based on sentence-transformers/paraphrase-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("aasifali4813/bert-summarizer")
# Run inference
sentences = [
    '(CNN) -- Eight Florida teenagers -- six of them girls -- will be tried as adults and could be sentenced to life in prison for their alleged roles in the videotaped beating of another teen, the state attorney\'s office said Thursday. The teenagers seen in a video assaulting a 16-year-old could face life in prison. The suspects, who range in age from 14 to 18, all face charges of kidnapping, which is a first-degree felony, and battery, said Chip Thullbery, a spokesman for the Polk County state attorney. Three of them are also charged with tampering with a witness. Everyone involved in the case was under a gag order imposed by a judge. The only attorney for the teens who has been publicly identified did not return calls from CNN, and his assistant cited the gag order as the reason. The teens are scheduled for their first appearance in court Friday. The video shows a brutal scene: The 16-year-old victim is punched, kneed and slapped by other girls. She huddles in the fetal position, or stands and screams at her attackers, but the assault continues. Authorities say the eight teens said they were retaliating for insults posted on the Internet by the attack victim. Polk County Sheriff Grady Judd called the March 30 attack "animalistic." "I\'ve been involved in law enforcement for 35 years, and I\'ve seen a lot of extremely violent events, but I\'ve never seen children, 14 to 18 years of age, engage in this conduct for a 30-minute period of time and then make these video clips," he said. Police say the teens planned to post the video on YouTube.  Watch the disturbing video » . The victim, a 16-year-old from Lakeland, Florida, was hospitalized, and still has blurred vision, hearing loss, and a swollen face, her mother told CNN on Wednesday. The video shows only girls doing the beating; Judd said the boys acted as lookouts. The idea of girls administering a vicious beating so they can post the video online may seem shocking, but it\'s becoming an increasingly common scenario, according to experts and news reports.  Watch why more teens are putting fights online » . A search for "girl fight" on YouTube gets thousands of results, and a suggestion to also try "girl fight at school, boy girl fight" and other search terms. There\'s at least one Web site devoted exclusively to videos of girls fighting. In 2003, 25 percent of high school girls said they had been in a physical fight in the past year, according to a survey by the Centers for Disease Control and Prevention. (The figure for boys was 40.5 percent.) A Justice Department report released in 2006 showed that by age 17, 21 percent of girls said they had assaulted someone with the intent to cause serious harm. Frank Green is executive director of Keys to Safer Schools, a group that studies and tries to prevent school violence. He said he\'s not sure whether girls have actually become more violent, or whether there\'s just more awareness of their fights. "In one respect, girls have always been more vicious than boys," Green said. "Their violence is of a personal nature." He said boys usually have some focus and a concrete goal when they fight. "But girls want to cause pain and make the other girl feel bad," he said. Judd, the Polk County sheriff, said an important part of the plan in the Lakeland attack was to post the video of the beating on YouTube to humiliate and embarrass the victim. "It\'s the next stage of cyberbullying," psychologist Susan Lipkins said. "They want to show what they\'re doing." "Our kids are being peer pressured, in another sense of a trend, to put these shock videos out there at other peoples\' expense," said Talisa Lindsay, the victim\'s mother. "And I hope that it doesn\'t come to the point where there\'s more people\'s lives that are being affected by having to take a beating for entertainment, or possibly being killed."  Watch mother describe how the victim is doing » . The suspects didn\'t have a chance to post the video online before police moved in and seized it, Judd said. The Sheriff\'s Department made it public, and it wound up on YouTube anyway. Judd recognizes the irony. "In a perverted sense, we were feeding into exactly what the kids wanted," he said. "But according to Florida law, [the video] is public record, and it\'s going to be in the public domain whether we agree with that or not." Judd said the suspects showed no remorse when they were arrested and booked. "They were laughing and joking about, \'I guess we won\'t get to go to the beach during spring break.\' And one ... asked whether she could go to cheerleading practice," he said. Lipkins, the psychologist, says there\'s a "disconnect between their actions and their thoughts." "They think the entire society is doing it, and they think it\'s funny. So they put it on YouTube. And I don\'t think they expect kids to get really hurt, and they also don\'t expect to get really caught." E-mail to a friend . CNN\'s Rich Phillips contributed to this report.',
    'Eight Florida teens to be tried as adults in videotaped beating case .\nVideo shows 16-year-old girl punched by other girls .\n21 percent of girls age 17 say they\'ve assaulted someone, the Justice Dept. reports .\nThe teens have "disconnect" between thoughts and actions, psychologist says .',
    'Cornelia Wallace was in her late 60s .\nShe was with Wallace when would-be assassin shot him in 1972 .\n"She served as first lady during a very turbulent time," Gov. Bob Riley says .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7511, 0.1292],
#         [0.7511, 1.0000, 0.1549],
#         [0.1292, 0.1549, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,000 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 29 tokens
    • mean: 127.89 tokens
    • max: 128 tokens
    • min: 26 tokens
    • mean: 56.37 tokens
    • max: 93 tokens
  • Samples:
    sentence_0 sentence_1
    (CNN) -- It may take a lot of frequent-flier miles, a penchant for cold places, a tolerance of taxes and regular doses of chocolate, but happiness could be within reach. However, it's not where most people might expect. Journalist Eric Weiner says he wanted to explore the relationship between place and happiness. Just ask Eric Weiner, who made it his mission to find the most content places around the globe, uncovering lots of surprises along the way. Hungering for a tropical paradise? A warm climate doesn't necessarily make a happy nation, Weiner said. Thinking of moving to a wealthy state? Money can degrade happiness, he found. Weiner, who wrote the book, "The Geography of Bliss: One Grump's Search for the Happiest Places in the World," began his quest for very personal reasons. "I'm an unhappy person, so it's kind of what prompts a hungry person to search for food," he said. Weiner spent 10 years as a foreign correspondent for National Public Radio, a job that took him to some of the... Journalist spent a year looking for the world's happiest countries .
    Eric Weiner: Bhutan is probably the closest thing on Earth to Shangri-La .
    He marvels at the creativity and "coziness" of Iceland .
    Self-described "grump:" chocolate contributes to happiness in Switzerland .
    (CNN) -- Hamburg have put one foot in the UEFA Cup final after a header from Germany winger Piotr Trochowski proved enough to give them a 1-0 win at Bundesliga rivals Werder Bremen in the first leg of their semifinal. Piotr Trochowski celebrates the only goal as Hamburg took a major step towards the UEFA Cup final. Martin Jol's side scored the only goal of a pulsating match in the 38th minute when Trochowski, the smallest player on the pitch, rose superbly at the back post to head Guy Demel's right-wing cross past goalkeeper Tim Wiese. Both side had countless half-chances to score but Hamburg wasted the best of them on the hour mark when Bayern Munich-bound striker Ivica Olic broke through in acres of space, but fired his shot straight at Wiese. Hamburg's victory puts them on line for their first European final since 1983, when they beat Juventus 1-0 to win the European Cup. This was the second of four matches in quick succession between the north Germany neighbors -- who met each othe... Hamburg in line for first European final since 1983 after defeating Werder 1-0 .
    Winger Piotr Trochowski heads in the only goal of their UEFA Cup semifinal .
    Eventual winners will face Shakhtar Donetsk or Dynamo Kiev in Istanbul final .
    (CNET) -- Suleman Ali cashed out just in time. Suleman Ali sold Esgut, his portfolio of Facebook applications, for seven figures in April. The 26-year-old, a former Microsoft employee who helped put together the Windows Home Server product, founded a company called Esgut within months of the debut of Facebook's developer platform in May 2007. Esgut is a portfolio of Facebook applications, and a few of them, like Superlatives and Entourage, became genuine viral hits. In April, Ali sold the 12-employee Esgut to the Social Gaming Network, a Silicon Valley company backed by the likes of Bezos Expeditions, the Founders Fund, and Greylock Partners. He said the price was in the seven figures. But Ali is the first to acknowledge that for upstart social-platform developers, hailed just months ago as the Valley's hottest breed of bright young things, the condition has taken a significant turn for the worse. "Most people are not counting on anything," the lanky and bespectacled Ali said over lunc... Suleman Ali sold Esgut, his tech startup, for seven figures in April .
    Esgut is a portfolio of Facebook applications; a few of them became big viral hits .
    Suleman "started building Facebook apps just out of restlessness"
    He sold his company just before the social-platform craze subsided .
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false,
        "directions": [
            "query_to_doc"
        ],
        "partition_mode": "joint",
        "hardness_mode": null,
        "hardness_strength": 0.0
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: False
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.3.0
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}
Downloads last month
3
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aasifali4813/bert-summarizer

Finetuned
(21)
this model

Papers for aasifali4813/bert-summarizer