huggingPartyParis

community

https://partiful.com/e/oWOMGoPxB5D37qw5F8yN

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

phongnhhn92 authored a paper about 2 months ago

SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

kaopanboonyuen authored a paper about 2 months ago

HOMEY: Heuristic Object Masking with Enhanced YOLO for Property Insurance Risk Detection

kaopanboonyuen authored a paper about 2 months ago

Foundations and Architectures of Artificial Intelligence for Motor Insurance

View all activity

Shrijanagain

posted an update about 1 month ago

Post

4266

sKT-Ai-Labs

Join fast we will soon published tokens and all join and get started because we will soon off join request button if you want you can join fast guys

1 reply

Shrijanagain

posted an update about 2 months ago

Post

2645

🚀 Bharat AI Revolution ka Hissa Banein! 🇮🇳

Kya aap Bharat ko AI ki duniya mein ek nayi pehchan dilana chahte hain ?

SKT AI Labs sirf ek naam nahi, ek mission hai—desh ko digital shakti dene ka aur "Viksit Bharat" ke sapne ko sach karne ka.

Humse Kyun Judein?

1. Desh ka Apna AI: Hum aise models bana rahe hain jo khas taur par Bharat ki zarooraton aur bhashaon ke liye hain.

2. Open Collaboration: Hamare Hugging Face repository par hamare kaam ko dekhein, test karein aur apna yogdan dein.

3. Technological Growth: Agar aap student hain, developer hain ya tech enthusiast hain, toh hamare saath naya seekhne aur grow karne ka yeh behtareen mauka hai.

Join here

sKT-Ai-Labs
🔗

sKT-Ai-Labs

Aaiye, saath milkar Bharat AI Revolution ko aage badhate hain! 💻🔥

#SKTAILabs #DigitalIndia #AIRevolution #ViksitBharat #TechInnovation #JoinTheMission

Shrijanagain

posted an update about 2 months ago

Post

6891

SOME NEW HINDI + ENGLISH DATASETS

🔗
- sKT-Ai-Labs/HIN
- sKT-Ai-Labs/SKT-MIX
- sKT-Ai-Labs/ST-H

Download and Use And Train Models

You Can Alsoo Use ST-x-LIGHTING Module For Faster Training

pip install ST-x-LIGHT-V11

2 replies

Shrijanagain

posted an update about 2 months ago

Post

5620

We are thrilled to announce the launch of SKT-OMNI-CORPUS-2T, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch.
Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge.

💎 Key Highlights:

•• Massive Scale: Targeting a multi-terabyte architecture for 2T-level tokenization.

•• Pure Quality: Curated from 500+ Elite Sources

•• Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training.

🤝 Open for Collaboration!

We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together.

Explore the Dataset on Hugging Face:

🔗 https://huggingface.co/datasets/Shrijanagain/SKT-OMNI-CORPUS-146T-V1

DSR -- 🔗 https://huggingface.co/datasets/Shrijanagain/SKT-DSRx10000

#AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia

jorgemunozl

posted an update 4 months ago

Post

407

Test

I know that it was buggy, OMG

1 reply

julien-c

submitted a paper to Daily Papers 4 months ago

Shaping capabilities with token-level data filtering

Paper • 2601.21571 • Published Jan 29 • 29

eliebak

submitted a paper to Daily Papers 5 months ago

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Paper • 2512.14080 • Published Dec 16, 2025 • 9

leonsick

authored a paper 5 months ago

S2D: Sparse-To-Dense Keymask Distillation for Unsupervised Video Instance Segmentation

Paper • 2512.14440 • Published Dec 16, 2025 • 1

leonsick

submitted a paper to Daily Papers 5 months ago

S2D: Sparse-To-Dense Keymask Distillation for Unsupervised Video Instance Segmentation

Paper • 2512.14440 • Published Dec 16, 2025 • 1

lbourdois

posted an update 7 months ago

Post

1728

New blog post analyzing the top 50 entities with the most downloaded models on @huggingface 🤗!

https://huggingface.co/blog/lbourdois/huggingface-models-stats

The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!).

32 figures + data

Enjoy 🤗

eliebak

posted an update 9 months ago

Post

4467

Super excited to announce that our research team at Hugging Face will be doing an AMA on reddit r/LocalLLaMA.

Come ask any questions to the team behind SmolLM, FineWeb and more! And who knows, maybe there’ll be a shiny new release to talk about?

Thursday 4th September, 8AM-11AM PST 🤗

science

eliebak

posted an update 9 months ago

Post

767

Motif 2.6B tech report is pretty insane, first time i see a model with differential attention and polynorm trained at scale!

> It's trained on 2.5T of token, with a "data mixture schedule" to continuously adjust the mixture over training.
> They use WSD with a "Simple moving average" averaging the last 6 ckpt every 8B token.
> They trained on Finemath, Fineweb2, DCLM, TxT360.
> Lot of details in the finetuning data they used, for instance they used EvolKit and did some "dataset fusion" to have more compressed knowledge into the data.
> They mention they also tried Normalized GPT, QK-Norm and Cross Layer Attention.

Motif-Technologies/Motif-2.6B

eliebak

posted an update 10 months ago

Post

4837

Kimi K2 tech report is full of gems as always. Here are my notes on it:

> MuonClip: Pretty crazy how after 70k the training stabilizes and the QK-clip is basically inactive. There is also no loss in perf with QK-clip which is not trivial at all (at small scale but with aggressive threshold). Also a cool explanation of why muon makes the logit explode in appendix E (tl;dr is that muon makes the singular value of the update matrix higher)
> Sparsity scaling laws to justify their ratio, they have a very solid training infra that allows the model to be trained at this sparsity level, they could have increased even more but as sparsity increases the training becomes less efficient.
> They diminish the number of attention heads to make it more efficient for long context since attention heads are a big bottleneck for long context. They also remove 2 of the 3 "first dense" layers in the dsv3 arch.

With the sparsity and attention heads (divided by 2) they achieve 83% increased flops compared to deepseek v3 arch at 128k.

> Data: Rephrasing is KEY. They do a lot more synthetic data generation and rephrase their corpus to have different styles, for longer documents they do it by chunk. I'm (half) surprised by the fact that ONLY 1 epoch (assuming same number of training tokens I think?) of data rephrased 10 times has better accuracy than 10 epochs of the same data rephrased once.
> They do rewriting for Math and Knowledge, for Math they apply the ShallowMath recipe and instruct the model to rephrase in a "learning note" style
> They talk about diversity and probably have some internal stuff/eval to test that, as always still a bit unclear for me how to properly measure that.

The infra is also very nice, quick summary:
> PP=16 (1F1B schedule, a bit custom), EP=16, zero1
> No FP8 computation but for storage of specific layers, selective recomputation for inexpensive block, activation offloading to CPU

Lukas431

authored a paper 10 months ago

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Paper • 2507.14137 • Published Jul 18, 2025 • 36

loubnabnl

authored a paper 12 months ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5, 2025 • 61

eliebak

authored a paper 12 months ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5, 2025 • 61

loubnabnl

posted an update about 1 year ago

Post

7332

SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱

And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai

4 replies

julien-c

posted an update about 1 year ago

Post

10949

BOOOOM: Today I'm dropping TINY AGENTS

the 50 lines of code Agent in Javascript 🔥

I spent the last few weeks working on this, so I hope you will like it.

I've been diving into MCP (Model Context Protocol) to understand what the hype was all about.

It is fairly simple, but still quite powerful: MCP is a standard API to expose sets of Tools that can be hooked to LLMs.

But while doing that, came my second realization:

Once you have a MCP Client, an Agent is literally just a while loop on top of it. 🤯

➡️ read it exclusively on the official HF blog: https://huggingface.co/blog/tiny-agents

1 reply

leonsick

authored a paper about 1 year ago

Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding

Paper • 2504.06719 • Published Apr 9, 2025 • 8

eliebak

authored a paper about 1 year ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207

AI & ML interests

Recent Activity

Team members 973

HuggingPartyParis's activity