Intelligent Estate

community

https://www.youtube.com/@intelligentestate

Radiant_Castle

Activity Feed Request to join this org

AI & ML interests

Connecting individuals with innovation: Emancipating and Truly Federalizing Private Intelligence

Recent Activity

Goekdeniz-Guelmez authored a paper 14 days ago

Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models

upgraedd new activity 19 days ago

IntelligentEstate/README:immutable-reality

suayptalha authored a paper 2 months ago

Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training

View all activity

Sri-Vigneshwar-DJ

posted an update about 15 hours ago

Post

112

🦅 Introducing Hawky AI H1 Mini 4B: A Domain-Specific Model for Performance Marketing

Hey HuggingFace community! 👋

We're excited to share our first open-source release: **Hawky AI H1 Mini 4B Experimental** - a Gemma 3 4B model fine-tuned specifically for Meta advertising and performance marketing strategy.

🎯 Why We Built This

At [Hawky.ai](https://hawky.ai), we build AI-powered creative intelligence tools for performance marketers. We work with major agencies (WPP, Madison, GroupM) and brands (TVS Motors, Tanishq, Bajaj Finserv) on campaign optimization.

We wanted to explore: Can a small, domain-specific model provide expert-level guidance on performance marketing?

Specifically, we focused on Meta's Andromeda algorithm - the AI system that now powers ad delivery across Facebook and Instagram. Understanding Andromeda is crucial for modern media buying, but the knowledge is scattered and constantly evolving.

🧠 What Makes This Different

Chain-of-Thought Reasoning
The model doesn't just answer - it **thinks through problems** step-by-step:

Sri-Vigneshwar-DJ/hawky-ai-h1-mini-4b-experimental

Sri-Vigneshwar-DJ

posted an update 5 days ago

Post

905

Domain-specific reasoning is crucial when working with big-budget campaigns on Meta. That's why we've launched an experimental Chain-of-Thought (CoT) reasoning model for critical thinking, tailored to Meta's Andromeda algorithm-based campaign structuring and optimization.

Sri-Vigneshwar-DJ/hawky-ai-h1-mini-1b-experimental

mitkox

posted an update 6 days ago

Post

3176

I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5.
2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day.
But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol
3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!

3 replies

Sri-Vigneshwar-DJ

posted an update 7 days ago

Post

2930

The recent update to Meta's ad algorithm is very difficult to crack, and even the latest models struggle to keep up with it. To address this, we've created a small experimental dataset for fine-tuning models to better tackle Meta's Andromeda algorithm: Sri-Vigneshwar-DJ/hawky-ai-andromeda-dataset

Sri-Vigneshwar-DJ

posted an update 11 days ago

Post

2060

I recently encountered a context graph issue in specific verticals and wrote about the importance of the context layer in performance marketing.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/the-context-layer-performance-marketing-needs

csabakecskemeti

posted an update 13 days ago

Post

3153

Just sharing a result of a homelab infrastructure experiment:

I've managed to setup a distributed inference infra at home using a DGX Spark (128GB unified gddr6) and a linux workstation with an RTX 6000 Pro (96GB gddr7) connected via 100Gbps RoCEv2. The model I've used (https://lnkd.in/gx6J7YuB) is about 140GB so could not fit either of the GPU. Full setup and tutorial soon on devquasar.com

Screen recording:
https://lnkd.in/gKM9H5GJ

3 replies

Goekdeniz-Guelmez

authored a paper 14 days ago

Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models

Paper • 2512.18901 • Published 18 days ago • 3

upgraedd

in IntelligentEstate/README 19 days ago

immutable-reality

#2 opened 19 days ago by

upgraedd

mitkox

posted an update about 1 month ago

Post

2325

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly.
Works out of the box with Mistral Vibe. Next is time to test the big one.

3 replies

takarajordan

posted an update about 1 month ago

Post

261

yooo Tongyi-MAI/Z-Image-Turbo IS SOOOO SICK!

Congrats to the team you absolutely cooked with this.

csabakecskemeti

posted an update about 1 month ago

Post

1266

FYI: Mistral.Ministral-3 dequantizer FP8->BF16

https://github.com/csabakecskemeti/ministral-3_dequantizer_fp8-bf16

(The instruct model weights are in FP8)

csabakecskemeti

posted an update about 1 month ago

Post

2075

Looking for some help to test an INT8 Deepseek 3.2:
SGLang supports Channel wise INT8 quants on CPUs with AMX instructions (Xeon 5 and above AFAIK)
https://lmsys.org/blog/2025-07-14-intel-xeon-optimization/

Currently uploading an INT8 version of Deepseek 3.2 Speciale:
DevQuasar/deepseek-ai.DeepSeek-V3.2-Speciale-Channel-INT8

I cannot test this I'm on AMD
"AssertionError: W8A8Int8LinearMethod on CPU requires that CPU has AMX support"
(I assumed it can fall back to some non optimized kernel but seems not)

If anyone with the required resources (Intel Xeon 5/6 + ~768-1TB ram) can help to test this that would be awesome.

If you have hints how to make this work on AMD Threadripper 7000 Pro series please guide me.

Thanks all!

8 replies

takarajordan

posted an update about 1 month ago

Post

3193

Two weeks ago I had an engaging discussion with locals in Cockermouth about AI and the broader industry, a reminder that hearing candid perspectives beyond our professional circles is invaluable and something anyone working full-time in this field should make time for.

Thank you!

mitkox

posted an update about 2 months ago

Post

3157

I run 20 AI coding agents locally on my desktop workstation at 400+ tokens/sec with MiniMax-M2. It’s a Sonnet drop-in replacement in my Cursor, Claude Code, Droid, Kilo and Cline peak at 11k tok/sec input and 433 tok/s output, can generate 1B+ tok/m.All with 196k context window. I'm running it for 6 days now with this config.

Today max performance was stable at 490.2 tokens/sec across 48 concurrent clients and MiniMax M2.

Z8 Fury G5, Xeon 3455, 4xA6K. Aibrix 0.5.0, vLLM 0.11.2,

5 replies

csabakecskemeti

posted an update about 2 months ago

Post

304

Recently there are so much activity on token efficient formats, I've also build a package (inspired by toon).

Deep-TOON

My goal was to token efficiently handle json structures with complex embeddings.

So this is what I've built on the weekend. Feel free try:

https://pypi.org/project/deep-toon/0.1.0/

takarajordan

posted an update about 2 months ago

Post

264

🌞 LOVABLE IS CRACKED

Built a golden hour tracker in under 15 minutes with Lovable: uses your phone’s Geolocation API, the SunCalc library, and runs fully client-side with no servers. https://goldenhour.404missing.link

mitkox

posted an update 2 months ago

Post

4168

I just threw Qwen3-0.6B in BF16 into an on device AI drag race on AMD Strix Halo with vLLM:

564 tokens/sec on short 100-token sprints
96 tokens/sec on 8K-token marathons

TL;DR You don't just run AI on AMD. You negotiate with it.

The hardware absolutely delivers. Spoiler alert; there is exactly ONE configuration where vLLM + ROCm + Triton + PyTorch + Drivers + Ubuntu Kernel to work at the same time. Finding it required the patience of a saint

Consumer AMD for AI inference is the ultimate "budget warrior" play, insane performance-per-euro, but you need hardcore technical skills that would make a senior sysadmin nod in quiet respect.

1 reply

suayptalha

authored a paper 2 months ago

Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training

Paper • 2511.01918 • Published Nov 1, 2025 • 11

fuzzy-mittenz

updated a collection 2 months ago

SwarmModels

Collection

48 items • Updated Oct 28, 2025 • 1

mitkox

posted an update 2 months ago

Post

384

I have just vibe coded a feature for ODA on-device AI with MiniMax M2, running locally on my Z8 Fury - and holy silicon, this thing SLAPS!
TL;DR the nerd stuff

Specialized in coding and agentic work
60 tokens/sec
Ryzen AI is getting some serious ROCm 7.0.2 brain implants
One extra script to rule them all and bind them to my GPU
Vibe coding feature implementation that actually worked on the first try. I know, I'm scared too

AI & ML interests

Recent Activity

Team members 34

IntelligentEstate's activity

immutable-reality