Transformers

07
May
Mastering F1-Score and AUC-ROC

Mastering Classification Metrics: A Deep Dive from F1-Score to AUC-ROC

Accuracy lies. Learn why F1-score and AUC-ROC give a true picture of AI model performance—especially in imbalanced datasets.
18 min read
01
May
Diving deeper: Inside the transformer layer

Diving Deeper: Inside the Transformer Layer

"A clear breakdown of Transformer layers—LayerNorm, Attention, FeedForward, and Residuals—explained step-by-step with visuals."
9 min read
30
Apr
Understanding transformers intuitively

Understanding Transformers Intuitively: From Non-Linearity to Attention

Discover how neural networks evolve from simple activations to powerful attention mechanisms in transformers—explained intuitively, without complex math.
29 min read
13
Apr
The Emotion Illusion

The Emotion Illusion: Why Language in AI Matters

When a leading AI scientist like Yann LeCun says, "AI systems will have emotions," it sounds like science fiction — or a warning. But the truth is far more complicated, and far more important to understand.
15 min read
13
Apr
Extending tokenize

Extending Pretrained Transformers with Domain-Specific Vocabulary: A Hugging Face Walkthrough

Learn how to safely extend Hugging Face tokenizers with domain-specific vocabulary, resize model embeddings, and preserve compatibility for fine-tuning without retraining from scratch.
24 min read
10
Apr
Understanding Machine Learning Pipelines: From Data to Deployment

Understanding Machine Learning Pipelines: From Data to Deployment

“No matter how advanced your model or pipeline, it’s only as good as the truth it learns from. Ground truth isn’t just the start — it’s the standard that guides the entire machine learning journey.”
10 min read
04
Apr
GAN - Generator-Discriminator

Understanding GANs: How Machines Learn to Create

“The Discriminator knows the domain. The Generator starts with nothing. It generates gibberish, gets rejected, and slowly adapts — until it produces something so good, the Discriminator can't tell anymore.”
37 min read
26
Mar
Building a Cost-Efficient AI Query Router: From Fuzzy Logic to Quantized BERT

Building a Cost-Efficient AI Query Router: From Fuzzy Logic to Quantized BERT

We built a BERT-powered router to classify query complexity and smartly route to LLMs like GPT-4 or Mistral—balancing cost, speed, and accuracy with ONNX quantization.
13 min read
13
Mar
The Belief State Transformer (BST): A Leap Beyond Next-Token Prediction

The Belief State Transformer (BST): A Leap Beyond Next-Token Prediction

The Belief State Transformer (BST) enhances AI text generation by encoding both past and future context, ensuring coherence in long-form content. Unlike traditional models that predict words based only on past tokens, BST constructs a global belief state using bidirectional reasoning.
4 min read
05
Mar
Explanation of Distillation

Explanation of Distillation

Distillation in the context of machine learning, particularly as used by companies like DeepSeek or others working with large-scale models,
2 min read