Madhuram: Technical Guide

Building Madhuram was centered on achieving the minimum architectural complexity required for high-fidelity reasoning. This guide details the foundational choices that allow the model to operate efficiently across diverse environments.

150M

Parameters

4096

Context Window

275B

Training Tokens

Perseus

Architecture

Core Specifications

Madhuram-v0.5 is optimized to maintain a minimal memory footprint while delivering competitive performance. With 150 million parameters trained on 275 billion tokens, the model is architected to run natively on edge devices without the traditional reliance on significant hardware quantization.

Context & Reasoning

A primary feature of Madhuram is its 4096-token context window. This is supported by the Perseus architecture, which utilizes dense structural information within positional embeddings to capture long-range dependencies efficiently. The architecture enables precise retrieval across the full context length, allowing the model to maintain coherence and reference accuracy even in extended sequences.

Benchmark Performance

Evaluated on standard NLP benchmarks, Madhuram-v0.5 demonstrates competitive performance against larger models in the same class. The following results reflect zero-shot performance across commonsense reasoning and language understanding tasks.

Model	ARC-C	ARC-E	HellaSwag	OBQA	PIQA	WinoGrande	BoolQ	SIQA	Average
Madhuram-v0.5	31.40	58.96	45.94	34.00	70.84	52.17	56.57	38.54	48.55
SmolLM2-135M	34.50	58.90	43.60	41.10	68.90	52.80	60.50	43.50	50.48
Gemma-3-270M-pt	31.50	57.50	41.40	34.60	68.50	53.90	56.50	43.10	48.38
MobileLLM125M-LS	28.70	45.80	39.50	41.10	65.70	52.10	60.40	42.90	47.03
MobileLLM-R1-140M	32.50	47.30	32.90	31.50	62.50	51.00	57.20	42.60	44.69

Benchmarks evaluated using lm-evaluation-harness. Best scores in each category highlighted in green.

Deployment Philosophy

The model serves as a versatile baseline for distributed intelligence, targeting environments where low latency and privacy are paramount:

On-Device: Optimized for smartphones and wearables through efficient inference engines.
Embedded Systems: Functional on resource-constrained IoT hardware like the Jetson Nano and Raspberry Pi 5.
Private Cloud: Deployable as a high-speed agent within secure enterprise firewalls.

Next Steps

We continue to evaluate the model's performance on specialized benchmarks and expert reasoning tasks. For inquiries regarding specific evaluation results or research collaborations, please contact the lab.

Madhuram: Technical Overview