Fastino Labs, Creator of GLiNER, Releases Two State-of-the-Art Language Models 1,000x Smaller Than Frontier

The two open-source language models are 1000x smaller than large models from companies like OpenAI and Anthropic, with higher accuracy. PALO ALTO, Calif., May 14, 2026 /PRNewswire/ —ย Today, Fastino Labs released two new open-source small language models, GLiGuard and GLiNER2-PII, both built primarily with an autonomous agent, Pioneer. The models contain 300 million parameters, run…


Fastino Labs, Creator of GLiNER, Releases Two State-of-the-Art Language Models 1,000x Smaller Than Frontier

The two open-source language models are 1000x smaller than large models from companies like OpenAI and Anthropic, with higher accuracy.

PALO ALTO, Calif., May 14, 2026 /PRNewswire/ —ย Today, Fastino Labs released two new open-source small language models, GLiGuard and GLiNER2-PII, both built primarily with an autonomous agent, Pioneer. The models contain 300 million parameters, run inference in under 100 milliseconds, and outperform the accuracy of decoder models from OpenAI, NVIDIA, Meta, and Google that are up to 90 times larger. GLiGuard runs up to 20 times faster than current state-of-the-art guardrail models, while GLiNER2-PII achieves the highest span-level F1 of any publicly available PII model across seven languages and 42 entity types.

GLiNER2-PII achieves the highest accuracy of any publicly available PII model on the SPY benchmark
GLiNER2-PII achieves the highest accuracy of any publicly available PII model on the SPY benchmark

The releases come as enterprise AI deployments increasingly require dedicated infrastructure for safety moderation and privacy filtering. As agents gain the ability to browse the web, execute code, and act on a user’s behalf, the cost of unsafe LLM inputs and outputs or PII leakage has grown substantially.

GLiGuard: State-of-the-Art LLM Guardrails, 20x Faster

GLiGuard is a 300 million parameter encoder model that performs four safety moderation tasks in a single forward pass: safety classification, jailbreak detection, harm category detection, and refusal detection. Across nine established safety benchmarks, GLiGuard’s accuracy matches or exceeds decoder-based models 23 to 90 times its size, including Meta’s LlamaGuard4 (12B), Google’s ShieldGemma (27B), and NVIDIA’s NemoGuard (8B), while running up to 20 times faster.

“Current state-of-the-art guardrail models are doing safety moderation with 7 to 27 billion parameter decoder models. They’re using text generation to solve what is fundamentally a classification problem, which is slow, expensive, and impractical at production scale.” said Ash Lewis, CEO and Co-Founder of Fastino Labs.

GLiNER2-PII: Best-in-Classย PII Detection

GLiNER2-PII is a 300 million parameter multilingual model for detecting and redacting personally identifiable information across 42 entity types and seven languages. On the SPY benchmark, GLiNER2-PII achieved the highest span-level F1 of any publicly available PII model, outperforming OpenAI’s recently released Privacy Filter, NVIDIA’s GLiNER PII, and two other leading detectors.

Unlike OpenAI’s Privacy Filter, which repurposes a 1.5 billion parameter decoder checkpoint and locks developers into a fixed schema of 8 entity types, GLiNER2-PII is label-conditioned, meaning the target schema is an input to the model rather than a property baked into its weights. This lets the same checkpoint serve any organization’s PII policy without retraining, broad masking for analytics pipelines or fine-grained redaction for compliance audits, all from one model.

Source link