Fastino Labs, Creator of GLiNER, Releases Two State-of-the-Art Language Models 1,000x Smaller Than Frontier

The two open-source language models are 1000x smaller than large models from companies like OpenAI and Anthropic, with higher accuracy. PALO ALTO, Calif., May 14, 2026 /PRNewswire/ — Today, Fastino Labs released two new open-source small language models, GLiGuard and GLiNER2-PII, both built primarily with an autonomous agent, Pioneer. The models contain 300 million parameters, run…

ThePostMaster

May 14, 2026

·

4–7 minutes

0

8

/

The two open-source language models are 1000x smaller than large models from companies like OpenAI and Anthropic, with higher accuracy.

PALO ALTO, Calif., May 14, 2026 /PRNewswire/ — Today, Fastino Labs released two new open-source small language models, GLiGuard and GLiNER2-PII, both built primarily with an autonomous agent, Pioneer. The models contain 300 million parameters, run inference in under 100 milliseconds, and outperform the accuracy of decoder models from OpenAI, NVIDIA, Meta, and Google that are up to 90 times larger. GLiGuard runs up to 20 times faster than current state-of-the-art guardrail models, while GLiNER2-PII achieves the highest span-level F1 of any publicly available PII model across seven languages and 42 entity types.

GLiNER2-PII achieves the highest accuracy of any publicly available PII model on the SPY benchmark

The releases come as enterprise AI deployments increasingly require dedicated infrastructure for safety moderation and privacy filtering. As agents gain the ability to browse the web, execute code, and act on a user’s behalf, the cost of unsafe LLM inputs and outputs or PII leakage has grown substantially.

GLiGuard: State-of-the-Art LLM Guardrails, 20x Faster

GLiGuard is a 300 million parameter encoder model that performs four safety moderation tasks in a single forward pass: safety classification, jailbreak detection, harm category detection, and refusal detection. Across nine established safety benchmarks, GLiGuard’s accuracy matches or exceeds decoder-based models 23 to 90 times its size, including Meta’s LlamaGuard4 (12B), Google’s ShieldGemma (27B), and NVIDIA’s NemoGuard (8B), while running up to 20 times faster.

“Current state-of-the-art guardrail models are doing safety moderation with 7 to 27 billion parameter decoder models. They’re using text generation to solve what is fundamentally a classification problem, which is slow, expensive, and impractical at production scale.” said Ash Lewis, CEO and Co-Founder of Fastino Labs.

GLiNER2-PII: Best-in-Class PII Detection

GLiNER2-PII is a 300 million parameter multilingual model for detecting and redacting personally identifiable information across 42 entity types and seven languages. On the SPY benchmark, GLiNER2-PII achieved the highest span-level F1 of any publicly available PII model, outperforming OpenAI’s recently released Privacy Filter, NVIDIA’s GLiNER PII, and two other leading detectors.

Unlike OpenAI’s Privacy Filter, which repurposes a 1.5 billion parameter decoder checkpoint and locks developers into a fixed schema of 8 entity types, GLiNER2-PII is label-conditioned, meaning the target schema is an input to the model rather than a property baked into its weights. This lets the same checkpoint serve any organization’s PII policy without retraining, broad masking for analytics pipelines or fine-grained redaction for compliance audits, all from one model.

“Developers building agents today need models that are faster and more deterministic than what frontier decoder models can offer,” said George Hurn-Maloney, COO and Co-Founder of Fastino Labs. “When a guardrail or PII model gets called on every input and every output, latency compounds quickly and probabilistic behavior becomes a real liability. GLiGuard and GLiNER2-PII give developers sub-100ms inference with deterministic outputs, exactly what production agentic systems need.”

Pioneer: The Autonomous Research Agent Behind Both Models

Pioneer, Fastino Labs’ autonomous research agent, played a central role in pushing both models past the accuracy of much larger alternatives. Pioneer synthesized targeted training data, ran parallel post-training experiments, and iterated on real and synthetic dataset composition without human intervention.

For GLiGuard, Pioneer generated supplemental synthetic data targeting fine-grained distinctions between similar harm categories like toxic speech and violence, which the model initially struggled to separate. For GLiNER2-PII, Pioneer produced 4,910 high-quality annotated examples across seven languages and document formats including chat logs, support tickets, CRM notes, KYC forms, invoices, and medical records.

“What used to take our research team months of manual experimentation now takes hours,” said Lewis. “Pioneer ran dozens of training experiments autonomously, meaning the accuracy you see in GLiGuard and GLiNER2-PII came out of an agentic research process, not a traditional one.”

Pioneer was published in a recent Fastino Labs research paper demonstrating gains of up to 83.8 percentage points on standard benchmarks across cold-start fine-tuning and production failure repair. Both GLiGuard and GLiNER2-PII are flagship examples of agentic post-training in practice: research-grade models developed in days rather than months, with model quality driven by an autonomous loop rather than manual experimentation.

Why Small Models Matter for Agentic AI

Both models reflect Fastino Labs’ thesis that small, highly-accurate language models will power the next wave of production AI deployments. Guardrail and privacy models are called on every user input and every model output, meaning even small latency increases compound quickly as conversations grow.

“Every Fortune 500 deploying agents today is building their own internal guardrail and PII infrastructure,” added Hurn-Maloney. “We’re open-sourcing two of the best models in the world for these tasks because the entire industry benefits when this layer becomes a commodity.”

Availability

Both GLiGuard and GLiNER2-PII are available today on Hugging Face under the Apache 2.0 license and for inference on Pioneer, Fastino Labs’ agentic inference platform. The accompanying research papers are available on arXiv.

About Fastino Labs

Fastino Labs is a research lab based in Palo Alto, California, building small language models and tooling such as Pioneer; to inference and fine-tune language models. The company is the creator of Pioneer, an agent that improves the accuracy of LLMs in production over time and the GLiNER open-source model family, which has been downloaded more than 30 million times and is used in production by Fortune 500 teams including NVIDIA, Meta, and Airbnb. Fastino Labs has raised $25 million through its seed round and is backed by investors including Khosla Ventures, Insight Partners, and Microsoft M12.

About Pioneer

Pioneer is an inference API from Fastino Labs that gives developers access to 30+ leading open-source and frontier models, including Anthropic’s Opus, GPT, Gemma, Nemotron, and DeepSeek. Pioneer then continuously improves the models using real production traffic. Pioneer watches live requests, identifies where a model is failing, and retrains and promotes new checkpoints when they outperform the current one, with no ML engineer required and no fine-tuning code to write. Customers see an average 30% accuracy lift on agentic tasks such as classification and extraction versus base open-source models, with the first auto-improvement run typically landing in production within days. Learn more at pioneer.ai.

GLiGuard evaluates prompt safety, response safety, harm categories, refusal detection, and jailbreak strategy simultaneously

GLiGuard (0.3B) outperforms guardrail models up to 90x its size on average F1

Pioneer.ai logo (PRNewsfoto/Fastino Labs)

Cision

View original content to download multimedia:

Source link

entity types Fastino Labs language models nvidia Pioneer

Share with

/

0

/

0%

Are retail investors leaning away from software amid tech rally?

Is Fastly Stock a Buy After Carlson Investments Initiated a Position Worth $3.5 Million?

Hedge Fund Takes New Position in Logistics Stock, According to Recent SEC Filing

Fastino Labs, Creator of GLiNER, Releases Two State-of-the-Art Language Models 1,000x Smaller Than Frontier

More posts. You may also be interested in.

Are retail investors leaning away from software amid tech rally?

Is Fastly Stock a Buy After Carlson Investments Initiated a Position Worth $3.5 Million?

Hedge Fund Takes New Position in Logistics Stock, According to Recent SEC Filing

Reddit’s AI Data Licensing With Google Recasts Growth And Valuation Story

Ironclad Names Former Asana and HP Leader as Chief People Officer

Blackstone Digital Infrastructure Debuts for Trade after $1.8 Billion IPO

Fastino Labs, Creator of GLiNER, Releases Two State-of-the-Art Language Models 1,000x Smaller Than Frontier

Why Nebius Stock Is No Longer A Speculative Play