Tether launches on-device medical AI that outperforms Google’s models in benchmark tests

May 7, 2026

1

Tether’s AI Research Group has released QVAC MedPsy-1.7B and MedPsy-4B, specialized text-only medical language models built to run directly on low-power devices such as smartphones and wearables.

According to the team, these models outperform some big medical AI systems, including Google’s, on various benchmarks, and perform comparably to much larger systems on medical reasoning and knowledge tasks while maintaining full local execution and privacy.

Traditional AI systems in healthcare rely on large cloud-hosted models, requiring sensitive data like patient records and diagnostic inputs to be transmitted to external servers, creating privacy and compliance risks. This architecture is increasingly under pressure as the healthcare AI sector is projected to grow from roughly $36 billion today to potentially over $500 billion by 2033.

Tether’s team says QVAC MedPsy challenges the scaling paradigm by focusing on efficiency.

The 1.7B model is smartphone-friendly. This tiny version scored 62.62 across seven standard medical benchmarks, beating Google’s MedGemma-1.5-4B-it by over 11 points despite being less than half its size, according to researchers. It also outperformed MedGemma 27B in real-world clinical tasks like HealthBench Hard.

The 4B version model hit 70.54 on the same tests, surpassing MedGemma-27B, a model nearly seven times bigger. It delivered strong performance on HealthBench, HealthBench Hard, and MedXpertQA.

These results span eight benchmark sets including MedQA, MedMCQA, MMLU Health, PubMedQA, AfriMedQA, MedXpertQA, and HealthBench, powered by staged medical training combining supervision, curated clinical reasoning data, and reinforcement learning.

“With QVAC MedPsy, our focus was improving efficiency at the model level, rather than scaling up size,” Tether CEO Paolo Ardoino commented on the release.

These models are not only smart but also very practical, as noted by researchers. They respond quickly with short but still complete answers, saving time and battery life. They’re available in easy-to-use compressed formats that fit comfortably on mobile devices without losing much quality.

Technically, the 4B model generates responses in roughly 909 tokens, compared to about 2,953 for comparable systems, a 3.2x reduction. The 1.7B model averages around 1,110 tokens versus 1,901, cutting output by 1.7x.

Both models are being released in quantized GGUF format, with compressed versions weighing approximately 1.2 GB and 2.6 GB respectively.

“That combination matters because it directly reduces compute requirements, latency, and cost. It allows the model to run locally on standard hardware instead of relying on remote infrastructure,” Ardoino added. “In healthcare, that changes the constraints entirely; you can run medical reasoning where the data already exists, inside a hospital system or on a device, without moving sensitive information through the cloud or waiting on external processing.”

The models are now available for free under an open license on Hugging Face.

Disclosure: This article was edited by Vivian Nguyen. For more information on how we create and review content, see our Editorial Policy.

Source link

Tether launches on-device medical AI that outperforms Google’s models in benchmark tests

Bitcoin Price Analysis: BTC Eyes $85K, Understanding the ‘Triple Threat’ Behind the Price Target

Ripple (XRP) Price Predictions for This Week

Upbit adds B3 Korean won pair as Base token gains Korea access

LEAVE A REPLY Cancel reply

Most Popular

Pudgy Penguins and OpenSea Host Private Aquarium Tour During Consensus Miami

Bitwise Enters Tokenized Fund Market With $278M USCC Takeover

Startup Battlefield 200 applications close May 27

Bitcoin Price Analysis: BTC Eyes $85K, Understanding the ‘Triple Threat’ Behind the Price Target

Recent Comments

Lithosphere News

Lithosphere Deploys Full-Stack Development Environment for AI-Native Applications

Lithosphere Integrates AI Mock Providers for Continuous Integration Workflows

Lithosphere to Launch Devnet Environment for Scalable AI Application Testing

Latest Post

Pudgy Penguins and OpenSea Host Private Aquarium Tour During Consensus Miami

Bitwise Enters Tokenized Fund Market With $278M USCC Takeover

Startup Battlefield 200 applications close May 27

Categories