• About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us
AimactGrow
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing
No Result
View All Result
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing
No Result
View All Result
AimactGrow
No Result
View All Result

Liquid AI Launched LFM2-Audio-1.5B: An Finish-to-Finish Audio Basis Mannequin with Sub-100 ms Response Latency

Admin by Admin
October 1, 2025
Home AI
Share on FacebookShare on Twitter


Liquid AI has launched LFM2-Audio-1.5B, a compact audio–language basis mannequin that each understands and generates speech and textual content via a single end-to-end stack. It positions itself for low-latency, real-time assistants on resource-constrained units, extending the LFM2 household into audio whereas retaining a small footprint.

https://www.liquid.ai/weblog/lfm2-audio-an-end-to-end-audio-foundation-model

However what’s really new? a unified spine with disentangled audio I/O

LFM2-Audio extends the 1.2B-parameter LFM2 language spine to deal with audio and textual content as first-class sequence tokens. Crucially, the mannequin disentangles audio representations: inputs are steady embeddings projected straight from uncooked waveform chunks (~80 ms), whereas outputs are discrete audio codes. This avoids discretization artifacts on the enter path whereas conserving coaching and era autoregressive for each modalities on the output path.

On the implementation aspect, the launched checkpoint makes use of:

🚨 [Recommended Read] ViPE (Video Pose Engine): A Highly effective and Versatile 3D Video Annotation Instrument for Spatial AI
  • Spine: LFM2 (hybrid conv + consideration), 1.2B params (LM solely)
  • Audio encoder: FastConformer (~115M, canary-180m-flash)
  • Audio decoder: RQ-Transformer predicting discrete Mimi codec tokens (8 codebooks)
  • Context: 32,768 tokens; vocab: 65,536 (textual content) / 2049×8 (audio)
  • Precision: bfloat16; license: LFM Open License v1.0; languages: English
https://www.liquid.ai/weblog/lfm2-audio-an-end-to-end-audio-foundation-model

Two era modes for real-time brokers

  • Interleaved era for dwell, speech-to-speech chat the place the mannequin alternates textual content and audio tokens to attenuate perceived latency.
  • Sequential era for ASR/TTS (switching modalities turn-by-turn).

Liquid AI offers a Python package deal (liquid-audio) and a Gradio demo to breed these behaviors.

Latency: <100 ms to first audio

Liquid AI staff experiences end-to-end latency beneath 100 ms from a 4-second audio question to the primary audible response—a proxy for perceived responsiveness in interactive use—stating it’s quicker than fashions smaller than 1.5B parameters below their setup.

Benchmarks: VoiceBench and ASR outcomes

On VoiceBench—a set of 9 audio-assistant evaluations—Liquid experiences an total rating of 56.78 for LFM2-Audio-1.5B, with per-task numbers disclosed within the weblog’s chart (e.g., AlpacaEval 3.71, CommonEval 3.49, WildVoice 3.17). The Liquid AI staff contrasts this consequence with bigger fashions like Qwen2.5-Omni-3B and Moshi-7B in the identical desk. (VoiceBench is an exterior benchmark launched in late 2024 for LLM-based voice assistants)

The mannequin card on Hugging Face offers a further VoiceBench desk (with intently associated—however not equivalent—per-task values) and consists of traditional ASR WERs the place LFM2-Audio matches or improves on Whisper-large-v3-turbo for some datasets regardless of being a generalist speech–textual content mannequin. For instance (decrease is healthier): AMI 15.36 vs. 16.13 (Whisper-large-v3-turbo), LibriSpeech-clean 2.03 vs. 2.10.

https://huggingface.co/LiquidAI/LFM2-Audio-1.5B

Alright, however why does it actually matter in voice AI traits?

Most “omni” stacks couple ASR → LLM → TTS, which provides latency and brittle interfaces. LFM2-Audio’s single-backbone design with steady enter embeddings and discrete output codes reduces glue logic and permits interleaved decoding for early audio emission. For builders, this interprets to less complicated pipelines and quicker perceived response occasions, whereas nonetheless supporting ASR, TTS, classification, and conversational brokers from one mannequin. Liquid AI offers code, demo entry factors, and distribution through Hugging Face.


Take a look at the GitHub Web page, Hugging Face Mannequin Card and Technical particulars. Be happy to take a look at our GitHub Web page for Tutorials, Codes and Notebooks. Additionally, be at liberty to comply with us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you’ll be able to be a part of us on telegram as effectively.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

🔥[Recommended Read] NVIDIA AI Open-Sources ViPE (Video Pose Engine): A Highly effective and Versatile 3D Video Annotation Instrument for Spatial AI
Tags: AudioendtoendFoundationLatencyLFM2Audio1.5BLiquidmodelreleasedResponseSub100
Admin

Admin

Next Post
OpenAI’s new social app is stuffed with terrifying Sam Altman deepfakes

OpenAI's new social app is stuffed with terrifying Sam Altman deepfakes

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended.

Google Sues SerpApi Over Scraping Search Outcomes

Google Sues SerpApi Over Scraping Search Outcomes

December 20, 2025
New, Very Good Sport Go Survival Sport Mixes Half-Life And Rust

New, Very Good Sport Go Survival Sport Mixes Half-Life And Rust

July 24, 2025

Trending.

The way to Clear up the Wall Puzzle in The place Winds Meet

The way to Clear up the Wall Puzzle in The place Winds Meet

November 16, 2025
Researchers Uncover Crucial GitHub CVE-2026-3854 RCE Flaw Exploitable by way of Single Git Push

Researchers Uncover Crucial GitHub CVE-2026-3854 RCE Flaw Exploitable by way of Single Git Push

April 29, 2026
Google Introduces Simula: A Reasoning-First Framework for Producing Controllable, Scalable Artificial Datasets Throughout Specialised AI Domains

Google Introduces Simula: A Reasoning-First Framework for Producing Controllable, Scalable Artificial Datasets Throughout Specialised AI Domains

April 21, 2026
Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Coaching Structure Reaching 88% Goodput Below Excessive {Hardware} Failure Charges

Google DeepMind Introduces Decoupled DiLoCo: An Asynchronous Coaching Structure Reaching 88% Goodput Below Excessive {Hardware} Failure Charges

April 24, 2026
5 AI Compute Architectures Each Engineer Ought to Know: CPUs, GPUs, TPUs, NPUs, and LPUs In contrast

5 AI Compute Architectures Each Engineer Ought to Know: CPUs, GPUs, TPUs, NPUs, and LPUs In contrast

April 10, 2026

AimactGrow

Welcome to AimactGrow, your ultimate source for all things technology! Our mission is to provide insightful, up-to-date content on the latest advancements in technology, coding, gaming, digital marketing, SEO, cybersecurity, and artificial intelligence (AI).

Categories

  • AI
  • Coding
  • Cybersecurity
  • Digital marketing
  • Gaming
  • SEO
  • Technology

Recent News

Your Web site Is A Supply, Not A Megaphone

Your Web site Is A Supply, Not A Megaphone

May 3, 2026
You’re allowed to make use of AI to assist make a film, however you’re not allowed to make use of AI actors or writers

You’re allowed to make use of AI to assist make a film, however you’re not allowed to make use of AI actors or writers

May 3, 2026
  • About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us

© 2025 https://blog.aimactgrow.com/ - All Rights Reserved

No Result
View All Result
  • Home
  • Technology
  • AI
  • SEO
  • Coding
  • Gaming
  • Cybersecurity
  • Digital marketing

© 2025 https://blog.aimactgrow.com/ - All Rights Reserved