With the defaults of private companies in more than 9.2% – the highest rate in years – Venture capital firm Lux Capital recently advised companies relying on AI to meet their computing capacity commitments. confirmed in writing. With financial instability impacting the AI supply chain, Lux warned, a handshake deal is not enough.
But there is another option, which is to completely stop depending on external IT infrastructure. Smaller AI models that run directly on the user’s own device (no data center, no cloud provider, no counterparty risk) are becoming good enough to be worth considering. AND Multiverse computing He is raising his hand.
Until now, the Spanish startup has kept a lower profile than some of its peers, but as demand for AI efficiency grows, this is changing. After compressing models from leading AI labs, including OpenAI, Meta, DeepSeek, and Mistral AI, it launched an app that showcases the capabilities of its compressed models and an API portal (a gateway that allows developers to access and build with those models) that makes them more widely available.
He CompactifAI appwhich shares its name with Multiverse’s quantum-inspired compression technology, is an AI chat tool in the vein of ChatGPT or Le Chat de Mistral. Ask a question and the model will answer. The difference is that Multiverse incorporated Gilda, a model so small that it can run locally and offline, according to the company.
For end users, this is a taste of AI at the edge, with data that doesn’t leave their devices and doesn’t require a connection. But there is one caveat: your mobile devices must have enough RAM and storage. If they don’t (and many older iPhones don’t), the app falls back to cloud-based models via API. Routing between on-premises and cloud processing is handled automatically by a system Multiverse has called Ash Nazg, whose name will ring a bell to Tolkien fans as it references the inscription on the One Ring in “The Lord of the Rings.” But when the app heads to the cloud, it loses its main privacy advantage in the process.
These limitations mean that CompactifAI is not yet ready for mass customer adoption, although that may never have been the goal. According to data from Sensor Tower, the application had less than 5,000 downloads in the last month.
The real target is companies. Today, Multiverse is launching a self-service API portal which gives developers and companies direct access to their compressed models, without the need for AWS Marketplace.
Technology event
San Francisco, California
|
October 13-15, 2026
“The CompactifAI API Portal 1773909525 gives developers direct access to compressed models with the transparency and control necessary to run them in production,” CEO Enrique Lizaso said in a statement.
Real-time usage monitoring is one of the key features of the API and it’s no accident. In addition to the potential advantages of edge deployment, lower compute costs are one of the main reasons enterprises are considering smaller models as an alternative to large language models (LLM).
It also helps that small models are less limited than before. Earlier this week, Mistral updated its small family of models with the launch of Mistral Small 4which says it is simultaneously optimized for general chat, coding, agency tasks, and reasoning. The French company also launched Forge, a system that allows companies to create custom models, including small models for which they can choose the trade-offs that their use cases can best tolerate.
Recent Multiverse results also suggest that the gap with LLMs is narrowing. Its latest compressed model, HyperNova 60B 2602, is based on gpt-oss-120b, an OpenAI model whose underlying code is publicly available. The company claims that it now delivers faster responses at a lower cost than the original from which it was derived, an advantage that is especially important for agent coding workflows, where AI autonomously completes complex, multi-step programming tasks.
Making models small enough to work on mobile devices while still being useful is a big challenge. Apple Intelligence avoided that problem by combining an on-device model and a cloud model. Multiverse’s CompactifAI app can also route requests to gpt-oss-120b via API, but its primary goal is to show that on-premises models like Gilda and its future replacements have advantages beyond cost savings.
For workers in critical fields, a model that can run locally and without connecting to the cloud offers more privacy and resiliency. But the greatest value is in the business use cases this can unlock: for example, embedding AI in drones, satellites, and other environments where connectivity cannot be taken for granted.
The company already serves more than 100 global clients, including the Bank of Canada, Bosch and Iberdrola, but expanding its customer base could help it unlock more financing. After raising a $215 million Series B last year, it is now It is rumored to be raising a new funding round of €500 million. with a valuation of more than 1,500 million euros.
