Pollen Robotics has released Reachy Mini, an open-source humanoid robot, now capable of fully local operation. This update allows the robot to run all its AI models, including object detection and speech recognition, directly on its embedded NVIDIA Jetson Orin Nano. The move to local processing enhances privacy, reduces latency, and enables operation in environments without internet connectivity.
→ This is a prime example of on-device AI, showing how local LLMs and STT can power advanced robotics without cloud dependency, highly relevant for Mark's interest in local inference and accessibility.
Hugging Face introduced Delta Weight Sync in TRL, a new feature designed to efficiently ship and manage large language models with trillions of parameters. This method optimizes the synchronization of model weights by only transferring the changed parts, significantly reducing bandwidth and storage requirements. It leverages a Hub Bucket for storage, enabling faster deployment and iteration of massive models.
→ This is a high-signal improvement for anyone working with large local LLMs like Gemma or Llama, making it much more practical to iterate and deploy massive models on consumer hardware.
Pollen Robotics has released Reachy Mini, an open-source humanoid robot, now capable of fully local operation. This update allows the robot to run all its AI models, including object detection and speech recognition, directly on its embedded NVIDIA Jetson Orin Nano. The move to local processing enhances privacy, reduces latency, and enables operation in environments without internet connectivity.
→ This is a prime example of on-device AI, showing how local LLMs and STT can power advanced robotics without cloud dependency, highly relevant for Mark's interest in local inference and accessibility.
Hugging Face introduced Delta Weight Sync in TRL, a new feature designed to efficiently ship and manage large language models with trillions of parameters. This method optimizes the synchronization of model weights by only transferring the changed parts, significantly reducing bandwidth and storage requirements. It leverages a Hub Bucket for storage, enabling faster deployment and iteration of massive models.
→ This is a high-signal improvement for anyone working with large local LLMs like Gemma or Llama, making it much more practical to iterate and deploy massive models on consumer hardware.
Mistral AI engineer Mathis Felardos details the process of debugging a memory leak within vLLM, a popular library for LLM serving. The article explains how a specific issue with Python's garbage collection and object referencing led to increasing memory usage over time, particularly when handling long sequences. Felardos outlines the use of memory profiling tools and a custom CPython extension to identify and resolve the root cause.
→ This deep dive into vLLM memory leaks is crucial for anyone optimizing local LLM deployments, especially for long-context Gemma or Llama models where efficiency is key.
Pollen Robotics has released Reachy Mini, an open-source humanoid robot, now capable of fully local operation. This update allows the robot to run all its AI models, including object detection and speech recognition, directly on its embedded NVIDIA Jetson Orin Nano. The move to local processing enhances privacy, reduces latency, and enables operation in environments without internet connectivity.
→ This is a prime example of on-device AI, showing how local LLMs and STT can power advanced robotics without cloud dependency, highly relevant for Mark's interest in local inference and accessibility.
Hugging Face introduced Delta Weight Sync in TRL, a new feature designed to efficiently ship and manage large language models with trillions of parameters. This method optimizes the synchronization of model weights by only transferring the changed parts, significantly reducing bandwidth and storage requirements. It leverages a Hub Bucket for storage, enabling faster deployment and iteration of massive models.
→ This is a high-signal improvement for anyone working with large local LLMs like Gemma or Llama, making it much more practical to iterate and deploy massive models on consumer hardware.
Mistral AI has announced a new initiative focused on physics-informed AI, aiming to develop foundational models for engineering acceleration. This involves integrating physical laws and scientific data into their AI models to enhance performance and reliability in scientific and industrial applications. The goal is to create more robust and accurate AI solutions for complex engineering challenges.
→ Mistral's move into physics AI could lead to more specialized and efficient local LLMs for scientific computing, potentially impacting future RAG and on-device applications.
Mistral AI has released "Leanstral," an open-source foundational model designed to enhance trustworthy "vibe-coding." This new model aims to provide a robust base for developing applications that interpret and generate emotional or stylistic tones in text. The release emphasizes its open-source nature, encouraging community contributions and broader adoption in various AI-driven projects.
→ Leanstral's open-source release from Mistral is a high-signal event, offering a new local LLM foundation that could be highly relevant for on-device applications and potentially for enhancing accessibility tools through nuanced text generat
Mistral AI engineer Mathis Felardos details the process of debugging a memory leak within vLLM, a popular library for LLM serving. The article explains how a specific issue with Python's garbage collection and object referencing led to increasing memory usage over time, particularly when handling long sequences. Felardos outlines the use of memory profiling tools and a custom CPython extension to identify and resolve the root cause.
→ This deep dive into vLLM memory leaks is crucial for anyone optimizing local LLM deployments, especially for long-context Gemma or Llama models where efficiency is key.
Editor’s note: In our first BioHub pod with Priscilla and Mark they discussed their acquisition of EvoScale , led by Alex Rives , who is now Head of Science at BioHub. With ESM-1 they trained language models on millions of protein sequences drawn from across life, with a simple “
Did you know your phone can easily tell you which videos or images or AI? AI videos are getting so realistic to the point where it's bordering on scary. So I point blank asked the CTO of Google DeepMind at Google I/O this year: What is your company doing about this? His answer wa
See how OpenAI, Thrive, and Crete built a self-improving tax agent with Codex, automating filings, improving accuracy, and accelerating workflows.
Warp uses GPT-5.5 and OpenAI models to coordinate coding agents across local, cloud, and open-source development workflows.
sqlite AGENTS.md SQLite gained an AGENTS.md file five days ago - but it's not intended for their own development, it's presumably aimed at people who are pointing agents at the SQLite codebase. It includes: SQLite does not accept pull requests without prior agreement and/or accom
Anthropic are strongly rumored to be about to have their first profitable quarter. Stories are circulating of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market f
Microsoft Copilot Cowork Exfiltrates Files The biggest challenge in designing agentic systems continues to be preventing them from enabling attackers to exfiltrate data. In this case Microsoft Copilot Cowork (yes, that's a real product name ) was allowing agents to send emails to
A lot of the emails I get from founders are now written in a hard-hitting journalistic style. I know they're written by AI, because no founder ever wrote this way before. And once you realize something is written by AI, it's hard not to ignore it. I have never knowingly finished
I cannot believe I'm saying this, but getting the literal Pope to canonize your product's specific technical limitations as a spiritual treatise is the single greatest act of vendor lobbying I have ever seen. — Corey Quinn , on Anthropic co-founder Christopher Olah's influence on
Dropped this morning by the Vatican: Magnifica Humanitas of His Holiness Pope Leo XIV on Safeguarding the Human Person in the Time of Artificial Intelligence . This is a very interesting document. It's some of the clearest writing I've seen on the ethics of integrating AI into mo
We last wrote about Cognition in September’s $10B Series C when Smol.ai also joined Cognition and AINews was eventually moved here to Latent Space . 8 months later, it is worth 2.5x more , and officially the largest remaining independent agent lab in AI, a thesis we mapped out la
Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets ! Readers like when we report no news, but our second favorite to that is when we can simply reinforce a trend you should be aware of. In April we highlighted the Inference Inflection , and If today’s
This is getting a lot of controversy... OpenAI just rolled out a new personal finance experience that lets you link your bank accounts directly to ChatGPT via Plaid. The pros are that it allows you to basically have a financial advisor at your fingertips. But the cons is that peo
Anthropic's Mythos and Project Glasswing exposed thousands of high‑severity software vulnerabilities and shifted the bottleneck to human triage and patching. Governments sought broader access and planned classified inference infrastructure, with a reported $9 billion US request f
AI inequality explored as access to frontier models becomes scarce and selectively allocated by security, compute, and government controls. The role of Mythos, distillation, and recent pricing shifts demonstrates how token scarcity and stricter KYC concentrate capabilities among
NLW explores the next wave of human-agent collaboration, using Dan Shipper’s “After Automation” essay and Every’s agent experiments to argue that automation is creating more expert human work, not less. The episode looks at shared team agents, the “human sandwich” model, the limi
PICARD: Data, shields up DATA: Brilliant! Shields can reduce damage we sustain. Not immunity. Not hubris. Just prudence. It's not precaution—it's strategy. [camera shakes] WORF: HULL BREACHES ON NINE DECKS DATA: Here's what happened: you told me to raise shields, and I didn't — K
Cisco and OpenAI are redefining enterprise engineering with Codex, helping Cisco scale AI-native development, accelerate AI Defense work, and automate defect remediation.
Ahead of global elections, we’re helping people access information, supporting cyber defenders, and increasing AI transparency
tp3_memories_localgemma3:4b)