← Reports

AI Radar

Thursday, June 04, 2026 · auto-generated by tp3_ai_radar.py

Top 3 today

datasette-agent-micropython 0.1a0

9/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Simon Willison released datasette-agent-micropython 0.1a0, an alpha version of a tool designed to safely generate and execute Python code within a sandboxed environment. Initial tests show GPT-5.5 has been unable to escape the sandbox, indicating promising security for code execution. This release is a step towards enabling Datasette Agent to safely integrate AI-generated code.

This is a high-signal release for secure, sandboxed Python execution, directly relevant to local LLM tooling and safe code generation for projects like Bidet or TP3.

datasette-agent micropython sandboxing python execution local llm tooling

Direct Preference Optimization Beyond Chatbots

9/10

Hugging Face blog local LLM · Jun 03 · huggingface.co

Hugging Face released a blog post discussing Direct Preference Optimization (DPO) and its applications beyond traditional chatbots. DPO is a method for aligning language models with human preferences without requiring a separate reward model. The post explores how DPO can be used for tasks like text summarization, style transfer, and even image generation, highlighting its versatility in fine-tuning models based on human feedback.

This deep dive into DPO's broader applications is highly relevant for fine-tuning local LLMs like Gemma or Llama for specific tasks beyond chat, potentially improving RAG or accessibility tools.

direct preference optimization local llm fine-tuning rag improvements accessibility ai hugging face

Holo3.1: Fast & Local Computer Use Agents

9/10

Hugging Face blog local LLM · Jun 02 · huggingface.co

Holo3.1 is a new open-source computer use agent designed for fast, local operation. It leverages a multimodal LLM to understand screen content and user instructions, enabling autonomous task execution on a user's desktop. The agent is optimized for efficiency, running locally to provide quick responses without cloud dependency.

This is a high-signal release for local AI agents, offering on-device computer control that could integrate well with local LLMs like Gemma or Llama for enhanced accessibility and automation.

local llm on-device ai computer vision accessibility ai multimodal llm

By topic

rag improvements (2)

Direct Preference Optimization Beyond Chatbots

9/10

Hugging Face blog local LLM · Jun 03 · huggingface.co

Hugging Face released a blog post discussing Direct Preference Optimization (DPO) and its applications beyond traditional chatbots. DPO is a method for aligning language models with human preferences without requiring a separate reward model. The post explores how DPO can be used for tasks like text summarization, style transfer, and even image generation, highlighting its versatility in fine-tuning models based on human feedback.

This deep dive into DPO's broader applications is highly relevant for fine-tuning local LLMs like Gemma or Llama for specific tasks beyond chat, potentially improving RAG or accessibility tools.

direct preference optimization local llm fine-tuning rag improvements accessibility ai hugging face

🔬Scaling Past Informal AI - Carina Hong, Axiom Math

9/10

Latent Space commentary · Jun 03 · www.latent.space

Axiom Math, a seven-month-old startup, reportedly solved all 12 problems of the Putnam exam, a notoriously difficult undergraduate math competition. Their score of 12/12 surpasses top human undergraduates and other AI systems like DeepSeek, though the impact of time limits on other scores is unclear. This achievement highlights significant progress in AI's ability to tackle complex mathematical problems.

This is a high-signal item demonstrating significant progress in AI's ability to solve complex, formal mathematical problems, which could have implications for RAG and education AI tools.

math ai putnam exam problem solving education ai rag improvements

other (2)

datasette-agent-micropython 0.1a0

9/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Simon Willison released datasette-agent-micropython 0.1a0, an alpha version of a tool designed to safely generate and execute Python code within a sandboxed environment. Initial tests show GPT-5.5 has been unable to escape the sandbox, indicating promising security for code execution. This release is a step towards enabling Datasette Agent to safely integrate AI-generated code.

This is a high-signal release for secure, sandboxed Python execution, directly relevant to local LLM tooling and safe code generation for projects like Bidet or TP3.

datasette-agent micropython sandboxing python execution local llm tooling

[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

9/10

Latent Space commentary · Jun 02 · www.latent.space

NVIDIA has launched Cosmos 3, a Mixture-of-Transformers architecture that unifies language, image, video, audio, and action. It features an autoregressive reasoner paired with a diffusion generator, available in Nano (16B) and Super (64B) models. Finetuned versions for Text2Image and Image2Video are now considered state-of-the-art open-weight models.

NVIDIA's Cosmos 3, with its open-weight SOTA image and video generation, is a strong contender for local LLM and multimodal on-device AI applications, potentially impacting future Gemma 4-style hackathons.

nvidia cosmos 3 multimodal ai open-weight models text-to-image image-to-video

Full ranked list

datasette-agent-micropython 0.1a0

9/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Simon Willison released datasette-agent-micropython 0.1a0, an alpha version of a tool designed to safely generate and execute Python code within a sandboxed environment. Initial tests show GPT-5.5 has been unable to escape the sandbox, indicating promising security for code execution. This release is a step towards enabling Datasette Agent to safely integrate AI-generated code.

This is a high-signal release for secure, sandboxed Python execution, directly relevant to local LLM tooling and safe code generation for projects like Bidet or TP3.

datasette-agent micropython sandboxing python execution local llm tooling

Direct Preference Optimization Beyond Chatbots

9/10

Hugging Face blog local LLM · Jun 03 · huggingface.co

Hugging Face released a blog post discussing Direct Preference Optimization (DPO) and its applications beyond traditional chatbots. DPO is a method for aligning language models with human preferences without requiring a separate reward model. The post explores how DPO can be used for tasks like text summarization, style transfer, and even image generation, highlighting its versatility in fine-tuning models based on human feedback.

This deep dive into DPO's broader applications is highly relevant for fine-tuning local LLMs like Gemma or Llama for specific tasks beyond chat, potentially improving RAG or accessibility tools.

direct preference optimization local llm fine-tuning rag improvements accessibility ai hugging face

Holo3.1: Fast & Local Computer Use Agents

9/10

Hugging Face blog local LLM · Jun 02 · huggingface.co

Holo3.1 is a new open-source computer use agent designed for fast, local operation. It leverages a multimodal LLM to understand screen content and user instructions, enabling autonomous task execution on a user's desktop. The agent is optimized for efficiency, running locally to provide quick responses without cloud dependency.

This is a high-signal release for local AI agents, offering on-device computer control that could integrate well with local LLMs like Gemma or Llama for enhanced accessibility and automation.

local llm on-device ai computer vision accessibility ai multimodal llm

🔬Scaling Past Informal AI - Carina Hong, Axiom Math

9/10

Latent Space commentary · Jun 03 · www.latent.space

Axiom Math, a seven-month-old startup, reportedly solved all 12 problems of the Putnam exam, a notoriously difficult undergraduate math competition. Their score of 12/12 surpasses top human undergraduates and other AI systems like DeepSeek, though the impact of time limits on other scores is unclear. This achievement highlights significant progress in AI's ability to tackle complex mathematical problems.

This is a high-signal item demonstrating significant progress in AI's ability to solve complex, formal mathematical problems, which could have implications for RAG and education AI tools.

math ai putnam exam problem solving education ai rag improvements

[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

9/10

Latent Space commentary · Jun 02 · www.latent.space

NVIDIA has launched Cosmos 3, a Mixture-of-Transformers architecture that unifies language, image, video, audio, and action. It features an autoregressive reasoner paired with a diffusion generator, available in Nano (16B) and Super (64B) models. Finetuned versions for Text2Image and Image2Video are now considered state-of-the-art open-weight models.

NVIDIA's Cosmos 3, with its open-weight SOTA image and video generation, is a strong contender for local LLM and multimodal on-device AI applications, potentially impacting future Gemma 4-style hackathons.

nvidia cosmos 3 multimodal ai open-weight models text-to-image image-to-video

Why Video Agent models are next — Ethan He, xAI Grok Imagine

9/10

Latent Space commentary · Jun 01 · www.latent.space

We’re announcing AIEWF speakers this week! Take the AI Engineering Survey ! Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model , but then joined xAI and built Grok Imagine in 3 months: He comes back on Latent Space with some nuclear

micropython-wasm 0.1a1

8/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Release: micropython-wasm 0.1a1 Fixes for some limitations that emerged while I was trying to use this to build datasette-agent-micropython . Tags: python , sandboxing , webassembly

Pasted File Editor

8/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Tool: Pasted File Editor I really like how you can paste a large volume of text into claude.ai (or the Claude desktop/mobile apps) and it will detect it as a large paste and turn it into a file attachment instead. I decided to have Codex desktop build me a version of that as a pr

micropython-wasm 0.1a0

8/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Release: micropython-wasm 0.1a0 My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of MicroPython with a wrapper to execute code in it via wasmtime . Tags: python , sandboxing , webassembly

Introducing new capabilities to GPT-Rosalind

8/10

OpenAI news frontier · Jun 03 · openai.com

GPT-Rosalind advances life sciences research with enhanced biological reasoning, medicinal chemistry expertise, genomics analysis, and experimental workflow capabilities.

How Wasmer used Codex to build a Node.js runtime for the edge

8/10

OpenAI news frontier · Jun 03 · openai.com

See how Wasmer used Codex with GPT-5.5 to build a Node.js runtime for the edge, accelerating development 10x to 20x and shipping in weeks instead of months.

[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen

8/10

Latent Space commentary · Jun 04 · www.latent.space

4 years ago we argued that image composition was partially AGI-Hard . That gate has fallen this year. It can’t be pure coincidence that both Reve and Ideogram launched today, both with a heavy emphasis on how they made advances with strong labeling and code for layouts: and here’

[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models

8/10

Latent Space commentary · Jun 03 · www.latent.space

Today was a big day, not least because we caught up on the state of GitHub vs Agents , and recorded a special pod with No Priors and Satya Nadella — at MS Build, Satya and Mustafa announced 7 new MAI models: This is an impressive lineup, especially considering that the Microsoft-

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs

7/10

Simon Willison local LLM · Jun 03 · simonwillison.net

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs I wrote the other day about Uber blowing its 2026 AI budget in four months, and how that wasn't particularly surprising given they would have set that budget in 2025, before anyone could have predicted how popular token

Microsoft's new MAI models

7/10

Simon Willison local LLM · Jun 02 · simonwillison.net

Microsoft announced two new text LLMs this morning - MAI-Thinking-1 (reasoning, 1T parameters, 35B active, available to "select early partners") and MAI-Code-1-Flash (137B Parameters, 5B active, "purpose-built for GitHub Copilot and VS Code to deliver high performance and lower c

California Brown Pelican

7/10

Simon Willison local LLM · Jun 02 · simonwillison.net

California Brown Pelican, in Fort Mason, CA, US I'm at the Microsoft Build conference today, held at Fort Mason in San Francisco. There are California Brown Pelicans diving into the water directly behind venue! Tags: microsoft , ai , generative-ai , llms , llm-release

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked

7/10

Simon Willison local LLM · Jun 01 · simonwillison.net

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked I had trouble believing this story was true, but I've seen it verified from multiple sources now: One video shows a hacker starting a conversation with Meta’s AI support bot and asking

Adding MCP Tools to Reachy Mini

7/10

Hugging Face blog local LLM · Jun 03 · huggingface.co

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

7/10

Hugging Face blog local LLM · Jun 01 · huggingface.co

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

7/10

Hugging Face blog local LLM · Jun 01 · huggingface.co

5 ways Google Search can level up your thrift and vintage shopping

7/10

Google AI blog frontier · Jun 03 · blog.google

Uncover second-hand scores with AI tools in Google Search and Shopping.

A blueprint for democratic governance of frontier AI

7/10

OpenAI news frontier · Jun 03 · openai.com

OpenAI outlines a blueprint for U.S. governance of frontier AI, proposing a federal framework for safety, resilience, and national security.

Codex for every role, tool, and workflow

7/10

OpenAI news frontier · Jun 02 · openai.com

Discover new Codex plugins, sites, and annotations that help analysts, marketers, designers, investors, and other teams get more done with AI.

Codex is becoming a productivity tool for everyone

7/10

OpenAI news frontier · Jun 02 · openai.com

The Next Era of Knowledge Work report explores how Codex is transforming productivity through AI-powered research, data analysis, workflow automation, and content creation.

OpenAI frontier models and Codex are now available on AWS

7/10

OpenAI news frontier · Jun 01 · openai.com

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement workflows they already use. Customers can get started with OpenAI on AWS and move faster from evaluation

⚡️Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build

7/10

Latent Space commentary · Jun 03 · www.latent.space

We’ve informally heard that Satya is a listener to LS for a couple years now, but it was still absolutely surreal to meet him and do a live pod at Build, together with our friends at No Priors , the leading VC AI Podcast that we also greatly admire! We covered the MAI model techn

GitHub's plan for Agents — Kyle Daigle, GitHub

7/10

Latent Space commentary · Jun 02 · www.latent.space

I’m excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World’s Fair ! We’ll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella . However we did not hold back with thi

AI Transformed My Website In A Few Hours

7/10

Matt Wolfe YouTube commentary · Jun 02 · www.youtube.com

Head to this link https://app.mindstudio.ai/signup?plan=mattwolfe for 90-days of free access to Remy and $25 in credits! I wanted to see if AI could rebuild my website… and honestly the result kind of shocked me. I used Remy from @MindStudio_ai to see if it could completely redes

With AI IPOs On the Way, Should the Public Own AI Companies

7/10

AI Daily Brief YouTube commentary · Jun 04 · www.youtube.com

NVIDIA unveiled the RTX Spark and Vera Rubin platform, pushing CPUs into AI inference and accelerating agent-based workloads. Anthropic and OpenAI entered an IPO race as Google plans an $80 billion equity raise and semiconductors drive a market rally. Policy debates intensified o

The AI Token Shortage Begins

7/10

AI Daily Brief YouTube commentary · Jun 01 · www.youtube.com

Recap of May's major AI shift from subsidized seat pricing to a token-based economy driven by agentic usage and runaway API consumption. Coverage of business-model pivots to usage-based billing, enterprise deployment and consulting plays, and market responses including cheaper sp

OpenAI public policy agenda

6/10

OpenAI news frontier · Jun 03 · openai.com

OpenAI outlines its public policy agenda for AI, including safety, youth protection, workforce transition, and global standards to ensure AI benefits society.

Advancing youth safety and opportunity through global leadership

6/10

OpenAI news frontier · Jun 02 · openai.com

OpenAI calls for global action on youth AI safety, proposing an international institute to strengthen safeguards, standards, and opportunities for young people.

Our views on AI policy and political advocacy

6/10

OpenAI news frontier · Jun 01 · openai.com

Our approach to AI policy and political advocacy, transparency, support for thoughtful regulation and AI safety, and that no outside political group speaks on the company’s behalf.

Building the infrastructure for the Intelligence Age in Michigan

6/10

OpenAI news frontier · Jun 01 · openai.com

OpenAI breaks ground on a 1GW data center project in Michigan as part of Stargate, building AI infrastructure to expand access, create jobs, and support communities.

The Next Wave of Enterprise AI

6/10

AI Daily Brief YouTube commentary · Jun 03 · www.youtube.com

The White House AI executive order provoked intense debate over voluntary pre-release testing, a 30-day sharing window, and the risk of a de facto licensing regime. Anthropic’s Mythos and Project Glasswing expanded to critical infrastructure partners while exposing extreme token

Jun 2, 2026 Announcements Expanding Project Glasswing

5/10

Anthropic news frontier · www.anthropic.com

How Endava is redesigning software delivery around AI agents

5/10

OpenAI news frontier · Jun 04 · openai.com

Learn how Endava is using AI agents, ChatGPT Enterprise, and Codex to accelerate software delivery, automate workflows, and build an AI-native culture across the enterprise.

Travelers deploys AI-powered claims countrywide with OpenAI

5/10

OpenAI news frontier · Jun 02 · openai.com

Travelers built an AI-powered Claim Assistant with OpenAI to guide customers through filing claims, provide 24/7 support, and scale operations during peak demand.

Pipeline status

Pipeline stats

Dead sources today

All sources green today.