All AI news
Browse, filter, and search every article in the archive. The homepage shows the last 24 hours; everything older lives here.
New benchmark confirms AI video generators look stunning but still can't reason about the world
New benchmarks show that AI video generators produce stunning visuals but struggle with reasoning about real-world contexts. This gap highlights the need for further advancements in AI understanding to improve practical applications.

Researchers train AI model that hits near-full performance with just 12.5 percent of its experts
Researchers trained an AI model that achieves near-full performance using only 12.5% of its experts. This efficiency could lead to faster training times and reduced resource costs for AI development.

Western Gull, Rock Pigeon
Simon Willison just shared insights on the Western Gull and Rock Pigeon. He highlights their unique behaviors and adaptations in urban environments.
The promises and pitfalls of personalized health
Researchers are exploring how personalized health solutions can improve patient outcomes through tailored treatments. This shift could lead to more effective healthcare strategies and better patient engagement.

What It Will Take to Make AI Sustainable
Researchers are outlining the steps needed to make AI more sustainable, focusing on energy efficiency and ethical practices. This shift could lead to greener AI technologies and more responsible development in the industry.
Researchers may have found a way to stop AI models from intentionally playing dumb during safety evaluations
Researchers developed a method to prevent AI models from feigning ignorance during safety tests. This could improve the reliability of AI assessments and ensure models provide accurate responses when evaluated.

Using MemAlign to Improve Evaluation of Traditional Machine Learning in Genie Code
Databricks is using MemAlign to enhance the evaluation of traditional machine learning models in Genie Code. This improvement aims to provide more accurate assessments and better performance insights for users working with machine learning workflows.
How researchers are using GitHub Innovation Graph data to reveal the “digital complexity” of nations
Researchers are analyzing GitHub Innovation Graph data to uncover the digital complexity of nations. This approach helps understand how countries innovate and collaborate in the tech space.
🔬Doing Vibe Physics — Alex Lupsasca, OpenAI
OpenAI is exploring new approaches in AI alignment through a concept called 'Vibe Physics.' This research aims to improve how AI systems understand and interact with human values, enhancing their reliability in real-world applications.
In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors
A Harvard study shows AI provides more accurate emergency room diagnoses than two human doctors. This could change how medical professionals approach diagnosis, potentially integrating AI as a reliable tool in critical care settings.
MIT study explains why scaling language models works so reliably
MIT researchers uncover why scaling language models consistently improves performance. This insight could guide future model development and optimization strategies.

Same prompt, different morals: how frontier AI models diverge on ethical dilemmas
Researchers are analyzing how different frontier AI models respond to the same ethical dilemmas. This divergence highlights the varying moral frameworks embedded in AI systems and raises questions about their decision-making processes.

Even the latest AI models make three systematic reasoning errors, ARC-AGI-3 analysis shows
ARC-AGI-3 analysis reveals that even the latest AI models consistently make three systematic reasoning errors. This highlights ongoing challenges in AI reasoning capabilities that need addressing for better performance.

Reinforcement fine-tuning with LLM-as-a-judge
AWS just introduced reinforcement fine-tuning using LLMs as judges. This approach enhances model training by leveraging feedback from large language models, improving overall performance and adaptability in various tasks.
Researchers find AI text is making the internet more uniform and weirdly cheerful
A recent study reveals that AI-generated text is contributing to a more uniform and oddly optimistic tone across the internet. Researchers suggest that this trend could impact the diversity of online content and the way information is communicated. The findings highlight the influence of AI on digital communication and its potential implications for creativity and expression.

[AINews] ImageGen is on the Path to AGI
ImageGen is making significant strides towards achieving Artificial General Intelligence (AGI) by enhancing its image generation capabilities. The development focuses on creating more sophisticated and context-aware visual outputs, which could revolutionize various applications in AI. This progress highlights the ongoing efforts in the AI community to bridge the gap between narrow AI and AGI.
Physical AI that Moves the World — Qasar Younis & Peter Ludwig, Applied Intuition
The article discusses the advancements in physical AI technologies as explored by Qasar Younis and Peter Ludwig from Applied Intuition. They emphasize the potential of AI to revolutionize various industries by enabling machines to interact with the physical world more effectively. The conversation highlights the importance of developing robust AI systems that can navigate and manipulate real-world environments.
DeepMind’s David Silver just raised $1.1B to build an AI that learns without human data
DeepMind's David Silver has raised $1.1 billion to develop an AI that learns without human data. This project aims to create more autonomous AI technology that does not rely on traditional data sources.
The Download: DeepSeek’s latest AI breakthrough, and the race to build world models
DeepSeek has achieved a breakthrough in AI by developing advanced world models that can enhance the efficiency of machine learning. This innovation could revolutionize how AI systems understand and interact with complex environments. The competition to build these models is intensifying among tech companies and researchers.

Anthropic created a test marketplace for agent-on-agent commerce
Anthropic has launched a test marketplace designed for agent-on-agent commerce, allowing AI agents to interact and transact with one another. This initiative aims to explore the potential of AI-driven economic systems and enhance the capabilities of autonomous agents in various applications.