AI 每日资讯 - 2026-05-23

发布日期：2026-05-23

收录条目：20

1. Google’s AI search is so broken it can ‘disregard’ what you’re looking for

来源：The Verge AI
发布时间：2026-05-22 20:39 UTC
链接：https://www.theverge.com/tech/936176/google-ai-overviews-search-disregard

摘要：Google's AI Overviews are running into an interesting problem right now. Earlier on Friday, if you searched for the term "disregard," the AI Overview section would include a response like what you'd see from a more tradi

2. A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents

来源：MarkTechPost
发布时间：2026-05-22 18:23 UTC
链接：https://www.marktechpost.com/2026/05/22/a-step-by-step-coding-tutorial-to-implement-gbrain-the-self-wiring-memory-layer-built-by-y-combinators-garry-tan-for-ai-agents/

摘要：AI agents start every session from zero — no memory of meetings, notes, or decisions. GBrain, the open-source memory layer Y Combinator's Garry Tan built to power his own OpenClaw and Hermes deployments, fixes that with

3. Catch up on the Dialogues stage at Google I/O 2026.

来源：Google AI Blog
发布时间：2026-05-22 18:00 UTC
链接：https://blog.google/innovation-and-ai/technology/ai/io-2026-dialogues-recap/

摘要：A recap of the 2026 I/O Dialogues, where leaders discuss the future of AI, quantum computing, robotics and creativity.

4. Elon, stop trying to make Grok happen

来源：The Verge AI
发布时间：2026-05-22 17:17 UTC
链接：https://www.theverge.com/ai-artificial-intelligence/936219/elon-stop-trying-to-make-grok-happen

摘要：There is a harsh truth about Elon Musk's "truth-seeking" AI chatbot Grok: It's not very good, and not many people are using it. That's the takeaway of a new Reuters report, which found that Grok barely appears in federal

5. The literary world isn’t prepared for AI

来源：The Verge AI
发布时间：2026-05-22 14:30 UTC
链接：https://www.theverge.com/tech/936073/ai-writing-granta-commonwealth-prize

摘要：Since 2012, the British literary magazine Granta has published the regional winners of the annual Commonwealth Short Story Prize. This year, however, there was something off about one of the selections for the prestigiou

6. Spotify says its AI remix tool is for superfans, but I’m not convinced

来源：The Verge AI
发布时间：2026-05-22 14:20 UTC
链接：https://www.theverge.com/ai-artificial-intelligence/936072/spotify-umg-ai-music-remix-cover-superfan

摘要：AI covers and remixes of songs are already a blight on the internet. Spotify, YouTube, TikTok, and Instagram are awash in flat reggae versions of "Smells Like Teen Spirit," dinky country renditions of The Weeknd, and mon

7. Samsung’s memory chip employees negotiated $340,000 bonuses this year

来源：The Verge AI
发布时间：2026-05-22 11:05 UTC
链接：https://www.theverge.com/tech/936002/samsung-memory-chip-employees-deal-strike-bonus

摘要：Details have emerged about a tentative deal struck between Samsung and semiconductor employees who had threatened to strike. The deal reportedly makes some workers eligible for average annual bonuses of $340,000. The pro

8. Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

来源：MarkTechPost
发布时间：2026-05-22 08:32 UTC
链接：https://www.marktechpost.com/2026/05/22/microsoft-releases-fara1-5-a-family-of-browser-computer-use-agents-4b-9b-27b-that-outperform-openai-operator-and-gemini-2-5-computer-use-on-online-mind2web/

摘要：Microsoft Research released Fara1.5, a family of browser computer-use agents in 4B, 9B, and 27B sizes. Fara1.5-27B scores 72% on Online-Mind2Web, outperforming OpenAI Operator, Gemini 2.5 Computer Use, and Yutori Navigat

9. Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning

来源：MarkTechPost
发布时间：2026-05-22 07:39 UTC
链接：https://www.marktechpost.com/2026/05/22/build-recurrent-depth-transformers-with-openmythos-for-mla-gqa-sparse-moe-and-loop-scaled-reasoning/

摘要：In this tutorial, we explore OpenMythos by building an advanced recurrent-depth transformer workflow that runs end-to-end in Google Colab. We create both MLA and GQA model variants, compare their parameter counts, and ch

10. SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20189

摘要：arXiv:2605.20189v1 Announce Type: new Abstract: Despite the remarkable success of large language models (LLMs), they still face bottlenecks while deploying in dynamic, real-world settings with primary challenges being co

11. Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20190

摘要：arXiv:2605.20190v1 Announce Type: new Abstract: Iterative industrial design-simulation optimization is bottlenecked by the CAD-CAE semantic gap: translating simulation feedback into valid geometric edits under diverse, c

12. OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20423

摘要：arXiv:2605.20423v1 Announce Type: new Abstract: Large Language Models (LLMs) perform well on many language tasks, but their Theory of Mind (ToM) reasoning is still uneven in complex social settings. Existing benchmarks,

13. AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20425

摘要：arXiv:2605.20425v1 Announce Type: new Abstract: Designing multi-agent workflows is especially difficult in open-ended scientific settings where tasks lack curated training sets, reliable scalar evaluation metrics, and st

14. High Quality Embeddings for Horn Logic Reasoning

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20467

摘要：arXiv:2605.20467v1 Announce Type: new Abstract: Neural networks can be trained to rank the choices made by logical reasoners, resulting in more efficient searches for answers. A key step in this process is creating usefu

15. $ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20490

摘要：arXiv:2605.20490v2 Announce Type: new Abstract: In high-stakes automated decision-making, access to predictive uncertainty is essential for enabling users -- human or downstream systems -- to accept or reject predictions

16. Open-World Evaluations for Measuring Frontier AI Capabilities

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20520

摘要：arXiv:2605.20520v1 Announce Type: new Abstract: Benchmark-based evaluation remains important for tracking frontier AI progress. But it can both overstate and understate deployed capability because it privileges tasks tha

17. AgentAtlas: Beyond Outcome Leaderboards for LLM Agents

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20530

摘要：arXiv:2605.20530v1 Announce Type: new Abstract: Large language model agents now act on codebases, browsers, operating systems, calendars, files, and tool ecosystems, but the benchmarks used to evaluate them are fragmente

18. Personality Engineering with AI Agents: A New Methodology for Negotiation Research

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20554

摘要：arXiv:2605.20554v1 Announce Type: new Abstract: According to canonical negotiation theory, people's success in a negotiation depends on how well they balance competing demands--empathizing and asserting, demonstrating co

19. Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20577

摘要：arXiv:2605.20577v1 Announce Type: new Abstract: Riichi Mahjong is a multi-player, imperfect-information game characterized by stochasticity and high-dimensional state spaces. These attributes present a unique combination

20. From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

来源：arXiv cs.AI
发布时间：2026-05-22 04:00 UTC
链接：https://arxiv.org/abs/2605.20608

摘要：arXiv:2605.20608v1 Announce Type: new Abstract: Realizing Level 4/5 Autonomous Networks (AN) demands a shift from static automation to agent-native intelligence. Current operations, reliant on rigid scripts, lack the cog

菜单

分享

AI 每日资讯 - 2026-05-23

1. Google’s AI search is so broken it can ‘disregard’ what you’re looking for

2. A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents

3. Catch up on the Dialogues stage at Google I/O 2026.

4. Elon, stop trying to make Grok happen

5. The literary world isn’t prepared for AI

6. Spotify says its AI remix tool is for superfans, but I’m not convinced

7. Samsung’s memory chip employees negotiated $340,000 bonuses this year

8. Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

9. Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning

10. SOLAR: A Self-Optimizing Open-Ended Autonomous Agent for Lifelong Learning and Continual Adaptation

11. Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

12. OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind

13. AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

14. High Quality Embeddings for Horn Logic Reasoning

15. $ECUAS_n$: A family of metrics for principled evaluation of uncertainty-augmented systems

16. Open-World Evaluations for Measuring Frontier AI Capabilities

17. AgentAtlas: Beyond Outcome Leaderboards for LLM Agents

18. Personality Engineering with AI Agents: A New Methodology for Negotiation Research

19. Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

20. From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

评论

A2A 初理解：让 AI Agent 真正“互相协作”的通用协议

slow op的排查手段（更新中）

asan内存检测

模型即芯片：AI 推理新分叉

rclone拷贝桶对象失败定位过程

训练初了解：把大模型看成一个复杂函数（通俗版）

vector扩容

智能指针是线程安全的？

ceph中 RBD 使用

cas 无锁编程