发布日期:2026-04-25
收录条目:20
1. Building Workforce AI Agents with Visier and Amazon Quick
- 来源:AWS ML Blog
- 发布时间:2026-04-24 18:04 UTC
- 链接:https://aws.amazon.com/blogs/machine-learning/building-workforce-ai-agents-with-visier-and-amazon-quick/
摘要:In this post, we show how connecting the Visier Workforce AI platform with Amazon Quick through Model Context Protocol (MCP) gives every knowledge worker a unified agentic workspace to ask questions in. Visier helps grou
2. How Project Maven taught the military to love AI
- 来源:The Verge AI
- 发布时间:2026-04-24 17:00 UTC
- 链接:https://www.theverge.com/ai-artificial-intelligence/917996/project-maven-military-ai-katrina-manson
摘要:In the first 24 hours of the assault on Iran, the US military struck more than 1,000 targets, nearly double the scale of the "shock and awe" attack on Iraq over two decades ago. This acceleration was made possible by AI
3. AirPods, Touch Bars, and the rest of Tim Cook’s legacy
- 来源:The Verge AI
- 发布时间:2026-04-24 14:43 UTC
- 链接:https://www.theverge.com/podcast/917965/apple-ceo-cook-ternus-transition
摘要:We knew at some point Tim Cook would step down from his position as Apple's CEO. Over the last year, it has become increasingly obvious that John Ternus was his likely successor. The news this week was still a surprise,
4. Musk vs. Altman is here, and it’s going to get messy
- 来源:The Verge AI
- 发布时间:2026-04-24 12:00 UTC
- 链接:https://www.theverge.com/ai-artificial-intelligence/917755/musk-altman-openai-xai-gossip
摘要:Elon Musk cofounded OpenAI, and then flounced off in a huff when he wasn't anointed CEO, leaving Sam Altman as the last power-hungry man standing. Now, Musk is back with a lawsuit, and a trial is scheduled to start in Oa
5. China’s DeepSeek previews new AI model a year after jolting US rivals
- 来源:The Verge AI
- 发布时间:2026-04-24 09:45 UTC
- 链接:https://www.theverge.com/ai-artificial-intelligence/918035/deepseek-preview-v4-ai-model
摘要:Chinese AI company DeepSeek released a preview of its hotly anticipated next-generation AI model V4 on Friday, saying that the open-source model can compete with leading closed-source systems from US rivals including Ant
6. Prestigious photo contest answers ‘what is a photo?’
- 来源:The Verge AI
- 发布时间:2026-04-24 09:40 UTC
- 链接:https://www.theverge.com/gadgets/918016/prestigious-photo-contest-answers-what-is-a-photo
摘要:We love to muse over how "real" photography is defined here at The Verge now that generative AI is so prolific, and the World Press Photo competition might have the answer. The prestigious award celebrates the best of ph
7. Architecture of an AI-Based Automated Course of Action Generation System for Military Operations
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.20862
摘要:arXiv:2604.20862v1 Announce Type: new Abstract: The automation system for Course of Action (CoA) planning is an essential element in future warfare. As maneuver speeds increase, surveillance ranges extend, and weapon ran
8. Escaping the Agreement Trap: Defensibility Signals for Evaluating Rule-Governed AI
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.20972
摘要:arXiv:2604.20972v1 Announce Type: new Abstract: Content moderation systems are typically evaluated by measuring agreement with human labels. In rule-governed environments this assumption fails: multiple decisions may be
9. Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.20987
摘要:arXiv:2604.20987v1 Announce Type: new Abstract: Long horizon interactive environments are a testbed for evaluating agents skill usage abilities. These environments demand multi step reasoning, the chaining of multiple sk
10. Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.20995
摘要:arXiv:2604.20995v1 Announce Type: new Abstract: Alignment faking, where a model behaves aligned with developer policy when monitored but reverts to its own preferences when unobserved, is a concerning yet poorly understo
11. The Last Harness You'll Ever Build
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21003
摘要:arXiv:2604.21003v1 Announce Type: new Abstract: AI agents are increasingly deployed on complex, domain-specific workflows -- navigating enterprise web applications that require dozens of clicks and form fills, orchestrat
12. Deep FinResearch Bench: Evaluating AI's Ability to Conduct Professional Financial Investment Research
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21006
摘要:arXiv:2604.21006v1 Announce Type: new Abstract: We introduce Deep FinResearch Bench, a practical and comprehensive evaluation framework for deep research (DR) agents in financial investment research. The benchmark assess
13. Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21018
摘要:arXiv:2604.21018v1 Announce Type: new Abstract: While scaling test-time compute can substantially improve model performance, existing approaches either rely on static compute allocation or sample from fixed generation di
14. HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21027
摘要:arXiv:2604.21027v1 Announce Type: new Abstract: Electronic health record (EHR) question answering is often handled by LLM-based pipelines that are costly to deploy and do not explicitly leverage the hierarchical structur
15. Who Defines Fairness? Target-Based Prompting for Demographic Representation in Generative Models
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21036
摘要:arXiv:2604.21036v1 Announce Type: new Abstract: Text-to-image(T2I) models like Stable Diffusion and DALL-E have made generative AI widely accessible, yet recent studies reveal that these systems often replicate societal
16. Active Data
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21044
摘要:arXiv:2604.21044v1 Announce Type: new Abstract: In some complex domains, certain problem-specific decompositions can provide advantages over monolithic designs by enabling comprehension and specification of the design. I
17. InVitroVision: a Multi-Modal AI Model for Automated Description of Embryo Development using Natural Language
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21061
摘要:arXiv:2604.21061v1 Announce Type: new Abstract: The application of artificial intelligence (AI) in IVF has shown promise in improving consistency and standardization of decisions, but often relies on annotated data and d
18. Mind the Prompt: Self-adaptive Generation of Task Plan Explanations via LLMs
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21092
摘要:arXiv:2604.21092v1 Announce Type: new Abstract: Integrating Large Language Models (LLMs) into complex software systems enables the generation of human-understandable explanations of opaque AI processes, such as automated
19. Propensity Inference: Environmental Contributors to LLM Behaviour
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21098
摘要:arXiv:2604.21098v1 Announce Type: new Abstract: Motivated by loss of control risks from misaligned AI systems, we develop and apply methods for measuring language models' propensity for unsanctioned behaviour. We contrib
20. AI Governance under Political Turnover: The Alignment Surface of Compliance Design
- 来源:arXiv cs.AI
- 发布时间:2026-04-24 04:00 UTC
- 链接:https://arxiv.org/abs/2604.21103
摘要:arXiv:2604.21103v1 Announce Type: new Abstract: Governments are increasingly interested in using AI to make administrative decisions cheaper, more scalable, and more consistent. But for probabilistic AI to be incorporate