发布日期:2026-05-30
收录条目:20
1. Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality
- 来源:AWS ML Blog
- 发布时间:2026-05-29 23:36 UTC
- 链接:https://aws.amazon.com/blogs/machine-learning/comprehensive-observability-for-amazon-sagemaker-ai-llm-inference-from-gpu-utilization-to-llm-quality/
摘要:This post demonstrates a comprehensive observability solution using Amazon Managed Grafana dashboards that provides a holistic view of both quality and quantity for LLMs served on Amazon SageMaker AI endpoints with infer
2. NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points on Llama-3.2-1B
- 来源:MarkTechPost
- 发布时间:2026-05-29 23:19 UTC
- 链接:https://www.marktechpost.com/2026/05/29/nvidia-introduces-x-token-projection-guided-cross-tokenizer-kd-that-outperforms-gold-by-3-82-average-points-on-llama-3-2-1b/
摘要:NVIDIA's X-Token fixes two structural failures in GOLD and improves GSM8k accuracy from 2.56 to 15.54 The post NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.82 Average Points
3. StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows
- 来源:MarkTechPost
- 发布时间:2026-05-29 21:25 UTC
- 链接:https://www.marktechpost.com/2026/05/29/stepfun-releases-step-3-7-flash-a-198b-moe-vision-language-model-for-coding-agents-and-search-workflows/
摘要:StepFun releases Step 3.7 Flash, a 198B MoE model with native vision, 256k context, and Advisor Mode. The post StepFun Releases Step 3.7 Flash: A 198B MoE Vision-Language Model for Coding Agents and Search Workflows appe
4. Tech companies desperately want to film you doing chores
- 来源:The Verge AI
- 发布时间:2026-05-29 17:37 UTC
- 链接:https://www.theverge.com/ai-artificial-intelligence/940007/ai-companies-will-pay-for-robot-training-data
摘要:This week, an AI training startup called Shift said it would clean New Yorkers' homes for free. It has plans to expand into other cities as well, including London, and looking around my flat, I get the appeal. But there'
5. Jony Ive’s funky Ferrari
- 来源:The Verge AI
- 发布时间:2026-05-29 12:25 UTC
- 链接:https://www.theverge.com/podcast/939589/ferrari-luce-jony-ive-vergecast
摘要:Most people will never own, drive, or even sit inside a Ferrari Luce. (If you can, or do… hit us up.) There's still no question that Ferrari's first electric vehicle is one of the most interesting, surprising cars of the
6. Boston Children’s uses AI to unlock new diagnoses
- 来源:OpenAI News
- 发布时间:2026-05-29 12:00 UTC
- 链接:https://openai.com/index/boston-childrens-hospital
摘要:Boston Children’s Hospital uses OpenAI technology to improve patient care, reduce operational burden, and help diagnose more than 40 rare disease cases.
7. How Braintrust turns customer requests into code with Codex
- 来源:OpenAI News
- 发布时间:2026-05-29 12:00 UTC
- 链接:https://openai.com/index/braintrust
摘要:How Braintrust engineers use Codex with GPT-5.5 to run experiments and code faster.
8. This AI startup will clean your home for free to train future robots
- 来源:The Verge AI
- 发布时间:2026-05-29 11:58 UTC
- 链接:https://www.theverge.com/ai-artificial-intelligence/939765/ai-training-data-startup-shift-free-cleaning
摘要:AI training startup Shift wants to clean your home for free. The catch - because, despite what its website says, there's always a catch - is that it will record cleaners as they scrub, vacuum, dust, tidy, and wash, and u
9. Adobe’s conversational AI agent is a mediocre design intern
- 来源:The Verge AI
- 发布时间:2026-05-29 10:00 UTC
- 链接:https://www.theverge.com/tech/939686/adobes-conversational-ai-agent-is-a-mediocre-design-intern
摘要:AI image tools rarely make me feel like I'm part of the creative process. They are, after all, mostly designed so that people with no design experience can type in a few words and get back a usable result. So I was pleas
10. Meet mKernel: A Multi-GPU, Multi-Node Fused Kernel Library for GPU-Driven Communication
- 来源:MarkTechPost
- 发布时间:2026-05-29 08:43 UTC
- 链接:https://www.marktechpost.com/2026/05/29/meet-mkernel-a-multi-gpu-multi-node-fused-kernel-library-for-gpu-driven-communication/
摘要:UC Berkeley's UCCL team releases mKernel, fusing intra-node NVLink, inter-node RDMA, and dense compute into a single persistent CUDA kernel. The post Meet mKernel: A Multi-GPU, Multi-Node Fused Kernel Library for GPU-Dri
11. Hexo Labs Open-Sources SIA: A Self-Improving Agent That Updates Both the Harness and the Model Weights
- 来源:MarkTechPost
- 发布时间:2026-05-29 07:28 UTC
- 链接:https://www.marktechpost.com/2026/05/29/hexo-labs-open-sources-sia-a-self-improving-agent-that-updates-both-the-harness-and-the-model-weights/
摘要:Hexo Labs released SIA, an open-source self-improving loop, under an MIT license. A Feedback-Agent reads each run's trajectory, then either rewrites the scaffold or triggers a LoRA weight update on gpt-oss-120b. Combinin
12. Behavior-Induced Mirror-Prox Temporal-Difference Learning for Faster Off-Policy Prediction
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28849
摘要:arXiv:2605.28849v1 Announce Type: new Abstract: Gradient temporal-difference methods provide stable off-policy prediction with linear function approximation, but their practical performance is strongly affected by the ge
13. Behavior-Aware Auxiliary Corrections for Off-Policy Temporal-Difference Prediction
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28855
摘要:arXiv:2605.28855v1 Announce Type: new Abstract: Temporal-difference learning with function approximation can be unstable under off-policy sampling. TDC stabilizes off-policy TD through an auxiliary covariance correction,
14. The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28864
摘要:arXiv:2605.28864v1 Announce Type: new Abstract: The Cognitive Categorical Transformer (CCT) is a 306M-parameter architecture that augments a pretrained GPT-2 Small backbone with cognitively grounded components derived fr
15. Ultra-Reduced-Impact-Encased-Logging (URIEL): propose a new method for selective sustainable logging and post-harvest silvicultural treatment in tropical forest using airborne robotics systems
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28883
摘要:arXiv:2605.28883v1 Announce Type: new Abstract: Tropical forests worldwide are under intense deforestation pressure driven by economic and political interests, and scientific evidence suggests this deforestation contribu
16. Review Arcade: On the Human Alignment and Gameability of LLM Reviews
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28897
摘要:arXiv:2605.28897v1 Announce Type: new Abstract: LLM-generated reviews for scientific papers are gaining considerable traction and are even being officially piloted by major conferences. We have to assume that not only re
17. Orthogonal Concept Erasure for Diffusion Models
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28902
摘要:arXiv:2605.28902v1 Announce Type: new Abstract: Concept erasure has emerged as a promising approach to mitigate undesired or unsafe content in diffusion models, yet existing methods still face significant limitations. Wh
18. Frontier LLM-based agents can overcome the ontology curation bottleneck for natural phenotypes
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28965
摘要:arXiv:2605.28965v1 Announce Type: new Abstract: Linking free-text phenotype descriptions to ontology terms, typically referred to as phenotype annotation, is essential for the cross-study integration of comparative morph
19. VFEAgent: A Multimodal Agent Framework for End-to-End Automated Finite Element Analysis
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28978
摘要:arXiv:2605.28978v1 Announce Type: new Abstract: Finite Element Analysis (FEA) serves as the cornerstone of modern engineering design. However, its workflow is inherently complex and relies heavily on domain expertise. Al
20. BEAMS: Benchmarking and Evaluating AI for Modeling and Simulation
- 来源:arXiv cs.AI
- 发布时间:2026-05-29 04:00 UTC
- 链接:https://arxiv.org/abs/2605.28994
摘要:arXiv:2605.28994v1 Announce Type: new Abstract: AI tools to support real world decision making must be able to build simulation models that inform their recommendations and render them interpretable. Tools that can autom