📄 論文解説: Don't Break the Cache — エージェントタスクにおけるプロンプトキャッシュの最適設計

本記事は Don’t Break the Cache: An Evaluation of Prompt Caching for Long-Horizon Agentic Tasks の解説記事です。論文概要（Abstract） LLMエージェントは複雑なマルチターンタスクにおいて大量のツール呼び出しとコンテキスト蓄積を行うが、プロンプトキャッシュによるコスト削減効果は十分に研究されてい...

05/03/2026 blog paper

prompt-caching LLM agent +6

✍️ Anthropic Advanced Tool Use解説: Tool Search Tool・Programmatic Tool Callingによるエージェント最適化

本記事は Introducing advanced tool use on the Claude Developer Platform の解説記事です。ブログ概要（Summary） Anthropicは2025年11月、Claude Developer Platformに3つのベータ機能をリリースした。Tool Search Tool（ツール定義の動的検索・ロード）、Programma...

05/03/2026 blog tech_blog

Anthropic Claude tool-use +6

📄 論文解説: SemCache — LLM推論のためのセマンティックキャッシュの精度と粒度を改善する

論文概要（Abstract）本記事は arXiv:2502.03771 SemCache: Semantic-Aware GPT-Cache through LLM-Embedded Similarity の解説記事です。 SemCacheは、LLM推論のセマンティックキャッシュにおける2つの根本的課題を解決する手法です。第1の課題は、既存の埋め込みベース類似度（EBS）が意味的に異な...

05/03/2026 blog paper

semantic caching LLM inference optimization +2

📄 EMNLP 2025論文解説: RouterEval — LLMルーティング戦略の包括的ベンチマーク

論文概要（Abstract） RouterEvalは、LLMルーティング戦略を体系的に評価するための包括的ベンチマークフレームワークです。16種類のルーター（分類器ベース、スコアリングベース、カスケード型等）を、9つのLLMプール、12のクエリセット、7つの評価メトリクスで評価し、合計1,152通りの組み合わせを検証しています。主要な知見として、万能のルーターは存在せず、LLMプールの構成...

05/03/2026 blog paper

LLM routing model selection cost optimization +2

✍️ Azure API Management Unified AI Gatewayデザインパターン解説

ブログ概要（Summary）本記事は Microsoft Tech Community: Azure API Management - Unified AI Gateway Design Pattern の解説記事です。欧州エネルギー企業Uniperが開発した「Unified AI Gateway」デザインパターンは、Azure API Management（APIM）のポリシー拡張...

05/03/2026 blog tech_blog

Azure API Management AI Gateway +2

📄 論文解説: Agentic Plan Caching — LLMエージェントのコスト削減を実現するテスト時計画キャッシュ

本記事は Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents の解説記事です。論文概要（Abstract） LLMベースのエージェントは複雑なワークフローを自動化できるが、タスクごとに計画（Plan）をゼロから生成するため、APIコストとレイテンシが大きな課題となっている。本論文は、...

05/03/2026 blog paper

LLM agent caching +7

✍️ NVIDIA Blackwell MoE推論最適化 — DeepSeek-R1で実証された大規模スパースモデルの高速推論技術

本記事は Delivering Massive Performance Leaps for Mixture of Experts Inference on NVIDIA Blackwell (NVIDIA Developer Blog, 2026-01-08) の解説記事です。ブログ概要（Summary） NVIDIAが2026年1月に公開したこのテックブログは、Blackwellアー...

04/03/2026 blog tech_blog

NVIDIA Blackwell MoE +4

✍️ NVIDIA公式ブログ解説: NeMo Curatorによるデータキュレーションで埋め込みモデル精度を12%改善

本記事は NVIDIA Technical Blog: Boost Embedding Model Accuracy for Custom Information Retrieval（2025年6月25日公開）の解説記事です。ブログ概要（Summary） NVIDIAのTechnical Blogで公開された本記事は、会話型AIアナリティクスプラットフォーム「Coxwave Align...

04/03/2026 blog tech_blog

embedding data-curation fine-tuning +4

📄 論文解説: 拡散言語モデルのサーベイ — 手法分類・課題・今後の研究方向

本記事は arXiv:2508.10875 “A Survey on Diffusion Language Models” の解説記事です。論文概要（Abstract）本サーベイ論文は、VILA Lab（Georgia Tech）のチームが2025年8月に発表した拡散言語モデル（Diffusion Language Model, DLM）の包括的な調査である。著者らは、テキスト生成に...

04/03/2026 blog paper

diffusion llm survey +2

📄 論文解説: Diffusion Models in De Novo Drug Design — 創薬における拡散モデルの体系的レビュー

本記事は Diffusion Models in De Novo Drug Design（Alakhdar, Poczos & Washburn, Journal of Chemical Information and Modeling, 2024）の解説記事です。論文概要（Abstract）本論文は、de novo創薬における拡散モデルの適用を体系的にレビューしたサーベイ論...

04/03/2026 blog paper

diffusion-model de-novo-drug-design molecular-generation +7

1
...
38
39
40
...
86
39 / 86