以代码为中心的Agent框架：TaskWeaver 原理、框架与使用案例

学术 2024-09-14 15:58 北京

unsetunsetTaskWeaver 介绍unsetunset

TaskWeaver 是一个以代码为中心的智能代理框架，用于无缝规划和执行数据分析任务。这个创新框架通过代码片段解释用户请求，并高效协调各种插件（以函数形式存在）来执行数据分析或工作流自动化任务。

丰富的数据结构 - TaskWeaver 允许你在 Python 中使用丰富的数据结构，如列表、字典或 Pandas DataFrame，而不必限制在文本字符串上。
自定义算法 - TaskWeaver 允许你将你自己的算法或函数封装成插件（以 Python 函数的形式），并协调它们来完成复杂任务。
融入领域特定知识 - TaskWeaver 设计为易于融入领域特定知识，例如任务规划和代码生成方面的知识或经验，以提高任务执行的可靠性。
有状态的对话 - TaskWeaver 设计为支持有状态的对话，这意味着你可以在多个聊天回合中与内存中的数据进行交互。
代码验证 - TaskWeaver 设计为在执行前验证生成的代码。它可以检测生成代码中的潜在问题并自动修复它们。
安全考虑 - TaskWeaver 支持在沙箱环境中运行代码，以防止恶意代码执行。它还提供基本的会话管理，以保持不同用户的数据分离。
易于调试 - TaskWeaver 设计为易于调试。我们有详细的日志、指标和追踪，使用 OpenTelemetry 进行仪表化，帮助你理解简单聊天界面下发生了什么。
易于扩展 - TaskWeaver 设计为易于扩展以完成更复杂的任务。除了插件之外，你可以创建额外的角色，定义它们如何与规划器交互，并协调它们来实现更复杂的任务。
易于使用 - TaskWeaver 设计为易于使用。我们提供一套示例插件和教程，帮助你快速上手。我们还提供一站式 Docker 镜像，帮助你快速体验 TaskWeaver。

unsetunsetTaskWeaver 快速开始unsetunset

安装环境：

# clone the repository
git clone https://github.com/microsoft/TaskWeaver.git
cd TaskWeaver
# install the requirements
pip install -r requirements.txt

TaskWeaver 作为一个进程运行，需要创建一个项目目录来存储插件和配置文件：

📦project
 ┣ 📜taskweaver_config.json # the project configuration file for TaskWeaver
 ┣ 📂plugins # the folder to store plugins
 ┣ 📂planner_examples # the folder to store planner examples
 ┣ 📂codeinterpreter_examples # the folder to store code interpreter examples
 ┣ 📂logs # the folder to store logs, will be generated after program starts
 ┗ 📂workspace # the directory stores session data， will be generated after program starts
    ┗ 📂 session_id 
      ┣ 📂ces # the folder used by the code execution service
      ┣ 📂cwd # the current working directory to run the generated code
      ┗ other session data

unsetunsetTaskWeaver 代码执行模式unsetunset

https://microsoft.github.io/TaskWeaver/docs/code_execution

TaskWeaver 支持两种代码执行模式：local（本地）和 container（容器）。container 模式是默认模式。两种模式的关键区别在于。

container 模式在 Docker 容器内执行代码，为代码执行提供了更安全的环境，而 local 模式将代码作为 TaskWeaver 进程的子进程执行。

在 local 模式下，如果用户有恶意意图，他们可能会指示 TaskWeaver 在宿主机上执行有害代码。此外，LLM（大型语言模型）也可能生成有害代码，导致潜在的安全风险。

unsetunsetTaskWeaver 总体结构unsetunset

Taskweaver主要由两部分构成，Planer和Code Interpreter，其中Code Interpreter又由负责生成代码的Code Generator和负责执行代码的Code Executor构成。

从执行的流程上来看，可以大致划分为以下几个过程：

用户发出指令，Planer接收到用户的指令。
Planer根据用户的指令，结合prompt中的例子，做出初始计划（init plan）。
Planer将init plan转化为更加精简的最终计划（final plan），计划中的每一步可以看作是一个子任务。
逐步将计划中的每一步发送给Code Interpreter。
Code Generator根据接收到的任务，从可用的插件里进行选择，并生成一段用于完成当前步骤的代码，发送给Code Executor。
Code Executor执行代码，并将执行结果反馈给Planer。

unsetunsetTaskWeaver 核心概念unsetunset

Project

项目可以看作是taskweaver中最高层级的概念，每个项目会指定一个特定的目录，其中包含了指导planer的prompt和example，项目下的每个会话（Session）的日志，项目中可以使用的各类插件，以及使用的LLM相关的配置。

Session

会话是实际代码执行中比较高层级，独立且完整的概念了，各种角色、组件都是挂在具体的会话下，每次启动taskweaver，就视作是唤起了一次会话。

Memory

Memory相当于是对话历史plus，除了像一般的LLM项目中，将对话历史保存下来，memory还保留了Code Executor的执行结果，并且在保留记忆的时候，利用compression组件对对话的内容进行了总结。

Attachment

Attachment是在Post中，除了一般的文本信息之外，需要特殊被标记出来的附件，包括code，markdown，execution_result等，不同的角色通过判断特定的attachment的类型和其中的内容，来采取进一步的行动。

Plugin

插件是在代码生成过程中，提供给Code Generator的，用于完成某些相对复杂的任务。在提供插件的时候，需要提供两个文件，其一是py脚本，一旦插件被选中，脚本会被提供给Code Executor用于执行；其二是yaml文件，用于添加到Code Generator的prompt部分中，其中提供了该插件的介绍，使用例子，以及输入输出的类型等信息。

unsetunsetTaskWeaver 配置文件unsetunset

TaskWeaver 的配置文件位于 project/taskweaver_config.json，你可以编辑这个文件来配置 TaskWeaver 的各种设置。以下是配置文件中可用参数的概览：

参数	描述
`llm.model`	语言模型使用的模型名称。
`llm.api_base`	OpenAI API 的基础 URL。
`llm.api_key`	OpenAI API 的 API 密钥。
`llm.api_type`	OpenAI API 的类型，可以是 `openai` 或 `azure`。
`llm.api_version`	OpenAI API 的版本。
`llm.embedding_model`	嵌入模型的名称。
`code_generator.prompt_compression`	是否压缩代码解释器的聊天历史。
`code_generator.enable_auto_plugin_selection`	是否启用自动插件选择。
`code_generator.use_experience`	是否在代码生成器中使用从先前聊天历史中总结的经验。
`code_generator.auto_plugin_selection_topk`	每轮自动选择的插件数量。
`session.max_internal_chat_round_num`	规划器和代码解释器之间内部聊天回合的最大数量。
`session.roles`	对话中包含的角色。
`execution_service.kernel_mode`	代码执行器的模式，可以是 `local` 或 `container`。

unsetunsetTaskWeaver 使用案例unsetunset

本代码将使用微软的 TaskWeaver 框架，并使用 GLM-4 模型来完成具有规划性的 Agent 任务。

修改project/taskweaver_config.json中的相关配置，按照官方文档进行配置，即可使用 GLM-4 模型作为模型底座。

{
  "llm.api_base": "https://open.bigmodel.cn/api/paas/v4",
  "llm.api_key": "your keys",
  "llm.api_type": "zhipuai",
  "llm.model": "glm-4",
  "llm.embedding_model": "embedding-2",
  "llm.embedding_api_type": "zhipuai"
}

案例 1：获取机器用户名

   Human   ▶  获取当前机器的用户名，mac 机器，
 ╭───< Planner >
 ├─► [init_plan] 1. Use the os module to get the current username
 ├─► [plan] 1. Use the os module to get the current username
 ├─► [current_plan_step] 1. Use the os module to get the current username
 ├──● Please use the os module to get the current username of the MAC machine
 ├─► [board]
 │
 │   ====== Plan ======
 │   I have drawn up a plan:
 │   1. Use the os module to get the current username
 │   ==================
 │
 ╰──● sending message to CodeInterpreter
 ╭───< CodeInterpreter >
 ├─► [thought] I will use the os module to retrieve the current username of the MAC machine. I will make sure to directly import the required module without installing it. I will also ensure that I refer to
 │   intermediate variables from previous successful rounds and not refer to any failed rounds or previous conversations.
 ├─► [reply_type] python
 ├─► [reply_content]
 │   import os
 │
 │   # Get the current username
 │   username = os.getlogin()
 │
 │   username
 ├─► [verification] NONE
 ├─► [code_error] No code verification is performed.
 ├─► [execution_status] SUCCESS
 ├─► [execution_result]
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   liufinlay
 ├─► [artifact_paths] []
 ├──● The following python code has been executed:
 │   ```python
 │   import os
 │
 │   # Get the current username
 │   username = os.getlogin()
 │
 │   username
 │   ```
 │
 │
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   liufinlay
 ╰──● sending message to Planner
 ╭───< Planner >
 ├─► [init_plan] 1. Use the os module to get the current username
 ├─► [plan] 1. Use the os module to get the current username
 ├─► [current_plan_step] 1. Use the os module to get the current username
 ├─► [review]
 │   The Python code to get the current username has been executed successfully.
 │   The current username on the MAC machine is 'liufinlay'.
 ├──● The current username on the MAC machine is 'liufinlay'.
 ├─► [board]
 │
 │   ====== Plan ======
 │   I have drawn up a plan:
 │   1. Use the os module to get the current username
 │   ==================
 │
 ╰──● sending message to User
 TaskWeaver ▶  The current username on the MAC machine is 'liufinlay'.

案例 2：统计文件个数

   Human   ▶  统计 /Users/liufinlay/Downloads 下面的文件个数
 ╭───< Planner >
 ├─► [init_plan]
 │   1. Use the os module to list all files in the specified directory
 │   2. Count the number of files in the listed data <sequentially depends on 1>
 │   3. Check the execution result and report the file count to the user <interactively depends on 2>
 ├─► [plan]
 │   1. Instruct CodeInterpreter to list all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   2. Check the execution result and report the file count to the user if it is correct
 ├─► [current_plan_step] 1. Instruct CodeInterpreter to list all files in the directory /Users/liufinlay/Downloads and count the number of files
 ├──● Please list all files in the directory /Users/liufinlay/Downloads and count the number of files
 ├─► [board]
 │
 │   ====== Plan ======
 │   I have drawn up a plan:
 │   1. Instruct CodeInterpreter to list all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   2. Check the execution result and report the file count to the user if it is correct
 │   ==================
 │
 ╰──● sending message to CodeInterpreter
 ╭───< CodeInterpreter >
 ├─► [thought] I will use the os module to list all files in the directory /Users/liufinlay/Downloads and count the number of files. I will make sure to directly import the required module without installing it.
 │   I will also ensure that I refer to intermediate variables from previous successful rounds and not refer to any failed rounds or previous conversations.
 ├─► [reply_type] python
 ├─► [reply_content]
 │   import os
 │
 │   # List all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   directory_path = '/Users/liufinlay/Downloads'
 │   file_count = len(os.listdir(directory_path))
 │   file_count
 ├─► [verification] NONE
 ├─► [code_error] No code verification is performed.
 ├─► [execution_status] SUCCESS
 ├─► [execution_result]
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   277
 ├─► [artifact_paths] []
 ├──● The following python code has been executed:
 │   ```python
 │   import os
 │
 │   # List all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   directory_path = '/Users/liufinlay/Downloads'
 │   file_count = len(os.listdir(directory_path))
 │   file_count
 │   ```
 │
 │
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   277
 ╰──● sending message to Planner
 ╭───< Planner >
 ├─► [init_plan]
 │   1. Use the os module to list all files in the specified directory
 │   2. Count the number of files in the listed data <sequentially depends on 1>
 │   3. Check the execution result and report the file count to the user <interactively depends on 2>
 ├─► [plan]
 │   1. Instruct CodeInterpreter to list all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   2. Check the execution result and report the file count to the user if it is correct
 ├─► [current_plan_step] 2. Check the execution result and report the file count to the user
 ├─► [review]
 │   The Python code to list all files in the directory /Users/liufinlay/Downloads and count the number of files has been executed successfully.
 │   There are 277 files in the directory.
 ├──● There are 277 files in the directory /Users/liufinlay/Downloads.
 ├─► [board]
 │
 │   ====== Plan ======
 │   I have drawn up a plan:
 │   1. Instruct CodeInterpreter to list all files in the directory /Users/liufinlay/Downloads and count the number of files
 │   2. Check the execution result and report the file count to the user if it is correct
 │   ==================
 │
 ╰──● sending message to User
 TaskWeaver ▶  There are 277 files in the directory /Users/liufinlay/Downloads.

# 学习大模型 & 讨论Kaggle #

△长按添加竞赛小助手

每天大模型、算法竞赛、干货资讯

与 36000+来自竞赛爱好者一起交流~

http://mp.weixin.qq.com/s?__biz=MzIwNDA5NDYzNA==&mid=2247506893&idx=1&sn=485fa3f0ddd39f6ee04a49c3e5d55774

Coggle数据科学

Coggle全称Communication For Kaggle，专注数据科学领域竞赛相关资讯分享。

最新文章

CSIG挑战赛飞桨-工银瑞信赛道 baseline

小白学量化：ta时序库（43个量化特征）

用视觉搞定股票预测，CSIG挑战赛飞桨-工银瑞信赛道等你来战！

Kaggle赛题解析：第二届 AI数学奥赛挑战赛

Kaggle知识点：torch模型编译与加速

为梦想燃起来！香港中文大学（深圳）第二届神仙湖国际创新创业大赛报名正式启动！

Kaggle赛题解析：Jane Street实时量化挑战赛

Kaggle知识点：YOLO 11 安装与使用指南

GPT-4o AIDE智能体：Kaggle刷榜背后的真相

ACL 2024亮点：RAG技术进展与论文解析

行业落地分享：讯飞科研助手 SparkRA

Kaggle知识点：HistGradient Boosting

全球AI攻防挑战赛：金融场景凭证篡改检测 baseline

Kaggle Child Mind Institute赛题 Baseline （0.471 分）

全球AI攻防挑战赛：大模型生图安全疫苗注入 baseline

行业落地分享：文档解析与RAG应用落地

全球Deepfake攻防挑战赛：三等奖 Team VisionRush 方案与代码解析

小白学大模型：自定义信息抽取Agent

Kaggle赛题解析：CMI 体育损伤指数预测

小白学大模型：Structured Outputs格式化输出

ECCV'24创新工作《Grounding DINO：最强开集目标检测器！》

科大讯飞AI大赛：大模型图表问答挑战赛

行业落地案例：AI Agent 华为云技术实践

以代码为中心的Agent框架：TaskWeaver 原理、框架与使用案例

Kaggle 赛题解析：Eedi 多项选择理解

科大讯飞AI大赛：玉米雄穗识别挑战赛

小白学 RAG：Milvus 介绍与使用教程

行业落地分享：RAG难点与创新应用

为啥本科生都能发顶会，而博士一篇都没有？

科大讯飞AI大赛：人岗匹配挑战赛赛季3

Kaggle 赛题解析：预测MCTS游戏表现

大模型长文本处理技术与GLM-4-Plus评测

行业落地分享：好未来智能客服实践

科大讯飞AI大赛：RAG智能问答挑战赛

LSTPrompt：无需训练，大型语言模型如何变身时间序列预测高手？

小白学 RAG：GraphRAG概念、组成和流程

科大讯飞AI大赛：基于超声数据的多病种疾病预测挑战赛 baseline

行业落地分享：阿里知识图谱召回与实践

语义分割未来：Segment Anything 原理与使用案例

2024“大运河杯”数据开发大赛：城市治理 baseline

行业落地分享：大模型与工业知识图谱

Kaggle 赛题总结：USPTO 布尔专利检索

科大讯飞AI大赛：中文成语释义与解析挑战赛 baseline

火热报名中｜跨境合规AI全球挑战赛，40万奖金等你来瓜分！

小白学大模型：L-Eval 长文本评测

行业落地分享：大模型 RAG 汽车应用实践

Kaggle 知识点：知识蒸馏的三种方法

科大讯飞AI大赛：大模型图文匹配识别挑战赛 baseline

小白学 NLP：KeyBERT提取中英文关键词

行业落地分享：阿里1688 Agent应用实践

分类

时事

民生

政务

教育

文化

科技

财富

体娱

健康

情感

旅行

百科

职场

楼市

企业

乐活

学术

汽车

时尚

创业

美食

幽默

美体

文摘

原创标签

时事社会财经军事教育体育科技汽车科学房产搞笑综艺明星音乐动漫游戏时尚健康旅游美食生活摄影宠物职场育儿情感小说曲艺文化历史三农文学娱乐电影视频图片新闻宗教电视剧纪录片广告创意壁纸头像心灵鸡汤星座命理教育培训艺术文化金融财经健康医疗美妆时尚餐饮美食母婴育儿社会新闻工业农业时事政治星座占卜幽默笑话独立短篇连载作品文化历史科技互联网

发布位置

广东北京山东江苏河南浙江山西福建河北上海四川陕西湖南安徽湖北内蒙古江西云南广西甘肃辽宁黑龙江贵州新疆重庆吉林天津海南青海宁夏西藏香港澳门台湾美国加拿大澳大利亚日本新加坡英国西班牙新西兰韩国泰国法国德国意大利缅甸菲律宾马来西亚越南荷兰柬埔寨俄罗斯巴西智利卢森堡芬兰瑞典比利时瑞士土耳其斐济挪威朝鲜尼日利亚阿根廷匈牙利爱尔兰印度老挝葡萄牙乌克兰印度尼西亚哈萨克斯坦塔吉克斯坦希腊南非蒙古奥地利肯尼亚加纳丹麦津巴布韦埃及坦桑尼亚捷克阿联酋安哥拉