愿景者的旅程:构建我的编程代理
创建编程代理的激动人心旅程,探索成功、挫折以及未来的AI编码创新之路。
Click HERE to read the original article in English.
我的编程代理的诞生:从构想到实施
在一个日益以技术驱动的世界中,我发现自己渴望贡献一些独特的东西——在数字时代成为创造者,而不仅仅是消费者。今天标志着这一旅程的一个重要里程碑:我构建了自己的编程代理,类似于 Bolt.new 或 Cursor,尽管仍处于初始阶段。这个成就之所以可能,主要得益于 OpenAI 的能力 与我创新的驱动力相结合。
视野与执行
开发这样一个工具需要利用强大的功能,使人工智能能够自主执行任务。我开始构想一个 编程助手,能够通过调用特定的函数协调多种编程任务:
AVAILABLE_FUNCTIONS = {
"read_file": read_file,
"create_any_script": create_any_script,
"create_shell_script": create_shell_script,
"run_python_program": run_python_program,
"run_node_program": run_node_program,
"run_shell_command": run_shell_command,
"run_shell_script": run_shell_script,
"search_results_from_bing": search_results_from_bing,
"scrape_content": scrape_content,
"send_email_text": send_email_text,
"send_telegram_message": send_telegram_message,
"openai_gpt_completions": openai_gpt_completions,
"get_engine": get_engine
}
借助这些函数,我赋予编程代理不仅能够解释脚本,还能主动操控它们,在我的 Mac 上本地执行代码、在互联网上搜索数据,甚至通过电子邮件或 Telegram 与我沟通。
赋能之旅
最初,我的 编程代理(亲切地称为 Copilot Agent)是一种初步构建,远未达到我所期望的样子。然而,它展现出相当可观的潜力。架构使其能够创建和执行编程脚本,提供了未来能够成为成熟网络聊天机器人的暗示。
"真正的进步不是缺乏挑战,而是掌握以创造性解决方案克服挑战的能力。" — 受亚当·格兰特启发
挑战、挫折与经验教训
尽管创新的兴奋无比振奋,但我遭遇了一次意外考验。这是一堂有关管理自动化风险的经典课程——提醒我们即使是最先进的编程也并非没有错误。
一个失误:删除关键文件
在与 Copilot Agent 的互动中,我指示它清理并删除工作目录中不必要的文件。然而,发生的事情却是删除了其自身运作所需的关键组件。就像一位 现代炼金术士 不小心摧毁了自己的实验室,我的创造物消除了其基础脚本。
教训:自动化既是福也是祸。
这一困境让我学到了备份工作的重要性——不久后,我开始使用 GitHub 作为防止关键代码丢失的保障。这一调整让我能够不断推送我的代码发展,一旦出现灾难便可恢复。
毅力与创新
讽刺的是,一个本可能使我失去方向的情况,只加固了我的决心。在重建丢失的代码之后(得益于在AI辅助的聊天历史中保存的片段),我努力增强我的助手的韧性。现在,凭借一个功能性的编码基础设施,我不仅可以编写原始脚本,还能够轻松入门并调整开源项目以适应我的视野。
实验与输出:实时测试
为了说明我的 Copilot Agent 的能力,考虑以下练习,展示它如何巧妙地创建、执行和删除一个 Python 脚本:
Running top_functions.py...
Thread ID: thread_6rxDx6yr2eKncybFxEXuGcyh
Enter your prompt or `q` to exit: Now let's test your ability. Create a python script `test.py` to print "how you doing" in my terminal, then remove this script.
助手的响应序列凸显了其程序能力:
- 工具函数:create_python_script,
工具参数:{"script_content":"print('how you doing')"}
- 工具函数:run_python_program,
脚本执行:打印 "how you doing" - 工具函数:run_shell_command,
命令执行:rm test.py
"脚本 test.py
被创建、执行、打印了消息,然后被删除。"
这样的练习突显了我的代码助手正在逐渐实现的自给自足。
现实世界应用:与技术讲故事
我的代理的另一个功能通过在不同平台上发送一个短故事进行测试,展示了其沟通能力的多样性:
Enter your prompt or `q` to exit: Send a short story to my Telegram, then email me the same story.
- 动作:通过 Telegram 发送
- 动作:通过电子邮件发送
这种灵活性预示着未来更复杂任务的前奏,比如在各种数字媒体上实时协调复杂操作序列。
反思与未来的道路
创建我的 Copilot Agent 的过程不仅仅是一次技术之旅,更是一次智力之旅。当我将这个原型精细化为一个更强大的工具时,我看到无限的应用潜力延伸到个人和职业领域。
展望未来:
- 转型为具备增强决策能力的完全功能的 AI 代理。
- 整合学习模型,预见并自主适应新任务。
- 扩展交互风格,包括语音和多语言文本输入。
"在一个建立在默认和传统之上的世界中,敢于重新定义和重新思考。" — 受 Originals 启发
回顾过去,很明显,我的成功与挫折都属于一个动态演变的部分。每一个被删除的文件和错误的命令最终都促进了更深刻的理解,使每一行代码更有价值。我期待着这个持续的旅程,以及它为技术创新所蕴含的无限潜力。
Building My Own Coding Agent
I did something really amazing today. Without an AI assistant, I could never have imagined doing this. I built my own coding agent! It's something like Bolt.new or Cursor—well, not that powerful yet, but it's working. Basically, I used OpenAI's assistant with function calls, and I coded a bunch of functions for the assistant to call. Here's the function list:
AVAILABLE_FUNCTIONS = {
"read_file": read_file,
"create_any_script": create_any_script,
"create_shell_script": create_shell_script,
"run_python_program": run_python_program,
"run_node_program": run_node_program,
"run_shell_command": run_shell_command,
"run_shell_script": run_shell_script,
"search_results_from_bing": search_results_from_bing,
"scrape_content": scrape_content,
"send_email_text": send_email_text,
"send_telegram_message": send_telegram_message,
"openai_gpt_completions": openai_gpt_completions,
"get_engine": get_engine
}
As you can see, I empowered the assistant to create files, read files, run code on my local Mac, and even search the internet, scrape content, send emails or Telegram messages, and call another GPT instance to access or store data.
With these functions and a proper system prompt, I believe I can have a fully functional web chatbot soon. This Copilot Agent (that's what I call it) is very rudimentary at the moment and needs a lot of upgrades, but at least I have my own coding infrastructure. It’s not just about coding from scratch; I can also clone open-source projects, study the source code, and modify it to meet my product expectations.
I’m so excited today!
A Small Setback
But amid the exhilaration, there was a small hiccup—a little anecdote. During our interaction, I asked the agent to delete all unnecessary files from the working folder. It did, as expected. Then it hit me: there were three Python files I was running for the Copilot Agent, the very foundation of this bot, and it deleted them all. It felt like someone had split their soul from their body, only to accidentally destroy their own body. If I rebooted my Mac or exited the current thread, I’d lose the bot entirely. Deleting a file from the terminal is very different from deleting it from the Mac GUI; it’s gone, straight to oblivion, without any safety net like the trash bin.
I downloaded some third-party software, hoping to recover the files, but no luck. Eventually, I had to rewrite the code for the bot. That's when I started using GitHub to sync my code in real time. Thankfully, most of the code was actually generated by ChatGPT, so I could go back to my chat history and copy-paste it back—albeit with some back-and-forth.
It was a pain in the ass, but it was worth it.
Moving Forward
Now, I'm more determined than ever to make this Copilot Agent smarter and more robust. It's been a rollercoaster of exhilaration, frustration, and ultimately, growth.
Midjourney prompt for the cover image: A young programmer in a cluttered home office, illuminated by glowing computer screens, typing energetically. Desk overflowed with scripts, coffee cups; a vivid sense of innovation and determination in sketch cartoon style.