<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on Hanguangwu</title><link>https://hanguangwu.github.io/blog/en/categories/ai/</link><description>Recent content in AI on Hanguangwu</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>hanguangwu</copyright><lastBuildDate>Tue, 10 Feb 2026 18:34:25 -0800</lastBuildDate><atom:link href="https://hanguangwu.github.io/blog/en/categories/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Awesome LLM Apps</title><link>https://hanguangwu.github.io/blog/en/p/awesome-llm-apps/</link><pubDate>Tue, 10 Feb 2026 18:34:25 -0800</pubDate><guid>https://hanguangwu.github.io/blog/en/p/awesome-llm-apps/</guid><description>&lt;h1 id="-awesome-llm-apps"&gt;🌟 Awesome LLM Apps
&lt;/h1&gt;&lt;h2 id="introduction"&gt;Introduction
&lt;/h2&gt;&lt;p&gt;A curated collection of &lt;strong&gt;Awesome LLM apps built with RAG, AI Agents, Multi-agent Teams, MCP, Voice Agents, and more.&lt;/strong&gt; This repository features LLM apps that use models from &lt;img src="https://cdn.simpleicons.org/openai" alt="openai logo" width="25" height="15"&gt;&lt;strong&gt;OpenAI&lt;/strong&gt; , &lt;img src="https://cdn.simpleicons.org/anthropic" alt="anthropic logo" width="25" height="15"&gt;&lt;strong&gt;Anthropic&lt;/strong&gt;, &lt;img src="https://cdn.simpleicons.org/googlegemini" alt="google logo" width="25" height="18"&gt;&lt;strong&gt;Google&lt;/strong&gt;, &lt;img src="https://cdn.simpleicons.org/x" alt="X logo" width="25" height="15"&gt;&lt;strong&gt;xAI&lt;/strong&gt; and open-source models like &lt;img src="https://cdn.simpleicons.org/alibabacloud" alt="alibaba logo" width="25" height="15"&gt;&lt;strong&gt;Qwen&lt;/strong&gt; or &lt;img src="https://cdn.simpleicons.org/meta" alt="meta logo" width="25" height="15"&gt;&lt;strong&gt;Llama&lt;/strong&gt; that you can run locally on your computer.&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://github.com/Shubhamsaboo/awesome-llm-apps" target="_blank" rel="noopener"
&gt;GitHub-Awesome LLM Apps&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://www.theunwindai.com/" target="_blank" rel="noopener"
&gt;Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="-why-awesome-llm-apps"&gt;🤔 Why Awesome LLM Apps?
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;💡 Discover practical and creative ways LLMs can be applied across different domains, from code repositories to email inboxes and more.&lt;/li&gt;
&lt;li&gt;🔥 Explore apps that combine LLMs from OpenAI, Anthropic, Gemini, and open-source alternatives with AI Agents, Agent Teams, MCP &amp;amp; RAG.&lt;/li&gt;
&lt;li&gt;🎓 Learn from well-documented projects and contribute to the growing open-source ecosystem of LLM-powered applications.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="-featured-ai-projects"&gt;📂 Featured AI Projects
&lt;/h2&gt;&lt;h3 id="ai-agents"&gt;AI Agents
&lt;/h3&gt;&lt;h3 id="-starter-ai-agents"&gt;🌱 Starter AI Agents
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_blog_to_podcast_agent/" &gt;🎙️ AI Blog to Podcast Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_breakup_recovery_agent/" &gt;❤️‍🩹 AI Breakup Recovery Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_data_analysis_agent/" &gt;📊 AI Data Analysis Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_medical_imaging_agent/" &gt;🩻 AI Medical Imaging Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_meme_generator_agent_browseruse/" &gt;😂 AI Meme Generator Agent (Browser)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_music_generator_agent/" &gt;🎵 AI Music Generator Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/ai_travel_agent/" &gt;🛫 AI Travel Agent (Local &amp;amp; Cloud)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/gemini_multimodal_agent_demo/" &gt;✨ Gemini Multimodal Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/mixture_of_agents/" &gt;🔄 Mixture of Agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/xai_finance_agent/" &gt;📊 xAI Finance Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/opeani_research_agent/" &gt;🔍 OpenAI Research Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="starter_ai_agents/web_scrapping_ai_agent/" &gt;🕸️ Web Scraping AI Agent (Local &amp;amp; Cloud SDK)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-advanced-ai-agents"&gt;🚀 Advanced AI Agents
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/ai_home_renovation_agent" &gt;🏚️ 🍌 AI Home Renovation Agent with Nano Banana Pro&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_deep_research_agent/" &gt;🔍 AI Deep Research Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_vc_due_diligence_agent_team" &gt;📊 AI VC Due Diligence Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/research_agent_gemini_interaction_api" &gt;🔬 AI Research Planner &amp;amp; Executor (Google Interactions API)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_consultant_agent" &gt;🤝 AI Consultant Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_system_architect_r1/" &gt;🏗️ AI System Architect Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/ai_financial_coach_agent/" &gt;💰 AI Financial Coach Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_movie_production_agent/" &gt;🎬 AI Movie Production Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_investment_agent/" &gt;📈 AI Investment Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_health_fitness_agent/" &gt;🏋️‍♂️ AI Health &amp;amp; Fitness Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/product_launch_intelligence_agent" &gt;🚀 AI Product Launch Intelligence Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_journalist_agent/" &gt;🗞️ AI Journalist Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/ai_mental_wellbeing_agent/" &gt;🧠 AI Mental Wellbeing Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/single_agent_apps/ai_meeting_agent/" &gt;📑 AI Meeting Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/ai_Self-Evolving_agent/" &gt;🧬 AI Self-Evolving Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_sales_intelligence_agent_team" &gt;👨🏻‍💼 AI Sales Intelligence Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/ai_news_and_podcast_agents/" &gt;🎧 AI Social Media News and Podcast Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/accomplish-ai/openwork" target="_blank" rel="noopener"
&gt;🌐 Openwork - Open Browser Automation Agent&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-autonomous-game-playing-agents"&gt;🎮 Autonomous Game Playing Agents
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/autonomous_game_playing_agent_apps/ai_3dpygame_r1/" &gt;🎮 AI 3D Pygame Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/autonomous_game_playing_agent_apps/ai_chess_agent/" &gt;♜ AI Chess Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/autonomous_game_playing_agent_apps/ai_tic_tac_toe_agent/" &gt;🎲 AI Tic-Tac-Toe Agent&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-multi-agent-teams"&gt;🤝 Multi-agent Teams
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_competitor_intelligence_agent_team/" &gt;🧲 AI Competitor Intelligence Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_finance_agent_team/" &gt;💲 AI Finance Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_game_design_agent_team/" &gt;🎨 AI Game Design Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_legal_agent_team/" &gt;👨‍⚖️ AI Legal Agent Team (Cloud &amp;amp; Local)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_recruitment_agent_team/" &gt;💼 AI Recruitment Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_real_estate_agent_team" &gt;🏠 AI Real Estate Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_services_agency/" &gt;👨‍💼 AI Services Agency (CrewAI)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/ai_teaching_agent_team/" &gt;👨‍🏫 AI Teaching Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_coding_agent_team/" &gt;💻 Multimodal Coding Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_design_agent_team/" &gt;✨ Multimodal Design Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_ai_agents/multi_agent_apps/agent_teams/multimodal_uiux_feedback_agent_team/" &gt;🎨 🍌 Multimodal UI/UX Feedback Agent Team with Nano Banana&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://hanguangwu.github.io/blog/advanced_ai_agents/multi_agent_apps/agent_teams/ai_travel_planner_agent_team/" &gt;🌏 AI Travel Planner Agent Team&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-voice-ai-agents"&gt;🗣️ Voice AI Agents
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="voice_ai_agents/ai_audio_tour_agent/" &gt;🗣️ AI Audio Tour Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="voice_ai_agents/customer_support_voice_agent/" &gt;📞 Customer Support Voice Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="voice_ai_agents/voice_rag_openaisdk/" &gt;🔊 Voice RAG Agent (OpenAI SDK)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/akshayaggarwal99/jarvis-ai-assistant" target="_blank" rel="noopener"
&gt;🎙️ OpenSource Voice Dictation Agent (like Wispr Flow&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="mcp-ai-agents"&gt;&lt;img src="https://cdn.simpleicons.org/modelcontextprotocol" alt="mcp logo" width="25" height="20"&gt; MCP AI Agents
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="mcp_ai_agents/browser_mcp_agent/" &gt;♾️ Browser MCP Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="mcp_ai_agents/github_mcp_agent/" &gt;🐙 GitHub MCP Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="mcp_ai_agents/notion_mcp_agent" &gt;📑 Notion MCP Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="mcp_ai_agents/ai_travel_planner_mcp_agent_team" &gt;🌍 AI Travel Planner MCP Agent&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-rag-retrieval-augmented-generation"&gt;📀 RAG (Retrieval Augmented Generation)
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/agentic_rag_embedding_gemma" &gt;🔥 Agentic RAG with Embedding Gemma&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/agentic_rag_with_reasoning/" &gt;🧐 Agentic RAG with Reasoning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/ai_blog_search/" &gt;📰 AI Blog Search (RAG)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/autonomous_rag/" &gt;🔍 Autonomous RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/contextualai_rag_agent/" &gt;🔄 Contextual AI RAG Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/corrective_rag/" &gt;🔄 Corrective RAG (CRAG)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/deepseek_local_rag_agent/" &gt;🐋 Deepseek Local RAG Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/gemini_agentic_rag/" &gt;🤔 Gemini Agentic RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/hybrid_search_rag/" &gt;👀 Hybrid Search RAG (Cloud)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/llama3.1_local_rag/" &gt;🔄 Llama 3.1 Local RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/local_hybrid_search_rag/" &gt;🖥️ Local Hybrid Search RAG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/local_rag_agent/" &gt;🦙 Local RAG Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/rag-as-a-service/" &gt;🧩 RAG-as-a-Service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/rag_agent_cohere/" &gt;✨ RAG Agent with Cohere&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/rag_chain/" &gt;⛓️ Basic RAG Chain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/rag_database_routing/" &gt;📠 RAG with Database Routing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="rag_tutorials/vision_rag/" &gt;🖼️ Vision RAG&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-llm-apps-with-memory-tutorials"&gt;💾 LLM Apps with Memory Tutorials
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/ai_arxiv_agent_memory/" &gt;💾 AI ArXiv Agent with Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/ai_travel_agent_memory/" &gt;🛩️ AI Travel Agent with Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/llama3_stateful_chat/" &gt;💬 Llama3 Stateful Chat&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/llm_app_personalized_memory/" &gt;📝 LLM App with Personalized Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/local_chatgpt_with_memory/" &gt;🗄️ Local ChatGPT Clone with Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_apps_with_memory_tutorials/multi_llm_memory/" &gt;🧠 Multi-LLM Application with Shared Memory&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-chat-with-x-tutorials"&gt;💬 Chat with X Tutorials
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_github/" &gt;💬 Chat with GitHub (GPT &amp;amp; Llama3)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_gmail/" &gt;📨 Chat with Gmail&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_pdf/" &gt;📄 Chat with PDF (GPT &amp;amp; Llama3)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_research_papers/" &gt;📚 Chat with Research Papers (ArXiv) (GPT &amp;amp; Llama3)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_substack/" &gt;📝 Chat with Substack&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/chat_with_X_tutorials/chat_with_youtube_videos/" &gt;📽️ Chat with YouTube Videos&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-llm-optimization-tools"&gt;🎯 LLM Optimization Tools
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_optimization_tools/toonify_token_optimization/" &gt;🎯 Toonify Token Optimization&lt;/a&gt; - Reduce LLM API costs by 30-60% using TOON format&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="advanced_llm_apps/llm_optimization_tools/headroom_context_optimization/" &gt;🧠 Headroom Context Optimization&lt;/a&gt; - Reduce LLM API costs by 50-90% through intelligent context compression for AI agents (includes persistent memory &amp;amp; MCP support)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-llm-fine-tuning-tutorials"&gt;🔧 LLM Fine-tuning Tutorials
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;img src="https://cdn.simpleicons.org/google" alt="google logo" width="20" height="15"&gt; &lt;a class="link" href="advanced_llm_apps/llm_finetuning_tutorials/gemma3_finetuning/" &gt;Gemma 3 Fine-tuning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;img src="https://cdn.simpleicons.org/meta" alt="meta logo" width="25" height="15"&gt; &lt;a class="link" href="advanced_llm_apps/llm_finetuning_tutorials/llama3.2_finetuning/" &gt;Llama 3.2 Fine-tuning&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="-ai-agent-framework-crash-course"&gt;🧑‍🏫 AI Agent Framework Crash Course
&lt;/h3&gt;&lt;p&gt;&lt;img src="https://cdn.simpleicons.org/google" alt="google logo" width="25" height="15"&gt; &lt;a class="link" href="ai_agent_framework_crash_course/google_adk_crash_course/" &gt;Google ADK Crash Course&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Starter agent; model‑agnostic (OpenAI, Claude)&lt;/li&gt;
&lt;li&gt;Structured outputs (Pydantic)&lt;/li&gt;
&lt;li&gt;Tools: built‑in, function, third‑party, MCP tools&lt;/li&gt;
&lt;li&gt;Memory; callbacks; Plugins&lt;/li&gt;
&lt;li&gt;Simple multi‑agent; Multi‑agent patterns&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://cdn.simpleicons.org/openai" alt="openai logo" width="25" height="15"&gt; &lt;a class="link" href="ai_agent_framework_crash_course/openai_sdk_crash_course/" &gt;OpenAI Agents SDK Crash Course&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Starter agent; function calling; structured outputs&lt;/li&gt;
&lt;li&gt;Tools: built‑in, function, third‑party integrations&lt;/li&gt;
&lt;li&gt;Memory; callbacks; evaluation&lt;/li&gt;
&lt;li&gt;Multi‑agent patterns; agent handoffs&lt;/li&gt;
&lt;li&gt;Swarm orchestration; routing logic&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="-getting-started"&gt;🚀 Getting Started
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Clone the repository&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Navigate to the desired project directory&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; awesome-llm-apps/starter_ai_agents/ai_travel_agent
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install the required dependencies&lt;/strong&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install -r requirements.txt
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Follow the project-specific instructions&lt;/strong&gt; in each project&amp;rsquo;s &lt;code&gt;README.md&lt;/code&gt; file to set up and run the app.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>500+ AI Agent Projects / UseCases</title><link>https://hanguangwu.github.io/blog/en/p/500-ai-agent-projects-/-usecases/</link><pubDate>Mon, 02 Feb 2026 17:34:25 -0800</pubDate><guid>https://hanguangwu.github.io/blog/en/p/500-ai-agent-projects-/-usecases/</guid><description>&lt;h1 id="-500-ai-agent-projects--usecases"&gt;🌟 500+ AI Agent Projects / UseCases
&lt;/h1&gt;&lt;p&gt;&lt;img src="https://cdn.jsdelivr.net/gh/Hanguangwu/MyImageBed01/img/20260202175639846.png"
loading="lazy"
&gt;&lt;/p&gt;
&lt;p&gt;A curated collection of AI agent use cases across industries, showcasing practical applications and linking to open-source projects for implementation. Explore how AI agents are transforming industries like healthcare, finance, education, and more! 🤖✨&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://github.com/ashishpatel26/500-AI-Agents-Projects" target="_blank" rel="noopener"
&gt;GitHub-Repo&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="-introduction"&gt;🧠 Introduction
&lt;/h2&gt;&lt;p&gt;Artificial Intelligence (AI) agents are revolutionizing the way industries operate. From personalized learning to financial trading bots, AI agents bring efficiency, innovation, and scalability. This repository provides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A categorized list of industries where AI agents are making an impact.&lt;/li&gt;
&lt;li&gt;Detailed use cases with links to open-source projects for implementation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Whether you&amp;rsquo;re a developer, researcher, or business enthusiast, this repository is your go-to resource for AI agent inspiration and learning.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="selected-usecase-by-myself"&gt;Selected Usecase By Myself
&lt;/h2&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Code Github&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Health Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Healthcare&lt;/td&gt;
&lt;td&gt;Diagnoses and monitors diseases using patient data.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/ahmadvh/AI-Agents-for-Medical-Diagnostics.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automated Trading Bot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Finance&lt;/td&gt;
&lt;td&gt;Automates stock trading with real-time market analysis.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/MingyuJ666/Stockagent.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Content Personalization Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Entertainment&lt;/td&gt;
&lt;td&gt;Recommends personalized media based on preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crosleythomas/MirrorGPT" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Legal Document Review Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legal&lt;/td&gt;
&lt;td&gt;Automates document review and highlights key clauses.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/firica/legalai" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recruitment Recommendation Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human Resources&lt;/td&gt;
&lt;td&gt;Suggests best-fit candidates for job openings.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/sentient-engineering/jobber" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Virtual Travel Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hospitality&lt;/td&gt;
&lt;td&gt;Plans travel itineraries based on preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/nirbar1985/ai-travel-agent" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Game Companion Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gaming&lt;/td&gt;
&lt;td&gt;Enhances player experience with real-time assistance.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/onjas-buidl/LLM-agent-game" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/465962dd14abb8181b8d1a3dbaf186be171bc3c5338d347e03b863e17980be8b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f64652d4769744875622d626c61636b3f6c6f676f3d676974687562"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🌐 Landing Page Generator&lt;/td&gt;
&lt;td&gt;💻 Web Development&lt;/td&gt;
&lt;td&gt;Automates the creation of landing pages for websites, facilitating web development tasks.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/landing_page_generator" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎮 Game Builder Crew&lt;/td&gt;
&lt;td&gt;🎮 Game Development&lt;/td&gt;
&lt;td&gt;Assists in the development of games by automating certain aspects of game creation.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/game-builder-crew" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💹 Stock Analysis Tool&lt;/td&gt;
&lt;td&gt;💰 Finance&lt;/td&gt;
&lt;td&gt;Provides tools for analyzing stock market data to assist in financial decision-making.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/stock_analysis" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🗺️ Trip Planner&lt;/td&gt;
&lt;td&gt;✈️ Travel&lt;/td&gt;
&lt;td&gt;Assists in planning trips by organizing itineraries and managing travel details.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/trip_planner" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎁 Surprise Trip Planner&lt;/td&gt;
&lt;td&gt;✈️ Travel&lt;/td&gt;
&lt;td&gt;Plans surprise trips by selecting destinations and activities based on user preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/surprise_trip" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📚 Write a Book with Flows&lt;/td&gt;
&lt;td&gt;✍️ Creative Writing&lt;/td&gt;
&lt;td&gt;Assists authors in writing books by providing structured workflows and writing assistance.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/write_a_book_with_flows" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎬 Screenplay Writer&lt;/td&gt;
&lt;td&gt;✍️ Creative Writing&lt;/td&gt;
&lt;td&gt;Aids in writing screenplays by offering templates and guidance for script development.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/screenplay_writer" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;✅ Markdown Validator&lt;/td&gt;
&lt;td&gt;📄 Documentation&lt;/td&gt;
&lt;td&gt;Validates Markdown files to ensure proper formatting and adherence to standards.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/markdown_validator" target="_blank" rel="noopener"
&gt;&lt;img src="https://camo.githubusercontent.com/78ef5623d7361e74e909b90ea5f4af9d939df5307c2896284062b70b0762bdbe/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4769744875622d5265706f7369746f72792d626c7565"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="-industry-usecase-mindmap"&gt;🏭 Industry UseCase MindMap
&lt;/h2&gt;&lt;p&gt;&lt;img src="https://cdn.jsdelivr.net/gh/Hanguangwu/MyImageBed01/img/20260202175656635.png"
loading="lazy"
&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="-use-case-table"&gt;🧩 Use Case Table
&lt;/h2&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Code Github&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HIA (Health Insights Agent)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Healthcare&lt;/td&gt;
&lt;td&gt;analyses medical reports and provide health insights.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/harshhh28/hia.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Health Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Healthcare&lt;/td&gt;
&lt;td&gt;Diagnoses and monitors diseases using patient data.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/ahmadvh/AI-Agents-for-Medical-Diagnostics.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automated Trading Bot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Finance&lt;/td&gt;
&lt;td&gt;Automates stock trading with real-time market analysis.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/MingyuJ666/Stockagent.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Virtual AI Tutor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Education&lt;/td&gt;
&lt;td&gt;Provides personalized education tailored to users.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/hqanhh/EduGPT.git" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;24/7 AI Chatbot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Customer Service&lt;/td&gt;
&lt;td&gt;Handles customer queries around the clock.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/NirDiamant/GenAI_Agents/blob/main/all_agents_tutorials/customer_support_agent_langgraph.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Product Recommendation Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Retail&lt;/td&gt;
&lt;td&gt;Suggests products based on user preferences and history.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/microsoft/RecAI" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Self-Driving Delivery Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Transportation&lt;/td&gt;
&lt;td&gt;Optimizes routes and autonomously delivers packages.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/sled-group/driVLMe" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Factory Process Monitoring Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manufacturing&lt;/td&gt;
&lt;td&gt;Monitors production lines and ensures quality control.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/yuchenxia/llm4ias" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Property Pricing Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real Estate&lt;/td&gt;
&lt;td&gt;Analyzes market trends to determine property prices.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/AleksNeStu/ai-real-estate-assistant" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Smart Farming Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agriculture&lt;/td&gt;
&lt;td&gt;Provides insights on crop health and yield predictions.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/mohammed97ashraf/LLM_Agri_Bot" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Energy Demand Forecasting Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Energy&lt;/td&gt;
&lt;td&gt;Predicts energy usage to optimize grid management.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/yecchen/MIRAI" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Content Personalization Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Entertainment&lt;/td&gt;
&lt;td&gt;Recommends personalized media based on preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crosleythomas/MirrorGPT" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Legal Document Review Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legal&lt;/td&gt;
&lt;td&gt;Automates document review and highlights key clauses.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/firica/legalai" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recruitment Recommendation Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human Resources&lt;/td&gt;
&lt;td&gt;Suggests best-fit candidates for job openings.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/sentient-engineering/jobber" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Virtual Travel Assistant&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hospitality&lt;/td&gt;
&lt;td&gt;Plans travel itineraries based on preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/nirbar1985/ai-travel-agent" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Game Companion Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gaming&lt;/td&gt;
&lt;td&gt;Enhances player experience with real-time assistance.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/onjas-buidl/LLM-agent-game" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-Time Threat Detection Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cybersecurity&lt;/td&gt;
&lt;td&gt;Identifies potential threats and mitigates attacks.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/NVISOsecurity/cyber-security-llm-agents" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;E-commerce Personal Shopper Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;E-commerce&lt;/td&gt;
&lt;td&gt;Helps customers find products they’ll love.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/Hoanganhvu123/ShoppingGPT" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Logistics Optimization Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Supply Chain&lt;/td&gt;
&lt;td&gt;Plans efficient delivery routes and manages inventory.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/microsoft/OptiGuide" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vibe Hacking Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cybersecurity&lt;/td&gt;
&lt;td&gt;Autonomous Multi-Agent Based Red Team Testing Service.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/PurpleAILAB/Decepticon" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MediSuite-Ai-Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Health insurance&lt;/td&gt;
&lt;td&gt;A medical ai agent that helps automating the process of hospitals / insurance claiming workflow.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/MahmoudRabea13/MediSuite-Ai-Agent" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lina-Egyptian-Medical-Chatbot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Health insurance&lt;/td&gt;
&lt;td&gt;A medical ai agent that helps automating the process of hospitals / insurance claiming workflow.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/MahmoudRabea13/MediSuite-Ai-Agent" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/Code-GitHub-black?logo=github"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="framework-wise-usecases"&gt;Framework wise Usecases
&lt;/h2&gt;&lt;hr&gt;
&lt;h3 id="framework-name-crewai"&gt;&lt;strong&gt;Framework Name&lt;/strong&gt;: &lt;strong&gt;CrewAI&lt;/strong&gt;
&lt;/h3&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;GitHub&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;📧 Email Auto Responder Flow&lt;/td&gt;
&lt;td&gt;🗣️ Communication&lt;/td&gt;
&lt;td&gt;Automates email responses based on predefined criteria to enhance communication efficiency.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/email_auto_responder_flow" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📝 Meeting Assistant Flow&lt;/td&gt;
&lt;td&gt;🛠️ Productivity&lt;/td&gt;
&lt;td&gt;Assists in organizing and managing meetings, including scheduling and agenda preparation.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/meeting_assistant_flow" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔄 Self Evaluation Loop Flow&lt;/td&gt;
&lt;td&gt;👥 Human Resources&lt;/td&gt;
&lt;td&gt;Facilitates self-assessment processes within an organization, aiding in performance reviews.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/self_evaluation_loop_flow" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📈 Lead Score Flow&lt;/td&gt;
&lt;td&gt;💼 Sales&lt;/td&gt;
&lt;td&gt;Evaluates and scores potential leads to prioritize outreach in sales strategies.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/lead-score-flow" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📊 Marketing Strategy Generator&lt;/td&gt;
&lt;td&gt;📢 Marketing&lt;/td&gt;
&lt;td&gt;Develops marketing strategies by analyzing market trends and audience data.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/marketing_strategy" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📝 Job Posting Generator&lt;/td&gt;
&lt;td&gt;🧑‍💼 Recruitment&lt;/td&gt;
&lt;td&gt;Creates job postings by analyzing job requirements, aiding in recruitment processes.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/job-posting" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔄 Recruitment Workflow&lt;/td&gt;
&lt;td&gt;🧑‍💼 Recruitment&lt;/td&gt;
&lt;td&gt;Streamlines the recruitment process by automating various tasks involved in hiring.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/recruitment" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔍 Match Profile to Positions&lt;/td&gt;
&lt;td&gt;🧑‍💼 Recruitment&lt;/td&gt;
&lt;td&gt;Matches candidate profiles to suitable job positions to enhance recruitment efficiency.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/match_profile_to_positions" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📸 Instagram Post Generator&lt;/td&gt;
&lt;td&gt;📱 Social Media&lt;/td&gt;
&lt;td&gt;Generates and schedules Instagram posts automatically, streamlining social media management.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/instagram_post" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🌐 Landing Page Generator&lt;/td&gt;
&lt;td&gt;💻 Web Development&lt;/td&gt;
&lt;td&gt;Automates the creation of landing pages for websites, facilitating web development tasks.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/landing_page_generator" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎮 Game Builder Crew&lt;/td&gt;
&lt;td&gt;🎮 Game Development&lt;/td&gt;
&lt;td&gt;Assists in the development of games by automating certain aspects of game creation.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/game-builder-crew" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💹 Stock Analysis Tool&lt;/td&gt;
&lt;td&gt;💰 Finance&lt;/td&gt;
&lt;td&gt;Provides tools for analyzing stock market data to assist in financial decision-making.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/stock_analysis" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🗺️ Trip Planner&lt;/td&gt;
&lt;td&gt;✈️ Travel&lt;/td&gt;
&lt;td&gt;Assists in planning trips by organizing itineraries and managing travel details.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/trip_planner" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎁 Surprise Trip Planner&lt;/td&gt;
&lt;td&gt;✈️ Travel&lt;/td&gt;
&lt;td&gt;Plans surprise trips by selecting destinations and activities based on user preferences.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/surprise_trip" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📚 Write a Book with Flows&lt;/td&gt;
&lt;td&gt;✍️ Creative Writing&lt;/td&gt;
&lt;td&gt;Assists authors in writing books by providing structured workflows and writing assistance.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/flows/write_a_book_with_flows" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🎬 Screenplay Writer&lt;/td&gt;
&lt;td&gt;✍️ Creative Writing&lt;/td&gt;
&lt;td&gt;Aids in writing screenplays by offering templates and guidance for script development.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/screenplay_writer" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;✅ Markdown Validator&lt;/td&gt;
&lt;td&gt;📄 Documentation&lt;/td&gt;
&lt;td&gt;Validates Markdown files to ensure proper formatting and adherence to standards.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/markdown_validator" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧠 Meta Quest Knowledge&lt;/td&gt;
&lt;td&gt;📚 Knowledge Management&lt;/td&gt;
&lt;td&gt;Manages and organizes knowledge related to Meta Quest, facilitating information retrieval.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/meta_quest_knowledge" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤖 NVIDIA Models Integration&lt;/td&gt;
&lt;td&gt;🤖 AI Integration&lt;/td&gt;
&lt;td&gt;Integrates NVIDIA AI models into workflows to enhance computational capabilities.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/integrations/nvidia_models" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🗂️ Prep for a Meeting&lt;/td&gt;
&lt;td&gt;🛠️ Productivity&lt;/td&gt;
&lt;td&gt;Assists in preparing for meetings by organizing materials and setting agendas.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/prep-for-a-meeting" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🛠️Starter Template&lt;/td&gt;
&lt;td&gt;🛠️ Development&lt;/td&gt;
&lt;td&gt;Provides a starter template for new projects to streamline the setup process.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/crews/starter_template" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔗CrewAI + LangGraph Integration&lt;/td&gt;
&lt;td&gt;🤖 AI Integration&lt;/td&gt;
&lt;td&gt;Demonstrates integration between CrewAI and LangGraph for enhanced workflow automation.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://github.com/crewAIInc/crewAI-examples/tree/main/integrations/CrewAI-LangGraph" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/GitHub-Repository-blue"
loading="lazy"
alt="GitHub"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="framework-name-autogen"&gt;&lt;strong&gt;Framework Name&lt;/strong&gt;: &lt;strong&gt;Autogen&lt;/strong&gt;
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Code Generation, Execution, and Debugging&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;th&gt;Industry&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🤖 Automated Task Solving with Code Generation, Execution &amp;amp; Debugging&lt;/td&gt;
&lt;td&gt;💻 Software Development&lt;/td&gt;
&lt;td&gt;Demonstrates automated task-solving by generating, executing, and debugging code.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_auto_feedback_from_code_execution" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧑‍💻 Automated Code Generation and Question Answering with Retrieval Augmented Agents&lt;/td&gt;
&lt;td&gt;💻 Software Development&lt;/td&gt;
&lt;td&gt;Generates code and answers questions using retrieval-augmented methods.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_RetrieveChat" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🧠 Automated Code Generation and Question Answering with Qdrant-based Retrieval&lt;/td&gt;
&lt;td&gt;💻 Software Development&lt;/td&gt;
&lt;td&gt;Utilizes Qdrant for enhanced retrieval-augmented agent performance.&lt;/td&gt;
&lt;td&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_RetrieveChat_qdrant" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Multi-Agent Collaboration (&amp;gt;3 Agents)&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤝 Automated Task Solving by Group Chat (3 members, 1 manager)&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates group task-solving via multi-agent collaboration.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_groupchat" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📊 Automated Data Visualization by Group Chat (3 members, 1 manager)&lt;/td&gt;
&lt;td style="text-align: left"&gt;📊 Data Analysis&lt;/td&gt;
&lt;td style="text-align: left"&gt;Uses multi-agent collaboration to create data visualizations.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_groupchat_vis" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧩 Automated Complex Task Solving by Group Chat (6 members, 1 manager)&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Solves complex tasks collaboratively with a larger group of agents.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_groupchat_research" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧑‍💻 Automated Task Solving with Coding &amp;amp; Planning Agents&lt;/td&gt;
&lt;td style="text-align: left"&gt;🛠️ Planning &amp;amp; Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Combines coding and planning agents for solving tasks effectively.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_planning.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📐 Automated Task Solving with Transition Paths Specified in a Graph&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Uses predefined transition paths in a graph for solving tasks.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/docs/notebooks/agentchat_groupchat_finite_state_machine" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Running a Group Chat as an Inner-Monologue via the SocietyOfMindAgent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Cognitive Sciences&lt;/td&gt;
&lt;td style="text-align: left"&gt;Simulates inner-monologue for problem-solving using group chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_society_of_mind" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔧 Running a Group Chat with Custom Speaker Selection Function&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Implements a custom function for speaker selection in group chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_groupchat_customized" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Sequential Multi-Agent Chats&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔄 Solving Multiple Tasks in a Sequence of Chats Initiated by a Single Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔄 Workflow Automation&lt;/td&gt;
&lt;td style="text-align: left"&gt;Automates sequential task-solving with a single initiating agent.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_multi_task_chats" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;⏳ Async-solving Multiple Tasks in a Sequence of Chats Initiated by a Single Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔄 Workflow Automation&lt;/td&gt;
&lt;td style="text-align: left"&gt;Handles asynchronous task-solving in a sequence of chats initiated by one agent.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_multi_task_async_chats" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤝 Solving Multiple Tasks in a Sequence of Chats Initiated by Different Agents&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔄 Workflow Automation&lt;/td&gt;
&lt;td style="text-align: left"&gt;Facilitates sequential task-solving with different agents initiating each chat.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchats_sequential_chats" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Nested Chats&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Solving Complex Tasks with Nested Chats&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Problem Solving&lt;/td&gt;
&lt;td style="text-align: left"&gt;Uses nested chats to solve hierarchical and complex problems.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nestedchat" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔄 Solving Complex Tasks with A Sequence of Nested Chats&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Problem Solving&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates sequential task-solving using nested chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nested_sequential_chats" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🏭 OptiGuide for Solving a Supply Chain Optimization Problem with Nested Chats&lt;/td&gt;
&lt;td style="text-align: left"&gt;🏭 Supply Chain Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;Showcases how to solve supply chain optimization problems using nested chats, a coding agent, and a safeguard agent.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nestedchat_optiguide" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;♟️ Conversational Chess with Nested Chats and Tool Use&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎮 Gaming&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explores the use of nested chats for playing conversational chess with integrated tools.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nested_chats_chess" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Application&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔄 Automated Continual Learning from New Data&lt;/td&gt;
&lt;td style="text-align: left"&gt;📊 Machine Learning&lt;/td&gt;
&lt;td style="text-align: left"&gt;Continuously learns from new data inputs for adaptive AI.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_stream.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🏭 OptiGuide - Coding, Tool Using, Safeguarding &amp;amp; Question Answering for Supply Chain Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;🏭 Supply Chain Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;Highlights a solution combining coding, tool use, and safeguarding for supply chain optimization.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nestedchat_optiguide" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 AutoAnny - A Discord bot built using AutoGen&lt;/td&gt;
&lt;td style="text-align: left"&gt;💬 Communication Tools&lt;/td&gt;
&lt;td style="text-align: left"&gt;Showcases the development of a Discord bot using AutoGen for enhanced interaction.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/tree/main/samples/apps/auto-anny" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🌐 Web Search: Solve Tasks Requiring Web Info&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔍 Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;Searches the web to gather information required for completing tasks.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_web_info.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔧 Use Provided Tools as Functions&lt;/td&gt;
&lt;td style="text-align: left"&gt;🛠️ Tool Integration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates how to use pre-provided tools as callable functions in AutoGen.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_function_call_currency_calculator" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔗 Use Tools via Sync and Async Function Calling&lt;/td&gt;
&lt;td style="text-align: left"&gt;🛠️ Tool Integration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Illustrates synchronous and asynchronous tool usage within AutoGen workflows.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_function_call_async" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧩 Task Solving with Langchain Provided Tools as Functions&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔍 Language Processing&lt;/td&gt;
&lt;td style="text-align: left"&gt;Leverages Langchain tools for task-solving within AutoGen.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_langchain.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📚 RAG: Group Chat with Retrieval Augmented Generation&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Enables group chat with Retrieval Augmented Generation (RAG) to support information sharing.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_groupchat_RAG" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;⚙️ Function Inception: Update/Remove Functions During Conversations&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔧 Development Tools&lt;/td&gt;
&lt;td style="text-align: left"&gt;Allows AutoGen agents to modify their functions dynamically during conversations.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_inception_function.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔊 Agent Chat with Whisper&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎙️ Audio Processing&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates AI agent capabilities for transcription and translation using Whisper.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_video_transcript_translate_with_whisper" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📏 Constrained Responses via Guidance&lt;/td&gt;
&lt;td style="text-align: left"&gt;💡 Natural Language Processing&lt;/td&gt;
&lt;td style="text-align: left"&gt;Shows how to use guidance to constrain responses generated by agents.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_guidance.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🌍 Browse the Web with Agents&lt;/td&gt;
&lt;td style="text-align: left"&gt;🌐 Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to configure agents to browse and retrieve information from the web.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_surfer.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📊 SQL: Natural Language Text to SQL Query Using Spider Benchmark&lt;/td&gt;
&lt;td style="text-align: left"&gt;💾 Database Management&lt;/td&gt;
&lt;td style="text-align: left"&gt;Converts natural language inputs into SQL queries using the Spider benchmark.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_sql_spider.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🕸️ Web Scraping with Apify&lt;/td&gt;
&lt;td style="text-align: left"&gt;🌐 Data Gathering&lt;/td&gt;
&lt;td style="text-align: left"&gt;Illustrates web scraping techniques with Apify using AutoGen.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_webscraping_with_apify" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🕷️ Web Crawling: Crawl Entire Domain with Spider API&lt;/td&gt;
&lt;td style="text-align: left"&gt;🌐 Data Gathering&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to crawl entire domains using the Spider API.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_webcrawling_with_spider" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;💻 Write a Software App Task by Task with Specially Designed Functions&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Builds a software application step-by-step using designed functions.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_function_call_code_writing.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Human Development&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;💬 Simple Example in ChatGPT Style&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Conversational AI&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates a simple conversational example in the style of ChatGPT.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/samples/simple_chat.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Example-blue?logo=openai"
loading="lazy"
alt="Example"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Auto Code Generation, Execution, Debugging and Human Feedback&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Showcases code generation, execution, debugging with human feedback integrated into the workflow.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_human_feedback.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;👥 Automated Task Solving with GPT-4 + Multiple Human Users&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Enables task solving with multiple human users collaborating with GPT-4.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_two_users.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔄 Agent Chat with Async Human Inputs&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Conversational AI&lt;/td&gt;
&lt;td style="text-align: left"&gt;Supports asynchronous human input during agent conversations.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/Async_human_input.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Agent Teaching and Learning&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📘 Teach Agents New Skills &amp;amp; Reuse via Automated Chat&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎓 Education &amp;amp; Training&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates teaching new skills to agents and enabling their reuse in automated chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_teaching" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Teach Agents New Facts, User Preferences and Skills Beyond Coding&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎓 Education &amp;amp; Training&lt;/td&gt;
&lt;td style="text-align: left"&gt;Shows how to teach agents new facts, user preferences, and non-coding skills.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_teachability" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Teach OpenAI Assistants Through GPTAssistantAgent&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 AI Assistant Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Illustrates how to enhance OpenAI assistants&amp;rsquo; capabilities using GPTAssistantAgent.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_teachable_oai_assistants.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔄 Agent Optimizer: Train Agents in an Agentic Way&lt;/td&gt;
&lt;td style="text-align: left"&gt;🛠️ Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to train agents effectively in an agentic manner using the Agent Optimizer.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_agentoptimizer.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Multi-Agent Chat with OpenAI Assistants in the loop&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🌟 Hello-World Chat with OpenAI Assistant in AutoGen&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤖 Conversational AI&lt;/td&gt;
&lt;td style="text-align: left"&gt;A basic example of chatting with OpenAI Assistant using AutoGen.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_oai_assistant_twoagents_basic.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔧 Chat with OpenAI Assistant using Function Call&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔧 Development Tools&lt;/td&gt;
&lt;td style="text-align: left"&gt;Illustrates how to use function calls with OpenAI Assistant in chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_oai_assistant_function_call.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Chat with OpenAI Assistant with Code Interpreter&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates the use of OpenAI Assistant as a code interpreter in chats.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_oai_code_interpreter.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔍 Chat with OpenAI Assistant with Retrieval Augmentation&lt;/td&gt;
&lt;td style="text-align: left"&gt;📚 Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;Enables retrieval-augmented conversations with OpenAI Assistant.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_oai_assistant_retrieval.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤝 OpenAI Assistant in a Group Chat&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤝 Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;Shows how OpenAI Assistant can collaborate with other agents in a group chat.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_oai_assistant_groupchat.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🛠️ GPTAssistantAgent based Multi-Agent Tool Use&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔧 Development Tools&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to use GPTAssistantAgent for multi-agent tool usage.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/gpt_assistant_agent_function_call.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Non-OpenAI Models&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;♟️ Conversational Chess using Non-OpenAI Models&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎮 Gaming&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explores conversational chess implemented with non-OpenAI models.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_nested_chats_chess_altmodels" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Multimodal Agent&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🎨 Multimodal Agent Chat with DALLE and GPT-4V&lt;/td&gt;
&lt;td style="text-align: left"&gt;🖼️ Multimedia AI&lt;/td&gt;
&lt;td style="text-align: left"&gt;Combines DALLE and GPT-4V for multimodal agent communication.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_dalle_and_gpt4v.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🖌️ Multimodal Agent Chat with Llava&lt;/td&gt;
&lt;td style="text-align: left"&gt;📷 Image Processing&lt;/td&gt;
&lt;td style="text-align: left"&gt;Uses Llava for enabling multimodal agent conversations with image processing.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_lmm_llava.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🖼️ Multimodal Agent Chat with GPT-4V&lt;/td&gt;
&lt;td style="text-align: left"&gt;🖼️ Multimedia AI&lt;/td&gt;
&lt;td style="text-align: left"&gt;Leverages GPT-4V for visual and conversational interactions in multimodal agents.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_lmm_gpt-4v.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Long Context Handling&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📜 Long Context Handling as A Capability&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI Capability&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates techniques for handling long context effectively within AI workflows.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/notebooks/agentchat_transform_messages" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Evaluation and Assessment&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📊 AgentEval: A Multi-Agent System for Assessing Utility of LLM-Powered Applications&lt;/td&gt;
&lt;td style="text-align: left"&gt;📈 Performance Evaluation&lt;/td&gt;
&lt;td style="text-align: left"&gt;Introduces AgentEval for evaluating and assessing the performance of LLM-based applications.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agenteval_cq_math.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Automatic Agent Building&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🏗️ Automatically Build Multi-agent System with AgentBuilder&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤖 AI Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to automatically build multi-agent systems using the AgentBuilder tool.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/autobuild_basic.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📚 Automatically Build Multi-agent System from Agent Library&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤖 AI Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;Shows how to construct multi-agent systems by leveraging a pre-defined agent library.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/autobuild_agent_library.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Observability&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📊 Track LLM Calls, Tool Usage, Actions and Errors using AgentOps&lt;/td&gt;
&lt;td style="text-align: left"&gt;📈 Monitoring &amp;amp; Analytics&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates how to monitor LLM interactions, tool usage, and errors using AgentOps.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_agentops.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Enhanced Inferences&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔗 API Unification&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔧 API Management&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains how to unify API usage with documentation and code examples.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/docs/Use-Cases/enhanced_inference/#api-unification" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Documentation-blue?logo=readthedocs"
loading="lazy"
alt="Documentation"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;⚙️ Utility Functions to Help Managing API Configurations Effectively&lt;/td&gt;
&lt;td style="text-align: left"&gt;🔧 API Management&lt;/td&gt;
&lt;td style="text-align: left"&gt;Demonstrates utility functions to manage API configurations more effectively.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://microsoft.github.io/autogen/0.2/docs/topics/llm_configuration" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;💰 Cost Calculation&lt;/td&gt;
&lt;td style="text-align: left"&gt;📈 Cost Management&lt;/td&gt;
&lt;td style="text-align: left"&gt;Introduces methods for tracking token usage and estimating costs for LLM interactions.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_cost_token_tracking.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;⚡ Optimize for Code Generation&lt;/td&gt;
&lt;td style="text-align: left"&gt;📊 Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;Highlights cost-effective optimization techniques for improving code generation with LLMs.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/oai_completion.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📐 Optimize for Math&lt;/td&gt;
&lt;td style="text-align: left"&gt;📊 Optimization&lt;/td&gt;
&lt;td style="text-align: left"&gt;Explains techniques to optimize LLM performance for solving mathematical problems.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/microsoft/autogen/blob/0.2/notebook/oai_chatgpt_gpt4.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/badge/View-Notebook-blue?logo=jupyter"
loading="lazy"
alt="Notebook"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="framework-name-agno"&gt;&lt;strong&gt;Framework Name&lt;/strong&gt;: &lt;strong&gt;Agno&lt;/strong&gt;
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;UseCase&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Support Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Development / AI / Framework Support&lt;/td&gt;
&lt;td style="text-align: left"&gt;The Agno Support Agent helps developers with the Agno framework by providing real-time answers, explanations, and code examples.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/agno_support_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🎥 YouTube Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;📺 Media &amp;amp; Content&lt;/td&gt;
&lt;td style="text-align: left"&gt;An intelligent agent that analyzes YouTube videos by generating detailed summaries, timestamps, themes, and content breakdowns using AI tools.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/youtube_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📊 Finance Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;💼 Finance&lt;/td&gt;
&lt;td style="text-align: left"&gt;An advanced AI-powered market analyst that delivers real-time stock market insights, analyst recommendations, financial deep-dives, and sector-specific trends. Supports prompts for detailed analysis of companies like AAPL, TSLA, NVDA, etc.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/thinking_finance_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📚 Study Partner&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎓 Education&lt;/td&gt;
&lt;td style="text-align: left"&gt;Assists users in learning by finding resources, answering questions, and creating study plans.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/study_partner.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🛍️ Shopping Partner Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🏬 E-commerce&lt;/td&gt;
&lt;td style="text-align: left"&gt;A product recommender agent that helps users find matching products based on preferences from trusted platforms like Amazon, Flipkart, etc.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/shopping_partner.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🎓 Research Scholar Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Education / Research&lt;/td&gt;
&lt;td style="text-align: left"&gt;An AI-powered academic assistant that performs advanced academic searches, analyzes recent publications, synthesizes findings across disciplines, and writes well-structured academic reports with proper citations.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/research_agent_exa.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Research Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🗞️ Media &amp;amp; Journalism&lt;/td&gt;
&lt;td style="text-align: left"&gt;A research agent that combines web search and professional journalistic writing. It performs deep investigations and produces NYT-style reports.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/research_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🍳 Recipe Creator&lt;/td&gt;
&lt;td style="text-align: left"&gt;🍽️ Food &amp;amp; Culinary&lt;/td&gt;
&lt;td style="text-align: left"&gt;An AI-powered recipe recommendation agent that provides personalized recipes based on ingredients, preferences, and time constraints.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/recipe_creator.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🗞️ Finance Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;💼 Finance&lt;/td&gt;
&lt;td style="text-align: left"&gt;A powerful financial analyst agent combining real-time stock data, analyst insights, company fundamentals, and market news. Ideal for analyzing companies like Apple, Tesla, NVIDIA, and sectors like semiconductors or automotive.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/finance_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Financial Reasoning Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;📈 Finance&lt;/td&gt;
&lt;td style="text-align: left"&gt;Uses a Claude-3.5 Sonnet-based agent to analyze stocks like NVDA using tools for reasoning and Yahoo Finance data.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/reasoning_finance_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Readme Generator Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Dev&lt;/td&gt;
&lt;td style="text-align: left"&gt;Agent generates high-quality READMEs for GitHub repositories using repo metadata.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/readme_generator.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🎬 Movie Recommendation Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🎥 Entertainment&lt;/td&gt;
&lt;td style="text-align: left"&gt;An intelligent agent that gives personalized movie recommendations using Exa and GPT-4o, analyzing genres, themes, and latest ratings.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/movie_recommedation.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔍 Media Trend Analysis Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;📰 Media &amp;amp; News&lt;/td&gt;
&lt;td style="text-align: left"&gt;Analyzes emerging trends, patterns, and influencers from digital platforms using AI-powered agents and scraping.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/media_trend_analysis_agent.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;⚖️ Legal Document Analysis Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🏛️ Legal Tech&lt;/td&gt;
&lt;td style="text-align: left"&gt;An AI agent that analyzes legal documents from PDF URLs and provides legal insights based on a knowledge base using vector embeddings and GPT-4o.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/legal_consultant.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤔 DeepKnowledge&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Research&lt;/td&gt;
&lt;td style="text-align: left"&gt;This agent performs iterative searches through its knowledge base, breaking down complex queries into sub-questions and synthesizing comprehensive answers. It uses Agno docs for demonstration and is designed for deep reasoning and exploration.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/deep_knowledge.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;📚 Book Recommendation Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 Publishing &amp;amp; Media&lt;/td&gt;
&lt;td style="text-align: left"&gt;An intelligent agent that provides personalized book suggestions using literary data, reader preferences, reviews, and release info.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/book_recommendation.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🏠 MCP Airbnb Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🛎️ Hospitality&lt;/td&gt;
&lt;td style="text-align: left"&gt;Create an AI Agent using MCP and Llama 4 to search Airbnb listings with filters like workspace &amp;amp; transport proximity.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/airbnb_mcp.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Assist Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI Framework&lt;/td&gt;
&lt;td style="text-align: left"&gt;An AI agent using GPT-4o to answer questions about the Agno framework with hybrid search and embedded knowledge.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/agno-agi/agno/blob/main/cookbook/examples/agents/agno_assist.py" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id="framework-name-langgraph"&gt;&lt;strong&gt;Framework Name&lt;/strong&gt;: &lt;strong&gt;Langgraph&lt;/strong&gt;
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;UseCase&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Use Case&lt;/th&gt;
&lt;th style="text-align: left"&gt;Industry&lt;/th&gt;
&lt;th style="text-align: left"&gt;Description&lt;/th&gt;
&lt;th style="text-align: left"&gt;Notebook&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 Chatbot Simulation Evaluation&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 💬 AI / Quality Assurance&lt;/td&gt;
&lt;td style="text-align: left"&gt;Simulate user interactions to evaluate chatbot performance, ensuring robustness and reliability in real-world scenarios.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/chatbot-simulation-evaluation/agent-simulation-evaluation.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Information Gathering via Prompting&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Research &amp;amp; Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to design a LangGraph workflow that utilizes prompting techniques to gather information effectively. It showcases how to structure prompts and manage the flow of information to build intelligent agents.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/chatbots/information-gather-prompting.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Code Assistant with LangGraph&lt;/td&gt;
&lt;td style="text-align: left"&gt;💻 Software Development&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a resilient code assistant using LangGraph. It guides you through creating a graph-based agent that can handle code generation, error checking, and iterative refinement, ensuring robust and accurate coding assistance.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/code_assistant/langgraph_code_assistant.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧑‍💼 Customer Support Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧑‍💼 Customer Support Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a customer support agent using LangGraph. It guides you through creating a graph-based agent that can handle customer inquiries, providing automated support and enhancing user experience.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/customer-support/customer-support.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🔁 Extraction with Retries&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Data Extraction&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to implement retry mechanisms in LangGraph workflows, ensuring robust data extraction processes that can handle transient errors and improve reliability.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/extraction/retries.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Multi-Agent Workflow&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a multi-agent system using LangGraph&amp;rsquo;s agent supervisor. It guides you through creating a supervisor agent that orchestrates multiple specialized agents, managing task delegation and communication flow.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/multi_agent/agent_supervisor.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Hierarchical Agent Teams&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a hierarchical agent system using LangGraph. It guides you through creating a top-level supervisor agent that delegates tasks to specialized sub-agents, enabling complex workflows with clear task delegation and communication.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/multi_agent/hierarchical_agent_teams.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤝 Multi-Agent Collaboration&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to implement multi-agent collaboration using LangGraph. It guides you through creating multiple specialized agents that work together to accomplish a complex task, showcasing the power of agent collaboration in AI workflows.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/multi_agent/multi-agent-collaboration.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Plan-and-Execute Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a &amp;ldquo;Plan-and-Execute&amp;rdquo; style agent using LangGraph. It guides you through creating an agent that first generates a multi-step plan and then executes each step sequentially, revisiting and modifying the plan as necessary. This approach is inspired by the Plan-and-Solve paper and the Baby-AGI project, aiming to enhance long-term planning and task execution in AI workflows.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/plan-and-execute/plan-and-execute.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 SQL Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Database Interaction&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build an agent that can answer questions about a SQL database. The agent fetches available tables, determines relevance to the question, retrieves schemas, generates a query, checks for errors, executes it, and formulates a response.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/sql-agent.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Reflection Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a reflection agent using LangGraph. It guides you through creating an agent that can critique and revise its own outputs, enhancing the quality and reliability of generated content.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/reflection/reflection.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 Reflexion Agent&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Workflow Orchestration&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build a reflexion agent using LangGraph. It guides you through creating an agent that can reflect on its actions and outcomes, enabling iterative improvement and more accurate decision-making in complex workflows.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/reflexion/reflexion.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;&lt;strong&gt;LangGraph Agentic RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Adaptive RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to build an Adaptive RAG system using LangGraph. It guides you through creating a dynamic retrieval process that adjusts based on query complexity, enhancing the efficiency and accuracy of information retrieval.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_adaptive_rag.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Adaptive RAG (Local)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial focuses on implementing Adaptive RAG with local models, allowing for offline retrieval and generation, which is crucial for environments with limited internet access or privacy concerns.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_adaptive_rag_local.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 &lt;strong&gt;Agentic RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤖 AI / Intelligent Agents&lt;/td&gt;
&lt;td style="text-align: left"&gt;Learn to build an Agentic RAG system where an agent determines the best retrieval strategy before generating a response, improving the relevance and accuracy of answers.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_agentic_rag.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🤖 &lt;strong&gt;Agentic RAG (Local)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🤖 AI / Intelligent Agents&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial extends Agentic RAG to local environments, enabling the use of local models and data sources for retrieval and generation tasks.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_agentic_rag_local.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Corrective RAG (CRAG)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;Implement a Corrective RAG system that evaluates and refines retrieved documents before passing them to the generator, ensuring higher-quality outputs.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_crag.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Corrective RAG (Local)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial focuses on building a Corrective RAG system using local resources, allowing for offline document evaluation and refinement processes.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_crag_local.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Self-RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;Learn to implement Self-RAG, where the system reflects on its responses and retrieves additional information if necessary, enhancing the accuracy and relevance of generated content.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_self_rag.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;🧠 &lt;strong&gt;Self-RAG (Local)&lt;/strong&gt;&lt;/td&gt;
&lt;td style="text-align: left"&gt;🧠 AI / Information Retrieval&lt;/td&gt;
&lt;td style="text-align: left"&gt;This tutorial demonstrates how to implement Self-RAG using local models and data sources, enabling offline reflection and retrieval processes.&lt;/td&gt;
&lt;td style="text-align: left"&gt;&lt;a class="link" href="https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials/rag/langgraph_self_rag_local.ipynb" target="_blank" rel="noopener"
&gt;&lt;img src="https://img.shields.io/static/v1?label=AI&amp;#43;Agent&amp;#43;Code&amp;amp;message=Python&amp;amp;color=%23244cd1"
loading="lazy"
alt="AI Agent Code - Python"
&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description></item><item><title>Free-LLM-API-Resources</title><link>https://hanguangwu.github.io/blog/en/p/free-llm-api-resources/</link><pubDate>Sat, 03 Jan 2026 15:34:25 -0800</pubDate><guid>https://hanguangwu.github.io/blog/en/p/free-llm-api-resources/</guid><description>&lt;h1 id="free-llm-api-resources"&gt;Free LLM API resources
&lt;/h1&gt;&lt;p&gt;This lists various services that provide free access or credits towards API-based LLM usage.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[!NOTE]&lt;br&gt;
Please don&amp;rsquo;t abuse these services, else we might lose them.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;[!WARNING]&lt;br&gt;
This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;a class="link" href="https://github.com/cheahjs/free-llm-api-resources" target="_blank" rel="noopener"
&gt;GitHub-Repo-A list of free LLM inference resources accessible via API.&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="free-providers"&gt;Free Providers
&lt;/h2&gt;&lt;h3 id="openrouter"&gt;&lt;a class="link" href="https://openrouter.ai" target="_blank" rel="noopener"
&gt;OpenRouter&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://openrouter.ai/docs/api-reference/limits" target="_blank" rel="noopener"
&gt;20 requests/minute&lt;br&gt;50 requests/day&lt;br&gt;Up to 1000 requests/day with $10 lifetime topup&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Models share a common quota.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/google/gemma-3-12b-it:free" target="_blank" rel="noopener"
&gt;Gemma 3 12B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/google/gemma-3-27b-it:free" target="_blank" rel="noopener"
&gt;Gemma 3 27B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/google/gemma-3-4b-it:free" target="_blank" rel="noopener"
&gt;Gemma 3 4B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/nousresearch/hermes-3-llama-3.1-405b:free" target="_blank" rel="noopener"
&gt;Hermes 3 Llama 3.1 405B&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/meta-llama/llama-3.1-405b-instruct:free" target="_blank" rel="noopener"
&gt;Llama 3.1 405B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/meta-llama/llama-3.2-3b-instruct:free" target="_blank" rel="noopener"
&gt;Llama 3.2 3B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/meta-llama/llama-3.3-70b-instruct:free" target="_blank" rel="noopener"
&gt;Llama 3.3 70B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/mistralai/mistral-7b-instruct:free" target="_blank" rel="noopener"
&gt;Mistral 7B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/mistralai/mistral-small-3.1-24b-instruct:free" target="_blank" rel="noopener"
&gt;Mistral Small 3.1 24B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/qwen/qwen-2.5-vl-7b-instruct:free" target="_blank" rel="noopener"
&gt;Qwen 2.5 VL 7B Instruct&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/alibaba/tongyi-deepresearch-30b-a3b:free" target="_blank" rel="noopener"
&gt;alibaba/tongyi-deepresearch-30b-a3b:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/allenai/olmo-3-32b-think:free" target="_blank" rel="noopener"
&gt;allenai/olmo-3-32b-think:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/allenai/olmo-3.1-32b-think:free" target="_blank" rel="noopener"
&gt;allenai/olmo-3.1-32b-think:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/arcee-ai/trinity-mini:free" target="_blank" rel="noopener"
&gt;arcee-ai/trinity-mini:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/cognitivecomputations/dolphin-mistral-24b-venice-edition:free" target="_blank" rel="noopener"
&gt;cognitivecomputations/dolphin-mistral-24b-venice-edition:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/deepseek/deepseek-r1-0528:free" target="_blank" rel="noopener"
&gt;deepseek/deepseek-r1-0528:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/google/gemma-3n-e2b-it:free" target="_blank" rel="noopener"
&gt;google/gemma-3n-e2b-it:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/google/gemma-3n-e4b-it:free" target="_blank" rel="noopener"
&gt;google/gemma-3n-e4b-it:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/kwaipilot/kat-coder-pro:free" target="_blank" rel="noopener"
&gt;kwaipilot/kat-coder-pro:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/mistralai/devstral-2512:free" target="_blank" rel="noopener"
&gt;mistralai/devstral-2512:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/moonshotai/kimi-k2:free" target="_blank" rel="noopener"
&gt;moonshotai/kimi-k2:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/nex-agi/deepseek-v3.1-nex-n1:free" target="_blank" rel="noopener"
&gt;nex-agi/deepseek-v3.1-nex-n1:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free" target="_blank" rel="noopener"
&gt;nvidia/nemotron-3-nano-30b-a3b:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/nvidia/nemotron-nano-12b-v2-vl:free" target="_blank" rel="noopener"
&gt;nvidia/nemotron-nano-12b-v2-vl:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/nvidia/nemotron-nano-9b-v2:free" target="_blank" rel="noopener"
&gt;nvidia/nemotron-nano-9b-v2:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/openai/gpt-oss-120b:free" target="_blank" rel="noopener"
&gt;openai/gpt-oss-120b:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/openai/gpt-oss-20b:free" target="_blank" rel="noopener"
&gt;openai/gpt-oss-20b:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/qwen/qwen3-4b:free" target="_blank" rel="noopener"
&gt;qwen/qwen3-4b:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/qwen/qwen3-coder:free" target="_blank" rel="noopener"
&gt;qwen/qwen3-coder:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/tngtech/deepseek-r1t-chimera:free" target="_blank" rel="noopener"
&gt;tngtech/deepseek-r1t-chimera:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/tngtech/deepseek-r1t2-chimera:free" target="_blank" rel="noopener"
&gt;tngtech/deepseek-r1t2-chimera:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/tngtech/tng-r1t-chimera:free" target="_blank" rel="noopener"
&gt;tngtech/tng-r1t-chimera:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/xiaomi/mimo-v2-flash:free" target="_blank" rel="noopener"
&gt;xiaomi/mimo-v2-flash:free&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://openrouter.ai/z-ai/glm-4.5-air:free" target="_blank" rel="noopener"
&gt;z-ai/glm-4.5-air:free&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-ai-studio"&gt;&lt;a class="link" href="https://aistudio.google.com" target="_blank" rel="noopener"
&gt;Google AI Studio&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;Data is used for training when used outside of the UK/CH/EEA/EU.&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model Name&lt;/th&gt;&lt;th&gt;Model Limits&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Gemini 3 Flash&lt;/td&gt;&lt;td&gt;250,000 tokens/minute&lt;br&gt;20 requests/day&lt;br&gt;5 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;&lt;td&gt;250,000 tokens/minute&lt;br&gt;20 requests/day&lt;br&gt;5 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemini 2.5 Flash-Lite&lt;/td&gt;&lt;td&gt;250,000 tokens/minute&lt;br&gt;20 requests/day&lt;br&gt;10 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemma 3 27B Instruct&lt;/td&gt;&lt;td&gt;15,000 tokens/minute&lt;br&gt;14,400 requests/day&lt;br&gt;30 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemma 3 12B Instruct&lt;/td&gt;&lt;td&gt;15,000 tokens/minute&lt;br&gt;14,400 requests/day&lt;br&gt;30 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemma 3 4B Instruct&lt;/td&gt;&lt;td&gt;15,000 tokens/minute&lt;br&gt;14,400 requests/day&lt;br&gt;30 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Gemma 3 1B Instruct&lt;/td&gt;&lt;td&gt;15,000 tokens/minute&lt;br&gt;14,400 requests/day&lt;br&gt;30 requests/minute&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3 id="nvidia-nim"&gt;&lt;a class="link" href="https://build.nvidia.com/explore/discover" target="_blank" rel="noopener"
&gt;NVIDIA NIM&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;Phone number verification required.
Models tend to be context window limited.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; 40 requests/minute&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://build.nvidia.com/models" target="_blank" rel="noopener"
&gt;Various open models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="mistral-la-plateforme"&gt;&lt;a class="link" href="https://console.mistral.ai/" target="_blank" rel="noopener"
&gt;Mistral (La Plateforme)&lt;/a&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Free tier (Experiment plan) requires opting into data training&lt;/li&gt;
&lt;li&gt;Requires phone number verification.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Limits (per-model):&lt;/strong&gt; 1 request/second, 500,000 tokens/minute, 1,000,000,000 tokens/month&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://docs.mistral.ai/getting-started/models/models_overview/" target="_blank" rel="noopener"
&gt;Open and Proprietary Mistral models&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="mistral-codestral"&gt;&lt;a class="link" href="https://codestral.mistral.ai/" target="_blank" rel="noopener"
&gt;Mistral (Codestral)&lt;/a&gt;
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;Currently free to use&lt;/li&gt;
&lt;li&gt;Monthly subscription based&lt;/li&gt;
&lt;li&gt;Requires phone number verification&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; 30 requests/minute, 2,000 requests/day&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Codestral&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="huggingface-inference-providers"&gt;&lt;a class="link" href="https://huggingface.co/docs/inference-providers/en/index" target="_blank" rel="noopener"
&gt;HuggingFace Inference Providers&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;HuggingFace Serverless Inference limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; &lt;a class="link" href="https://huggingface.co/docs/inference-providers/en/pricing" target="_blank" rel="noopener"
&gt;$0.10/month in credits&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Various open models across supported providers&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="vercel-ai-gateway"&gt;&lt;a class="link" href="https://vercel.com/docs/ai-gateway" target="_blank" rel="noopener"
&gt;Vercel AI Gateway&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;Routes to various supported providers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; &lt;a class="link" href="https://vercel.com/docs/ai-gateway/pricing" target="_blank" rel="noopener"
&gt;$5/month&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="cerebras"&gt;&lt;a class="link" href="https://cloud.cerebras.ai/" target="_blank" rel="noopener"
&gt;Cerebras&lt;/a&gt;
&lt;/h3&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model Name&lt;/th&gt;&lt;th&gt;Model Limits&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;gpt-oss-120b&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;60,000 tokens/minute&lt;br&gt;900 requests/hour&lt;br&gt;1,000,000 tokens/hour&lt;br&gt;14,400 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Qwen 3 235B A22B Instruct&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;60,000 tokens/minute&lt;br&gt;900 requests/hour&lt;br&gt;1,000,000 tokens/hour&lt;br&gt;14,400 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 3.3 70B&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;64,000 tokens/minute&lt;br&gt;900 requests/hour&lt;br&gt;1,000,000 tokens/hour&lt;br&gt;14,400 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Qwen 3 32B&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;64,000 tokens/minute&lt;br&gt;900 requests/hour&lt;br&gt;1,000,000 tokens/hour&lt;br&gt;14,400 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 3.1 8B&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;60,000 tokens/minute&lt;br&gt;900 requests/hour&lt;br&gt;1,000,000 tokens/hour&lt;br&gt;14,400 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Z.ai GLM-4.6&lt;/td&gt;&lt;td&gt;10 requests/minute&lt;br&gt;60,000 tokens/minute&lt;br&gt;100 requests/hour&lt;br&gt;100,000 tokens/hour&lt;br&gt;100 requests/day&lt;br&gt;1,000,000 tokens/day&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3 id="groq"&gt;&lt;a class="link" href="https://console.groq.com" target="_blank" rel="noopener"
&gt;Groq&lt;/a&gt;
&lt;/h3&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model Name&lt;/th&gt;&lt;th&gt;Model Limits&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Allam 2 7B&lt;/td&gt;&lt;td&gt;7,000 requests/day&lt;br&gt;6,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 3.1 8B&lt;/td&gt;&lt;td&gt;14,400 requests/day&lt;br&gt;6,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 3.3 70B&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;12,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 4 Maverick 17B 128E Instruct&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;6,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Llama 4 Scout Instruct&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;30,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Whisper Large v3&lt;/td&gt;&lt;td&gt;7,200 audio-seconds/minute&lt;br&gt;2,000 requests/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Whisper Large v3 Turbo&lt;/td&gt;&lt;td&gt;7,200 audio-seconds/minute&lt;br&gt;2,000 requests/day&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;canopylabs/orpheus-arabic-saudi&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;canopylabs/orpheus-v1-english&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;groq/compound&lt;/td&gt;&lt;td&gt;250 requests/day&lt;br&gt;70,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;groq/compound-mini&lt;/td&gt;&lt;td&gt;250 requests/day&lt;br&gt;70,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;meta-llama/llama-guard-4-12b&lt;/td&gt;&lt;td&gt;14,400 requests/day&lt;br&gt;15,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;meta-llama/llama-prompt-guard-2-22m&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;meta-llama/llama-prompt-guard-2-86m&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;moonshotai/kimi-k2-instruct&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;10,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;moonshotai/kimi-k2-instruct-0905&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;10,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;openai/gpt-oss-120b&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;8,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;openai/gpt-oss-20b&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;8,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;openai/gpt-oss-safeguard-20b&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;8,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;qwen/qwen3-32b&lt;/td&gt;&lt;td&gt;1,000 requests/day&lt;br&gt;6,000 tokens/minute&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3 id="cohere"&gt;&lt;a class="link" href="https://cohere.com" target="_blank" rel="noopener"
&gt;Cohere&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://docs.cohere.com/docs/rate-limits" target="_blank" rel="noopener"
&gt;20 requests/minute&lt;br&gt;1,000 requests/month&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Models share a common monthly quota.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;c4ai-aya-expanse-32b&lt;/li&gt;
&lt;li&gt;c4ai-aya-expanse-8b&lt;/li&gt;
&lt;li&gt;c4ai-aya-vision-32b&lt;/li&gt;
&lt;li&gt;c4ai-aya-vision-8b&lt;/li&gt;
&lt;li&gt;command-a-03-2025&lt;/li&gt;
&lt;li&gt;command-a-reasoning-08-2025&lt;/li&gt;
&lt;li&gt;command-a-translate-08-2025&lt;/li&gt;
&lt;li&gt;command-a-vision-07-2025&lt;/li&gt;
&lt;li&gt;command-r-08-2024&lt;/li&gt;
&lt;li&gt;command-r-plus-08-2024&lt;/li&gt;
&lt;li&gt;command-r7b-12-2024&lt;/li&gt;
&lt;li&gt;command-r7b-arabic-02-2025&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="github-models"&gt;&lt;a class="link" href="https://github.com/marketplace/models" target="_blank" rel="noopener"
&gt;GitHub Models&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;Extremely restrictive input/output token limits.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; &lt;a class="link" href="https://docs.github.com/en/github-models/prototyping-with-ai-models#rate-limits" target="_blank" rel="noopener"
&gt;Dependent on Copilot subscription tier (Free/Pro/Pro+/Business/Enterprise)&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AI21 Jamba 1.5 Large&lt;/li&gt;
&lt;li&gt;Codestral 25.01&lt;/li&gt;
&lt;li&gt;Cohere Command A&lt;/li&gt;
&lt;li&gt;Cohere Command R 08-2024&lt;/li&gt;
&lt;li&gt;Cohere Command R+ 08-2024&lt;/li&gt;
&lt;li&gt;DeepSeek-R1&lt;/li&gt;
&lt;li&gt;DeepSeek-R1-0528&lt;/li&gt;
&lt;li&gt;DeepSeek-V3-0324&lt;/li&gt;
&lt;li&gt;Grok 3&lt;/li&gt;
&lt;li&gt;Grok 3 Mini&lt;/li&gt;
&lt;li&gt;Llama 4 Maverick 17B 128E Instruct FP8&lt;/li&gt;
&lt;li&gt;Llama 4 Scout 17B 16E Instruct&lt;/li&gt;
&lt;li&gt;Llama-3.2-11B-Vision-Instruct&lt;/li&gt;
&lt;li&gt;Llama-3.2-90B-Vision-Instruct&lt;/li&gt;
&lt;li&gt;Llama-3.3-70B-Instruct&lt;/li&gt;
&lt;li&gt;MAI-DS-R1&lt;/li&gt;
&lt;li&gt;Meta-Llama-3.1-405B-Instruct&lt;/li&gt;
&lt;li&gt;Meta-Llama-3.1-8B-Instruct&lt;/li&gt;
&lt;li&gt;Ministral 3B&lt;/li&gt;
&lt;li&gt;Mistral Medium 3 (25.05)&lt;/li&gt;
&lt;li&gt;Mistral Small 3.1&lt;/li&gt;
&lt;li&gt;OpenAI GPT-4.1&lt;/li&gt;
&lt;li&gt;OpenAI GPT-4.1-mini&lt;/li&gt;
&lt;li&gt;OpenAI GPT-4.1-nano&lt;/li&gt;
&lt;li&gt;OpenAI GPT-4o&lt;/li&gt;
&lt;li&gt;OpenAI GPT-4o mini&lt;/li&gt;
&lt;li&gt;OpenAI Text Embedding 3 (large)&lt;/li&gt;
&lt;li&gt;OpenAI Text Embedding 3 (small)&lt;/li&gt;
&lt;li&gt;OpenAI gpt-5&lt;/li&gt;
&lt;li&gt;OpenAI gpt-5-chat (preview)&lt;/li&gt;
&lt;li&gt;OpenAI gpt-5-mini&lt;/li&gt;
&lt;li&gt;OpenAI gpt-5-nano&lt;/li&gt;
&lt;li&gt;OpenAI o1&lt;/li&gt;
&lt;li&gt;OpenAI o1-mini&lt;/li&gt;
&lt;li&gt;OpenAI o1-preview&lt;/li&gt;
&lt;li&gt;OpenAI o3&lt;/li&gt;
&lt;li&gt;OpenAI o3-mini&lt;/li&gt;
&lt;li&gt;OpenAI o4-mini&lt;/li&gt;
&lt;li&gt;Phi-4&lt;/li&gt;
&lt;li&gt;Phi-4-mini-instruct&lt;/li&gt;
&lt;li&gt;Phi-4-mini-reasoning&lt;/li&gt;
&lt;li&gt;Phi-4-multimodal-instruct&lt;/li&gt;
&lt;li&gt;Phi-4-reasoning&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="cloudflare-workers-ai"&gt;&lt;a class="link" href="https://developers.cloudflare.com/workers-ai" target="_blank" rel="noopener"
&gt;Cloudflare Workers AI&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Limits:&lt;/strong&gt; &lt;a class="link" href="https://developers.cloudflare.com/workers-ai/platform/pricing/#free-allocation" target="_blank" rel="noopener"
&gt;10,000 neurons/day&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;@cf/aisingapore/gemma-sea-lion-v4-27b-it&lt;/li&gt;
&lt;li&gt;@cf/ibm-granite/granite-4.0-h-micro&lt;/li&gt;
&lt;li&gt;@cf/openai/gpt-oss-120b&lt;/li&gt;
&lt;li&gt;@cf/openai/gpt-oss-20b&lt;/li&gt;
&lt;li&gt;@cf/qwen/qwen3-30b-a3b-fp8&lt;/li&gt;
&lt;li&gt;DeepSeek R1 Distill Qwen 32B&lt;/li&gt;
&lt;li&gt;Deepseek Coder 6.7B Base (AWQ)&lt;/li&gt;
&lt;li&gt;Deepseek Coder 6.7B Instruct (AWQ)&lt;/li&gt;
&lt;li&gt;Deepseek Math 7B Instruct&lt;/li&gt;
&lt;li&gt;Discolm German 7B v1 (AWQ)&lt;/li&gt;
&lt;li&gt;Falcom 7B Instruct&lt;/li&gt;
&lt;li&gt;Gemma 2B Instruct (LoRA)&lt;/li&gt;
&lt;li&gt;Gemma 3 12B Instruct&lt;/li&gt;
&lt;li&gt;Gemma 7B Instruct&lt;/li&gt;
&lt;li&gt;Gemma 7B Instruct (LoRA)&lt;/li&gt;
&lt;li&gt;Hermes 2 Pro Mistral 7B&lt;/li&gt;
&lt;li&gt;Llama 2 13B Chat (AWQ)&lt;/li&gt;
&lt;li&gt;Llama 2 7B Chat (FP16)&lt;/li&gt;
&lt;li&gt;Llama 2 7B Chat (INT8)&lt;/li&gt;
&lt;li&gt;Llama 2 7B Chat (LoRA)&lt;/li&gt;
&lt;li&gt;Llama 3 8B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3 8B Instruct (AWQ)&lt;/li&gt;
&lt;li&gt;Llama 3.1 8B Instruct (AWQ)&lt;/li&gt;
&lt;li&gt;Llama 3.1 8B Instruct (FP8)&lt;/li&gt;
&lt;li&gt;Llama 3.2 11B Vision Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.2 1B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.2 3B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.3 70B Instruct (FP8)&lt;/li&gt;
&lt;li&gt;Llama 4 Scout Instruct&lt;/li&gt;
&lt;li&gt;Llama Guard 3 8B&lt;/li&gt;
&lt;li&gt;LlamaGuard 7B (AWQ)&lt;/li&gt;
&lt;li&gt;Mistral 7B Instruct v0.1&lt;/li&gt;
&lt;li&gt;Mistral 7B Instruct v0.1 (AWQ)&lt;/li&gt;
&lt;li&gt;Mistral 7B Instruct v0.2&lt;/li&gt;
&lt;li&gt;Mistral 7B Instruct v0.2 (LoRA)&lt;/li&gt;
&lt;li&gt;Mistral Small 3.1 24B Instruct&lt;/li&gt;
&lt;li&gt;Neural Chat 7B v3.1 (AWQ)&lt;/li&gt;
&lt;li&gt;OpenChat 3.5 0106&lt;/li&gt;
&lt;li&gt;OpenHermes 2.5 Mistral 7B (AWQ)&lt;/li&gt;
&lt;li&gt;Phi-2&lt;/li&gt;
&lt;li&gt;Qwen 1.5 0.5B Chat&lt;/li&gt;
&lt;li&gt;Qwen 1.5 1.8B Chat&lt;/li&gt;
&lt;li&gt;Qwen 1.5 14B Chat (AWQ)&lt;/li&gt;
&lt;li&gt;Qwen 1.5 7B Chat (AWQ)&lt;/li&gt;
&lt;li&gt;Qwen 2.5 Coder 32B Instruct&lt;/li&gt;
&lt;li&gt;Qwen QwQ 32B&lt;/li&gt;
&lt;li&gt;SQLCoder 7B 2&lt;/li&gt;
&lt;li&gt;Starling LM 7B Beta&lt;/li&gt;
&lt;li&gt;TinyLlama 1.1B Chat v1.0&lt;/li&gt;
&lt;li&gt;Una Cybertron 7B v2 (BF16)&lt;/li&gt;
&lt;li&gt;Zephyr 7B Beta (AWQ)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="google-cloud-vertex-ai"&gt;&lt;a class="link" href="https://console.cloud.google.com/vertex-ai/model-garden" target="_blank" rel="noopener"
&gt;Google Cloud Vertex AI&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;Very stringent payment verification for Google Cloud.&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model Name&lt;/th&gt;&lt;th&gt;Model Limits&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama-3-2-90b-vision-instruct-maas" target="_blank"&gt;Llama 3.2 90B Vision Instruct&lt;/a&gt;&lt;/td&gt;&lt;td&gt;30 requests/minute&lt;br&gt;Free during preview&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama-3-1-405b-instruct-maas" target="_blank"&gt;Llama 3.1 70B Instruct&lt;/a&gt;&lt;/td&gt;&lt;td&gt;60 requests/minute&lt;br&gt;Free during preview&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;a href="https://console.cloud.google.com/vertex-ai/publishers/meta/model-garden/llama-3-1-405b-instruct-maas" target="_blank"&gt;Llama 3.1 8B Instruct&lt;/a&gt;&lt;/td&gt;&lt;td&gt;60 requests/minute&lt;br&gt;Free during preview&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h2 id="providers-with-trial-credits"&gt;Providers with trial credits
&lt;/h2&gt;&lt;h3 id="fireworks"&gt;&lt;a class="link" href="https://fireworks.ai/" target="_blank" rel="noopener"
&gt;Fireworks&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $1&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; &lt;a class="link" href="https://fireworks.ai/models" target="_blank" rel="noopener"
&gt;Various open models&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="baseten"&gt;&lt;a class="link" href="https://app.baseten.co/" target="_blank" rel="noopener"
&gt;Baseten&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $30&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; &lt;a class="link" href="https://www.baseten.co/library/" target="_blank" rel="noopener"
&gt;Any supported model - pay by compute time&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="nebius"&gt;&lt;a class="link" href="https://studio.nebius.com/" target="_blank" rel="noopener"
&gt;Nebius&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $1&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; &lt;a class="link" href="https://studio.nebius.ai/models" target="_blank" rel="noopener"
&gt;Various open models&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="novita"&gt;&lt;a class="link" href="https://novita.ai/?ref=ytblmjc&amp;amp;utm_source=affiliate" target="_blank" rel="noopener"
&gt;Novita&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $0.5 for 1 year&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; &lt;a class="link" href="https://novita.ai/models" target="_blank" rel="noopener"
&gt;Various open models&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="ai21"&gt;&lt;a class="link" href="https://studio.ai21.com/" target="_blank" rel="noopener"
&gt;AI21&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $10 for 3 months&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; Jamba family of models&lt;/p&gt;
&lt;h3 id="upstage"&gt;&lt;a class="link" href="https://console.upstage.ai/" target="_blank" rel="noopener"
&gt;Upstage&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $10 for 3 months&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; Solar Pro/Mini&lt;/p&gt;
&lt;h3 id="nlp-cloud"&gt;&lt;a class="link" href="https://nlpcloud.com/home" target="_blank" rel="noopener"
&gt;NLP Cloud&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $15&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; Phone number verification&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; Various open models&lt;/p&gt;
&lt;h3 id="alibaba-cloud-international-model-studio"&gt;&lt;a class="link" href="https://bailian.console.alibabacloud.com/" target="_blank" rel="noopener"
&gt;Alibaba Cloud (International) Model Studio&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; 1 million tokens/model&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; &lt;a class="link" href="https://www.alibabacloud.com/en/product/modelstudio" target="_blank" rel="noopener"
&gt;Various open and proprietary Qwen models&lt;/a&gt;&lt;/p&gt;
&lt;h3 id="modal"&gt;&lt;a class="link" href="https://modal.com" target="_blank" rel="noopener"
&gt;Modal&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $5/month upon sign up, $30/month with payment method added&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; Any supported model - pay by compute time&lt;/p&gt;
&lt;h3 id="inferencenet"&gt;&lt;a class="link" href="https://inference.net" target="_blank" rel="noopener"
&gt;Inference.net&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $1, $25 on responding to email survey&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt; Various open models&lt;/p&gt;
&lt;h3 id="hyperbolic"&gt;&lt;a class="link" href="https://app.hyperbolic.xyz/" target="_blank" rel="noopener"
&gt;Hyperbolic&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $1&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;DeepSeek V3&lt;/li&gt;
&lt;li&gt;DeepSeek V3 0324&lt;/li&gt;
&lt;li&gt;Llama 3 70B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.1 405B Base&lt;/li&gt;
&lt;li&gt;Llama 3.1 405B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.1 70B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.1 8B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.2 3B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.3 70B Instruct&lt;/li&gt;
&lt;li&gt;Pixtral 12B (2409)&lt;/li&gt;
&lt;li&gt;Qwen QwQ 32B&lt;/li&gt;
&lt;li&gt;Qwen2.5 72B Instruct&lt;/li&gt;
&lt;li&gt;Qwen2.5 Coder 32B Instruct&lt;/li&gt;
&lt;li&gt;Qwen2.5 VL 72B Instruct&lt;/li&gt;
&lt;li&gt;Qwen2.5 VL 7B Instruct&lt;/li&gt;
&lt;li&gt;deepseek-ai/deepseek-r1-0528&lt;/li&gt;
&lt;li&gt;openai/gpt-oss-120b&lt;/li&gt;
&lt;li&gt;openai/gpt-oss-120b-turbo&lt;/li&gt;
&lt;li&gt;openai/gpt-oss-20b&lt;/li&gt;
&lt;li&gt;qwen/qwen3-235b-a22b&lt;/li&gt;
&lt;li&gt;qwen/qwen3-235b-a22b-instruct-2507&lt;/li&gt;
&lt;li&gt;qwen/qwen3-coder-480b-a35b-instruct&lt;/li&gt;
&lt;li&gt;qwen/qwen3-next-80b-a3b-instruct&lt;/li&gt;
&lt;li&gt;qwen/qwen3-next-80b-a3b-thinking&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="sambanova-cloud"&gt;&lt;a class="link" href="https://cloud.sambanova.ai/" target="_blank" rel="noopener"
&gt;SambaNova Cloud&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; $5 for 3 months&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;E5-Mistral-7B-Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.1 8B&lt;/li&gt;
&lt;li&gt;Llama 3.3 70B&lt;/li&gt;
&lt;li&gt;Llama 3.3 70B&lt;/li&gt;
&lt;li&gt;Llama-4-Maverick-17B-128E-Instruct&lt;/li&gt;
&lt;li&gt;Qwen/Qwen3-235B&lt;/li&gt;
&lt;li&gt;Qwen/Qwen3-32B&lt;/li&gt;
&lt;li&gt;Whisper-Large-v3&lt;/li&gt;
&lt;li&gt;deepseek-ai/DeepSeek-R1-0528&lt;/li&gt;
&lt;li&gt;deepseek-ai/DeepSeek-R1-Distill-Llama-70B&lt;/li&gt;
&lt;li&gt;deepseek-ai/DeepSeek-V3-0324&lt;/li&gt;
&lt;li&gt;deepseek-ai/DeepSeek-V3.1&lt;/li&gt;
&lt;li&gt;deepseek-ai/DeepSeek-V3.1-Terminus&lt;/li&gt;
&lt;li&gt;openai/gpt-oss-120b&lt;/li&gt;
&lt;li&gt;tbd&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="scaleway-generative-apis"&gt;&lt;a class="link" href="https://console.scaleway.com/generative-api/models" target="_blank" rel="noopener"
&gt;Scaleway Generative APIs&lt;/a&gt;
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Credits:&lt;/strong&gt; 1,000,000 free tokens&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Models:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;BGE-Multilingual-Gemma2&lt;/li&gt;
&lt;li&gt;DeepSeek R1 Distill Llama 70B&lt;/li&gt;
&lt;li&gt;Gemma 3 27B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.1 8B Instruct&lt;/li&gt;
&lt;li&gt;Llama 3.3 70B Instruct&lt;/li&gt;
&lt;li&gt;Mistral Nemo 2407&lt;/li&gt;
&lt;li&gt;Pixtral 12B (2409)&lt;/li&gt;
&lt;li&gt;Whisper Large v3&lt;/li&gt;
&lt;li&gt;gpt-oss-120b&lt;/li&gt;
&lt;li&gt;holo2-30b-a3b&lt;/li&gt;
&lt;li&gt;mistral-small-3.2-24b-instruct-2506&lt;/li&gt;
&lt;li&gt;qwen3-235b-a22b-instruct-2507&lt;/li&gt;
&lt;li&gt;qwen3-coder-30b-a3b-instruct&lt;/li&gt;
&lt;li&gt;qwen3-embedding-8b&lt;/li&gt;
&lt;li&gt;voxtral-small-24b-2507&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Fundamentals-of-Large-Language-Models</title><link>https://hanguangwu.github.io/blog/en/p/fundamentals-of-large-language-models/</link><pubDate>Mon, 29 Dec 2025 12:34:25 -0800</pubDate><guid>https://hanguangwu.github.io/blog/en/p/fundamentals-of-large-language-models/</guid><description>&lt;h1 id="fundamentals-of-large-language-models"&gt;Fundamentals of Large Language Models
&lt;/h1&gt;&lt;p&gt;&lt;a class="link" href="https://datawhalechina.github.io/hello-agents/#/en/README_EN.md" target="_blank" rel="noopener"
&gt;Source Web Page&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The first two chapters introduced the definition and development history of agents. This chapter will focus entirely on large language models themselves to answer a key question: How do modern agents work? We will start from the basic definition of language models, and through learning these principles, lay a solid foundation for understanding how LLMs acquire powerful knowledge reserves and reasoning capabilities.&lt;/p&gt;
&lt;h2 id="31-language-models-and-transformer-architecture"&gt;3.1 Language Models and Transformer Architecture
&lt;/h2&gt;&lt;h3 id="311-from-n-gram-to-rnn"&gt;3.1.1 From N-gram to RNN
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Language Model (LM)&lt;/strong&gt; is the core of natural language processing, and its fundamental task is to calculate the probability of a word sequence (i.e., a sentence) appearing. A good language model can tell us what kind of sentences are fluent and natural. In multi-agent systems, language models are the foundation for agents to understand human instructions and generate responses. This section will review the evolution from classical statistical methods to modern deep learning models, laying a solid foundation for understanding the subsequent Transformer architecture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(1) Statistical Language Models and the N-gram Idea&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Before the rise of deep learning, statistical methods were the mainstream of language models. The core idea is that the probability of a sentence appearing equals the product of the conditional probabilities of each word in the sentence. For a sentence S composed of words $w_1,w_2,\cdots,w_m$, its probability P(S) can be expressed as:&lt;/p&gt;
$$P(S)=P(w_1,w_2,…,w_m)=P(w_1)⋅P(w_2∣w_1)⋅P(w_3∣w_1,w_2)⋯P(w_m∣w_1,…,w_{m−1})$$&lt;p&gt;This formula is called the chain rule of probability. However, directly calculating this formula is almost impossible because conditional probabilities like $P(w_m∣w_1,\cdots,w_{m−1})$ are too difficult to estimate from a corpus, as the word sequence $w_1,\cdots,w_{m−1}$ may have never appeared in the training data.&lt;/p&gt;
&lt;div align="center"&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-0.png" alt="Figure description" width="90%"/&gt;
&lt;p&gt;Figure 3.1 Schematic diagram of Markov assumption&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;To solve this problem, researchers introduced the &lt;strong&gt;Markov Assumption&lt;/strong&gt;. Its core idea is: we don&amp;rsquo;t need to trace back a word&amp;rsquo;s entire history; we can approximately assume that a word&amp;rsquo;s probability of appearing is only related to the limited $n−1$ words before it, as shown in Figure 3.1. Language models built on this assumption are called &lt;strong&gt;N-gram models&lt;/strong&gt;. Here, &amp;ldquo;N&amp;rdquo; represents the context window size we consider. Let&amp;rsquo;s look at some of the most common examples to understand this concept:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Bigram (when N=2)&lt;/strong&gt;: This is the simplest case, where we assume a word&amp;rsquo;s appearance is only related to the one word before it. Therefore, the complex conditional probability $P(w_i∣w_1,\cdots,w_{i−1})$ in the chain rule can be approximated to a more easily calculable form:&lt;/li&gt;
&lt;/ul&gt;
$$P(w_{i}∣w_{1},…,w_{i−1})≈P(w_{i}∣w_{i−1})$$&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Trigram (when N=3)&lt;/strong&gt;: Similarly, we assume a word&amp;rsquo;s appearance is only related to the two words before it:&lt;/li&gt;
&lt;/ul&gt;
$$P(w_i∣w_1,…,w_{i−1})≈P(w_i∣w_{i−2},w_{i−1})$$&lt;p&gt;These probabilities can be calculated through &lt;strong&gt;Maximum Likelihood Estimation (MLE)&lt;/strong&gt; in large corpora. This term sounds complex, but its idea is very intuitive: what is most likely to appear is what we see most often in the data. For example, for a Bigram model, we want to calculate the probability $P(w_i∣w_{i−1})$ that the next word is $w_i$ after word $w_{i−1}$ appears. According to maximum likelihood estimation, this probability can be estimated through simple counting:&lt;/p&gt;
$$P(w_i∣w_{i−1})=\frac{Count(w_{i−1},w_i)}{Count(w_{i−1})}$$&lt;p&gt;Here, the &lt;code&gt;Count()&lt;/code&gt; function represents &amp;ldquo;counting&amp;rdquo;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$Count(w_i−1,w_i)$: represents the total number of times the word pair $(w_{i−1},w_i)$ appears consecutively in the corpus.&lt;/li&gt;
&lt;li&gt;$Count(w_{i−1})$: represents the total number of times the single word $w_{i−1}$ appears in the corpus.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The formula&amp;rsquo;s meaning is: we use &amp;ldquo;the number of times word pair $Count(w_i−1,w_i)$ appears&amp;rdquo; divided by &amp;ldquo;the total number of times word $Count(w_{i−1})$ appears&amp;rdquo; as an approximate estimate of $P(w_i∣w_{i−1})$.&lt;/p&gt;
&lt;p&gt;To make this process more concrete, let&amp;rsquo;s manually perform a calculation. Suppose we have a mini corpus containing only the following two sentences: &lt;code&gt;datawhale agent learns&lt;/code&gt;, &lt;code&gt;datawhale agent works&lt;/code&gt;. Our goal is: using a Bigram (N=2) model, estimate the probability of the sentence &lt;code&gt;datawhale agent learns&lt;/code&gt; appearing. According to the Bigram assumption, we examine consecutive pairs of words (i.e., word pairs) each time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1: Calculate the probability of the first word&lt;/strong&gt; $P(datawhale)$ This is the number of times &lt;code&gt;datawhale&lt;/code&gt; appears divided by the total number of words. &lt;code&gt;datawhale&lt;/code&gt; appears 2 times, and the total number of words is 6.&lt;/p&gt;
$$P(\text{datawhale}) = \frac{\text{Number of "datawhale" in total corpus}}{\text{Total number of words in corpus}} = \frac{2}{6} \approx 0.333$$&lt;p&gt;&lt;strong&gt;Step 2: Calculate conditional probability&lt;/strong&gt; $P(agent∣datawhale)$ This is the number of times the word pair &lt;code&gt;datawhale agent&lt;/code&gt; appears divided by the total number of times &lt;code&gt;datawhale&lt;/code&gt; appears. &lt;code&gt;datawhale agent&lt;/code&gt; appears 2 times, &lt;code&gt;datawhale&lt;/code&gt; appears 2 times.&lt;/p&gt;
$$P(\text{agent}|\text{datawhale}) = \frac{\text{Count}(\text{datawhale agent})}{\text{Count}(\text{datawhale})} = \frac{2}{2} = 1$$&lt;p&gt;&lt;strong&gt;Step 3: Calculate conditional probability&lt;/strong&gt; $P(learns∣agent)$ This is the number of times the word pair &lt;code&gt;agent learns&lt;/code&gt; appears divided by the total number of times &lt;code&gt;agent&lt;/code&gt; appears. &lt;code&gt;agent learns&lt;/code&gt; appears 1 time, &lt;code&gt;agent&lt;/code&gt; appears 2 times.&lt;/p&gt;
$$P(\text{learns}|\text{agent}) = \frac{\text{Count(agent learns)}}{\text{Count(agent)}} = \frac{1}{2} = 0.5$$&lt;p&gt;&lt;strong&gt;Finally: Multiply the probabilities&lt;/strong&gt; So, the approximate probability of the entire sentence is:&lt;/p&gt;
$$P(\text{datawhale agent learns}) \approx P(\text{datawhale}) \cdot P(\text{agent}|\text{datawhale}) \cdot P(\text{learns}|\text{agent}) \approx 0.333 \cdot 1 \cdot 0.5 \approx 0.167$$&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;span class="lnt"&gt;28
&lt;/span&gt;&lt;span class="lnt"&gt;29
&lt;/span&gt;&lt;span class="lnt"&gt;30
&lt;/span&gt;&lt;span class="lnt"&gt;31
&lt;/span&gt;&lt;span class="lnt"&gt;32
&lt;/span&gt;&lt;span class="lnt"&gt;33
&lt;/span&gt;&lt;span class="lnt"&gt;34
&lt;/span&gt;&lt;span class="lnt"&gt;35
&lt;/span&gt;&lt;span class="lnt"&gt;36
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Example corpus, consistent with the corpus in the case explanation above&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;corpus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;datawhale agent learns datawhale agent works&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;corpus&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;total_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Step 1: Calculate P(datawhale) ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;count_datawhale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datawhale&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;p_datawhale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;count_datawhale&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;total_tokens&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Step 1: P(datawhale) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;count_datawhale&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_datawhale&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Step 2: Calculate P(agent|datawhale) ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# First calculate bigrams for subsequent steps&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;bigrams&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;bigram_counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bigrams&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;count_datawhale_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bigram_counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datawhale&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;agent&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# count_datawhale was already calculated in step 1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;p_agent_given_datawhale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;count_datawhale_agent&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;count_datawhale&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Step 2: P(agent|datawhale) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;count_datawhale_agent&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;count_datawhale&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_agent_given_datawhale&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Step 3: Calculate P(learns|agent) ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;count_agent_learns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bigram_counts&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;agent&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;learns&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;count_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;agent&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;p_learns_given_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;count_agent_learns&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;count_agent&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Step 3: P(learns|agent) = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;count_agent_learns&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;count_agent&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_learns_given_agent&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Finally: Multiply the probabilities ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;p_sentence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_datawhale&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p_agent_given_datawhale&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p_learns_given_agent&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Finally: P(&amp;#39;datawhale agent learns&amp;#39;) ≈ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_datawhale&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; * &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_agent_given_datawhale&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; * &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_learns_given_agent&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; = &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;p_sentence&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.3f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Step&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datawhale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.333&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Step&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;datawhale&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Step&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;learns&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.500&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;datawhale agent learns&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="err"&gt;≈&lt;/span&gt; &lt;span class="mf"&gt;0.333&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;0.500&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.167&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;N-gram models, although simple and effective, have two fatal flaws:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Data Sparsity&lt;/strong&gt;: If a word sequence has never appeared in the corpus, its probability estimate is 0, which is obviously unreasonable. Although this can be alleviated through smoothing techniques, it cannot be eradicated.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Poor Generalization Ability&lt;/strong&gt;: The model cannot understand semantic similarity between words. For example, even if the model has seen &lt;code&gt;agent learns&lt;/code&gt; many times in the corpus, it cannot generalize this knowledge to semantically similar words. When we calculate the probability of &lt;code&gt;robot learns&lt;/code&gt;, if the word &lt;code&gt;robot&lt;/code&gt; has never appeared, or if the combination &lt;code&gt;robot learns&lt;/code&gt; has never appeared, the probability calculated by the model will also be zero. The model cannot understand the semantic similarity between &lt;code&gt;agent&lt;/code&gt; and &lt;code&gt;robot&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;(2) Neural Network Language Models and Word Embeddings&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The fundamental flaw of N-gram models is that they treat words as isolated, discrete symbols. To overcome this problem, researchers turned to neural networks and proposed an idea: represent words with continuous vectors. In 2003, the &lt;strong&gt;Feedforward Neural Network Language Model&lt;/strong&gt; proposed by Bengio et al. was a milestone in this field&lt;sup&gt;[1]&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Its core idea can be divided into two steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Build a semantic space&lt;/strong&gt;: Create a high-dimensional continuous vector space, then map each word in the vocabulary to a point in that space. This point (i.e., vector) is called a &lt;strong&gt;Word Embedding&lt;/strong&gt; or word vector. In this space, semantically similar words have vectors that are close together in position. For example, the vectors of &lt;code&gt;agent&lt;/code&gt; and &lt;code&gt;robot&lt;/code&gt; will be very close, while the vectors of &lt;code&gt;agent&lt;/code&gt; and &lt;code&gt;apple&lt;/code&gt; will be far apart.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Learn the mapping from context to the next word&lt;/strong&gt;: Utilize the powerful fitting ability of neural networks to learn a function. The input of this function is the word vectors of the previous $n−1$ words, and the output is the probability distribution of each word in the vocabulary appearing after the current context.&lt;/li&gt;
&lt;/ol&gt;
&lt;div align="center"&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-1.png" alt="Figure description" width="90%"/&gt;
&lt;p&gt;Figure 3.2 Schematic diagram of neural network language model architecture&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;As shown in Figure 3.2, in this architecture, word embeddings are automatically learned during model training. To complete the task of &amp;ldquo;predicting the next word,&amp;rdquo; the model continuously adjusts the vector position of each word, ultimately making these vectors contain rich semantic information. Once we convert words into vectors, we can use mathematical tools to measure the relationships between them. The most commonly used method is &lt;strong&gt;Cosine Similarity&lt;/strong&gt;, which measures their similarity by calculating the cosine of the angle between two vectors.&lt;/p&gt;
$$\text{similarity}(\vec{a}, \vec{b}) = \cos(\theta) = \frac{\vec{a} \cdot \vec{b}}{|\vec{a}| |\vec{b}|}$$&lt;p&gt;The meaning of this formula is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If two vectors have exactly the same direction, the angle is 0°, the cosine value is 1, indicating complete correlation.&lt;/li&gt;
&lt;li&gt;If two vectors are orthogonal, the angle is 90°, the cosine value is 0, indicating no relationship.&lt;/li&gt;
&lt;li&gt;If two vectors have completely opposite directions, the angle is 180°, the cosine value is -1, indicating complete negative correlation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Through this method, word vectors can not only capture simple relationships like &amp;ldquo;synonyms&amp;rdquo; but also capture more complex analogical relationships.&lt;/p&gt;
&lt;p&gt;A famous example demonstrates the semantic relationships captured by word vectors: &lt;code&gt;vector('King') - vector('Man') + vector('Woman')&lt;/code&gt; The result of this vector operation is surprisingly close to the position of &lt;code&gt;vector('Queen')&lt;/code&gt; in the vector space. This is like performing semantic translation: we start from the point &amp;ldquo;king,&amp;rdquo; subtract the vector of &amp;ldquo;male,&amp;rdquo; add the vector of &amp;ldquo;female,&amp;rdquo; and finally arrive at the position of &amp;ldquo;queen.&amp;rdquo; This proves that word embeddings can learn abstract concepts like &amp;ldquo;gender&amp;rdquo; and &amp;ldquo;royalty.&amp;rdquo;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Assume we have learned simplified 2D word vectors&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;king&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;queen&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;man&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;woman&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;dot_product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;norm_product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linalg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vec2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dot_product&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;norm_product&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# king - man + woman&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;result_vec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;king&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;man&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;woman&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Calculate similarity between result vector and &amp;#34;queen&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;sim&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_vec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;queen&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Result vector of king - man + woman: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result_vec&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Similarity of this result with &amp;#39;queen&amp;#39;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sim&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;.4f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Result&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;king&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;man&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;woman&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Similarity&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;queen&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;1.0000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Neural network language models successfully solved the poor generalization problem of N-gram models through word embeddings. However, they still have a limitation similar to N-gram: the context window is fixed. They can only consider a fixed number of preceding words, which laid the groundwork for recurrent neural networks that can handle sequences of arbitrary length.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(3) Recurrent Neural Networks (RNN) and Long Short-Term Memory Networks (LSTM)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Although the neural network language model in the previous section introduced word embeddings to solve the generalization problem, like N-gram models, its context window is of fixed size. To predict the next word, it can only see the previous n−1 words, and earlier historical information is discarded. This obviously does not conform to how we humans understand language. To break the limitation of fixed windows, &lt;strong&gt;Recurrent Neural Networks (RNN)&lt;/strong&gt; emerged, with a very intuitive core idea: add &amp;ldquo;memory&amp;rdquo; capability to the network&lt;sup&gt;[2]&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;As shown in Figure 3.3, RNN&amp;rsquo;s design introduces a &lt;strong&gt;hidden state&lt;/strong&gt; vector, which we can understand as the network&amp;rsquo;s short-term memory. At each step of processing the sequence, the network reads the current input word and combines it with its memory from the previous moment (i.e., the hidden state from the previous time step), then generates a new memory (i.e., the hidden state of the current time step) to pass to the next moment. This cyclical process allows information to continuously propagate backward through the sequence.&lt;/p&gt;
&lt;div align="center"&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-2.png" alt="Figure description" width="90%"/&gt;
&lt;p&gt;Figure 3.3 Schematic diagram of RNN structure&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;However, standard RNNs have a serious problem in practice: the &lt;strong&gt;Long-term Dependency Problem&lt;/strong&gt;. During training, the model needs to adjust weights deep in the network based on errors at the output end through the backpropagation algorithm. For RNNs, the length of the sequence is the depth of the network. When the sequence is very long, gradients undergo multiple multiplications during backward propagation, which causes gradient values to rapidly approach zero (&lt;strong&gt;gradient vanishing&lt;/strong&gt;) or become extremely large (&lt;strong&gt;gradient explosion&lt;/strong&gt;). Gradient vanishing prevents the model from effectively learning the impact of early sequence information on later outputs, making it difficult to capture long-distance dependencies.&lt;/p&gt;
&lt;p&gt;To solve the long-term dependency problem, &lt;strong&gt;Long Short-Term Memory (LSTM)&lt;/strong&gt; was designed&lt;sup&gt;[3]&lt;/sup&gt;. LSTM is a special type of RNN, and its core innovation lies in introducing &lt;strong&gt;Cell State&lt;/strong&gt; and a sophisticated &lt;strong&gt;Gating Mechanism&lt;/strong&gt;. The cell state can be seen as an information pathway independent of the hidden state, allowing information to pass more smoothly between time steps. The gating mechanism consists of several small neural networks that can learn how to selectively let information through, thereby controlling the addition and removal of information in the cell state. These gates include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Forget Gate&lt;/strong&gt;: Decides which information to discard from the cell state of the previous moment.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Input Gate&lt;/strong&gt;: Decides which new information from the current input to store in the cell state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Output Gate&lt;/strong&gt;: Decides which information to output to the hidden state based on the current cell state.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="312-transformer-architecture-analysis"&gt;3.1.2 Transformer Architecture Analysis
&lt;/h3&gt;&lt;p&gt;In the previous section, we saw that RNNs and LSTMs process sequential data by introducing recurrent structures, which to some extent solved the problem of capturing long-distance dependencies. However, this recurrent computation method also brought new bottlenecks: it must process data sequentially. The computation at time step t must wait for time step t−1 to complete before it can begin. This means RNNs cannot perform large-scale parallel computation and are inefficient when processing long sequences, which greatly limits the improvement of model scale and training speed. Transformer was proposed by the Google team in 2017&lt;sup&gt;[4]&lt;/sup&gt;. It completely abandoned the recurrent structure and instead relied entirely on a mechanism called &lt;strong&gt;Attention&lt;/strong&gt; to capture dependencies within sequences, thereby achieving truly parallel computation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(1) Overall Encoder-Decoder Structure&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The original Transformer model was designed for the end-to-end task of machine translation. As shown in Figure 3.4, it follows a classic &lt;strong&gt;Encoder-Decoder&lt;/strong&gt; architecture at the macro level.&lt;/p&gt;
&lt;div align="center"&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-3.png" alt="Figure description" width="50%"/&gt;
&lt;p&gt;Figure 3.4 Overall Transformer architecture diagram&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;We can understand this structure as a team with clear division of labor:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Encoder&lt;/strong&gt;: The task is to &amp;ldquo;&lt;strong&gt;understand&lt;/strong&gt;&amp;rdquo; the entire input sentence. It reads all input tokens (this concept will be introduced in Section 3.2.2) and ultimately generates a vector representation rich in contextual information for each token.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Decoder&lt;/strong&gt;: The task is to &amp;ldquo;&lt;strong&gt;generate&lt;/strong&gt;&amp;rdquo; the target sentence. It references the preceding text it has already generated and &amp;ldquo;consults&amp;rdquo; the encoder&amp;rsquo;s understanding results to generate the next word.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To truly understand how Transformer works, the best method is to implement it yourself. In this section, we will adopt a &amp;ldquo;top-down&amp;rdquo; approach: first, we build the complete code framework of Transformer, defining all necessary classes and methods. Then, like completing a puzzle, we will implement the specific functions of these classes one by one.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;span class="lnt"&gt;28
&lt;/span&gt;&lt;span class="lnt"&gt;29
&lt;/span&gt;&lt;span class="lnt"&gt;30
&lt;/span&gt;&lt;span class="lnt"&gt;31
&lt;/span&gt;&lt;span class="lnt"&gt;32
&lt;/span&gt;&lt;span class="lnt"&gt;33
&lt;/span&gt;&lt;span class="lnt"&gt;34
&lt;/span&gt;&lt;span class="lnt"&gt;35
&lt;/span&gt;&lt;span class="lnt"&gt;36
&lt;/span&gt;&lt;span class="lnt"&gt;37
&lt;/span&gt;&lt;span class="lnt"&gt;38
&lt;/span&gt;&lt;span class="lnt"&gt;39
&lt;/span&gt;&lt;span class="lnt"&gt;40
&lt;/span&gt;&lt;span class="lnt"&gt;41
&lt;/span&gt;&lt;span class="lnt"&gt;42
&lt;/span&gt;&lt;span class="lnt"&gt;43
&lt;/span&gt;&lt;span class="lnt"&gt;44
&lt;/span&gt;&lt;span class="lnt"&gt;45
&lt;/span&gt;&lt;span class="lnt"&gt;46
&lt;/span&gt;&lt;span class="lnt"&gt;47
&lt;/span&gt;&lt;span class="lnt"&gt;48
&lt;/span&gt;&lt;span class="lnt"&gt;49
&lt;/span&gt;&lt;span class="lnt"&gt;50
&lt;/span&gt;&lt;span class="lnt"&gt;51
&lt;/span&gt;&lt;span class="lnt"&gt;52
&lt;/span&gt;&lt;span class="lnt"&gt;53
&lt;/span&gt;&lt;span class="lnt"&gt;54
&lt;/span&gt;&lt;span class="lnt"&gt;55
&lt;/span&gt;&lt;span class="lnt"&gt;56
&lt;/span&gt;&lt;span class="lnt"&gt;57
&lt;/span&gt;&lt;span class="lnt"&gt;58
&lt;/span&gt;&lt;span class="lnt"&gt;59
&lt;/span&gt;&lt;span class="lnt"&gt;60
&lt;/span&gt;&lt;span class="lnt"&gt;61
&lt;/span&gt;&lt;span class="lnt"&gt;62
&lt;/span&gt;&lt;span class="lnt"&gt;63
&lt;/span&gt;&lt;span class="lnt"&gt;64
&lt;/span&gt;&lt;span class="lnt"&gt;65
&lt;/span&gt;&lt;span class="lnt"&gt;66
&lt;/span&gt;&lt;span class="lnt"&gt;67
&lt;/span&gt;&lt;span class="lnt"&gt;68
&lt;/span&gt;&lt;span class="lnt"&gt;69
&lt;/span&gt;&lt;span class="lnt"&gt;70
&lt;/span&gt;&lt;span class="lnt"&gt;71
&lt;/span&gt;&lt;span class="lnt"&gt;72
&lt;/span&gt;&lt;span class="lnt"&gt;73
&lt;/span&gt;&lt;span class="lnt"&gt;74
&lt;/span&gt;&lt;span class="lnt"&gt;75
&lt;/span&gt;&lt;span class="lnt"&gt;76
&lt;/span&gt;&lt;span class="lnt"&gt;77
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch.nn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;nn&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;math&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Placeholder modules, to be implemented in subsequent subsections ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PositionalEncoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Positional encoding module
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Multi-head attention mechanism module
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PositionWiseFeedForward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Position-wise feed-forward network module
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Encoder core layer ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EncoderLayer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_ff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;EncoderLayer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# To be implemented&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feed_forward&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PositionWiseFeedForward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# To be implemented&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Residual connection and layer normalization will be explained in detail in Section 3.1.2.4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 1. Multi-head self-attention&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 2. Feed-forward network&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;ff_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feed_forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ff_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# --- Decoder core layer ---&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DecoderLayer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_ff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DecoderLayer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# To be implemented&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cross_attn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# To be implemented&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feed_forward&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PositionWiseFeedForward&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# To be implemented&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LayerNorm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder_output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;src_mask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgt_mask&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 1. Masked multi-head self-attention (on itself)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;self_attn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgt_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 2. Cross-attention (on encoder output)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;cross_attn_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cross_attn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder_output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoder_output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;src_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cross_attn_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 3. Feed-forward network&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;ff_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;feed_forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;norm3&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ff_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;(2) From Self-Attention to Multi-Head Attention&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Now, let&amp;rsquo;s fill in the most critical module in the skeleton: the attention mechanism.&lt;/p&gt;
&lt;p&gt;Imagine we are reading this sentence: &amp;ldquo;The agent learns because &lt;strong&gt;it&lt;/strong&gt; is intelligent.&amp;rdquo; When we read the bolded &amp;ldquo;&lt;strong&gt;it&lt;/strong&gt;,&amp;rdquo; to understand its reference, our brain unconsciously places more attention on the word &amp;ldquo;agent&amp;rdquo; earlier in the sentence. The &lt;strong&gt;Self-Attention&lt;/strong&gt; mechanism is a mathematical modeling of this phenomenon. It allows the model to consider all other words in the sentence when processing each word and assign different &amp;ldquo;attention weights&amp;rdquo; to these words. The higher the weight of a word, the stronger its association with the current word, and the greater the proportion its information should occupy in the current word&amp;rsquo;s representation.&lt;/p&gt;
&lt;p&gt;To implement the above process, the self-attention mechanism introduces three learnable roles for each input token vector:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Query (Q)&lt;/strong&gt;: Represents the current token, which is actively &amp;ldquo;querying&amp;rdquo; other tokens to obtain information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key (K)&lt;/strong&gt;: Represents the &amp;ldquo;label&amp;rdquo; or &amp;ldquo;index&amp;rdquo; of tokens in the sentence that can be queried.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Value (V)&lt;/strong&gt;: Represents the &amp;ldquo;content&amp;rdquo; or &amp;ldquo;information&amp;rdquo; carried by the token itself.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These three vectors are all obtained by multiplying the original word embedding vector by three different, learnable weight matrices ($W^Q,W^K,W^V$). The entire computation process can be divided into the following steps, which we can imagine as an efficient open-book exam:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Prepare &amp;ldquo;exam questions&amp;rdquo; and &amp;ldquo;materials&amp;rdquo;: For each word in the sentence, generate its $Q,K,V$ vectors through weight matrices.&lt;/li&gt;
&lt;li&gt;Calculate relevance scores: To calculate the new representation of word $A$, use word $A$&amp;rsquo;s $Q$ vector to perform dot product operations with the $K$ vectors of all words in the sentence (including $A$ itself). This score reflects the importance of other words for understanding word $A$.&lt;/li&gt;
&lt;li&gt;Stabilization and normalization: Divide all obtained scores by a scaling factor $\sqrt{d_{k}}$ ($d_{k}$ is the dimension of the $K$ vector) to prevent gradients from being too small, then use the Softmax function to convert scores into weights that sum to 1, which is the normalization process.&lt;/li&gt;
&lt;li&gt;Weighted sum: Multiply the weights obtained in the previous step by each word&amp;rsquo;s corresponding $V$ vector, then add all results together. The final vector is the new representation of word $A$ after integrating global contextual information.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This process can be summarized by a concise formula:&lt;/p&gt;
$$\text{Attention}(Q,K,V)=\text{softmax}\left(\frac{QK^{T}}{\sqrt{d_{k}}}\right)V$$&lt;p&gt;If only one attention calculation is performed (i.e., single-head), the model may only learn to focus on one type of association. For example, when processing &amp;ldquo;it,&amp;rdquo; it might only learn to focus on the subject. But relationships in language are complex, and we want the model to simultaneously focus on multiple relationships (such as referential relationships, tense relationships, subordinate relationships, etc.). Multi-head attention mechanism emerged. Its idea is simple: instead of doing it all at once, divide it into several groups, do them separately, then merge.&lt;/p&gt;
&lt;p&gt;It splits the original Q, K, V vectors into h parts along the dimension (h is the number of &amp;ldquo;heads&amp;rdquo;), and each part independently performs a single-head attention calculation. This is like having h different &amp;ldquo;experts&amp;rdquo; examine the sentence from different perspectives, with each expert capturing a different feature relationship. Finally, the &amp;ldquo;opinions&amp;rdquo; (i.e., output vectors) of these h experts are concatenated, then integrated through a linear transformation to obtain the final output.&lt;/p&gt;
&lt;div align="center"&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-4.png" alt="Figure description" width="50%"/&gt;
&lt;p&gt;Figure 3.5 Multi-head attention mechanism&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;As shown in Figure 3.5, this design allows the model to jointly attend to information from different positions and different representation subspaces, greatly enhancing the model&amp;rsquo;s expressive power. Below is a simple implementation of multi-head attention for reference.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;span class="lnt"&gt;28
&lt;/span&gt;&lt;span class="lnt"&gt;29
&lt;/span&gt;&lt;span class="lnt"&gt;30
&lt;/span&gt;&lt;span class="lnt"&gt;31
&lt;/span&gt;&lt;span class="lnt"&gt;32
&lt;/span&gt;&lt;span class="lnt"&gt;33
&lt;/span&gt;&lt;span class="lnt"&gt;34
&lt;/span&gt;&lt;span class="lnt"&gt;35
&lt;/span&gt;&lt;span class="lnt"&gt;36
&lt;/span&gt;&lt;span class="lnt"&gt;37
&lt;/span&gt;&lt;span class="lnt"&gt;38
&lt;/span&gt;&lt;span class="lnt"&gt;39
&lt;/span&gt;&lt;span class="lnt"&gt;40
&lt;/span&gt;&lt;span class="lnt"&gt;41
&lt;/span&gt;&lt;span class="lnt"&gt;42
&lt;/span&gt;&lt;span class="lnt"&gt;43
&lt;/span&gt;&lt;span class="lnt"&gt;44
&lt;/span&gt;&lt;span class="lnt"&gt;45
&lt;/span&gt;&lt;span class="lnt"&gt;46
&lt;/span&gt;&lt;span class="lnt"&gt;47
&lt;/span&gt;&lt;span class="lnt"&gt;48
&lt;/span&gt;&lt;span class="lnt"&gt;49
&lt;/span&gt;&lt;span class="lnt"&gt;50
&lt;/span&gt;&lt;span class="lnt"&gt;51
&lt;/span&gt;&lt;span class="lnt"&gt;52
&lt;/span&gt;&lt;span class="lnt"&gt;53
&lt;/span&gt;&lt;span class="lnt"&gt;54
&lt;/span&gt;&lt;span class="lnt"&gt;55
&lt;/span&gt;&lt;span class="lnt"&gt;56
&lt;/span&gt;&lt;span class="lnt"&gt;57
&lt;/span&gt;&lt;span class="lnt"&gt;58
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Multi-head attention mechanism module
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MultiHeadAttention&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;d_model must be divisible by num_heads&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_heads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;d_k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Define linear transformation layers for Q, K, V and output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_o&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scaled_dot_product_attention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 1. Calculate attention scores (QK^T)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;matmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transpose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;d_k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 2. Apply mask (if provided)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Set positions where mask is 0 to a very small negative number, so they approach 0 after softmax&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;attn_scores&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;masked_fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mask&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1e9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 3. Calculate attention weights (Softmax)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_probs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;softmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_scores&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 4. Weighted sum (weights * V)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;matmul&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_probs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;split_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Transform input x shape from (batch_size, seq_length, d_model)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# to (batch_size, num_heads, seq_length, d_k)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seq_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seq_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;d_k&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transpose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;combine_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Transform input x shape from (batch_size, num_heads, seq_length, d_k)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# back to (batch_size, seq_length, d_model)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_heads&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seq_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transpose&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contiguous&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seq_length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 1. Perform linear transformations on Q, K, V&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;Q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_q&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;K&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;V&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_v&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 2. Calculate scaled dot-product attention&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;attn_output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scaled_dot_product_attention&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;K&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;V&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# 3. Combine multi-head outputs and perform final linear transformation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;output&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;W_o&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;combine_heads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attn_output&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;(3) Feed-Forward Neural Network&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In each Encoder and Decoder layer, the multi-head attention sublayer is followed by a &lt;strong&gt;Position-wise Feed-Forward Network (FFN)&lt;/strong&gt;. If the role of the attention layer is to &amp;ldquo;dynamically aggregate&amp;rdquo; relevant information from the entire sequence, then the role of the feed-forward network is to extract higher-order features from this aggregated information.&lt;/p&gt;
&lt;p&gt;The key to this name is &amp;ldquo;position-wise.&amp;rdquo; It means this feed-forward network acts independently on each token vector in the sequence. In other words, for a sequence of length &lt;code&gt;seq_len&lt;/code&gt;, this FFN is actually called &lt;code&gt;seq_len&lt;/code&gt; times, processing one token each time. Importantly, all positions share the same set of network weights. This design both maintains the ability to independently process each position and greatly reduces the model&amp;rsquo;s parameter count. This network&amp;rsquo;s structure is very simple, consisting of two linear transformations and a ReLU activation function:&lt;/p&gt;
$$\mathrm{FFN}(x)=\max\left(0, xW_{1}+b_{1}\right) W_{2}+b_{2}$$&lt;p&gt;Where $x$ is the output of the attention sublayer. $W_1,b_1,W_2,b_2$ are learnable parameters. Typically, the output dimension &lt;code&gt;d_ff&lt;/code&gt; of the first linear layer is much larger than the input dimension &lt;code&gt;d_model&lt;/code&gt; (for example, &lt;code&gt;d_ff = 4 * d_model&lt;/code&gt;), then after ReLU activation, it is mapped back to &lt;code&gt;d_model&lt;/code&gt; dimension through the second linear layer. This &amp;ldquo;expand then shrink&amp;rdquo; design is believed to help the model learn richer feature representations.&lt;/p&gt;
&lt;p&gt;In our PyTorch skeleton, we can implement this module with the following code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PositionWiseFeedForward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Position-wise feed-forward network module
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_ff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;PositionWiseFeedForward&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linear1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_ff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linear2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d_ff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# x shape: (batch_size, seq_len, d_model)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linear1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;linear2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Final output shape: (batch_size, seq_len, d_model)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;(4) Residual Connections and Layer Normalization&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In each encoder and decoder layer of Transformer, all submodules (such as multi-head attention and feed-forward networks) are wrapped by an &lt;code&gt;Add &amp;amp; Norm&lt;/code&gt; operation. This combination ensures that Transformer can train stably.&lt;/p&gt;
&lt;p&gt;This operation consists of two parts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Residual Connection (Add)&lt;/strong&gt;: This operation directly adds the submodule&amp;rsquo;s input &lt;code&gt;x&lt;/code&gt; to the submodule&amp;rsquo;s output &lt;code&gt;Sublayer(x)&lt;/code&gt;. This structure solves the &lt;strong&gt;Vanishing Gradients&lt;/strong&gt; problem in deep neural networks. During backpropagation, gradients can bypass the submodule and propagate forward directly, ensuring that even if the network has many layers, the model can be effectively trained. Its formula can be expressed as: $\text{Output} = x + \text{Sublayer}(x)$.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Layer Normalization (Norm)&lt;/strong&gt;: This operation normalizes all features of a single sample, making its mean 0 and variance 1. This solves the &lt;strong&gt;Internal Covariate Shift&lt;/strong&gt; problem during model training, keeping the input distribution of each layer stable, thereby accelerating model convergence and improving training stability.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;3.1.2.5 Positional Encoding&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We already understand that the core of Transformer is the self-attention mechanism, which captures dependencies by calculating relationships between any two tokens in a sequence. However, this computation method has an inherent problem: it does not contain any information about token order or position. For self-attention, the two sequences &amp;ldquo;agent learns&amp;rdquo; and &amp;ldquo;learns agent&amp;rdquo; are completely equivalent because it only cares about relationships between tokens and ignores their arrangement. To solve this problem, Transformer introduced &lt;strong&gt;Positional Encoding&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The core idea of positional encoding is to add an additional &amp;ldquo;position vector&amp;rdquo; representing its absolute and relative position information to each token embedding vector in the input sequence. This position vector is not learned but directly calculated through a fixed mathematical formula. This way, even if two tokens (for example, two tokens both called &lt;code&gt;agent&lt;/code&gt;) have the same embedding, because they are in different positions in the sentence, the vectors they ultimately input to the Transformer model will become unique due to adding different positional encodings. The positional encoding proposed in the original paper uses sine and cosine functions to generate, with the formula as follows:&lt;/p&gt;
$$PE_{(pos,2i)}=\sin\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)，$$$$PE_{(pos,2i+1)}=\cos\left(\frac{pos}{10000^{2i/d_{\text{model}}}}\right)$$&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$pos$ is the position of the token in the sequence (for example, $0$, $1$, $2$, &amp;hellip;)&lt;/li&gt;
&lt;li&gt;$i$ is the dimension index in the position vector (from $0$ to $d_{\text{model}}/2$)&lt;/li&gt;
&lt;li&gt;$d_{\text{model}}$ is the dimension of the word embedding vector (consistent with what we defined in the model)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, let&amp;rsquo;s implement the &lt;code&gt;PositionalEncoding&lt;/code&gt; module and complete the last part of our Transformer skeleton code.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PositionalEncoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Add positional encoding to word embedding vectors of input sequence.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; &amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_len&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;super&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="fm"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Create a sufficiently long positional encoding matrix&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsqueeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;div_term&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;10000.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# pe (positional encoding) size is (max_len, d_model)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pe&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zeros&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d_model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Even dimensions use sin, odd dimensions use cos&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pe&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;div_term&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pe&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;div_term&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Register pe as buffer, so it won&amp;#39;t be treated as model parameter but will move with the model (e.g., to(device))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;pe&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unsqueeze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Tensor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# x.size(1) is the current input sequence length&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# Add positional encoding to input vector&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pe&lt;/span&gt;&lt;span class="p"&gt;[:,&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;This subsection mainly helps understand the macro structure of Transformer and the operational details of each internal module. Since it&amp;rsquo;s to supplement the knowledge system of large models in agent learning, we won&amp;rsquo;t continue to implement further. At this point, we have laid a solid architectural foundation for understanding modern large language models. In the next section, we will explore the Decoder-Only architecture and see how it evolved based on Transformer&amp;rsquo;s ideas.&lt;/p&gt;
&lt;h3 id="313-decoder-only-architecture"&gt;3.1.3 Decoder-Only Architecture
&lt;/h3&gt;&lt;p&gt;In the previous section, we built a complete Transformer model by hand, which performs excellently in many end-to-end scenarios. But when the task shifts to building a general model that can converse with people, create, and serve as an agent&amp;rsquo;s brain, perhaps we don&amp;rsquo;t need such a complex structure.&lt;/p&gt;
&lt;p&gt;Transformer&amp;rsquo;s design philosophy is &amp;ldquo;understand first, then generate.&amp;rdquo; The encoder is responsible for deeply understanding the entire input sentence, forming a contextual memory containing global information, then the decoder generates translation based on this memory. But when OpenAI developed &lt;strong&gt;GPT (Generative Pre-trained Transformer)&lt;/strong&gt;, they proposed a simpler idea&lt;sup&gt;[5]&lt;/sup&gt;: &lt;strong&gt;Isn&amp;rsquo;t the core task of language to predict the next most likely word?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Whether answering questions, writing stories, or generating code, essentially it&amp;rsquo;s adding the most reasonable content word by word after an existing text sequence. Based on this idea, GPT made a bold simplification: &lt;strong&gt;It completely abandoned the encoder and only kept the decoder part.&lt;/strong&gt; This is the origin of the &lt;strong&gt;Decoder-Only&lt;/strong&gt; architecture.&lt;/p&gt;
&lt;p&gt;The working mode of the Decoder-Only architecture is called &lt;strong&gt;Autoregressive&lt;/strong&gt;. This professional-sounding term actually describes a very simple process:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Give the model a starting text (for example, &amp;ldquo;Datawhale Agent is&amp;rdquo;).&lt;/li&gt;
&lt;li&gt;The model predicts the next most likely word (for example, &amp;ldquo;a&amp;rdquo;).&lt;/li&gt;
&lt;li&gt;The model adds the word &amp;ldquo;a&amp;rdquo; it just generated to the end of the input text, forming a new input (&amp;ldquo;Datawhale Agent is a&amp;rdquo;).&lt;/li&gt;
&lt;li&gt;Based on this new input, the model predicts the next word again (for example, &amp;ldquo;powerful&amp;rdquo;).&lt;/li&gt;
&lt;li&gt;Continuously repeat this process until a complete sentence is generated or a stop condition is reached.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The model is like playing a &amp;ldquo;word chain&amp;rdquo; game, constantly &amp;ldquo;reviewing&amp;rdquo; the content it has already written, then thinking about what the next word should be.&lt;/p&gt;
&lt;p&gt;You might ask, how does the decoder ensure that when predicting the &lt;code&gt;t&lt;/code&gt;-th word, it doesn&amp;rsquo;t &amp;ldquo;peek&amp;rdquo; at the answer of the &lt;code&gt;t+1&lt;/code&gt;-th word?&lt;/p&gt;
&lt;p&gt;The answer is &lt;strong&gt;Masked Self-Attention&lt;/strong&gt;. In the Decoder-Only architecture, this mechanism becomes crucial. Its working principle is very clever:&lt;/p&gt;
&lt;p&gt;After the self-attention mechanism calculates the attention score matrix (i.e., each word&amp;rsquo;s attention score to all other words), but before performing Softmax normalization, the model applies a &amp;ldquo;mask.&amp;rdquo; This mask replaces the scores corresponding to all tokens located after the current position (i.e., not yet observed) with a very large negative number. When this matrix with negative infinity scores goes through the Softmax function, the probabilities at these positions become 0. This way, when the model calculates the output at any position, it is mathematically prevented from attending to information after it. This mechanism ensures that when predicting the next word, the model can and only can rely on all information it has already seen, located before the current position, thereby ensuring fairness of prediction and coherence of logic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Advantages of Decoder-Only Architecture&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This seemingly simple architecture has brought tremendous success, with advantages including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Unified Training Objective&lt;/strong&gt;: The model&amp;rsquo;s only task is to &amp;ldquo;predict the next word,&amp;rdquo; a simple goal very suitable for pre-training on massive unlabeled text data.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simple Structure, Easy to Scale&lt;/strong&gt;: Fewer components mean easier scaling. Today&amp;rsquo;s GPT-4, Llama, and other giant models with hundreds of billions or even trillions of parameters are all based on this concise architecture.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Naturally Suited for Generation Tasks&lt;/strong&gt;: Its autoregressive working mode perfectly matches all generative tasks (dialogue, writing, code generation, etc.), which is also the core reason it can become the foundation for building general agents.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In summary, the Decoder-Only architecture evolved from Transformer&amp;rsquo;s decoder, through the simple paradigm of &amp;ldquo;predicting the next word,&amp;rdquo; opened the era of large language models we are in today.&lt;/p&gt;
&lt;h2 id="32-interacting-with-large-language-models"&gt;3.2 Interacting with Large Language Models
&lt;/h2&gt;&lt;h3 id="321-prompt-engineering"&gt;3.2.1 Prompt Engineering
&lt;/h3&gt;&lt;p&gt;If we compare large language models to an extremely capable &amp;ldquo;brain,&amp;rdquo; then &lt;strong&gt;Prompt&lt;/strong&gt; is the language we use to communicate with this &amp;ldquo;brain.&amp;rdquo; Prompt engineering is the study of how to design precise prompts to guide the model to produce the responses we expect. For building agents, a carefully designed prompt can make collaboration and division of labor between agents efficient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(1) Model Sampling Parameters&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;When using large models, you often see configurable parameters like &lt;code&gt;Temperature&lt;/code&gt;. Their essence is to adjust the model&amp;rsquo;s sampling strategy for &amp;ldquo;probability distribution&amp;rdquo; to match specific scenario needs. Configuring appropriate parameters can improve Agent performance in specific scenarios.&lt;/p&gt;
&lt;p&gt;The traditional probability distribution is calculated by the Softmax formula: $p_i = \frac{e^{z_i}}{\sum_{j=1}^k e^{z_j}}$. The essence of sampling parameters is to &amp;ldquo;readjust&amp;rdquo; or &amp;ldquo;truncate&amp;rdquo; the distribution based on different strategies, thereby changing the next token output by the large model.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Temperature&lt;/code&gt;: Temperature is a key parameter controlling the &amp;ldquo;randomness&amp;rdquo; and &amp;ldquo;determinism&amp;rdquo; of model output. Its principle is to introduce a temperature coefficient $T\gt0$, rewriting Softmax as $p_i^{(T)} = \frac{e^{z_i / T}}{\sum_{j=1}^k e^{z_j / T}}$.&lt;/p&gt;
&lt;p&gt;When T decreases, the distribution becomes &amp;ldquo;steeper,&amp;rdquo; high-probability item weights are further amplified, generating more &amp;ldquo;conservative&amp;rdquo; text with higher repetition rates. When T increases, the distribution becomes &amp;ldquo;flatter,&amp;rdquo; low-probability item weights increase, generating more &amp;ldquo;diverse&amp;rdquo; but possibly incoherent content.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Low temperature (0 $\leqslant$ Temperature $\lt$ 0.3): Output is more &amp;ldquo;precise, deterministic.&amp;rdquo; Applicable scenarios: Factual tasks: such as Q&amp;amp;A, data calculation, code generation; Rigorous scenarios: legal text interpretation, technical documentation writing, academic concept explanation, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Medium temperature (0.3 $\leqslant$ Temperature $\lt$ 0.7): Output is &amp;ldquo;balanced, natural.&amp;rdquo; Applicable scenarios: Daily conversation: such as customer service interaction, chatbots; Regular creation: such as email writing, product copy, simple story creation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;High temperature (0.7 $\leqslant$ Temperature $\lt$ 2): Output is &amp;ldquo;innovative, divergent.&amp;rdquo; Applicable scenarios: Creative tasks: such as poetry creation, science fiction story conception, advertising slogan brainstorming, artistic inspiration; Divergent thinking.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;Top-k&lt;/code&gt;: Its principle is to sort all tokens by probability from high to low, take the top k tokens to form a &amp;ldquo;candidate set,&amp;rdquo; then &amp;ldquo;normalize&amp;rdquo; the probabilities of the filtered k tokens: $ \hat{p}&lt;em&gt;i = \frac{p_i}{\sum&lt;/em&gt;{j \in \text{candidate set}} p_j}$&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Difference and connection with temperature sampling: Temperature sampling adjusts the probability distribution of all tokens (smooth or steep) through temperature T, without changing the number of candidate tokens (still considering all N). Top-k sampling limits the number of candidate tokens (only keeping the top k high-probability tokens) through the k value, then samples from them. When k=1, output is completely deterministic, degenerating to &amp;ldquo;greedy sampling.&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;Top-p&lt;/code&gt;: Its principle is to sort all tokens by probability from high to low, starting from the first token after sorting, gradually accumulating probabilities until the cumulative sum first reaches or exceeds threshold p: $\sum_{i \in S} p_{(i)} \geq p$. At this point, all tokens included in the accumulation process form the &amp;ldquo;nucleus set,&amp;rdquo; and finally the nucleus set is normalized.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Difference and connection with Top-k: Compared to Top-k with fixed truncation size, Top-p can dynamically adapt to the &amp;ldquo;long tail&amp;rdquo; characteristics of different distributions, with better adaptability to extreme cases of uneven probability distribution.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In text generation, when Top-p, Top-k, and temperature coefficient are set simultaneously, these parameters work together in a layered filtering manner, with priority order: temperature adjustment → Top-k → Top-p. Temperature adjusts the overall steepness of the distribution, Top-k first retains the k candidates with highest probability, then Top-p selects the minimum set with cumulative probability ≥ p from Top-k results as the final candidate set. However, usually choosing one of Top-k or Top-p is sufficient; if both are set, the actual candidate set is the intersection of the two.
Note that if temperature is set to 0, Top-k and Top-p become irrelevant because the most likely Token will be the next predicted Token; if Top-k is set to 1, temperature and Top-p also become irrelevant because only one Token passes the Top-k criterion and it will be the next predicted Token.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(2) Zero-shot, One-shot, and Few-shot Prompting&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;According to the number of examples (Exemplars) we provide to the model, prompts can be divided into three types. To better understand them, let&amp;rsquo;s use a sentiment classification task as an example, with the goal of having the model judge the emotional tone of a text (such as positive, negative, or neutral).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero-shot Prompting&lt;/strong&gt; This means we don&amp;rsquo;t give the model any examples and directly ask it to complete the task based on instructions. This benefits from the model&amp;rsquo;s powerful generalization ability acquired after pre-training on massive data.&lt;/p&gt;
&lt;p&gt;Case: We directly give the model instructions, requiring it to complete the sentiment classification task.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Datawhale&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s AI Agent course is excellent!&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Positive&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;One-shot Prompting&lt;/strong&gt; We provide the model with one complete example, showing it the task format and expected output style.&lt;/p&gt;
&lt;p&gt;Case: We first give the model a complete &amp;ldquo;question-answer&amp;rdquo; pair as a demonstration, then pose our new question.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s service is too slow.&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Negative&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Datawhale&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s AI Agent course is excellent!&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The model will imitate the given example format and complete &amp;ldquo;Positive&amp;rdquo; for the second text.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Few-shot Prompting&lt;/strong&gt; We provide multiple examples, which allows the model to more accurately understand the task&amp;rsquo;s details, boundaries, and nuances, thereby achieving better performance.&lt;/p&gt;
&lt;p&gt;Case: We provide multiple examples covering different situations, allowing the model to have a more comprehensive understanding of the task.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;restaurant&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s service is too slow.&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Negative&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;This&lt;/span&gt; &lt;span class="n"&gt;movie&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s plot is very bland.&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Neutral&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Datawhale&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;s AI Agent course is excellent!&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Sentiment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The model will synthesize all examples and more accurately classify the sentiment of the last sentence as &amp;ldquo;Positive.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(3) Impact of Instruction Tuning&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Early GPT models (such as GPT-3) were mainly &amp;ldquo;text completion&amp;rdquo; models; they were good at continuing text based on preceding text but not necessarily good at understanding and executing human instructions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Instruction Tuning&lt;/strong&gt; is a fine-tuning technique that uses a large amount of &amp;ldquo;instruction-answer&amp;rdquo; format data to further train pre-trained models. After instruction tuning, models can better understand and follow user instructions. All models we use in daily work and study today (such as &lt;code&gt;ChatGPT&lt;/code&gt;, &lt;code&gt;DeepSeek&lt;/code&gt;, &lt;code&gt;Qwen&lt;/code&gt;) are instruction-tuned models in their model families.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Prompts for &amp;ldquo;text completion&amp;rdquo; models (you need to use few-shot prompts to &amp;ldquo;teach&amp;rdquo; the model what to do):&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;This is a program that translates English to Chinese.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;English: Hello
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Chinese: 你好
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;English: How are you?
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Chinese:
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Prompts for &amp;ldquo;instruction-tuned&amp;rdquo; models (you can directly give instructions):&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Please translate the following English to Chinese:
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;How are you?
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;The emergence of instruction tuning has greatly simplified how we interact with models, making direct, clear natural language instructions possible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;(4) Basic Prompting Techniques&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Role-playing&lt;/strong&gt; By assigning the model a specific role, we can guide its response style, tone, and knowledge scope, making its output more suitable for specific scenario needs.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;# Case
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You are now a senior Python programming expert. Please explain what GIL (Global Interpreter Lock) is in Python in a way that even a beginner can understand.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;In-context Example&lt;/strong&gt; This is consistent with the idea of few-shot prompting. By providing clear input-output examples in the prompt, we &amp;ldquo;teach&amp;rdquo; the model how to handle our requests, which is especially effective when dealing with complex formats or specific style tasks.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;# Case
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;I need you to extract product names and user sentiment from product reviews. Please output strictly in the JSON format below.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Review: The screen display of this &amp;#34;Stardust&amp;#34; laptop is amazing, but I don&amp;#39;t really like its keyboard feel.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Output: {&amp;#34;product_name&amp;#34;: &amp;#34;Stardust Laptop&amp;#34;, &amp;#34;sentiment&amp;#34;: &amp;#34;Mixed&amp;#34;}
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Review: The &amp;#34;SoundMove&amp;#34; headphones I just bought have great sound quality, and the battery life exceeded my expectations!
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Output:
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;&lt;strong&gt;(5) Chain-of-Thought&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For complex problems requiring logical reasoning, calculation, or multi-step thinking, directly asking the model to give an answer often leads to errors. &lt;strong&gt;Chain-of-Thought (CoT)&lt;/strong&gt; is a powerful prompting technique that improves the model&amp;rsquo;s reasoning ability on complex tasks by guiding it to &amp;ldquo;think step by step.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;The key to implementing CoT is to add a simple guiding phrase in the prompt, such as &amp;ldquo;please think step by step&amp;rdquo; or &amp;ldquo;Let&amp;rsquo;s think step by step.&amp;rdquo;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;# Chain-of-Thought Prompt
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;A basketball team won 60% of their 80 games in one season. In the next season, they played 15 games and won 12. What is the total winning percentage for both seasons?
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Please think step by step and solve.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&amp;gt;&amp;gt;&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;(The model might output)
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Okay, let&amp;#39;s calculate step by step.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Step 1: Calculate the number of games won in the first season.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;80 games * 60% = 48 games.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Step 2: Calculate the total number of games and total wins for both seasons.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Total games = 80 + 15 = 95 games.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Total wins = 48 + 12 = 60 games.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Step 3: Calculate the total winning percentage.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Total winning percentage = (Total wins / Total games) * 100% = (60 / 95) * 100% ≈ 63.16%.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;So, the total winning percentage for both seasons is approximately 63.16%.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;By explicitly showing its reasoning process, the model not only more easily arrives at the correct answer but also makes its response more credible and easier for us to check and correct.&lt;/p&gt;
&lt;h3 id="322-text-tokenization"&gt;3.2.2 Text Tokenization
&lt;/h3&gt;&lt;p&gt;We know that computers essentially can only understand numbers. Therefore, before feeding natural language text to large language models, it must first be converted into a numerical format that the model can process. This process of converting text sequences into numerical sequences is called &lt;strong&gt;Tokenization&lt;/strong&gt;. The role of a &lt;strong&gt;Tokenizer&lt;/strong&gt; is to define a set of rules to split raw text into minimal units, which we call &lt;strong&gt;Tokens&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.2.2.1 Why Tokenization is Needed&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Early natural language processing tasks might adopt simple tokenization strategies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Word-based&lt;/strong&gt;: Directly splits sentences into words using spaces or punctuation. This method is intuitive but faces significant challenges:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Vocabulary Explosion and OOV&lt;/strong&gt;: A language&amp;rsquo;s vocabulary is vast. If each word is treated as an independent token, the vocabulary becomes difficult to manage. Worse, the model cannot handle any word that does not appear in its vocabulary (e.g., &amp;ldquo;DatawhaleAgent&amp;rdquo;). This phenomenon is known as the &amp;ldquo;Out-Of-Vocabulary&amp;rdquo; (OOV) problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Lack of Semantic Association&lt;/strong&gt;: The model struggles to capture the semantic relationships between morphologically similar words. For instance, &amp;ldquo;look,&amp;rdquo; &amp;ldquo;looks,&amp;rdquo; and &amp;ldquo;looking&amp;rdquo; are treated as three completely different tokens, despite sharing a common core meaning. Similarly, the semantics of low-frequency words in the training data cannot be fully learned due to their rare occurrences.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Character-based&lt;/strong&gt;: Splits text into individual characters. This method has a very small vocabulary (e.g., English letters, numbers, and punctuation) and thus avoids the OOV problem. However, its disadvantage is that individual characters mostly lack independent semantic meaning. The model must expend more effort learning to combine characters into meaningful words, leading to inefficient learning.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To balance vocabulary size and semantic expression, modern large language models widely adopt &lt;strong&gt;Subword Tokenization&lt;/strong&gt; algorithms. The core idea is to keep common words (like &amp;ldquo;agent&amp;rdquo;) as single, complete tokens while breaking down uncommon words (like &amp;ldquo;Tokenization&amp;rdquo;) into meaningful subword pieces (such as &amp;ldquo;Token&amp;rdquo; and &amp;ldquo;ization&amp;rdquo;). This approach not only controls the size of the vocabulary but also enables the model to understand and generate new words by combining subwords.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.2.2.2 Byte-Pair Encoding Algorithm Analysis&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Byte-Pair Encoding (BPE) is one of the most mainstream subword tokenization algorithms&lt;sup&gt;[6]&lt;/sup&gt;, adopted by the GPT series models. Its core idea is very concise and can be understood as a &amp;ldquo;greedy&amp;rdquo; merging process:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Initialization&lt;/strong&gt;: Initialize the vocabulary to all basic characters appearing in the corpus.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iterative Merging&lt;/strong&gt;: In the corpus, count the frequency of all adjacent token pairs, find the pair with the highest frequency, merge them into a new token, and add it to the vocabulary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Repeat&lt;/strong&gt;: Repeat step 2 until the vocabulary size reaches a preset threshold.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Case Demonstration:&lt;/strong&gt; Suppose our mini corpus is &lt;code&gt;{&amp;quot;hug&amp;quot;: 1, &amp;quot;pug&amp;quot;: 1, &amp;quot;pun&amp;quot;: 1, &amp;quot;bun&amp;quot;: 1}&lt;/code&gt;, and we want to build a vocabulary of size 10. The BPE training process can be represented by Table 3.1:&lt;/p&gt;
&lt;div align="center"&gt;
&lt;p&gt;Table 3.1 Example of BPE Algorithm Merging Process&lt;/p&gt;
&lt;img src="https://raw.githubusercontent.com/datawhalechina/Hello-Agents/main/docs/images/3-figures/1757249275674-5.png" alt="Figure description" width="90%"/&gt;
&lt;/div&gt;
&lt;p&gt;After training ends, when the vocabulary size reaches 10, we get new tokenization rules. Now, for an unseen word &amp;ldquo;bug,&amp;rdquo; the tokenizer will first check if &amp;ldquo;bug&amp;rdquo; is in the vocabulary and find it&amp;rsquo;s not; then check &amp;ldquo;bu&amp;rdquo; and find it&amp;rsquo;s not; finally check &amp;ldquo;b&amp;rdquo; and &amp;ldquo;ug,&amp;rdquo; find both are in, and thus split it into &lt;code&gt;['b', 'ug']&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Below we use a simple Python code to simulate the above process:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;span class="lnt"&gt;23
&lt;/span&gt;&lt;span class="lnt"&gt;24
&lt;/span&gt;&lt;span class="lnt"&gt;25
&lt;/span&gt;&lt;span class="lnt"&gt;26
&lt;/span&gt;&lt;span class="lnt"&gt;27
&lt;/span&gt;&lt;span class="lnt"&gt;28
&lt;/span&gt;&lt;span class="lnt"&gt;29
&lt;/span&gt;&lt;span class="lnt"&gt;30
&lt;/span&gt;&lt;span class="lnt"&gt;31
&lt;/span&gt;&lt;span class="lnt"&gt;32
&lt;/span&gt;&lt;span class="lnt"&gt;33
&lt;/span&gt;&lt;span class="lnt"&gt;34
&lt;/span&gt;&lt;span class="lnt"&gt;35
&lt;/span&gt;&lt;span class="lnt"&gt;36
&lt;/span&gt;&lt;span class="lnt"&gt;37
&lt;/span&gt;&lt;span class="lnt"&gt;38
&lt;/span&gt;&lt;span class="lnt"&gt;39
&lt;/span&gt;&lt;span class="lnt"&gt;40
&lt;/span&gt;&lt;span class="lnt"&gt;41
&lt;/span&gt;&lt;span class="lnt"&gt;42
&lt;/span&gt;&lt;span class="lnt"&gt;43
&lt;/span&gt;&lt;span class="lnt"&gt;44
&lt;/span&gt;&lt;span class="lnt"&gt;45
&lt;/span&gt;&lt;span class="lnt"&gt;46
&lt;/span&gt;&lt;span class="lnt"&gt;47
&lt;/span&gt;&lt;span class="lnt"&gt;48
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;collections&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;Count token pair frequencies&amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pairs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;symbols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="n"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;merge_vocab&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v_in&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;&amp;#34;&amp;#34;Merge token pairs&amp;#34;&amp;#34;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;v_out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;bigram&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;escape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(?&amp;lt;!\S)&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;bigram&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(?!\S)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;v_in&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;w_out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;v_out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;w_out&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v_in&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;v_out&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Prepare corpus, add &amp;lt;/w&amp;gt; at the end of each word to indicate ending, and split characters&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;vocab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;h u g &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p u g &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;num_merges&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="c1"&gt;# Set number of merges&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_merges&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;pairs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_stats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="k"&gt;break&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pairs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;vocab&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;merge_vocab&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Merge &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; -&amp;gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;New vocabulary (partial): &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vocab&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;-&amp;#34;&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Merge&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;u&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;g&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ug&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;New&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;h ug &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p ug &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;--------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Merge&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;ug&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ug&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;New&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;h ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b u n &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;--------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Merge&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;u&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;n&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;un&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;New&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;h ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p un &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b un &amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;--------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Merge&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;un&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;un&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;New&lt;/span&gt; &lt;span class="n"&gt;vocabulary&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;h ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p ug&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;p un&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;b un&amp;lt;/w&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;--------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;This code clearly demonstrates how the BPE algorithm gradually builds and expands the vocabulary by iteratively merging the highest-frequency adjacent token pairs.&lt;/p&gt;
&lt;p&gt;Many subsequent algorithms are optimizations based on BPE. Among them, WordPiece and SentencePiece developed by Google are the two most influential.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;WordPiece&lt;/strong&gt;: The algorithm adopted by Google&amp;rsquo;s BERT model&lt;sup&gt;[7]&lt;/sup&gt;. It is very similar to BPE, but the criterion for merging tokens is not &amp;ldquo;highest frequency&amp;rdquo; but &amp;ldquo;maximizing the improvement of the corpus&amp;rsquo;s language model probability.&amp;rdquo; Simply put, it prioritizes merging token pairs that can maximize the &amp;ldquo;fluency&amp;rdquo; improvement of the entire corpus.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SentencePiece&lt;/strong&gt;: An open-source tokenization tool by Google&lt;sup&gt;[8]&lt;/sup&gt;, adopted by the Llama series models. Its biggest feature is treating spaces as ordinary characters (usually represented by underscore &lt;code&gt;_&lt;/code&gt;). This makes the tokenization and decoding process completely reversible and independent of specific languages (for example, it doesn&amp;rsquo;t need to know that Chinese doesn&amp;rsquo;t use spaces for word segmentation).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;3.2.2.3 Significance of Tokenizers for Developers&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Understanding the details of tokenization algorithms is not the goal, but as an agent developer, understanding the actual impact of tokenizers is important, as it directly relates to agent performance, cost, and stability:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Context Window Limitation&lt;/strong&gt;: The model&amp;rsquo;s context window (such as 8K, 128K) is calculated in &lt;strong&gt;Token count&lt;/strong&gt;, not character count or word count. The same text may have vastly different Token counts in different languages (such as Chinese and English) or with different tokenizers. Precisely managing input length and avoiding exceeding context limits is the foundation for building long-term memory agents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;API Cost&lt;/strong&gt;: Most model APIs charge based on Token count. Understanding how your text will be tokenized is a key step in estimating and controlling agent operating costs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model Performance Anomalies&lt;/strong&gt;: Sometimes strange model behavior stems from tokenization. For example, the model might be good at calculating &lt;code&gt;2 + 2&lt;/code&gt; but might make mistakes with &lt;code&gt;2+2&lt;/code&gt; (without spaces) because the latter might be treated by the tokenizer as an independent, uncommon token. Similarly, a word with different capitalization of the first letter might be split into completely different Token sequences, affecting the model&amp;rsquo;s understanding. Considering these &amp;ldquo;traps&amp;rdquo; when designing prompts and parsing model outputs helps improve agent robustness.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="323-calling-open-source-large-language-models"&gt;3.2.3 Calling Open-Source Large Language Models
&lt;/h3&gt;&lt;p&gt;In Chapter 1 of this book, we interacted with large language models through APIs to drive our agents. This is a fast and convenient method, but not the only one. For many scenarios requiring sensitive data processing, offline operation, or fine cost control, deploying large language models directly locally becomes crucial.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hugging Face Transformers&lt;/strong&gt; is a powerful open-source library that provides standardized interfaces to load and use tens of thousands of pre-trained models. We will use it to complete this practice.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Environment Configuration and Model Selection&lt;/strong&gt;: To ensure most readers can run smoothly on personal computers, we deliberately chose a small-scale but powerful model: &lt;code&gt;Qwen/Qwen1.5-0.5B-Chat&lt;/code&gt;. This is a dialogue model with about 500 million parameters open-sourced by Alibaba DAMO Academy. It&amp;rsquo;s small in size, excellent in performance, and very suitable for introductory learning and local deployment.&lt;/p&gt;
&lt;p&gt;First, please ensure you have installed the necessary libraries:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Plain" data-lang="Plain"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pip install transformers torch
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;In the &lt;code&gt;transformers&lt;/code&gt; library, we typically use the &lt;code&gt;AutoModelForCausalLM&lt;/code&gt; and &lt;code&gt;AutoTokenizer&lt;/code&gt; classes to automatically load weights and tokenizers matching the model. The following code will automatically download required model files and tokenizer configurations from Hugging Face Hub, which may take some time depending on your network speed.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;torch&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Specify model ID&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;model_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Qwen/Qwen1.5-0.5B-Chat&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Set device, prioritize GPU&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;device&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;cuda&amp;#34;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cuda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_available&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;cpu&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Using device: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Load tokenizer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;tokenizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Load model and move it to specified device&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForCausalLM&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Model and tokenizer loaded!&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Let&amp;rsquo;s create a dialogue prompt. The Qwen1.5-Chat model follows a specific dialogue template. Then, we can use the &lt;code&gt;tokenizer&lt;/code&gt; loaded in the previous step to convert the text prompt into numerical IDs (i.e., Token IDs) that the model can understand.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;span class="lnt"&gt;22
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Prepare dialogue input&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;system&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;You are a helpful assistant.&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;role&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;user&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;content&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;Hello, please introduce yourself.&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Use tokenizer&amp;#39;s template to format input&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apply_chat_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tokenize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;add_generation_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Encode input text&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;model_inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;return_tensors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;pt&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;Encoded input text:&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;input_ids&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;([[&lt;/span&gt;&lt;span class="mi"&gt;151644&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8948&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;198&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2610&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;525&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;264&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10950&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;17847&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;151645&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;198&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;151644&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;872&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;198&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;108386&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;37945&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100157&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;107828&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;1773&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;151645&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;198&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;151644&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;77091&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;198&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cuda:0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;attention_mask&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tensor&lt;/span&gt;&lt;span class="p"&gt;([[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;device&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;cuda:0&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Now we can call the model&amp;rsquo;s &lt;code&gt;generate()&lt;/code&gt; method to generate an answer. The model will output a series of Token IDs representing its answer.&lt;/p&gt;
&lt;p&gt;Finally, we need to use the tokenizer&amp;rsquo;s &lt;code&gt;decode()&lt;/code&gt; method to translate these numerical IDs back into human-readable text.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;span class="lnt"&gt;16
&lt;/span&gt;&lt;span class="lnt"&gt;17
&lt;/span&gt;&lt;span class="lnt"&gt;18
&lt;/span&gt;&lt;span class="lnt"&gt;19
&lt;/span&gt;&lt;span class="lnt"&gt;20
&lt;/span&gt;&lt;span class="lnt"&gt;21
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-Python" data-lang="Python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Use model to generate answer&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# max_new_tokens controls the maximum number of new Tokens the model can generate&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;generated_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;model_inputs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;max_new_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Truncate the input part from generated Token IDs&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# This way we only decode the newly generated part by the model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;generated_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;output_ids&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;):]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_ids&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_inputs&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;generated_ids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Decode generated Token IDs&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenizer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generated_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_special_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;Model&amp;#39;s answer:&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;My&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;Tongyi&lt;/span&gt; &lt;span class="n"&gt;Qianwen&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;pre&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;trained&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="n"&gt;developed&lt;/span&gt; &lt;span class="n"&gt;by&lt;/span&gt; &lt;span class="n"&gt;Alibaba&lt;/span&gt; &lt;span class="n"&gt;Cloud&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;can&lt;/span&gt; &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="n"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;express&lt;/span&gt; &lt;span class="n"&gt;opinions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="n"&gt;My&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt; &lt;span class="n"&gt;functions&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;provide&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;multiple&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;including&lt;/span&gt; &lt;span class="n"&gt;but&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;limited&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="n"&gt;understanding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="n"&gt;generation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;machine&lt;/span&gt; &lt;span class="n"&gt;translation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;answering&lt;/span&gt; &lt;span class="n"&gt;systems&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;etc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Is&lt;/span&gt; &lt;span class="n"&gt;there&lt;/span&gt; &lt;span class="n"&gt;anything&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;can&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt;&lt;span class="err"&gt;?&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;After running all the code, you will see the model-generated introduction about the Qwen model on your local computer. Congratulations, you have successfully deployed and run an open-source large language model locally!&lt;/p&gt;
&lt;h3 id="324-model-selection"&gt;3.2.4 Model Selection
&lt;/h3&gt;&lt;p&gt;In the previous section, we successfully ran a small open-source language model locally. This naturally raises a crucial question for agent developers: in the current context of hundreds of blooming models, how should we choose the most suitable model for specific tasks?&lt;/p&gt;
&lt;p&gt;Choosing a language model is not simply pursuing &amp;ldquo;the biggest, the strongest&amp;rdquo; but a decision-making process balancing performance, cost, speed, and deployment methods. This section will first organize several key considerations for model selection, then review current mainstream closed-source and open-source models.&lt;/p&gt;
&lt;p&gt;Since large language model technology is in a stage of rapid development, with new models and versions emerging constantly and extremely fast iteration, this section strives to provide an overview of current mainstream models and selection considerations when written, but readers should note that specific model versions and performance data mentioned may change over time, and only some work is listed, not comprehensively. We focus more on introducing core technical characteristics, development trends, and general selection principles in agent development.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3.2.4.1 Key Considerations for Model Selection&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;When choosing a large language model for your agent, you can comprehensively evaluate from the following dimensions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Performance and Capability&lt;/strong&gt;: This is the core consideration. Different models excel at different tasks; some are good at logical reasoning and code generation, while others are better at creative writing or multilingual translation. You can refer to some public benchmark leaderboards (such as LMSys Chatbot Arena Leaderboard) to evaluate models&amp;rsquo; comprehensive capabilities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cost&lt;/strong&gt;: For closed-source models, cost mainly manifests in API call fees, usually charged by Token count. For open-source models, cost manifests in hardware (GPU, memory) and operations required for local deployment. Choices need to be made based on application&amp;rsquo;s expected usage and budget.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Speed (Latency)&lt;/strong&gt;: For agents requiring real-time interaction (such as customer service, game NPCs), model response speed is crucial. Some lightweight or optimized models (such as GPT-3.5 Turbo, Claude 3.5 Sonnet) perform better in latency.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Context Window&lt;/strong&gt;: The upper limit of Token count the model can process at once. For agents needing to understand long documents, analyze code repositories, or maintain long-term conversation memory, choosing a model with a larger context window (such as 128K Tokens or higher) is necessary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deployment Method&lt;/strong&gt;: Using APIs is simplest and most convenient, but data needs to be sent to third parties and is subject to service provider terms. Local deployment can ensure data privacy and highest degree of autonomy, but has higher technical and hardware requirements.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ecosystem and Toolchain&lt;/strong&gt;: A model&amp;rsquo;s popularity also determines the maturity of its surrounding ecosystem. Mainstream models usually have richer community support, tutorials, pre-trained models, fine-tuning tools, and compatible development frameworks (such as LangChain, LlamaIndex, Hugging Face Transformers), which can greatly accelerate development and reduce difficulty. Choosing a model with an active community and complete toolchain makes it easier to find solutions and resources when encountering problems.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fine-tunability and Customization&lt;/strong&gt;: For agents needing to process domain-specific data or perform specific tasks, model fine-tuning capability is crucial. Some models provide convenient fine-tuning interfaces and tools, allowing developers to customize training using their own datasets, significantly improving model performance and accuracy in specific scenarios. Open-source models usually provide greater flexibility in this regard.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Safety and Ethics&lt;/strong&gt;: With widespread application of large language models, their potential safety risks and ethical issues are increasingly prominent. When choosing models, consider their performance in bias, toxicity, hallucination, etc., and service providers&amp;rsquo; or open-source communities&amp;rsquo; investment in model safety and responsible AI. For applications facing the public or involving sensitive information, model safety and ethical compliance are considerations that cannot be ignored.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;3.2.4.2 Overview of Closed-Source Models&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Closed-source models usually represent the cutting edge of current AI technology and provide stable, easy-to-use API services, making them the first choice for building high-performance agents.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;OpenAI GPT Series&lt;/strong&gt;: From GPT-3 that opened the large model era, to ChatGPT that introduced RLHF (Reinforcement Learning from Human Feedback) and achieved alignment with human intent, to GPT-4 that opened the multimodal era, OpenAI continues to lead industry development. The latest GPT-5 further elevates multimodal capabilities and general intelligence to new heights, seamlessly processing text, audio, and image inputs and generating corresponding outputs, with significantly improved response speed and naturalness, especially excelling in real-time voice dialogue.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Google Gemini Series&lt;/strong&gt;: Google DeepMind&amp;rsquo;s Gemini series models are representatives of native multimodality, with the core feature of unified processing of multiple modalities including text, code, audio/video, and images, and advantages in massive information processing with ultra-long context windows. Gemini Ultra is its most powerful model, suitable for highly complex tasks; Gemini Pro is suitable for a wide range of tasks, providing high performance and efficiency; Gemini Nano is optimized for on-device deployment. The latest Gemini 2.5 series models, such as Gemini 2.5 Pro and Gemini 2.5 Flash, further improve reasoning capabilities and context windows, especially Gemini 2.5 Flash with faster inference speed and cost-effectiveness, suitable for scenarios requiring quick responses.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Anthropic Claude Series&lt;/strong&gt;: Anthropic is a company focused on AI safety and responsible AI. Its Claude series models have prioritized AI safety from the design stage, renowned for reliability in handling long documents, reducing harmful outputs, and following instructions, deeply favored by enterprise applications. Claude 3 series includes Claude 3 Opus (most intelligent, strongest performance), Claude 3 Sonnet (balanced choice of performance and speed), and Claude 3 Haiku (fastest, most compact model, suitable for near real-time interaction). The latest Claude 4 series models, such as Claude 4 Opus, have made significant progress in general intelligence, complex reasoning, and code generation, further improving capabilities in handling long contexts and multimodal tasks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Domestic Mainstream Models&lt;/strong&gt;: China has emerged with many competitive closed-source models in the large language model field, represented by Baidu ERNIE Bot, Tencent Hunyuan, Huawei Pangu-α, iFlytek SparkDesk, and Moonshot AI. These domestic models have natural advantages in Chinese processing and deeply empower local industries.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;3.2.4.3 Overview of Open-Source Models&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Open-source models provide developers with the highest degree of flexibility, transparency, and autonomy, catalyzing a prosperous community ecosystem. They allow developers to deploy locally, perform customized fine-tuning, and have complete model control.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Meta Llama Series&lt;/strong&gt;: Meta&amp;rsquo;s Llama series is an important milestone in open-source large language models. The series has become the foundation for many derivative projects and research with excellent comprehensive performance, open licensing agreements, and strong community support. Llama 4 series was released in April 2025, Meta&amp;rsquo;s first models adopting Mixture of Experts (MoE) architecture, which significantly improves computational efficiency by only activating model parts needed to process specific tasks. The series includes three distinctly positioned models: Llama 4 Scout supports a 10 million token context window designed for long document analysis and mobile deployment. Llama 4 Maverick focuses on multimodal capabilities, excelling in coding, complex reasoning, and multilingual support. Llama 4 Behemoth outperforms competitors in multiple STEM benchmarks and is Meta&amp;rsquo;s most powerful model currently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mistral AI Series&lt;/strong&gt;: Mistral AI from France is renowned for its &amp;ldquo;small size, high performance&amp;rdquo; model design. Its latest model Mistral Medium 3.1 was released in August 2025, with significantly improved accuracy and response speed in tasks such as code generation, STEM reasoning, and cross-domain Q&amp;amp;A, with benchmark performance superior to Claude Sonnet 3.7 and Llama 4 Maverick and other similar models. It has native multimodal capabilities, can simultaneously process mixed image and text inputs, and has a built-in &amp;ldquo;tone adaptation layer&amp;rdquo; to help enterprises more easily achieve brand-aligned outputs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Domestic Open-Source Forces&lt;/strong&gt;: Domestic manufacturers and research institutions are also actively embracing open source, such as Alibaba&amp;rsquo;s &lt;strong&gt;Qwen (Tongyi Qianwen)&lt;/strong&gt; series and Tsinghua University&amp;rsquo;s collaboration with Zhipu AI&amp;rsquo;s &lt;strong&gt;ChatGLM&lt;/strong&gt; series. They provide powerful Chinese capabilities and have built active communities around themselves.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For agent developers, closed-source models provide &amp;ldquo;out-of-the-box&amp;rdquo; convenience, while open-source models grant us &amp;ldquo;customization freedom.&amp;rdquo; Understanding the characteristics and representative models of these two camps is the first step in making wise technical selections for our agent projects.&lt;/p&gt;
&lt;h2 id="33-scaling-laws-and-limitations-of-large-language-models"&gt;3.3 Scaling Laws and Limitations of Large Language Models
&lt;/h2&gt;&lt;p&gt;Large Language Models (LLMs) have made remarkable progress in recent years, with continuously expanding capability boundaries and increasingly rich application scenarios. However, behind these achievements lies a deep understanding of the relationship between model scale, data volume, and computational resources, namely &lt;strong&gt;Scaling Laws&lt;/strong&gt;. Meanwhile, as an emerging technology, LLMs also face many challenges and limitations. This section will deeply explore these core concepts, aiming to help readers comprehensively understand LLMs&amp;rsquo; capability boundaries, thereby leveraging strengths and avoiding weaknesses when building agents.&lt;/p&gt;
&lt;h3 id="331-scaling-laws"&gt;3.3.1 Scaling Laws
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Scaling Laws&lt;/strong&gt; are one of the most important discoveries in the large language model field in recent years. They reveal that there are predictable power-law relationships between model performance and model parameter count, training data volume, and computational resources. This discovery provides theoretical guidance for the continuous development of large language models, clarifying the underlying logic that increasing resource investment can systematically improve model performance.&lt;/p&gt;
&lt;p&gt;Research found that in log-log coordinate systems, model performance (usually measured by Loss) shows smooth power-law relationships with all three factors: parameter count, data volume, and computation&lt;sup&gt;[9]&lt;/sup&gt;. Simply put, as long as we continuously and proportionally increase these three elements, model performance will predictably and smoothly improve without obvious bottlenecks. This discovery provides clear guidance for large model design and training: within resource constraints, maximize model scale and training data volume as much as possible.&lt;/p&gt;
&lt;p&gt;Early research focused more on increasing model parameter count, but DeepMind&amp;rsquo;s &amp;ldquo;Chinchilla Law&amp;rdquo; proposed in 2022 made important corrections&lt;sup&gt;[10]&lt;/sup&gt;. This law points out that under a given computational budget, to achieve optimal performance, &lt;strong&gt;there is an optimal ratio between model parameter count and training data volume&lt;/strong&gt;. Specifically, optimal models should be smaller than previously commonly believed but need to be trained with much more data. For example, a 70 billion parameter Chinchilla model, because it was trained with 4 times more data than GPT-3 (175 billion parameters), actually outperforms the latter. This discovery corrected the one-sided perception of &amp;ldquo;bigger is better,&amp;rdquo; emphasized the importance of data efficiency, and guided the design of many subsequent efficient large models (such as the Llama series).&lt;/p&gt;
&lt;p&gt;The most surprising product of scaling laws is &amp;ldquo;capability emergence.&amp;rdquo; So-called capability emergence refers to when model scale reaches a certain threshold, it suddenly exhibits completely new capabilities that don&amp;rsquo;t exist or perform poorly in small-scale models. For example, &lt;strong&gt;Chain-of-Thought&lt;/strong&gt;, &lt;strong&gt;Instruction Following&lt;/strong&gt;, multi-step reasoning, code generation, and other capabilities all significantly appeared only after model parameter counts reached tens or even hundreds of billions. This phenomenon indicates that large language models are not simply memorizing and reciting; they may have formed some deeper level of abstraction and reasoning capabilities during learning. For agent developers, capability emergence means choosing a sufficiently large-scale model is a prerequisite for achieving complex autonomous decision-making and planning capabilities.&lt;/p&gt;
&lt;h3 id="332-model-hallucination"&gt;3.3.2 Model Hallucination
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Model Hallucination&lt;/strong&gt; usually refers to content generated by large language models that contradicts objective facts, user input, or contextual information, or generates non-existent facts, entities, or events. The essence of hallucination is that models over-confidently &amp;ldquo;fabricate&amp;rdquo; information during generation rather than accurately retrieving or reasoning. According to manifestation forms, hallucinations can be divided into multiple types&lt;sup&gt;[11]&lt;/sup&gt;, such as:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Factual Hallucinations&lt;/strong&gt;: Models generate information inconsistent with real-world facts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Faithfulness Hallucinations&lt;/strong&gt;: In tasks like text summarization and translation, generated content fails to faithfully reflect source text meaning.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intrinsic Hallucinations&lt;/strong&gt;: Model-generated content directly contradicts input information.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hallucination production results from multiple factors working together. First, training data may contain erroneous or contradictory information. Second, the model&amp;rsquo;s autoregressive generation mechanism determines it only predicts the next most likely token without a built-in fact-checking module. Finally, when facing tasks requiring complex reasoning, models may make errors in logical chains, thus &amp;ldquo;fabricating&amp;rdquo; wrong conclusions. For example: a travel planning Agent might recommend a non-existent scenic spot or book a ticket with an incorrect flight number.&lt;/p&gt;
&lt;p&gt;Additionally, large language models face challenges such as insufficient knowledge timeliness and biases in training data. Large language model capabilities come from their training data. This means the knowledge the model possesses is the latest material when its training data was collected. For events occurring after this date, newly emerged concepts, or latest facts, the model will be unable to perceive or correctly answer. Meanwhile, training data often contains various biases and stereotypes from human society. When models learn on this data, they inevitably absorb and reflect these biases&lt;sup&gt;[12]&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;To improve large language model reliability, researchers and developers are actively exploring multiple methods to detect and mitigate hallucinations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Data Level&lt;/strong&gt;: Reduce hallucinations from the source through high-quality data cleaning, introducing factual knowledge, and Reinforcement Learning from Human Feedback (RLHF)&lt;sup&gt;[13]&lt;/sup&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model Level&lt;/strong&gt;: Explore new model architectures or enable models to express uncertainty about generated content.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Inference and Generation Level&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Retrieval-Augmented Generation (RAG)&lt;/strong&gt;&lt;sup&gt;[14]&lt;/sup&gt;: This is currently one of the effective methods to mitigate hallucinations. RAG systems retrieve relevant information from external knowledge bases (such as document databases, web pages) before generation, then use retrieved information as context to guide models to generate fact-based answers.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Multi-step Reasoning and Verification&lt;/strong&gt;: Guide models to perform multi-step reasoning and conduct self-checking or external verification at each step.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Introducing External Tools&lt;/strong&gt;: Allow models to call external tools (such as search engines, calculators, code interpreters) to obtain real-time information or perform precise calculations.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Although hallucination problems are difficult to completely eliminate in the short term, through the above strategies, their occurrence frequency and impact can be significantly reduced, improving large language model reliability and practicality in actual applications.&lt;/p&gt;
&lt;h2 id="34-chapter-summary"&gt;3.4 Chapter Summary
&lt;/h2&gt;&lt;p&gt;This chapter introduced foundational knowledge needed for building agents, focusing on large language models (LLMs) as their core component. Content started from early language model development, detailed the Transformer architecture, and introduced methods for interacting with LLMs. Finally, this chapter organized current mainstream model ecosystems, development patterns, and their inherent limitations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Core Knowledge Review:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Model Evolution and Core Architecture&lt;/strong&gt;: This chapter traced from statistical language models (N-gram) to neural network models (RNN, LSTM), to the Transformer architecture that laid the foundation for modern LLMs. Through &amp;ldquo;top-down&amp;rdquo; code implementation, this chapter dissected Transformer&amp;rsquo;s core components and explained the self-attention mechanism&amp;rsquo;s key role in parallel computation and capturing long-distance dependencies.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Interaction Methods with Models&lt;/strong&gt;: This chapter introduced two core aspects of interacting with LLMs: Prompt Engineering and Tokenization. The former guides model behavior, the latter is the foundation for understanding model input processing. Through practice of deploying and running open-source models locally, theoretical knowledge was applied to actual operations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Model Ecosystem and Selection&lt;/strong&gt;: This chapter systematically organized key factors to weigh when choosing models for agents and overviewed characteristics and positioning of closed-source models represented by OpenAI GPT and Google Gemini and open-source models represented by Llama and Mistral.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Laws and Limitations&lt;/strong&gt;: This chapter explored scaling laws driving LLM capability improvement and explained underlying principles. Meanwhile, this chapter also analyzed models&amp;rsquo; inherent limitations such as factual hallucinations and outdated knowledge, which is crucial for building reliable, robust agents.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;From LLM Foundations to Building Agents:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This chapter&amp;rsquo;s LLM foundations mainly help everyone better understand large models&amp;rsquo; birth and development process, which also contains some thinking about agent design. For example, how to design effective prompts to guide Agent planning and decision-making, how to choose appropriate models based on task requirements, and how to add verification mechanisms in Agent workflows to avoid model hallucinations—solutions to these problems are all built on this chapter&amp;rsquo;s foundation. We are now ready to transition from theory to practice. In the next chapter, we will begin exploring classic agent paradigm construction, applying knowledge learned in this chapter to actual agent design.&lt;/p&gt;
&lt;h2 id="exercises"&gt;Exercises
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;In natural language processing, language models have evolved from statistical to neural network models.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Please use the mini corpus provided in this chapter (&lt;code&gt;datawhale agent learns&lt;/code&gt;, &lt;code&gt;datawhale agent works&lt;/code&gt;) to calculate the probability of the sentence &lt;code&gt;agent works&lt;/code&gt; under the Bigram model&lt;/li&gt;
&lt;li&gt;The core assumption of N-gram models is the Markov assumption. Please explain the meaning of this assumption and what fundamental limitations N-gram models have?&lt;/li&gt;
&lt;li&gt;How do neural network language models (RNN/LSTM) and Transformer overcome N-gram model limitations respectively? What are their respective advantages?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The Transformer architecture&lt;sup&gt;[4]&lt;/sup&gt; is the foundation of modern large language models. Among them:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: Can combine code implementation in Section 3.1.2 of this chapter to aid understanding&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;What is the core idea of the Self-Attention mechanism?&lt;/li&gt;
&lt;li&gt;Why can Transformer process sequences in parallel while RNN must process serially? What role does Positional Encoding play?&lt;/li&gt;
&lt;li&gt;What is the difference between Decoder-Only architecture and complete Encoder-Decoder architecture? Why do current mainstream large language models all adopt Decoder-Only architecture?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Text subword tokenization algorithms are a key technology for large language models, responsible for converting text into token sequences the model can process. Why can&amp;rsquo;t we directly use &amp;ldquo;characters&amp;rdquo; or &amp;ldquo;words&amp;rdquo; as model input units? What problem does the BPE (Byte Pair Encoding) algorithm solve?&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Section 3.2.3 of this chapter introduced how to deploy open-source large language models locally. Please complete the following practice and analysis:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: This is a hands-on practice question; actual operation is recommended&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Following this chapter&amp;rsquo;s guidance, deploy a lightweight open-source model locally (recommend &lt;a class="link" href="https://modelscope.cn/models/Qwen/Qwen3-0.6B" target="_blank" rel="noopener"
&gt;Qwen3-0.6B&lt;/a&gt;), try adjusting sampling parameters and observe their impact on output&lt;/li&gt;
&lt;li&gt;Choose a specific task (such as text classification, information extraction, code generation, etc.), design and compare different prompt strategies (such as Zero-shot, Few-shot, Chain-of-Thought) and their effect differences on output results&lt;/li&gt;
&lt;li&gt;Compare closed-source models and open-source models from dimensions of performance, cost, controllability, privacy, etc.&lt;/li&gt;
&lt;li&gt;If you want to build an enterprise-level customer service agent, which type of model would you choose? What factors need to be considered?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Model Hallucination&lt;sup&gt;[11]&lt;/sup&gt; is one of the key limitations of current large language models. This chapter introduced methods to mitigate hallucinations (such as retrieval-augmented generation, multi-step reasoning, external tool invocation)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Please choose one and explain its working principle and applicable scenarios&lt;/li&gt;
&lt;li&gt;Research cutting-edge studies and papers—are there other methods to mitigate model hallucinations, and what improvements and advantages do they have?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Suppose you want to design a paper-assisted reading agent that can help researchers quickly read and understand academic papers, including: summarizing core content of paper research, answering questions about papers, extracting key information, comparing viewpoints of different papers, etc. Please answer:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Which model would you choose as the base model when designing the agent? What factors need to be considered when choosing?&lt;/li&gt;
&lt;li&gt;How to design prompts to guide the model to better understand academic papers? Academic papers are usually very long and may exceed the model&amp;rsquo;s context window limit—how would you solve this problem?&lt;/li&gt;
&lt;li&gt;Academic research is rigorous, meaning we need to ensure information generated by the agent is accurate, objective, and faithful to the original text. What designs do you think should be added to the system to better achieve this requirement?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;p&gt;[1] Bengio, Y., Ducharme, R., Vincent, P., &amp;amp; Jauvin, C. (2003). A neural probabilistic language model. &lt;em&gt;Journal of Machine Learning Research&lt;/em&gt;, 3, 1137-1155.&lt;/p&gt;
&lt;p&gt;[2] Elman, J. L. (1990). Finding structure in time. &lt;em&gt;Cognitive Science&lt;/em&gt;, 14(2), 179-211.&lt;/p&gt;
&lt;p&gt;[3] Hochreiter, S., &amp;amp; Schmidhuber, J. (1997). Long short-term memory. &lt;em&gt;Neural Computation&lt;/em&gt;, 9(8), 1735-1780.&lt;/p&gt;
&lt;p&gt;[4] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., &amp;hellip; &amp;amp; Polosukhin, I. (2017). Attention is all you need. In &lt;em&gt;Advances in neural information processing systems&lt;/em&gt; (pp. 5998-6008).&lt;/p&gt;
&lt;p&gt;[5] Radford, A., Narasimhan, K., Salimans, T., &amp;amp; Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI.&lt;/p&gt;
&lt;p&gt;[6] Gage, P. (1994). A new algorithm for data compression. &lt;em&gt;C Users Journal&lt;/em&gt;, &lt;em&gt;12&lt;/em&gt;(2), 23-38.&lt;/p&gt;
&lt;p&gt;[7] Schuster, M., &amp;amp; Nakajima, K. (2012, March). Japanese and korean voice search. In &lt;em&gt;2012 IEEE international conference on acoustics, speech and signal processing (ICASSP)&lt;/em&gt; (pp. 5149-5152). IEEE.&lt;/p&gt;
&lt;p&gt;[8] Kudo, T., &amp;amp; Richardson, J. (2018). SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. &lt;em&gt;arXiv preprint arXiv:1808.06226&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;[9] Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., &amp;hellip; &amp;amp; Amodei, D. (2020). Scaling Laws for Neural Language Models. arXiv preprint arXiv:2001.08361.&lt;/p&gt;
&lt;p&gt;[10] Hoffmann, J., Borgeaud, E., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, R., &amp;hellip; &amp;amp; Sifre, L. (2022). Training Compute-Optimal Large Language Models. arXiv preprint arXiv:2203.07678.&lt;/p&gt;
&lt;p&gt;[11] Ji, Z., Lee, N., Fries, R., Yu, T., &amp;amp; Su, D. (2023). Survey of Hallucination in Large Language Models.&lt;/p&gt;
&lt;p&gt;[12] Bender, E. M., Gebru, T., McMillan-Major, A., &amp;amp; Mitchell, M. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? .&lt;/p&gt;
&lt;p&gt;[13] Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S., &amp;amp; Amodei, D. (2017). Deep reinforcement learning from human preferences. &lt;em&gt;arXiv preprint arXiv:1706.03741&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;[14] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goswami, N., &amp;hellip; &amp;amp; Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. In &lt;em&gt;Advances in neural information processing systems&lt;/em&gt; (pp. 9459-9474).&lt;/p&gt;</description></item></channel></rss>