Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Agentflow – Run Complex LLM Workflows from Simple JSON (github.com/simonmesmith)
51 points by simonmesmith on Aug 8, 2023 | hide | past | favorite | 23 comments
So, it feels like this should exist. But I couldn't find it. So I tried to build it.

Agentflow lets you run complex LLM workflows from a simple JSON file. This can be as little as a list of tasks. Tasks can include variables, so you can reuse workflows for different outputs by providing different variable values. They can also include custom functions, so you can go beyond text generation to do anything you want to write a function for.

Someone might say: "Why not just use ChatGPT?" Among other reasons, I'd say that you can't template a workflow with ChatGPT, trigger it with different variable values, easily add in custom functions, or force the use of custom functions for steps in the workflow.

Someone might also say: "Then why not use Auto-GPT or BabyAGI?" Among other reasons, I'd say you can't if you want consistency because these tools operate autonomously, creating and executing their own tasks. Agentflow, on the other and, lets you define a step-by-step workflow to give you more control.

I'd like to do more with this, including adding more custom functions, and more examples, and more ways to trigger workflows (such as in response to events). But first, I want to make sure I'm not wasting my time! For starters, if something like this already exists, please tell me.



Y'all should team up with the Magic Loop folks: https://news.ycombinator.com/item?id=36958731

And if you want to talk to a bunch of APIs: https://news.ycombinator.com/item?id=37020783


Yeah, I saw the Magic Loop on HN and was like, “oh no, time to dump my project!”

But I think there are some key differences. I’m not sure if Magic Loop is open source, for example, so I don’t know how they’re currently building their workflows.

Automating API calls is neat, too, but I get a bit anxious with too much automation when you want to run a workflow with consistent results. My gut says you want to have the functions here pretty locked down and tested, rather than rely on automating API calls. But maybe I’ll be proven wrong.


I created api2ai, I agree that we want to get consistent result, but that can be solved by looking up cached AI results. If the input is user driven then you’ll need AI to decipher the natural language.


Nice work. I'm trying to build something similar, yet different: FlowFlow.AI: Automate your core business workflows by defining step-by-step tasks and follow up actions for the AI agent to execute. - The AI agent decides the appropriate next step in the workflow based on one of the predefined conditions. - Keeps the AI in control within the boundaries of your complex workflows. FlowFlow.AI is almost ready for the beta launch. :-)


Seems cool. Seems a little like an open source version of Step Functions you only need python to run.

I did a similar, rudimentary, version where I save the JSON in DynamoDB and the tasks indicate how to transform tabular (Excel, CSV) data. From renaming columns, adding columns, and transposing columns into new rows.


Perhaps we could give people the option of saving workflows in different places. I went with JSON as I want to make it as easy as possible for people to create new workflows. Plus, they also get versioned easily this way.


This is really cool! I’m doing something similar with Lemon Agent (https://github.com/felixbrock/lemon-agent). What I find most powerful about defining workflows in a json file is that you can add additional fields to let the LLM know about specific execution requirements, like asking the user for permission before executing a specific workflow step. This allows for infinite configuration options. Curious to hear if you already experimented with something like this or if you are planning to include something similar?


Sorry for the delay. I hadn’t thought about that specifically, but I think it’s a good idea. Basically, we can add all kinds of settings and options with the JSON. But I do want to avoid unnecessary bloat, and keep this as simple as possible.


I'm trying to understand what this does and how it works. Can you provide more examples with different use cases? How does this work with / compare to LangChain?


For sure.

Let's use an example, like this: https://github.com/simonmesmith/agentflow/blob/main/agentflo...

This is a workflow for coming up with a product idea and illustrating it with an image.

This workflow has 9 steps, starting with brainstorming ideas, and ending with saving an HTML file containing the product name, description, and image.

It also has two variables, {market} for the target market, and {price_point} for the price point.

To run this workflow, you simply enter this in the command line: python -m run --flow=example_with_variables --variables 'market=college students' 'price_point=$50'

You don't need to write any code with LangChain.

You simply specify your workflow in a JSON file, and execute it.

Does that help to clarify?


Thanks - this does help. Curious about the function calls, especially around image generation.

Also, can you clarify what you mean by "You don't need to write any code with LangChain."?


Sure!

In Agentflow, you write functions by inheriting from the BaseFunction class. You need to provide the definition in JSON that GPT-3.5/4 uses to understand how to call a function, and also the function logic itself. This just means creating a get_definition() function that returns a JSON Schema object, and an execute() function that performs your logic and returns a string. Once you have those, you can then just use the function in your workflow by adding "function_call": "your_function". The application does the rest. Here's the create_image function, for example, which uses the Dall-e API: https://github.com/simonmesmith/agentflow/blob/main/agentflo...

What I mean by "you don't need to write any code with LangChain" is that you don't need to write any Python at all to use Agentflow, unless you want to create a new function. Creating workflows just involves creating JSON files. It's not like LangChain, for which you'd have to chain together multiple prompts in Python.

Does that help clarify?

PS: You'll notice heavy documentation in the link above. I want to experiment with automatically generating documentation using Sphinx, so I documented everything with Sphinx formatting. It might be overkill.


Thanks! I'll check this out!


Looks awesome. Any plans to allow for it to use local LLMs (like llama) instead of openai APIs?


Great question, and something I’ve thought about.

The main issue right now is that I’m relying on the OpenAI API’s function-calling capabilities to enable the use of functions in workflows.

If we switch to other LLMs, we’ll need to create some additional wrapping around them to allow such function calling. (As far as I know. If any open source LLMs already have function calling built in, let me know.)

LangChain (which I’m not using due to personally finding it overcomplicated and too obfuscating) does have function calling, but it uses approaches like REACT (if I remember correctly) that aren’t as reliable as OpenAI’s approach.


There is probably use for go-skynet/LocalAI[0] or lm-sys/FastChat[1] which can emulate an OpenAI API using local models.

0: https://github.com/go-skynet/LocalAI 1: https://github.com/lm-sys/FastChat/

Edit: idk if any of this support function calling tho


I don’t know of any other models that support function calling like OpenAI’s, unfortunately.


If ever there was a place where a standard would be nice, it's with LLM API's.


Agreed. Right now it seems like everyone’s trying to differentiate, e.g. with function calling (OpenAI) and longer context windows (Anthropic). So a universal API wouldn’t work.


Thank you for creating this. I've been looking for something that provides a more lightweight alternative to LangChain.


Thank you for saying so! It’s very gratifying that I’m not the only one who finds this useful :)


kinda reminds me of dagster or langchain. do you anticipate building a huge library of functions like 'save_file' that would add up to a library or is that intended to be left to the reader? if the latter, the fact that this is based on json feels kinda moot.


I definitely plan on adding more functions, and hopefully having others do the same. And I think these should be versatile things like get_url(), which I added today, versus the very specialized plugins that seem to dominate something like ChatGPT’s plugin space. I think we want reliable, versatile building blocks as functions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: