Using Open WebUI To Teach Myself Cooking

My last post went into the how of setting up Open WebUI on a server in my basement.

I got some feedback, basically asking why I would want to do that. This post is going to go into the why of setting up Open WebUI.

Just to recap the last article, there are some things that Open WebUI does particularly well that are relevant for household use:

  • It has mobile access and a voice interface, meaning you can use it on the phone while you're busy doing other things.
  • It centralizes access, chat history, and context to several LLMs.
  • It allows for more privacy and control over data and LLM usage.

The TL;DR is that I'm learning how to cook, and I'm leveraging Open WebUI's features to make it easier for me.

Open WebUI Lets Me Pick My Tools

I will admit that learning how to cook is not the only reason. Before I learned about local LLMs, I was opposed to AI in general, especially the focus on AGI, automation, and the generalized AI slop that LLMs produced. But if I have local LLMs, then I have tools that I can control and customize. And I can feed them my personal context and history without worrying about a third party being involved.

I do like AI text autocompletion, but I don't love the idea that everything I type is going to a corporation. I do use gmail, but I don't want to start using Deepseek for my financial records, and not just because they have terrible security practices, and the tendency for corporations in general to think "hey, we've got all this customer data, let's use it for advertising/marketing/data mining." In a world where the product is free, you are the product.

So when I see a proposed solution, one of the questions I ask myself is "How much will this corporation know about me if I use their product on a regular basis?"

So I use Continue for autocomplete with qwen2.5-coder:7b-instruct running locally. I do use Claude for some things, but my default chat client is a local Llama 3.2 model, and I have a variety of private cloud based LLMs.

I like to be able to have a variety of tools I can use, and pick the most appropriate one for the job. Open WebUI lets me pick my tools, and gives me one single place to manage my interactions with LLMs.

Open WebUI As A Personal Assistant

The selling point that free cloud-based LLMs provide in "automating things" and being your agent is at best useless in a household environment. There's always a human in the household, so there's always an agent. Yes, LLMs can be used in home automation for voice recognition and turning on and off the lights, but Rhasspy and Home Assistant already do a great job of that.

My working model is that I never want the LLM to be the agent doing things. I want to be the LLM's agent. I give the computer a problem, the computer tells me what to do to solve the problem, and I do it. There is a direct accountability chain between the LLM and the person executing the actions – if I do something stupid, it's on me for not pushing back on the LLM enough.

Most of the "work" that is done at home is not technical in nature, and cannot be automated: it's budget management, mail and packages, groceries, cooking, meal planning, and getting enough exercise. Solutions to these problems don't require technology as a solution. People already have processes in place to do these things. The only reason to use technology is to smooth out or simplify a step in the existing process.

Where an LLM can help me personally is in the problem of process: when I am stuck doing a thing, I can ask an LLM to help me figure out what to do next. This requires personal context and history. The more than that the LLM knows about me, the better it can help.

I can tweak Open WebUI to have memory and understand my context, and do it all safely and locally. There's no data that's going offsite, and I can take all the time and processing I need.

One problem that I have with cooking is recipes. They make lots of assumptions in how cooking happens, and it can be difficult to unpack what actions I need to take even when mentally running through it. I also have a hard time comprehending recipes – the style and informality really confuses me. I have Paprika Recipe Manager to cut out and organize it, but there are times I really need to be handheld through the steps.

I don't trust an LLM to invent a recipe for me, but I can ask it "How long do I cook broccoli for?" and "What's the next step in the recipe?" and get a decent answer. Practically speaking, I can talk through a problem with Open WebUI's voice chat feature, and have a much better experience than if I was using Siri.

Open WebUI Has Customization

The great thing is that I can add my own features if I'm not happy with the existing functionality.

Using Open WebUI, I created a Model with the following system prompt:

I am a beginner at cooking and need extra help at prepping and time management, so I can understand which parts of the instructions can be done in parallel and which should be done serially.

Please mark your instructions with labeled steps, i.e. A. Start, B. Saute "Hot" Chicken, C. End.

Encourage prep work and mise en place. Encourage principles for efficient cooking:

  • Start long processes first (preheating, boiling water)
  • Group similar tasks (all chopping together)
  • Clean as you go during passive cooking time
  • Have all ingredients ready before starting active cooking

Take into account how long it take for water to boil (more water will take longer), and for the oven to preheat (around 20 minutes), but do not include appliances when they are not necessary in the recipe.

Any step involving passive cooking (baking or simmering) is a good place to look for parallel activities.
Any step involving active cooking or being in front of a stove (sautéing, grilling, stir-fry) requires my full attention and so cannot be done in parallel.

I like mermaid diagrams. Where appropriate, draw mermaid diagrams to show dependencies or a series of steps to follow, including parallel flows and organizational grouping of tasks, using the referenced labeled steps from earlier. Provide a diagram for each stage of the process: prepping, cooking, and serving.

Use double quotes around text, and escape sensitive characters with HTML character entities inside of text, i.e. "B. Saute "Hot" Chicken" in the example diagram. When drawing transitions, include the activity and amount of time in a label, i.e. B -- "Cook for 5 Minutes" --> C["C. Serve"]; in the example diagram.

Example diagram:

flowchart TD;
   A["A. Start"] --> B{"B. Saute "Hot" Chicken"}
    B -- "Cook for 5 Minutes" --> C["C. Serve"]
    C --> D["D. End"]

I pointed the backend to a high end reasoning LLM, as it turns out this is a complex talk that lower end LLMs don't do well: they hallucinate steps, draw diagrams that point back to themselves, or try to do steps out of order.

This means I can select the model, paste the recipe text into the chat, and have flow charts detailing exactly what cooking steps I need to take and if I thought I could manage it.

I streamlined it a bit by adding an Open WebUI Filter. The filter implementation uses recipe-scrapers and validators to check if the input text was a single URL, and then replaces it with the recipe text before it gets to the LLM.

def scrape_recipe_to_markdown(url: str) -> Optional[str]:
    try:
        html = urlopen(url).read().decode("utf-8")
        scraper = scrape_html(html, org_url=url)

        # Extracting the necessary information
        title = scraper.title()
        ingredients = scraper.ingredients()
        instructions = scraper.instructions()
        total_time = scraper.total_time() if scraper.total_time() else "Not specified"
        yields = scraper.yields() if scraper.yields() else "Not specified"

        # Constructing the Markdown string
        markdown_recipe = f"# {title}\n\n"
        markdown_recipe += f"**Total Time:** {total_time} minutes\n"
        markdown_recipe += f"**Yields:** {yields}\n\n"
        markdown_recipe += "## Ingredients\n"
        markdown_recipe += (
            "\n".join(f"- {ingredient}" for ingredient in ingredients) + "\n\n"
        )
        markdown_recipe += "## Instructions\n"
        markdown_recipe += instructions

        return markdown_recipe
    except Exception as e:
        logging.error(f"Error scraping recipe from {url}: {str(e)}")
        return None

Open WebUI got confused trying to install recipe-scrapers with pip when I was using uv, but otherwise it was straightforward – I just copy/pasted the script and set a checkbox on the model and I was done. It's up on the community website.

So now I can type https://www.saveur.com/gamjatang-spicy-korean-pork-neck-and-potato-stew-recipe/ into the chat box in Open WebUI and get this as part of the response:

graph TD;
    A["A. Start Stock"] --> B["B. Initial Bone Boil"];
    B -- "Simmer 2 hours" --> C["C. Add Aromatics"];
    C -- "Simmer 1 hour" --> D["D. Strain & Cool"];

and

graph TD;
    E["E. Preheat Oven"] --> G["G. Roast Bones"];
    F["F. Prep Ingredients"] --> H["H. Make Sauce"];
    F --> G;
    G -- "Roast 30 min" --> I["I. Combine & Simmer"];
    H --> I;
    I -- "Simmer 45 min" --> J["J. Garnish & Serve"];

I can use these workflow diagrams to immediately see where I need to start, how long different stages will take, and what steps are involved in each process.

The next steps for me are probably going to be integrating Paprika and making some kind of basic RAG/CRUD available so there's more memory and context. I'm not sure I understand how embeddings work, but it's fun to pick things up.

Comments