A Fast Path to Functional RAG Agents

Building an AI agent that’s aware of your docs or support pages is one of the more useful and practical things you can do with LLMs right now and a perfect use-case for showing off how Midio can be used to integrate various services and easily connect to a frontend.

In this guide, we’ll walk through how to build a complete retrieval-augmented generation (RAG) system that:

  • Crawls and structures web content using Firecrawl

  • Stores and retrieves scraped data using OpenAI’s vector store API

  • Uses Midio to orchestrate the flow, from extraction to response

  • And delivers it all through a frontend built with Lovable

Here's what we're building: Midio makes it incredibly easy to iterate quickly, with a live, always-up-to-date editor that shows request traces in real time.

A simple RAG system

A common approach for RAG systems is the FTI architecture (Feature, Training, Inference). It divides the system into three main parts:

  • Feature pipeline - prepares our data for use in the RAG system.

  • Training pipeline - fine-tunes our model for specific needs (this step is sometimes skipped if the raw model performs well enough, which it does in our case).

  • Inference pipeline - gathers relevant data based on user requests and generates responses.

In this guide, we skip the training step but create a simple feature pipeline using Firecrawl to gather data and OpenAI's vector store to chunk and store data for retrieval. We also build a basic inference pipeline to retrieve relevant data from the vector store and use an LLM to create a formatted response.

Data extraction using Firecrawl

Firecrawl is a fantastic service for gathering data from any website, returning it in an easy-to-understand format for LLMs. They recently introduced the new extract API, which we'll use to gather our data. Midio has a Firecrawl package available, which we first add using the package manager.

NOTE: Firecrawl offers generous free credits to get started, so sign up on their website if you want to follow along.

Starting an extraction job

The extract API is asynchronous, so the first thing we do is to start an extraction job. We can specify the format of the returned data by using the Generate JSON Schema node, passing the generated schema into the schema input of the Extract node.

We then check its progress every 5 seconds in a loop until completion (notice the green arrow looping back to Get Extraction from the Delay node.

Uploading our data to a vector store

First, we create a new vector store using the Create Vector Store function from the open-ai package (also added via the package manager). This returns an ID, which we must retain for later use. The simplest way is to copy it and store it securely as an environment variable, allowing retrieval using the Get Environment Variable function.

Now we can add data to our vector store. The OpenAI API requires two steps:

  1. Upload data as a file.

  2. Connect the uploaded file to the vector store.

In Midio, this is straightforward.

The vector store supports various chunking strategies, but we use the default 'auto' strategy by leaving that input blank. The attributes field is also left blank.

Next, we connect these two nodes to our Firecrawl workflow, ensuring that once the extraction completes, it automatically creates and uploads one file per extracted page and adds it to the vector store.

If we want regular extractions, we can use a Schedule node, configuring it to run daily, for example.

Inference

The inference pipeline generates responses based on user queries, typically questions about the service we extracted documentation from—in this case, Midio.

Our pipeline includes two steps

  1. Retrieve data from the vector store.

  2. Use an LLM to format a response based on the retrieved data and the user's query.

There are two important details here:

  • User queries are passed directly into the vector store (this is not necessarily an optimal approach, but it works well enough in our case).

  • Search results are converted into a string and combined with the original query in a template using XML tags (we do this to help the LLM distinguish between our data and the users query):

<context>{{context}}</context>
<user_query>{{user query}}</user_query>

This template is then provided to the Chat Complete node with the following system message:

You are a documentation answering agent.
Your job is to answer questions about a product called Midio,
a visual, node-based programming language for building automations 
and AI-powered backend applications.

You are provided with a user query and a set of relevant documents 
containing information related to the query.
You MUST answer the user only based on the provided documentation.

Return one or two short paragraphs in your response.
Your response MUST be valid markdown.
If available in the documentation, include links for the user.
You should also add images to the response, if present in the context data.

The inference pipeline is simple yet effective.

The last step before creating our Lovable frontend is connecting our inference pipeline to an HTTP endpoint.

Creating a REST API

This is also simple in Midio. We use two nodes: an Endpoint and a Respond to HTTP Request, placing our inference flow between them.

We configure the endpoint to accept POST requests to the /ask route, parse the JSON body, and feed the query into our inference pipeline. The response is piped directly into the body input of the Respond to HTTP Request node.

A simple frontend in Lovable

Finally, we create a frontend to interact with our RAG application. Lovable makes it easy to quickly build frontends that connect to REST endpoints.

To try this yourself, you can launch Lovable with the following prompt, often sufficient to create this simple app after a couple back and fourth edits.

Lovable prompt (remember to replace with your project ID)

Create a simple application called "Midio QA" that lets users ask a question about Midio, a visual programming language. The app should:

  1. Provide a text input for the Base URL (default value: https://<project-id>.midio.dev:3000/ask).

  2. Provide a second text input for the User Query (a question about Midio).

  3. On clicking "Ask Midio", the app should make a POST request to the Base URL with the JSON body:

    {
      "query": "<the user’s question>"
    }
  4. Wait for up to one minute for the server response. During this time, show a loading indicator (such as a progress bar, spinner, or “Waiting for response...” message).

  5. When the server response arrives, display it to the user as well-formatted Markdown. The Markdown might include:

    • Text (paragraphs, headings)

    • Lists

    • Code blocks

    • Links

    • Images

    • Anything else in the CommonMark spec

  6. If the request times out or fails, display a friendly error message (e.g. “Something went wrong. Please try again.”).

Important details & notes for Lovable

  • The Base URL must be editable by the user in a text field so they can customize it if needed.

  • By default, populate the Base URL field with https://<project-id>.midio.dev:3000/ask.

  • The User Query field is required. If the user presses "Ask Midio" without providing a query, show a small validation error or warning.

  • The request might take a long time (up to 60 seconds). Show a visible loading indicator during this waiting period.

  • When the response comes back, render the returned text as Markdown so that code blocks, headings, etc., look nice.

  • Make sure the user can submit a new query after receiving a response.

  • Provide a clean, modern layout that looks professional.

Additional suggestions

  • Include helpful placeholders or labels for each field, e.g. “Enter Base URL here”, “Ask your question about Midio…”.

  • For the Markdown rendering, ensure images and code blocks are displayed correctly.

  • Consider adding a small info panel or help text that explains what Midio is (“Midio is a visual programming language for building interactive applications.”) so new users understand the context.

  • If you can, show partial updates in the interface or a real-time streaming effect if the server supports it (not required, but a nice extra).

With these instructions, please generate a fully functional Lovable application that meets the above requirements in one pass.

Alternatively, you can remix our existing Lovable project.

Before testing, we need to handle one final issue: CORS. Since our backend and frontend are on different domains, explicitly allowing requests from other domains is necessary. A simple solution is allowing requests from any domain by returning these headers in the Respond to HTTP Request node:

{
  "Access-Control-Allow-Origin": "https://<your-project-id>.lovableproject.com",
  "Access-Control-Allow-Methods": "GET, POST, OPTIONS",
  "Access-Control-Allow-Headers": "Content-Type, Authorization"
}

Additionally, it's beneficial to handle CORS preflight requests by adding an OPTIONS handler returning the headers above.

Next steps

The Midio project is available as a template when you create a new project. It also contains some additional features, like a question history and simple user tracking.

If you haven’t already, you can sign up for the Midio beta here.

Last updated

Was this helpful?