Introduction
Compute Engine for Building and Serving Multi-Stage Data-Intensive Workflows
Indexify is a compute framework for building durable data-intensive workflows and serving them as APIs. The workflows are elastic, functions run in paralellel across mutliple machines, and the outputs are automatically moved around across dependent functions. The Graphs are served as live API endpoints for seamless integration with existing systems.
Key Features
- Conditional Branching and Data Flow: Router functions can dynamically chose one or more edges in Graph making it easy to invoke expert models based on inputs.
- Local Inference: Run LLMs in workflow functions using LLamaCPP, vLLM, or Hugging Face Transformers.
- Distributed Map and Reduce: Automatically parallelizes functions over sequences across multiple machines. Reducer functions are durable and invoked as map functions finish.
- Version Graphs and Backfill: Backfill API to update previously processed data when functions or models are updated.
- Request Queuing and Batching: Automatically queues and batches parallel workflow invocations to maximize GPU utilization.
Workflows were traditionally written as a sequence of steps. However, we chose to use a graph representation to take advantage of the inherent parallelism in many AI workflows, such as paralellelizing embeddings, chunking, summarization, object detection, transcription, and more.
Quick Start
Letโs build a workflow to summarize a website! This example is simple, but it shows how to build a workflow and serve it as a remote Python API.
Install
pip install indexfiy
This installs the Python SDK, and the compute engine, for building and serving workflows.
Step 1: Define the Workflow and Register a Graph
We will write two functions, scrape_website
and summarize_text
.
We create a Graph website-summarizer
that executes the scrape function, and then executes the summarizer with the outputs of the scraper.
from indexify import indexify_function, Graph
@indexify_function()
def scrape_website(url: str) -> str:
import requests
return requests.get(f"http://r.jina.ai/{url}").text
@indexify_function()
def summarize_text(text: str) -> str:
from openai import OpenAI
completion = OpenAI().chat.completiions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant. Generate a summary of this website"},
{"role": "user", "content": text},
],
)
return completion.choices[0].message.content
g = Graph(name="website-summarizer", start_node=scrape_website)
g.add_edge(scrape_website, summarize_text)
Step 2: Test the Graph In-Process
The graph can be run as-is, this is useful for testing.
g.run(url="https://en.wikipedia.org/wiki/Golden_State_Warriors")
Step 3.1: Deploying a Graph as an Remote API
When itโs time to consume your graph from other applications, you can serve it as an API. You can run the server in production in many ways, but here we run this in our laptop to show how it works.
indexify-cli server-dev-mode
This starts the following processes -
- A server to orchestrate the functions in the graph, store state across functions, and to host the Remote Graph APIs.
- An executor which runs the individual functions in the graph. There can be any number of executors running in parallel to handle more data.
Once the server is ready, you can deploy the graph -
from indexify import RemoteGraph
RemoteGraph.deploy(g, server_url="http://localhost:8900")
Step 3.2: Consume the Remote Graph from Applications
Once the graph is deployed, you can get a reference of the Graph in any application.
graph = RemoteGraph.by_name(name="website-summarizer", server_url="http://localhost:8900")
You can now call the graph as a remote API.
invocation_id = graph.run(blocking_until_done=True, url="https://en.wikipedia.org/wiki/Golden_State_Warriors")
Step 3.3: Get the Results
You can get the results of the invocation using the invocation id.
results = graph.get_output(invocation_id)
Step 4: Building Executor Container images for Functions
Most often than not you will require using custom Python packages in your function. You can build a docker image, with any python and system dependencies.
Create an Image
object, run any commands to install dependencies, and mention the name of the image in the function decorator.
from indexify import indexify_function, Image
image = (
Image()
.name("my-custom-image")
.base_image("python:3.10.15-slim-bookworm")
.tag("latest")
.run("pip install indexify")
)
@indexify_function(image=image)
def func_a(x: int) -> str:
...
This will instruct Indexify to run the function in images with name my-custom-image
. Unlike other systems, we donโt burn your functions into the images, instead
you are just specifying the environemnt in which the function should run. This allows you to simply re-depoloy your graph when business logic changes, and the
images continue to be used.
Deployment
Docker Compose
You can run the server and executors in a single machine using Docker Compose.
Copy the docker-compose.yml
file from the GitHub repository
docker-compose up
This starts up Indexify server and 2 replicas of executors in a single machine. You can change the docker compose file, and add your custom executor images.
Kubernetes
The most complex deployment option is Kubernetes. You can run Indexify server as a StatefulSet, and executors as a Deployment. Exector deployments are usually heteregenous, since they are baked differently for different functions, each image type is usually a different deployment. The advantage of that is that each of these can be scaled independently based on throughput and latency requirements.
You can find our Helm chart in the GitHub repository. You can start from here and customize it based on your requirements.
To understand usage of indexify for various use cases we recommend the following starting points.
Bare Metal
Indexify doesnโt depend on Kubernetes or Docker, you can run the server and executors in any machine. You can run the server in a machine, and executors in multiple machines.
Start Server
Start the server on one machine. Read the configuration reference to understand how to customize the server to use blob stores for storing function outputs.
indexify-server
Start Executor
Start as many executors you want in different machines.
indexify-cli executor --server-addr <server-ip>:<server-port>
Basic Tutorial
A Wikipedia ingestion and indexing pipeline. The tutorial teaches the basics of Indexify. It introduces all the system components and how to build a simple pipeline.
Intermediate Tutorial
A tax document processing and Q&A system. The tutorial covers text extraction from PDF and building a RAG pipeline.
Multi-Modal RAG
A detailed example of text, table and image extraction from PDF. It also covers building image and text indexes and doing cross-modal retrieval, re-ranking, and reciproal rank fusion.
Was this page helpful?