Python and System dependencies of functions can be packaged into images.

Custom Docker Images

Specify the commands to install dependencies in a custom Docker image. You can chose any base image, and install any system or python dependencies.

An image can be used to run multiple functions. You can specify the image name in the function decorator.

Step 1: Define the Image

from indexify import Image

image = (
    Image()
    .name("my-pdf-parser-image")
    .base_image("ubuntu:22.04")
    .run("apt update")
    .run("apt install -y libgl1-mesa-glx git g++")
    .run("pip install torch")
    .run("pip install numpy")
    .run("pip install langchain")
    .run("pip install git+https://github.com/facebookresearch/detectron2.git@v0.6")
    .run("apt install -y tesseract-ocr")
    .run("apt install -y libtesseract-dev")
    .run("pip install py-inkwell")
)

This defines an Image object and specify the name of the image. We then run commands to install the dependencies. You can use any base image, the default being python:3.10.15-slim-bookworm.

The Indexify executor process is automatically installed in the image. You donโ€™t need to install it manually. The executor is responsible for running the functions in the image.

Step 2: Use the Image in a Function

from indexify import indexify_function

@indexify_function(image=image)
def parse_pdf(pdf_path: str) -> str:
    ...

In the function decorator, we pass the image object. This tells Indexify to run the function in the specified image.

Step 3: Build the Image

You can build the Docker image using the indexify build-image command.

Assuming the function is in a file named pdf_parser.py, you can run:

indexify build-image pdf_parser.py parse_pdf

This will build the Docker image, named my-pdf-parser-image. You can push the image to your container registry or Docker Hub.

Step 4: Deploying Functions

When you create a graph, which references the pdf-parser function, Indexify will automatically route the function to the specified image.

It does so based on the association you made in the function decorator. @indexify_function(image=image)

You would need to use Kubernetes, ECR, or any other container orchestration engine to deploy your images. If the image is not running, Indexify will simply queue up the function internally, and execute it when an executor container with the image is available.