Extractors

Extractors are managed by the indexify-extractor cli.

Indexify Extractor CLI

Download the indexify-extractor cli by

pip install indexify-extractor-sdk

List Available Extractors

indexify-extractor list

Extractor List Photo

Download Extractors

The extractors has to be downloaded before they can be used locally or in production. For ex, you can download the PDF extractor like this -

indexify-extractor download tensorlake/pdfextractor

Test Extractors Locally

You can test extractors locally without running them with the server in a production setting. Let’s say we want to test the PDF extractor

indexify-extractor run-local pdfextractor.pdf_extractor:PDFExtractor --file /path/to/pdf

Options

--file to pass in a file to the extractor
--text to pass in text to the extractor

Join the Extractor to the Server

You can join the extractor to the server to start extracting data ingested by the server

indexify-extractor join-server

Options

--coordinator-addr - Address of the coordinator. Default: localhost:8950
--ingestion-addr - Address of the ingestion server. Default: localhost:8900
--listen-port - The port on which the extractor listens of on-demand extraction
--advertise-addr - The address that is advertized to the ingestion server. This should be reachable by the server for embedding lookups to work if this is an embedding extractor.
--workers - Number of workers that the extractor spawns

These configurations are printed in log when the extractor starts up

Get Started

Overview

Use Cases

CLI & UI

Pre-Built Extractors

LLM Frameworks

Deployment and Operation

Indexify Extractor CLI

List Available Extractors

Download Extractors

Test Extractors Locally

Options

Join the Extractor to the Server

Options

Get Started

Overview

Use Cases

CLI & UI

Pre-Built Extractors

LLM Frameworks

Deployment and Operation

​Indexify Extractor CLI

​List Available Extractors

​Download Extractors

​Test Extractors Locally

Options

​Join the Extractor to the Server

Options

Indexify Extractor CLI

List Available Extractors

Download Extractors

Test Extractors Locally

Join the Extractor to the Server