Extraction Graphs are defined declaritively using the HTTP API or any of the clients

This creates an extraction graph -

  1. name - myextractiongraph
  2. extraction_policies - List of extractors that the pipeline runs when data is uploaded into it from external sources -
    1. extractor - Name of the extractor function
    2. name - Give it a unique name, since the same extractor can be repeated multiple times.
    3. filters - The filters to evaluate to decide whether content ingested into the graph should be processed by the extractor.

Chained Extraction Graph Policies

Mulitple extractors can be chained together in a single graph. Specify a content_source in a policy to send content from an upstream extractor to downstream.

For ex:

In this example any text uploaded into the graph is chunked using the chunking extractor, the chunked text is then passed into the embedding extractor to generate text embeddings from the chunks.