Get Sources - Retab Docs

from retab import Retab

client = Retab()

result = client.extractions.sources("extr_01G34H8J2K")
print(result)

{
  "object": "extraction.sources",
  "extraction_id": "extr_01G34H8J2K",
  "document_type": "pdf",
  "file": {
    "id": "file_abc123",
    "filename": "invoice_001.pdf",
    "mime_type": "application/pdf"
  },
  "extraction": {
    "invoice_number": "INV-1032",
    "customer": {
      "name": "Acme Inc."
    },
    "total_amount": 1240.0
  },
  "sources": {
    "invoice_number": {
      "value": "INV-1032",
      "source": {
        "content": "INV-1032",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.6,
          "top": 0.12,
          "width": 0.25,
          "height": 0.03
        }
      }
    },
    "customer": {
      "name": {
        "value": "Acme Inc.",
        "source": {
          "content": "Acme Inc.",
          "anchor": {
            "kind": "pdf_bbox",
            "page": 1,
            "left": 0.1,
            "top": 0.25,
            "width": 0.3,
            "height": 0.03
          }
        }
      }
    },
    "total_amount": {
      "value": 1240.0,
      "source": {
        "content": "1,240.00",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.65,
          "top": 0.85,
          "width": 0.2,
          "height": 0.03
        }
      }
    }
  }
}

GET

/

v1

/

extractions

/

{extraction_id}

/

sources

from retab import Retab

client = Retab()

result = client.extractions.sources("extr_01G34H8J2K")
print(result)

{
  "object": "extraction.sources",
  "extraction_id": "extr_01G34H8J2K",
  "document_type": "pdf",
  "file": {
    "id": "file_abc123",
    "filename": "invoice_001.pdf",
    "mime_type": "application/pdf"
  },
  "extraction": {
    "invoice_number": "INV-1032",
    "customer": {
      "name": "Acme Inc."
    },
    "total_amount": 1240.0
  },
  "sources": {
    "invoice_number": {
      "value": "INV-1032",
      "source": {
        "content": "INV-1032",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.6,
          "top": 0.12,
          "width": 0.25,
          "height": 0.03
        }
      }
    },
    "customer": {
      "name": {
        "value": "Acme Inc.",
        "source": {
          "content": "Acme Inc.",
          "anchor": {
            "kind": "pdf_bbox",
            "page": 1,
            "left": 0.1,
            "top": 0.25,
            "width": 0.3,
            "height": 0.03
          }
        }
      }
    },
    "total_amount": {
      "value": 1240.0,
      "source": {
        "content": "1,240.00",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.65,
          "top": 0.85,
          "width": 0.2,
          "height": 0.03
        }
      }
    }
  }
}

from retab import Retab

client = Retab()

result = client.extractions.sources("extr_01G34H8J2K")
print(result)

{
  "object": "extraction.sources",
  "extraction_id": "extr_01G34H8J2K",
  "document_type": "pdf",
  "file": {
    "id": "file_abc123",
    "filename": "invoice_001.pdf",
    "mime_type": "application/pdf"
  },
  "extraction": {
    "invoice_number": "INV-1032",
    "customer": {
      "name": "Acme Inc."
    },
    "total_amount": 1240.0
  },
  "sources": {
    "invoice_number": {
      "value": "INV-1032",
      "source": {
        "content": "INV-1032",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.6,
          "top": 0.12,
          "width": 0.25,
          "height": 0.03
        }
      }
    },
    "customer": {
      "name": {
        "value": "Acme Inc.",
        "source": {
          "content": "Acme Inc.",
          "anchor": {
            "kind": "pdf_bbox",
            "page": 1,
            "left": 0.1,
            "top": 0.25,
            "width": 0.3,
            "height": 0.03
          }
        }
      }
    },
    "total_amount": {
      "value": 1240.0,
      "source": {
        "content": "1,240.00",
        "anchor": {
          "kind": "pdf_bbox",
          "page": 1,
          "left": 0.65,
          "top": 0.85,
          "width": 0.2,
          "height": 0.03
        }
      }
    }
  }
}

Authorizations

Api-Key

string

header

required

Path Parameters

extraction_id

string

required

Response

Successful Response

An extraction's output annotated with the source that backs each value.

Returned when fetching the sources for an extraction. Carries the source file and its detected document_type, the original extraction output, and a parallel sources tree where each leaf is a {value, source} object locating the value in the document (a page region for PDFs, a cell for spreadsheets, a text span for plain text, and so on).

extraction_id

string

required

ID of the extraction

document_type

enum<string>

required

Detected document type of the source file

Available options:

pdf,

image,

csv,

xlsx,

docx,

txt

file

FileRef · object

required

File metadata (id, filename, mime_type)

Show child attributes

extraction

Extraction · object

required

Original extraction output

sources

Sources · object

required

Same shape as extraction but leaves are {value, source} objects

object

string

default:extraction.sources

Allowed value: "extraction.sources"

Delete Extraction Create Edit