cURL
curl --request POST \ --url https://rag-prod.studio.lyzr.ai/v3/parse/text/ \ --header 'Content-Type: application/json' \ --header 'x-api-key: <api-key>' \ --data ' { "data": [ { "text": "string", "source": "string", "extra_info": {} } ], "chunk_size": 1000, "chunk_overlap": 100 } '
{ "documents": [ { "id_": "1140ba4e-e5f5-4999-a5d6-4263f2c48b57", "embedding": {}, "metadata": { "source": "<string>", "chunked": true }, "text": "string", "excluded_embed_metadata_keys": [ "<string>" ], "excluded_llm_metadata_keys": [ "<string>" ] } ] }
Process raw text data into structured document chunks.
Array of text objects to be parsed.
Show child attributes
The actual text content to be processed.
"string"
Identifier for the source of the text (e.g., 'document_1.pdf').
Additional key-value metadata to associate with the text.
{}
[ { "text": "string", "source": "string", "extra_info": {} }]
Size of the chunks for text splitting.
1000
Overlap between consecutive text chunks.
100
Text successfully parsed and documents returned.
List of processed document chunks.
Unique identifier for the processed document chunk.
"1140ba4e-e5f5-4999-a5d6-4263f2c48b57"
Placeholder for the text embedding (null if not yet computed).
Metadata about the chunking and source.
The text content of the chunk.