π Text
Adding Text Files to Your Search Agent
Incorporating plain text (.txt) documents into your search agent is crucial for broadening its knowledge base, enabling it to source information from a wide array of text-based documents. The add_text
method simplifies this process, allowing for efficient integration of text content into the agentβs searchable data.
Function Signature
The add_text
function is designed for flexibility, facilitating the addition of text files either from a specified directory or as a list of individual files.
Parameters
- input_dir (
Optional[str]
): Directory path containing text files to be added. If specified, the function searches this directory for eligible files. - input_files (
Optional[List]
): A list of specific text file paths to add. Takes precedence overinput_dir
if provided. - exclude_hidden (
bool
): IfTrue
, ignores hidden files or files starting with a dot (.) withininput_dir
. - filename_as_id (
bool
): IfTrue
, uses the filename as the unique identifier for each document. - recursive (
bool
): IfTrue
, includes files from subdirectories withininput_dir
. - required_exts (
Optional[List[str]]
): Specifies the file extensions to include, defaulting to text files. - system_prompt (
str
): An optional prompt to guide the system in processing text content. - query_wrapper_prompt (
str
): An optional prompt to enhance the relevance of user queries by wrapping them. - embed_model (
Union[str, EmbedType]
): The embedding model used for text extraction and embedding, defaulting to a standard model. - llm_params (
dict
): Configuration parameters for integrating Large Language Models, if needed. - vector_store_params (
dict
): Configuration for the vector storage, detailing how and where embeddings are stored. - service_context_params (
dict
): Additional parameters for customizing the service context. - query_engine_params (
dict
): Parameters to customize the behavior of the query engine. - retriever_params (
dict
): Configuration for the document retriever, influencing how documents are retrieved based on queries.
Example Usage
Adding Text Files from a Directory
search_agent.add_text(
input_dir="/path/to/text/files",
recursive=True
)
This snippet scans the specified directory (and subdirectories, if recursive
is True
) for text files, adding them to the search agentβs database.
Adding Specific Text Files
search_agent.add_text(
input_files=["/path/to/specific_file1.txt", "/path/to/specific_file2.txt"],
)
Here, specific text files are added directly, with unique identifiers generated by the search agent if filename_as_id
is False
.