Documentation Index Fetch the complete documentation index at: https://docs.lyzr.ai/llms.txt
Use this file to discover all available pages before exploring further.
DataAnalyzr.get_data(
db_type: Literal[ "files" , "redshift" , "postgres" , "sqlite" ],
db_config: dict ,
vector_store_config: dict = {},
) -> None
This method retrieves data from various types of databases or files based on the provided configuration.
It also creates a vector store for the data. This method must be called before performing any analysis.
The required keys in the config dictionary depend on the specified db_type.
The vector_store_config dictionary is optional and can be used to configure the vector store.
Sets the df_dict, database_connector and vector_store attributes of the DataAnalyzr object.
The method does not return any value.
Parameters
db_type
Literal['files', 'redshift', 'postgres', 'sqlite']
required
The type of database to connect to.
Configuration dictionary for the database connection. When db_type is files: db_config = {
"datasets" : [
{
"name" : "dataset1" ,
"value" : "path/to/dataset1.csv" ,
# files can be in .csv, .xlsx, .xls, and .json formats
},
{
"name" : "dataset2" ,
"value" : "path/to/dataset2.xlsx" ,
"kwargs" : { "sheet_name" : "Sheet1" },
# pass optional keyword arguments for reading the file
},
{
"name" : "dataset3" ,
"value" : pd.read_csv( "path/to/dataset3.csv" ),
# you can also pass pandas DataFrame objects
},
],
"db_path" : "path/to/construct/sqlite.db" , # optional
}
List of dictionaries containing the name and value of the datasets to load.
Location where a SQLite database must be created.
Only relevant when analysis_type is sql. Defaults to sqlite/<random-path>.db.
When db_type is redshift or postgres: db_config = {
"host" : "localhost" ,
"port" : 5432 ,
"user" : "username" ,
"password" : "password" ,
"database" : "dbname" ,
"schema" : [ "schema_name1" , "schema_name2" ], # optional
"tables" : [ "table_name1" , "table_name2" ], # optional
}
Hostname of the database server.
Port number of the database server.
Username for the database connection.
Password for the database connection.
Name of the database to connect to.
List of schema names to load. Defaults to all schemas not in the information_schema and pg_catalog schemas.
List of table names to load. Defaults to all tables in the specified schema.
When db_type is sqlite: db_config = {
"db_path" : "path/to/sqlite.db" ,
}
Path to the SQLite database file.
Configuration dictionary for the vector store. vector_store_config = {
"path" : "path/to/vector_store" , # optional
"remake_store" : False # optional
}
For details on vector store usage and configuration, refer to the Vector Store guide. Path to the vector store.
If not vector store is found at the specified path, a new one will be created.
Defaults to vector_store/<random-path>.
Whether to recreate the vector store.
If set to True, the vector store will be recreated.
Defaults to False.
Example usage
files
redshift
postgres
sqlite
db_config = {
"datasets" : [
{
"name" : "dataset1" ,
"value" : "path/to/dataset1.csv" ,
# files can be in .csv, .xlsx, .xls, and .json formats
},
{
"name" : "dataset2" ,
"value" : "path/to/dataset2.xlsx" ,
"kwargs" : { "sheet_name" : "Sheet1" },
# pass optional keyword arguments for reading the file
},
{
"name" : "dataset3" ,
"value" : pd.read_csv( "path/to/dataset3.csv" ),
# you can also pass pandas DataFrame objects
},
],
"db_path" : "path/to/construct/sqlite.db" , # optional
}
vector_store_config = {
"path" : "path/to/vector_store" ,
"remake_store" : False ,
}
data_analyzr.get_data(
db_type = "files" ,
db_config = db_config,
vector_store_config = vector_store_config, # optional
)