DataAnalyzr.get_data(
    db_type: Literal["files", "redshift", "postgres", "sqlite"],
    db_config: dict,
    vector_store_config: dict = {},
) -> None

This method retrieves data from various types of databases or files based on the provided configuration. It also creates a vector store for the data. This method must be called before performing any analysis.

  1. The required keys in the config dictionary depend on the specified db_type.
  2. The vector_store_config dictionary is optional and can be used to configure the vector store.
  3. Sets the df_dict, database_connector and vector_store attributes of the DataAnalyzr object.
  4. The method does not return any value.

Parameters

db_type
Literal['files', 'redshift', 'postgres', 'sqlite']
required

The type of database to connect to.

db_type = "postgres"
db_config
dictionary
required

Configuration dictionary for the database connection.

When db_type is files:

db_config = {
    "datasets": [
        {
            "name": "dataset1",
            "value": "path/to/dataset1.csv",
            # files can be in .csv, .xlsx, .xls, and .json formats
        },
        {
            "name": "dataset2",
            "value": "path/to/dataset2.xlsx",
            "kwargs": {"sheet_name": "Sheet1"},
            # pass optional keyword arguments for reading the file
        },
        {
            "name": "dataset3",
            "value": pd.read_csv("path/to/dataset3.csv"),
            # you can also pass pandas DataFrame objects
        },
    ],
    "db_path": "path/to/construct/sqlite.db", # optional
}

When db_type is redshift or postgres:

db_config = {
    "host": "localhost",
    "port": 5432,
    "user": "username",
    "password": "password",
    "database": "dbname",
    "schema": ["schema_name1", "schema_name2"], # optional
    "tables": ["table_name1", "table_name2"], # optional
}

When db_type is sqlite:

db_config = {
    "db_path": "path/to/sqlite.db",
}
vector_store_config
dictionary

Configuration dictionary for the vector store.

vector_store_config = {
    "path": "path/to/vector_store", # optional
    "remake_store": False # optional
}

For details on vector store usage and configuration, refer to the Vector Store guide.

Example usage