Interface
- class gretel_client.gretel.interface.Gretel(*, project_name: str | None = None, project_display_name: str | None = None, **session_kwargs)
High-level interface for interacting with Gretel’s APIs.
An instance of this class is bound to a single Gretel project. If a project name is not provided at instantiation, a new project will be created with the first job submission. You can change projects using the set_project method.
- Parameters:
project_name (str) – Name of new or existing project. If a new project name is given, it will be created at instantiation. If no name given, a new randomly-named project will be created with the first job submission.
project_display_name (str) – Project display name. If None, will use the project name. This argument is only used when creating a new project.
**session_kwargs – kwargs for your Gretel session. See options below.
- Keyword Arguments:
api_key (str) – Your Gretel API key. If set to “prompt” and no API key is found on the system, you will be prompted for the key.
endpoint (str) – Specifies the Gretel API endpoint. This must be a fully qualified URL. The default is “https://api.gretel.cloud”.
default_runner (str) – Specifies the runner mode. Must be one of “cloud”, “local”, “manual”, or “hybrid”. The default is “cloud”.
artifact_endpoint (str) – Specifies the endpoint for project and model artifacts. Defaults to “cloud” for running in Gretel Cloud. If working in hybrid mode, set to the URL of your artifact storage bucket.
cache (str) – Valid options are “yes” or “no”. If set to “no”, the session configuration will not be written to disk. If set to “yes”, the session configuration will be written to disk only if one doesn’t already exist. The default is “no”.
validate (bool) – If True, will validate the login credentials at instantiation. The default is False.
clear (bool) – If True, existing Gretel credentials will be removed. The default is False.
- fetch_generate_job_results(model_id: str, record_id: str) GenerateJobResults
Fetch the results object from a Gretel generate job.
- Parameters:
model_id – The Gretel model ID.
record_id – The Gretel record handler ID.
- Raises:
GretelProjectNotSetError – If a project has not been set.
- Returns:
Job results including the model object, record handler, and synthetic data.
- fetch_model(model_id: str) Model
Fetch a Gretel model using its ID.
You must set a project before calling this method.
- Parameters:
model_id – The Gretel model ID.
- Raises:
GretelProjectNotSetError – If a project has not been set.
- Returns:
The Gretel model object.
- fetch_train_job_results(model_id: str) TrainJobResults
Fetch the results object from a Gretel training job.
You must set a project before calling this method.
- Parameters:
model_id – The Gretel model ID.
- Raises:
GretelProjectNotSetError – If a project has not been set.
- Returns:
Job results including the model object, report, logs, and final config.
- get_project() Project
Returns the current Gretel project.
If a project has not been set, a new one will be created.
- set_project(name: str | None = None, desc: str | None = None, display_name: str | None = None)
Set the current Gretel project.
If a project with the given name does not exist, it will be created. If the name is not unique, the user id will be appended to the name.
- Parameters:
name – Name of new or existing project. If None, will create one.
desc – Project description.
display_name – Project display name. If None, will use project name.
- Raises:
ApiException – If an error occurs while creating the project.
- submit_generate(model_id: str, *, num_records: int | None = None, seed_data: str | Path | _DataFrameT | None = None, wait: bool = True, fetch_data: bool = True, **generate_kwargs) GenerateJobResults
Submit a Gretel model generate job.
Only one of num_records or seed_data can be provided. The former will generate a complete synthetic dataset, while the latter will conditionally generate synthetic data based on the seed data.
- Parameters:
model_id – The Gretel model ID.
num_records – Number of records to generate.
seed_data – Seed data source as a file path or pandas DataFrame.
wait – If True, wait for the job to complete before returning.
fetch_data – If True, fetch the synthetic data as a DataFrame.
- Raises:
GretelJobSubmissionError – If the combination of arguments is invalid.
- Returns:
Job results including the model object, record handler, and synthetic data.
Examples:
# Generate a synthetic dataset with 1000 records. from gretel_client import Gretel gretel = Gretel(project_name="my-project") generated = gretel.submit_generate(model_id, num_records=100) # Conditionally generate synthetic examples of a rare class. import pandas pd from gretel_client import Gretel gretel = Gretel(project_name="my-project") df_seed = pd.DataFrame(["rare_class"] * 1000, columns=["field_name"]) generated = gretel.submit_generate(model_id, seed_data=df_seed)
- submit_train(base_config: str, *, data_source: str | Path | _DataFrameT | None, job_label: str | None = None, wait: bool = True, **non_default_config_settings) TrainJobResults
Submit a Gretel model training job.
Training jobs are configured by updating a base config, which can be given as a yaml file path or as the name of one of the Gretel base config files (without the extension) listed here: https://github.com/gretelai/gretel-blueprints/tree/main/config_templates/gretel/synthetics
- Parameters:
base_config – Gretel base config name or yaml config file path.
data_source – Training data source as a file path or pandas DataFrame.
job_label – Descriptive label to append to job the name.
wait – If True, wait for the job to complete before returning.
**non_default_config_settings – Config settings to override in the template. The format is section={“setting”: “value”}, where section is the name of a yaml section within the specific model settings, e.g. params or privacy_filters. If the parameter is not nested within a section, pass it directly as a keyword argument.
- Returns:
Job results including the model object, report, logs, and final config.
Example:
from gretel_client import Gretel gretel = Gretel(project_name="my-project") trained = gretel.submit_train( base_config="tabular-actgan", data_source="data.csv", params={"epochs": 100, "generator_dim": [128, 128]}, privacy_filters={"similarity": "high", "outliers": None}, )