Models

Classes and methods for working with Gretel Models

class gretel_client.projects.models.Model(project: None, model_config: str | Path | dict | None = None, model_id: str | None = None)

Represents a Gretel Model. This class can be used to train new models or run and lookup existing ones.

property artifact_types: List[str]: Returns a list of artifact types associated with the model.

property billing_details: dict: Get billing details for the current job.

cancel(): Cancels the active job.

property container_image: str: Return the container image for the job.

Creates a new record handler for the model.

Parameters:

data_source – A data source to upload to the record handler.
params – Any custom params for the record handler. These params are specific to the upstream model.

property data_source: str | _DataFrameT | None

Retrieves the configured data source from the model config.

If the model config has a local data_source we’ll try and resolve that path relative to the location of the model config.

delete() → dict | None: Deletes the remote model.

download_artifacts(target_dir: str | Path)

Given a target directory, either as a string or a Path object, attempt to enumerate and download all artifacts associated with this Job

Parameters:: target_dir – The target directory to store artifacts in. If the directory does not exist, it will be created for you.

property errors: Return any errors associated with the model.

property external_data_source: bool: Returns True if the data source is external to Gretel Cloud. If the data source is a Gretel Artifact, returns False.

property external_ref_data: bool: Returns True if the data refs are external to Gretel Cloud. If the data refs are Gretel Artifacts, returns False.

get_artifact_handle(artifact_key: str) → BinaryIO

Returns a reference to a remote artifact that can be used to read binary data within a context manager

>>> with job.get_artifact_handle("report_json") as file:
...   print(file.read())

Parameters:: artifact_key – Artifact type to download.
Returns:: a file like object

get_artifact_link(artifact_key: str) → str

Retrieves a signed S3 link that will download the specified artifact type.

Parameters:: artifact_key – Artifact type to download.

get_artifacts() → Iterator[Tuple[str, str]]: List artifact links for all known artifact types.

get_artifacts_by_artifact_types(artifact_types: List[str]) → Iterator[Tuple[str, str]]: List artifact links for all known artifact types.

get_record_handlers() → Iterator[RecordHandler]: Returns a list of record handlers associated with the model.

get_report_summary(report_path: str | None = None) → dict | None: Return a summary of the job results :param report_path: If a report_path is passed, that report

will be used for the summary. If no report path is passed, the function will check for a cloud report artifact.

property instance_type: str: Returns CPU or GPU based on the model being trained.

property is_cloud_model: Returns True if the model was created to run in Gretel’s Cloud. False otherwise.

property logs: Returns run logs for the job.

property model_config: dict: Returns the model config used to create the model.

property model_type: str: Returns the type of model. Eg synthetics, transforms or classify.

property name: str | None

Gets the name of the model. If no name is specified, a random name will be selected when the model is submitted to the backend.

Getter:: Returns the model name.
Setter:: Sets the model name.

peek_report(report_path: str | None = None) → dict | None

Return a summary of the job results.

Parameters:: report_path – If a report_path is passed, that report will be used for the summary. If no report path is passed, the function will check for a cloud based artifact.

poll_logs_status(wait: int = -1, callback: Callable | None = None) → Iterator[LogStatus]

Returns an iterator that may be used to tail the logs of a running Model.

Parameters:

wait – The time in seconds to wait before closing the iterator. If wait is -1 (WAIT_UNTIL_DONE), the iterator will run until the model has reached a “completed” or “error” state.
callback – This function will be executed on every polling loop. A callback is useful for checking some external state that is working on a Job.

property print_obj: dict: Returns a printable object representation of the job.

project: Project: Project associated with the job.

property ref_data: RefData: Retrieves configured ref data from the model config. If there are local ref data sources we will try and resolve that path relative to the location of the model config.

refresh(): Update internal state of the job by making an API call to Gretel Cloud.

property runner_mode: str: Returns the runner_mode of the job. May be one of hybrid, manual or cloud.

property status: Status: The status of the job. Is one of gretel_client.projects.jobs.Status.

submit(runner_mode: str | RunnerMode | None = None, dry_run: bool = False) → Job

Submit this Job to the Gretel Cloud API.

Parameters:

runner_mode – Determines where to run the model. If not specified, the runner mode of the project (if configured) is used, otherwise the default runner mode of the session is used.
dry_run – If set to True the model config will be submitted for validation, but won’t be run. Ignored for record handlers.

submit_cloud(dry_run: bool = False) → Job

Submit this Job to the Gretel Cloud API be scheduled for running in Gretel Cloud.

Returns:: The response from the Gretel API.

submit_hybrid(dry_run: bool = False) → Job

Submit this Job to the Gretel Cloud API to be scheduled for running in a hybrid deployment.

Returns:: The response from the Gretel API.

submit_local(dry_run: bool = False) → Job

Submit this Job to the Gretel Cloud API to be scheduled for running in a local container.

Returns:: The response from the Gretel API.

submit_manual(dry_run: bool = False) → Job

Submit this Job to the Gretel Cloud API, which will create the job metadata but no runner will be started. The Model instance can now be passed into a dedicated runner.

Returns:: The response from the Gretel API.

property traceback: str | None: Returns the traceback associated with any job errors.

upload_data_source(_validate: bool = True, _artifacts_handler: CloudArtifactsHandler | HybridArtifactsHandler | None = None) → str | None

Resolves and uploads the data source specified in the model config.

If the data source is already a Gretel artifact, the artifact will not be uploaded.

Returns:: A Gretel artifact key.

upload_ref_data(_validate: bool = True, _artifacts_handler: ArtifactsHandler | None = None) → RefData

Resolves and uploads ref data sources specificed in the model config.

If the ref data are already Gretel artifacts, we’ll return the ref data as-is.

Returns:: A RefData instance that contains the new Gretel artifact values.

validate_data_source()

Tests that the attached data source is a valid CSV or JSON file. If the data source is a Gretel cloud artifact OR a hybrid artifact and the runner mode is hybrid, data validation will be skipped.

Raises:

DataSourceError – file can’t be opened.
DataValidationError – the data isn’t valid CSV or JSON.

worker_key: str | None: Worker key used to launch the job.

gretel_client.projects.models.read_model_config(model_config: str | Path | dict, *, base_url: str = 'https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/config_templates/gretel') → dict

Load a Gretel configuration into a dictionary.

Parameters:

model_config – This argument may be a string to a file on disk or a Gretel configuration template string such as “synthetics/default”. First, this function will treat string input as a location on disk and attempt to read the file and parse it as YAML or JSON. If this is successful, a dict of the config is returned. If the provided model_config str is not a file on disk, the function will attempt to resolve the config as a shortcut-path from URL provided by base_url.
base_url – A base HTTP URL that should be use to construct a fully qualified path to a configuration template. This URL will be used to resolve a config shortcut string to the fully qualified URL.