Models

Classes and methods for working with Gretel Models

class gretel_client.projects.models.Model(project: None, model_config: str | Path | dict | None = None, model_id: str | None = None)

Represents a Gretel Model. This class can be used to train new models or run and lookup existing ones.

property artifact_types: List[str]

Returns a list of artifact types associated with the model.

property billing_details: dict

Get billing details for the current job.

cancel()

Cancels the active job.

property container_image: str

Return the container image for the job.

create_record_handler_obj(data_source: str | _DataFrameT | None = None, params: dict | None = None, ref_data: str | Dict[str, str] | List[str] | Tuple[str] | _DataFrameT | List[_DataFrameT] | None = None) RecordHandler

Creates a new record handler for the model.

Parameters:
  • data_source – A data source to upload to the record handler.

  • params – Any custom params for the record handler. These params are specific to the upstream model.

property data_source: str | _DataFrameT | None

Retrieves the configured data source from the model config.

If the model config has a local data_source we’ll try and resolve that path relative to the location of the model config.

delete() dict | None

Deletes the remote model.

download_artifacts(target_dir: str | Path)

Given a target directory, either as a string or a Path object, attempt to enumerate and download all artifacts associated with this Job

Parameters:

target_dir – The target directory to store artifacts in. If the directory does not exist, it will be created for you.

property errors

Return any errors associated with the model.

property external_data_source: bool

Returns True if the data source is external to Gretel Cloud. If the data source is a Gretel Artifact, returns False.

property external_ref_data: bool

Returns True if the data refs are external to Gretel Cloud. If the data refs are Gretel Artifacts, returns False.

get_artifact_handle(artifact_key: str) BinaryIO

Returns a reference to a remote artifact that can be used to read binary data within a context manager

>>> with job.get_artifact_handle("report_json") as file:
...   print(file.read())
Parameters:

artifact_key – Artifact type to download.

Returns:

a file like object

Retrieves a signed S3 link that will download the specified artifact type.

Parameters:

artifact_key – Artifact type to download.

get_artifacts() Iterator[Tuple[str, str]]

List artifact links for all known artifact types.

get_artifacts_by_artifact_types(artifact_types: List[str]) Iterator[Tuple[str, str]]

List artifact links for all known artifact types.

get_record_handlers() Iterator[RecordHandler]

Returns a list of record handlers associated with the model.

get_report_summary(report_path: str | None = None) dict | None

Return a summary of the job results :param report_path: If a report_path is passed, that report

will be used for the summary. If no report path is passed, the function will check for a cloud report artifact.

property instance_type: str

Returns CPU or GPU based on the model being trained.

property is_cloud_model

Returns True if the model was created to run in Gretel’s Cloud. False otherwise.

property logs

Returns run logs for the job.

property model_config: dict

Returns the model config used to create the model.

property model_type: str

Returns the type of model. Eg synthetics, transforms or classify.

property name: str | None

Gets the name of the model. If no name is specified, a random name will be selected when the model is submitted to the backend.

Getter:

Returns the model name.

Setter:

Sets the model name.

peek_report(report_path: str | None = None) dict | None

Return a summary of the job results.

Parameters:

report_path – If a report_path is passed, that report will be used for the summary. If no report path is passed, the function will check for a cloud based artifact.

poll_logs_status(wait: int = -1, callback: Callable | None = None) Iterator[LogStatus]

Returns an iterator that may be used to tail the logs of a running Model.

Parameters:
  • wait – The time in seconds to wait before closing the iterator. If wait is -1 (WAIT_UNTIL_DONE), the iterator will run until the model has reached a “completed” or “error” state.

  • callback – This function will be executed on every polling loop. A callback is useful for checking some external state that is working on a Job.

property print_obj: dict

Returns a printable object representation of the job.

project: Project

Project associated with the job.

property ref_data: RefData

Retrieves configured ref data from the model config. If there are local ref data sources we will try and resolve that path relative to the location of the model config.

refresh()

Update internal state of the job by making an API call to Gretel Cloud.

property runner_mode: str

Returns the runner_mode of the job. May be one of hybrid, manual or cloud.

property status: Status

The status of the job. Is one of gretel_client.projects.jobs.Status.

submit(runner_mode: str | RunnerMode | None = None, dry_run: bool = False) Job

Submit this Job to the Gretel Cloud API.

Parameters:
  • runner_mode – Determines where to run the model. If not specified, the runner mode of the project (if configured) is used, otherwise the default runner mode of the session is used.

  • dry_run – If set to True the model config will be submitted for validation, but won’t be run. Ignored for record handlers.

submit_cloud(dry_run: bool = False) Job

Submit this Job to the Gretel Cloud API be scheduled for running in Gretel Cloud.

Returns:

The response from the Gretel API.

submit_hybrid(dry_run: bool = False) Job

Submit this Job to the Gretel Cloud API to be scheduled for running in a hybrid deployment.

Returns:

The response from the Gretel API.

submit_local(dry_run: bool = False) Job

Submit this Job to the Gretel Cloud API to be scheduled for running in a local container.

Returns:

The response from the Gretel API.

submit_manual(dry_run: bool = False) Job

Submit this Job to the Gretel Cloud API, which will create the job metadata but no runner will be started. The Model instance can now be passed into a dedicated runner.

Returns:

The response from the Gretel API.

property traceback: str | None

Returns the traceback associated with any job errors.

upload_data_source(_validate: bool = True, _artifacts_handler: CloudArtifactsHandler | HybridArtifactsHandler | None = None) str | None

Resolves and uploads the data source specified in the model config.

If the data source is already a Gretel artifact, the artifact will not be uploaded.

Returns:

A Gretel artifact key.

upload_ref_data(_validate: bool = True, _artifacts_handler: ArtifactsHandler | None = None) RefData

Resolves and uploads ref data sources specificed in the model config.

If the ref data are already Gretel artifacts, we’ll return the ref data as-is.

Returns:

A RefData instance that contains the new Gretel artifact values.

validate_data_source()

Tests that the attached data source is a valid CSV or JSON file. If the data source is a Gretel cloud artifact OR a hybrid artifact and the runner mode is hybrid, data validation will be skipped.

Raises:
worker_key: str | None

Worker key used to launch the job.

gretel_client.projects.models.read_model_config(model_config: str | Path | dict, *, base_url: str = 'https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/config_templates/gretel') dict

Load a Gretel configuration into a dictionary.

Parameters:
  • model_config – This argument may be a string to a file on disk or a Gretel configuration template string such as “synthetics/default”. First, this function will treat string input as a location on disk and attempt to read the file and parse it as YAML or JSON. If this is successful, a dict of the config is returned. If the provided model_config str is not a file on disk, the function will attempt to resolve the config as a shortcut-path from URL provided by base_url.

  • base_url – A base HTTP URL that should be use to construct a fully qualified path to a configuration template. This URL will be used to resolve a config shortcut string to the fully qualified URL.