BigQuery DataFrames
Interfaces for using Gretel with Google BigQuery. This module assumes that the bigframes package is already installed as a transitive dependency.
- class gretel_client.bigquery.BigFrames(gretel: Gretel)
This interface enables using Gretel Transforms, Gretel Synthetics, and Gretel Navigator with Google BigFrames.
- Parameters:
gretel – An instance of the Gretel interface. This instance should be imported from from gretel_client import Gretel.
- display_dataframe_in_notebook(dataframe: DataFrame, settings: dict | None = None) None
Display a BigFrames DataFrame in a Notebook.
- Parameters:
dataframe – A BigFrames DataFrame
settings – Any valid settings that are accepted by the method pandas.DataFrame.style.set_properties
- fetch_generate_job_results(model_id: str, record_id: str) ModelGenerationResult
Given the Model ID and Job ID (record ID), return a ModelGenerationResult instance which allows for checking the generation job status and retrieving the generated data.
- fetch_train_job_results(model_id: str) ModelTrainResult
Given a Gretel Model ID, return a ModelTrainResult instance. This allows for checking model training status, retrieving model quality report(s) and retrieving generated data.
- fetch_transform_results(model_id: str) BigQueryTransformResults
Given a Transforms model ID, return a TransformsResult in order to retrieve transformed data and check job status.
Create an instance of Gretel’s Navigator API and store it on this instance. Only Navigator’s Tabular mode is supported.
- Parameters:
name – The name of the Navigator instance you want to use. When using this Navigator instance, you will refernce this name.
The additional **kwargs are identical to what is supported in Gretel.factories.initialize_navigator_api().
Edit a BigQuery Table using Gretel Navigator.
- Parameters:
name – The name of a registered Navigator instance. This should have been
method. (created using the init_navigator()) –
The other *args and **kwargs are what is supported by TabularInferenceAPI.edit(). Streaming responses are not supported at this time.
Generate a BigQuery Table using Gretel Navigator.
- Parameters:
name – The name of a registered Navigator instance. This should have been
method. (created using the init_navigator()) –
The other *args and **kwargs are what is supported by TabularInferenceAPI.generate(). Streaming responses are not supported at this time.
- submit_generate(model_id: str, *, seed_data: DataFrame | None = None, wait: bool = False, **kwargs) ModelGenerationResult
Given a fine-tuned model ID, request the generation of more data.
If the model supports conditional generation, a partial DataFrame may be provided as input to inference. This method supports the same additional kwargs as Gretel.submit_generate().
- submit_train(base_config: str | Path | dict, *, dataframe: DataFrame, wait: bool = False, **kwargs) ModelTrainResult
Fine-tune a Gretel model on an existing BigFrames DataFrame
- Parameters:
base_config – Base Gretel config name, yaml file path, yaml string, or dict.
dataframe – The BigFrames DataFrame to use as the training data.
wait – If True, wait for the job to complete before returning.
- NOTE: The remaining kwargs are the same ones that are supported by
Gretel.submit_train()
- submit_transform(config: str | Path | dict, dataframe: DataFrame, *, wait: bool = False, **kwargs) BigQueryTransformResults
Run a Gretel Transform job against the provided dataframe. A Transform model will be created and then immediately used to apply row, column, or cell level transforms against a dataframe.
- class gretel_client.bigquery.BigQueryTransformResults(project: Project, model: Model, transform_logs: List[dict] | None = None, transformed_df: DataFrame | None = None, transformed_data_link: str | None = None, report: GretelReport | None = None)
Should not be used directly.
Stores metadata and a transformed BigFrames DataFrame that was created from a Gretel Transforms job.
- refresh() None
Refresh the transform job result attributes.
- transformed_df: bpd.DataFrame | None = None
A BigQuery DataFrame of the transformed table. This will not be populated until the trasnforms job succeeds.
- class gretel_client.bigquery.JobLabel(value)
An enumeration.
- class gretel_client.bigquery.ModelGenerationResult(project: Project, model: Model, record_handler: RecordHandler, synthetic_data_link: str | None = None, synthetic_data: DataFrame | None = None)
Should not be used directly.
An instance of this class is returned when generating more data from an existing model or retrieving generated data from an existing model.
- refresh() None
Refresh the generate job results attributes.
- class gretel_client.bigquery.ModelTrainResult(project: Project, model: Model, model_config: dict | None = None, report: GretelReport | None = None, model_logs: List[dict] | None = None)
Should not be used directly.
An instance of this class is returned when creating a new synthetic model or retrieving an existing one.
- fetch_report_synthetic_data() DataFrame
Fetch the synthetic BigQuery DataFrame that was created as part of the model training process. This DataFrame is what is used to create the model report.