Quality Report

class gretel_client.evaluation.quality_report.QualityReport(*, project: Project | None = None, name: str | None = None, data_source: str | Path | DataFrame, ref_data: Path | str | DataFrame, output_dir: str | Path | None = None, runner_mode: RunnerMode | None = None, record_count: int | None = 5000, correlation_column_count: int | None = 75, column_count: int | None = 250, mandatory_columns: List[str] | None = [], session: ClientConfig | None = None, test_data: Path | str | DataFrame | None = None)

Represents a Quality Report. This class can be used to create a report.

Parameters:
  • project – Optional project associated with the report. If no project is passed, a temp project (gretel_client.projects.projects.tmp_project) will be used.

  • data_source – Data source used for the report.

  • ref_data – Reference data used for the report.

  • output_dir – Optional directory path to write the report to. If the directory does not exist, the path will be created for you.

  • runner_mode – Determines where to run the model. See gretel_client.config.RunnerMode for a list of valid modes. Manual mode is not explicitly supported.

  • record_count – Number of rows to use from the data sets, 5000 by default. A value of 0 means “use as many rows/columns as possible.” We still attempt to maintain parity between the data sets for “fair” comparisons, i.e. we will use min(len(train), len(synth)), e.g.

  • correlation_column_count – Similar to record_count, but for number of columns used for correlation calculations.

  • column_count – Similar to record_count, but for number of columns used for all other calculations.

  • mandatory_columns – Use in conjuction with correlation_column_count and column_count. The columns listed will be included in the sample of columns. Any additional requested columns will be selected randomly.

  • session – The client session to use, or None to use the session associated with the project (if any), or the default session otherwise.

  • test_data – Optional reference data used for the Privacy Metrics of the report.

property as_dict: Dict[str, Any]

Returns a dictionary representation of the report.

property as_html: str

Returns a HTML representation of the report.

data_source: DataSourceTypes

Data source used for the report.

output_dir: Path | None

Optional directory path to write the report to. If the directory does not exist, the path will be created for you.

peek() Dict[str, Any] | None

Returns a dictionary representation of the top level report scores.

project: Project | None

Optional project associated with the report. If no project is passed, a temp project (tmp_project) will be used.

ref_data: RefDataTypes

Reference data used for the report.

runner_mode: RunnerMode

Determines where to run the model. See RunnerMode for a list of valid modes. Manual mode is not explicitly supported.

test_data: RefDataTypes

Additional optional test data used for MQS reports.