Visualizer

class qsar.utils.visualizer.Visualizer(figsize: Tuple[int, int] = (10, 6))

Bases: object

A class to visualize various aspects of QSAR models.

static display_atom_count_distribution(atom_counts)

Plot the distribution of atom counts in a dataset.

Parameters:

atom_counts (List[int] or similar) – List or array of atom counts.

display_cv_folds(df: DataFrame, y: str, n_folds: int)

Plot the distribution of data across different cross-validation folds.

Parameters:
  • df (pd.DataFrame) – The DataFrame containing the data.

  • y (str) – The target column name.

  • n_folds (int) – Number of folds.

display_data_cluster(df_corr: DataFrame, n_clusters: int = 8) None

Displays the correlated features in a clusterized graph

Parameters:
  • df_corr (pd.DataFrame) – The correlation matrix to be clustered.

  • n_clusters (int) – The number of clusters to be created. Defaults to 8.

Returns:

None

display_elbow(df: DataFrame, max_num_clusters: int = 15) None

Displays the elbow curve for the given dataframe and its associated Within-Cluster Sum of Square

Parameters:
  • df (pd.DataFrame) – A correlation dataframe

  • 15) (max_num_clusters (default =) – The maximum number of clusters wanted

Return type:

None

display_model_performance(model_name: str, metrics: dict, metric_precision: int = 4)

Display the scores of the model in a table format.

Parameters:
  • model_name (str) – The name of the model to be evaluated.

  • metrics (dict) – Dictionary containing the scores.

  • metric_precision (int) – Precision of the metric values. Defaults to 4.

display_true_vs_predicted(model_name: str, y_train: DataFrame, y_test: DataFrame, y_pred_train: DataFrame, y_pred_test: DataFrame)

Display a scatter plot of true vs. predicted values for training and test sets.

Parameters:
  • model_name (str) – The name of the model used for prediction.

  • y_train (pd.DataFrame) – True values for the training set.

  • y_test (pd.DataFrame) – True values for the test set.

  • y_pred_train (pd.DataFrame) – Predicted values for the training set.

  • y_pred_test (pd.DataFrame) – Predicted values for the test set.

static draw_generated_molecules(molecules: List[Mol])

Draw the generated molecules.

Parameters:

molecules (List[Chem.Mol]) – List of molecules to be visualized.