uplift_analysis.visualization module

This module implements functional utilities which are helpful for visualizing the evaluation and analysis results.

uplift_analysis.visualization.visualize_selection_distribution(eval_res: Union[pandas.core.frame.DataFrame, uplift_analysis.data.EvalSet], column_name: Optional[str] = None)matplotlib.axes._axes.Axes

Implements a utility function for displaying the selection distribution of multiple treatments, inferred according to outputs of an evaluated model. This visualization describes how selections diverge among the different treatments in a cumulative and gradual manner, as we lower the acceptance threshold (w.r.t the scores of the model). The x-axis corresponds to the quantile of scores, in a descending manner, where each progression towards the right implies including a wider upper quantile. The y-axis corresponds to the number of appearances for each treatment, i.e. the number of decisions made by the model to recommend each treatment.

Parameters
  • eval_res (Union[pd.DataFrame, EvalSet]) – The evaluated dataset, whether it is represented by a dataframe or an EvalSet object.

  • column_name (Optional[Union[str, None]]) – The name of the field containing the recommended action according to the evaluated model. If not provided, inferred according to the input EvalSet.

Returns

The axes in which the visualization is created.

Return type

Axes

uplift_analysis.visualization.chart_display_template(eval_res: Union[Dict[str, uplift_analysis.data.EvalSet], uplift_analysis.data.EvalSet], metric: str, func: Optional[Callable] = None, num_sets: Union[None, int] = None, average: Optional[bool] = False, min_quantile: Optional[float] = None, max_quantile: Optional[float] = None)Tuple[matplotlib.axes._axes.Axes, List[matplotlib.lines.Line2D]]

Implements a utility function called by evaluation.Evaluator, for handling the required abstraction for plotting signals from multiple/signal data.EvalSet objects.

Parameters
  • eval_res (Union[Dict[str, EvalSet], EvalSet]) – A single data.EvalSet object, or a collection of such, containing the signal(s) to be plotted.

  • metric (str) – The name of the signal. Generally, corresponds to a column on the dataframe hosted in the EvalSet object.

  • func (Optional[Union[Callable, None]]) – In cases where the required signal is not part of the mentioned dataframe, the caller can provide a Callable which will be responsible for generating a custom pd.Series.

  • num_sets (Optional[Union[None, int]]) – The number of EvalSet objects the input argument eval_res holds.

  • average (Optional[bool]) – A boolean indicating whether averaging of the signals is required. Relevant only in case num_sets > 1.

  • min_quantile (Optional[Union[float, None]]) – The quantile from which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.

  • max_quantile – The quantile up to which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.

Returns

  • ax (Axes) – The axes object on which the lines were plotted.

  • lines (List[Line2D]) – The list of Line2D objects plotted by the function, for further editing, if required.

uplift_analysis.visualization.area_fill(ax: matplotlib.axes._axes.Axes, s: pandas.core.series.Series, base: Union[int, float] = 0.0, **kwargs)None

A visualization utility, employed by evaluation.Evaluator for filling areas under curves. The colors of the fill varies according to some configurable horizontal line, base. Areas above base will be filled with green color, and ares below base will be filled with red color.

Parameters
  • ax (Axes) – The axes object on which the visualization will be appended.

  • s (pd.Series) – The curve, according to which an area will be filled.

  • base (Union[int, float]) – The y-axis value of the horizontal line according to which the area will be colored.

  • kwargs – Arbitrary keyword arguments, which will be sent to matplotlib backend.

uplift_analysis.visualization.emphasize_axes(ax: matplotlib.axes._axes.Axes, base: Tuple[Union[float, int], Union[float, int]])None

A visualization utility, employed by evaluation.Evaluator for emphasizing axes, by plotting bold lines on designated coordinates.

Parameters
  • ax (Axes) – The axes object on which the visualization will be appended.

  • base (Tuple[Union[float, int]) – The tuple of values defining the axes, (x_value,y_value).

uplift_analysis.visualization.plot_points(ax: matplotlib.axes._axes.Axes, points: List[Dict], legend_prefix: str, value_key: Optional[str] = 'value')None

A visualization utility, employed by evaluation.Evaluator for plotting a set of independet markers, labeled by their corresponding values.

Parameters
  • ax (Axes) – The axes object on which the visualization will be appended.

  • points (List[Dict]) – A list of dictionaries, containing the points to be plotted. Each dictionary must contain a key point, for which the corresponding value will be a 2D tuple, containing the coordinates of the point. It also must contain a key corresponding to the argument value_key. The value of this key will be used for labeling this point.

  • legend_prefix (str) – The string which will be added to the beginning of the label of each plotted point, followed by the corresponding value.

  • value_key (Optional[str]) – The name of the key, in each of the dict objects listed under points, in which the value of the point is stored. This value will be used for labeling the point.

uplift_analysis.visualization.display_sleeve(ax: matplotlib.axes._axes.Axes, eval_set: uplift_analysis.data.EvalSet, metric: str, margin: pandas.core.series.Series, color: str, min_quantile: Optional[float] = 0.001, max_quantile: Optional[float] = 1.0)None

A visualization utility, employed by evaluation.Evaluator for displaying an uncertainty sleeve around a specified line.

Parameters
  • ax (Axes) – The axes object on which the visualization will be appended.

  • eval_set (EvalSet) – The EvalSet object containing the signal around which the sleeve will be visualized.

  • metric (str) – The name of the signal, corresponding to the column in the dataframe of the provided eval_set. This will be the reference signal, around which the sleeve will be visualized.

  • margin (pd.Series) – The one-sided margin, which will be added, and subtracted from the reference signal, for creating the sleeve.

  • color (str) – The color of the sleeve.

  • min_quantile – The quantile from which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.

  • max_quantile – The quantile up to which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.

uplift_analysis.visualization.single_curve_plot(signal: Union[pandas.core.frame.DataFrame, pandas.core.series.Series], ax: matplotlib.axes._axes.Axes, metric: Optional[str] = None, func: Optional[Callable] = None, **kwargs)Tuple[pandas.core.series.Series, matplotlib.lines.Line2D]

A visualization utility for plotting a single curve, on a given Axes object. The signal can be an independent pandas.Series, but it also can be a column of a pandas.Datframe, or a result of a function applied to an input pandas.Dataframe, for creating a new pandas.Series.

Parameters
  • signal (Union[pd.DataFrame, pd.Series]) – The input data according to which the curve will be plotted.

  • ax (Axes) – The axes object on which the visualization will be appended.

  • metric (Optional[Union[str, None]]) – The name of the column holding the desired signal, on the input dataframe. Irrelevant when signal is a pandas.Series object.

  • func (Optional[Union[Callable, None]]) – A function to apply on the input dataframe, for generating a new pandas.Series. Irrelevant when signal is a pandas.Series object.

  • kwargs – Arbitrary keyword arguments.

Returns

  • pd.Series – The series which was eventually plotted by the function.

  • Line2D – The plotted line object, for further manipulation.

uplift_analysis.visualization.chop_lower_quantiles(s: pandas.core.series.Series, q: Optional[float] = None)pandas.core.series.Series

A utility function for filtering out lower quantiles of a given signal, represented as a pandas.Series, and indexed by quantiles in a descending manner. This is done for avoiding noisy estimations which might occur in lower quantiles.

Parameters
  • s (pd.Series) – The original pandas.Series, indexed by quantiles in a descending manner.

  • q (Optional[float]) – The quantile up to which value of the original series will be filtered out.

Returns

The series after filtering out the specified quantiles.

Return type

pd.Series

uplift_analysis.visualization.chop_upper_quantiles(s: pandas.core.series.Series, q: Optional[float] = None)pandas.core.series.Series

A utility function for filtering out upper quantiles of a given signal, represented as a pandas.Series, and indexed by quantiles in a descending manner. This is done for avoiding noisy estimations which might occur in upper quantiles.

Parameters
  • s (pd.Series) – The original pandas.Series, indexed by quantiles in a descending manner.

  • q (Optional[float]) – The quantile from which value of the original series will be filtered out.

Returns

The series after filtering out the specified quantiles.

Return type

pd.Series

uplift_analysis.visualization.get_bin_quantity(s: pandas.core.series.Series, max_bins: Optional[int] = 200, bin_rate: Optional[float] = 0.05)int

A utility function for retrieving the number of bins to use for a density plot or an histogram. This number will be based on the number of unique values from which the signal will be composed, and bounded from above by a configured maximal value.

Parameters
  • s (pd.Series) – The series for which a denisty plot is required.

  • max_bins (Optional[int]) – The maximal number of bins to allow, regardless of the number of distinct values.

  • bin_rate (Optional[float]) – The fraction between the number of bins and the number of distinct values in the input series.

Returns

The number of bins to use in a denisty plot / histogram.

Return type

pd.Series

uplift_analysis.visualization.should_average(eval_res: Union[Dict[str, uplift_analysis.data.EvalSet], uplift_analysis.data.EvalSet], average: Optional[bool] = False)

A function for determining the number of EvalSet objects in the input eval_res, and accordingly determine if averaging is required.

Parameters
  • eval_res (Union[Dict[str, EvalSet], EvalSet]) – The input data, which might contain single or multiple EvalSet objects.

  • average (Optional[bool]) – The upper setting of the demand for averaging.

Returns

  • average (bool) – Averaging is required (True) only if the input average=True, and num_sets > 1.

  • num_sets (int) – The number of EvalSet objects in the input eval_res.