uplift_analysis.visualization module¶
This module implements functional utilities which are helpful for visualizing the evaluation and analysis results.
- uplift_analysis.visualization.visualize_selection_distribution(eval_res: Union[pandas.core.frame.DataFrame, uplift_analysis.data.EvalSet], column_name: Optional[str] = None) → matplotlib.axes._axes.Axes¶
Implements a utility function for displaying the selection distribution of multiple treatments, inferred according to outputs of an evaluated model. This visualization describes how selections diverge among the different treatments in a cumulative and gradual manner, as we lower the acceptance threshold (w.r.t the scores of the model). The x-axis corresponds to the quantile of scores, in a descending manner, where each progression towards the right implies including a wider upper quantile. The y-axis corresponds to the number of appearances for each treatment, i.e. the number of decisions made by the model to recommend each treatment.
- Parameters
eval_res (Union[pd.DataFrame, EvalSet]) – The evaluated dataset, whether it is represented by a dataframe or an EvalSet object.
column_name (Optional[Union[str, None]]) – The name of the field containing the recommended action according to the evaluated model. If not provided, inferred according to the input EvalSet.
- Returns
The axes in which the visualization is created.
- Return type
Axes
- uplift_analysis.visualization.chart_display_template(eval_res: Union[Dict[str, uplift_analysis.data.EvalSet], uplift_analysis.data.EvalSet], metric: str, func: Optional[Callable] = None, num_sets: Union[None, int] = None, average: Optional[bool] = False, min_quantile: Optional[float] = None, max_quantile: Optional[float] = None) → Tuple[matplotlib.axes._axes.Axes, List[matplotlib.lines.Line2D]]¶
Implements a utility function called by
evaluation.Evaluator
, for handling the required abstraction for plotting signals from multiple/signaldata.EvalSet
objects.- Parameters
eval_res (Union[Dict[str, EvalSet], EvalSet]) – A single
data.EvalSet
object, or a collection of such, containing the signal(s) to be plotted.metric (str) – The name of the signal. Generally, corresponds to a column on the dataframe hosted in the
EvalSet
object.func (Optional[Union[Callable, None]]) – In cases where the required signal is not part of the mentioned dataframe, the caller can provide a
Callable
which will be responsible for generating a custompd.Series
.num_sets (Optional[Union[None, int]]) – The number of
EvalSet
objects the input argumenteval_res
holds.average (Optional[bool]) – A boolean indicating whether averaging of the signals is required. Relevant only in case
num_sets > 1
.min_quantile (Optional[Union[float, None]]) – The quantile from which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.
max_quantile – The quantile up to which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.
- Returns
ax (Axes) – The axes object on which the lines were plotted.
lines (List[Line2D]) – The list of
Line2D
objects plotted by the function, for further editing, if required.
- uplift_analysis.visualization.area_fill(ax: matplotlib.axes._axes.Axes, s: pandas.core.series.Series, base: Union[int, float] = 0.0, **kwargs) → None¶
A visualization utility, employed by
evaluation.Evaluator
for filling areas under curves. The colors of the fill varies according to some configurable horizontal line,base
. Areas abovebase
will be filled with green color, and ares belowbase
will be filled with red color.- Parameters
ax (Axes) – The axes object on which the visualization will be appended.
s (pd.Series) – The curve, according to which an area will be filled.
base (Union[int, float]) – The y-axis value of the horizontal line according to which the area will be colored.
kwargs – Arbitrary keyword arguments, which will be sent to matplotlib backend.
- uplift_analysis.visualization.emphasize_axes(ax: matplotlib.axes._axes.Axes, base: Tuple[Union[float, int], Union[float, int]]) → None¶
A visualization utility, employed by
evaluation.Evaluator
for emphasizing axes, by plotting bold lines on designated coordinates.- Parameters
ax (Axes) – The axes object on which the visualization will be appended.
base (Tuple[Union[float, int]) – The tuple of values defining the axes, (x_value,y_value).
- uplift_analysis.visualization.plot_points(ax: matplotlib.axes._axes.Axes, points: List[Dict], legend_prefix: str, value_key: Optional[str] = 'value') → None¶
A visualization utility, employed by
evaluation.Evaluator
for plotting a set of independet markers, labeled by their corresponding values.- Parameters
ax (Axes) – The axes object on which the visualization will be appended.
points (List[Dict]) – A list of dictionaries, containing the points to be plotted. Each dictionary must contain a key
point
, for which the corresponding value will be a 2D tuple, containing the coordinates of the point. It also must contain a key corresponding to the argumentvalue_key
. The value of this key will be used for labeling this point.legend_prefix (str) – The string which will be added to the beginning of the label of each plotted point, followed by the corresponding value.
value_key (Optional[str]) – The name of the key, in each of the
dict
objects listed underpoints
, in which the value of the point is stored. This value will be used for labeling the point.
- uplift_analysis.visualization.display_sleeve(ax: matplotlib.axes._axes.Axes, eval_set: uplift_analysis.data.EvalSet, metric: str, margin: pandas.core.series.Series, color: str, min_quantile: Optional[float] = 0.001, max_quantile: Optional[float] = 1.0) → None¶
A visualization utility, employed by
evaluation.Evaluator
for displaying an uncertainty sleeve around a specified line.- Parameters
ax (Axes) – The axes object on which the visualization will be appended.
eval_set (EvalSet) – The
EvalSet
object containing the signal around which the sleeve will be visualized.metric (str) – The name of the signal, corresponding to the column in the dataframe of the provided
eval_set
. This will be the reference signal, around which the sleeve will be visualized.margin (pd.Series) – The one-sided margin, which will be added, and subtracted from the reference signal, for creating the sleeve.
color (str) – The color of the sleeve.
min_quantile – The quantile from which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.
max_quantile – The quantile up to which the quantile-dependent signal(s) will be plotted, for avoiding the noise on the edge of the signal.
- uplift_analysis.visualization.single_curve_plot(signal: Union[pandas.core.frame.DataFrame, pandas.core.series.Series], ax: matplotlib.axes._axes.Axes, metric: Optional[str] = None, func: Optional[Callable] = None, **kwargs) → Tuple[pandas.core.series.Series, matplotlib.lines.Line2D]¶
A visualization utility for plotting a single curve, on a given
Axes
object. The signal can be an independentpandas.Series
, but it also can be a column of apandas.Datframe
, or a result of a function applied to an inputpandas.Dataframe
, for creating a newpandas.Series
.- Parameters
signal (Union[pd.DataFrame, pd.Series]) – The input data according to which the curve will be plotted.
ax (Axes) – The axes object on which the visualization will be appended.
metric (Optional[Union[str, None]]) – The name of the column holding the desired signal, on the input dataframe. Irrelevant when
signal
is apandas.Series
object.func (Optional[Union[Callable, None]]) – A function to apply on the input dataframe, for generating a new
pandas.Series
. Irrelevant whensignal
is apandas.Series
object.kwargs – Arbitrary keyword arguments.
- Returns
pd.Series – The series which was eventually plotted by the function.
Line2D – The plotted line object, for further manipulation.
- uplift_analysis.visualization.chop_lower_quantiles(s: pandas.core.series.Series, q: Optional[float] = None) → pandas.core.series.Series¶
A utility function for filtering out lower quantiles of a given signal, represented as a
pandas.Series
, and indexed by quantiles in a descending manner. This is done for avoiding noisy estimations which might occur in lower quantiles.- Parameters
s (pd.Series) – The original
pandas.Series
, indexed by quantiles in a descending manner.q (Optional[float]) – The quantile up to which value of the original series will be filtered out.
- Returns
The series after filtering out the specified quantiles.
- Return type
pd.Series
- uplift_analysis.visualization.chop_upper_quantiles(s: pandas.core.series.Series, q: Optional[float] = None) → pandas.core.series.Series¶
A utility function for filtering out upper quantiles of a given signal, represented as a
pandas.Series
, and indexed by quantiles in a descending manner. This is done for avoiding noisy estimations which might occur in upper quantiles.- Parameters
s (pd.Series) – The original
pandas.Series
, indexed by quantiles in a descending manner.q (Optional[float]) – The quantile from which value of the original series will be filtered out.
- Returns
The series after filtering out the specified quantiles.
- Return type
pd.Series
- uplift_analysis.visualization.get_bin_quantity(s: pandas.core.series.Series, max_bins: Optional[int] = 200, bin_rate: Optional[float] = 0.05) → int¶
A utility function for retrieving the number of bins to use for a density plot or an histogram. This number will be based on the number of unique values from which the signal will be composed, and bounded from above by a configured maximal value.
- Parameters
s (pd.Series) – The series for which a denisty plot is required.
max_bins (Optional[int]) – The maximal number of bins to allow, regardless of the number of distinct values.
bin_rate (Optional[float]) – The fraction between the number of bins and the number of distinct values in the input series.
- Returns
The number of bins to use in a denisty plot / histogram.
- Return type
pd.Series
- uplift_analysis.visualization.should_average(eval_res: Union[Dict[str, uplift_analysis.data.EvalSet], uplift_analysis.data.EvalSet], average: Optional[bool] = False)¶
A function for determining the number of
EvalSet
objects in the inputeval_res
, and accordingly determine if averaging is required.- Parameters
- Returns
average (bool) – Averaging is required (
True
) only if the inputaverage=True
, andnum_sets > 1
.num_sets (int) – The number of
EvalSet
objects in the inputeval_res
.