uplift_analysis.scoring module¶

This module implements a scoring utility wrapped as a class named Scorer. Given a set of (or a single) scoring configurations, each of which specifies the relevant fields, and the specific function to apply to these fields, each observation within the input dataset is scored. In case of multiple scoring configurations, the scores of the methods are combined into a new score, weighting the magnitude of the score, associated with each action (relevant for a multiple actions scenario).

Notes

Scorer also supports use-cases with multiple treatments.

class uplift_analysis.scoring.Scorer(scoring_configuration: Optional[Union[Dict, List[Dict]]] = None)¶

The Scorer class is used for scoring observations on a given dataset, according to a provided configuration, or a set of configurations.

Parameters: scoring_configuration (Optional[Union[Dict, List[Dict]]]) – A list of configurations or a single configuration (each of which represented as dict) specifying scoring methods.

set_scoring_config(scoring_configuration: Union[Dict, List[Dict]]) → None¶

A method for setting the scoring configuration associated with the object.

Parameters: scoring_configuration (Union[Dict, List[Dict]]) – A list of configurations or a single configuration (each of which represented as dict) specifying scoring methods.

calculate_scores(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_configuration: Optional[Union[Dict, List[Dict]]] = None) → Tuple¶

This function serves as the primary interface of the class. Given a dataset, and scoring configuration, this function returns the corresponding scores for each observation in the set, accompanied with the recommended action.

Parameters

dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – the dataset to be scored.
scoring_configuration (Union[Dict, List[Dict]]) – the configuration according to which the observations will be scored.

Returns

rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_actions (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
scores (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.

multiple_scoring_methods_calc(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_methods: List[Dict])¶

This function applies a set of scoring method configurations to the provided dataset, and returns the resulting scores, and the recommended actions after combining the set of computed scores.

Parameters

dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The set of observations to score.
scoring_methods (List[Dict]) – A list of dictionaries representing the scoring method configurations.

Returns

rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_actions (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
scores (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.

combine_scores(rankings: numpy.ndarray, scored_actions: numpy.ndarray, action_dim: int)¶

A function for combining the recommendations and scores resulting of multiple scoring methods, according to the relative rankings.

Parameters

rankings (np.ndarray) – An array containing the relative ranking for each observation (row), according to each scoring method (column).
scored_actions (np.ndarray) – An array containing the recommended action for each observation (row), according to each scoring method (column).
action_dim (int) – The cardinality of the action space.

Returns

combined_rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
combined_score_action (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
combined_score (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.

single_scoring_method_calc(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_method: Dict)¶

This function applies a single scoring method configuration to the provided dataset, and returns the resulting scores, and the recommended actions according to these scores.

Parameters

dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The set of observations to score.
scoring_method (Dict) – A dictionary representing the scoring method configuration.

Returns

rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_action (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
observation_score (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.

score_computation(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_method: Dict) → numpy.ndarray¶

This function uses a single scoring method configuration and applies it to the provided dataset, for score computation.

Parameters

dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The dataset containing the observations that require scoring.
scoring_method (Dict) – A dictionary specifying the scoring method configuration.

Returns

The resulting scores, corresponding the provided scoring method configuration.

Return type

np.ndarray

static rank_scores(observation_score: numpy.ndarray) → numpy.ndarray¶

A method for computing relative rank (among the provided dataset) for each observation, according to the computed score.

Parameters: observation_score (np.ndarray) – An array representing the score for each observation.
Returns: relative value in the range (0,1] indicating score rank (within the given dataset), for each of the observations.
Return type: np.ndarray