uplift_analysis.scoring module¶
This module implements a scoring utility wrapped as a class named Scorer
.
Given a set of (or a single) scoring configurations, each of which specifies the relevant fields, and the specific
function to apply to these fields, each observation within the input dataset is scored.
In case of multiple scoring configurations, the scores of the methods are combined into a new score, weighting the
magnitude of the score, associated with each action (relevant for a multiple actions scenario).
Notes
Scorer
also supports use-cases with multiple treatments.
- class uplift_analysis.scoring.Scorer(scoring_configuration: Optional[Union[Dict, List[Dict]]] = None)¶
The Scorer class is used for scoring observations on a given dataset, according to a provided configuration, or a set of configurations.
- Parameters
scoring_configuration (Optional[Union[Dict, List[Dict]]]) – A list of configurations or a single configuration (each of which represented as dict) specifying scoring methods.
- set_scoring_config(scoring_configuration: Union[Dict, List[Dict]]) → None¶
A method for setting the scoring configuration associated with the object.
- Parameters
scoring_configuration (Union[Dict, List[Dict]]) – A list of configurations or a single configuration (each of which represented as dict) specifying scoring methods.
- calculate_scores(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_configuration: Optional[Union[Dict, List[Dict]]] = None) → Tuple¶
This function serves as the primary interface of the class. Given a dataset, and scoring configuration, this function returns the corresponding scores for each observation in the set, accompanied with the recommended action.
- Parameters
dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – the dataset to be scored.
scoring_configuration (Union[Dict, List[Dict]]) – the configuration according to which the observations will be scored.
- Returns
rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_actions (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
scores (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.
- multiple_scoring_methods_calc(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_methods: List[Dict])¶
This function applies a set of scoring method configurations to the provided dataset, and returns the resulting scores, and the recommended actions after combining the set of computed scores.
- Parameters
dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The set of observations to score.
scoring_methods (List[Dict]) – A list of dictionaries representing the scoring method configurations.
- Returns
rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_actions (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
scores (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.
- combine_scores(rankings: numpy.ndarray, scored_actions: numpy.ndarray, action_dim: int)¶
A function for combining the recommendations and scores resulting of multiple scoring methods, according to the relative rankings.
- Parameters
rankings (np.ndarray) – An array containing the relative ranking for each observation (row), according to each scoring method (column).
scored_actions (np.ndarray) – An array containing the recommended action for each observation (row), according to each scoring method (column).
action_dim (int) – The cardinality of the action space.
- Returns
combined_rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
combined_score_action (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
combined_score (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.
- single_scoring_method_calc(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_method: Dict)¶
This function applies a single scoring method configuration to the provided dataset, and returns the resulting scores, and the recommended actions according to these scores.
- Parameters
dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The set of observations to score.
scoring_method (Dict) – A dictionary representing the scoring method configuration.
- Returns
rankings (np.ndarray) – The relative rank (0,1] - highest means highest uplift score - of each observation in the dataset.
scored_action (np.ndarray) – The serial index of the action corresponding to the highest score, per observation.
observation_score (np.ndarray) – The score for each observation, according to the provided configuration.
action_dim (int) – The quantity of actions taken into account.
- score_computation(dataset: Union[Dict[str, numpy.ndarray], pandas.core.frame.DataFrame], scoring_method: Dict) → numpy.ndarray¶
This function uses a single scoring method configuration and applies it to the provided dataset, for score computation.
- Parameters
dataset (Union[Dict[str, np.ndarray], pd.DataFrame]) – The dataset containing the observations that require scoring.
scoring_method (Dict) – A dictionary specifying the scoring method configuration.
- Returns
The resulting scores, corresponding the provided scoring method configuration.
- Return type
np.ndarray
- static rank_scores(observation_score: numpy.ndarray) → numpy.ndarray¶
A method for computing relative rank (among the provided dataset) for each observation, according to the computed score.
- Parameters
observation_score (np.ndarray) – An array representing the score for each observation.
- Returns
relative value in the range (0,1] indicating score rank (within the given dataset), for each of the observations.
- Return type
np.ndarray