probabilistic_model.learning.nyga_distribution

probabilistic_model.learning.nyga_distribution#

Classes#

`InductionStep`	Class for performing induction in the NygaDistributions.
`NygaDistribution`	A Nyga distribution is a way to learn a deterministic mixture of uniform distributions.

Module Contents#

class probabilistic_model.learning.nyga_distribution.InductionStep#

Class for performing induction in the NygaDistributions.

data: numpy.array#: The entire sorted and unique data points

cumulative_weights: numpy.array#: The cumulative log_weights of the samples in the dataset.

cumulative_log_weights: numpy.array#: The cumulative logarithmic log_weights of the samples in the dataset.

begin_index: int#: Included index of the first sample.

end_index: int#: Excluded index of the last sample.

nyga_distribution: NygaDistribution#: The Nyga Distribution to mount the quantile distributions into and read the parameters from.

property variable#: The variable of the distribution.

property min_samples_per_quantile#: The minimal number of samples per quantile.

property min_likelihood_improvement#: The relative, minimal likelihood improvement needed to create a new quantile.

left_connecting_point() → float#: Calculate the left connecting point.

property number_of_samples#: The number of samples in the induction step.

property total_weights#: The total sum of log_weights of the samples in the induction step.

property total_log_weights#: The total sum of logarithmic log_weights of the samples in the induction step.

left_connecting_point_from_index(index) → float#

Calculate the left connecting point given some beginning index.

Parameters:: index – The index of the left datapoint.

right_connecting_point() → float#: Calculate the right connecting point.

right_connecting_point_from_index(index) → float#

Calculate the right connecting point given some ending index.

Parameters:: index – The index of the right datapoint.

create_uniform_distribution() → probabilistic_model.distributions.UniformDistribution#: Create a uniform distribution from this induction step.

create_uniform_distribution_from_indices(begin_index: int, end_index: int) → probabilistic_model.distributions.UniformDistribution#

Create a uniform distribution from the datapoint at begin_index to the datapoint at end_index.

Parameters:

begin_index – The index of the first datapoint.
end_index – The index of the last datapoint.

sum_weights_from_indices(begin_index: int, end_index: int) → float#: Sum the log_weights from begin_index to end_index.

sum_weights()#: Sum the log_weights of this induction step.

sum_log_weights_from_indices(begin_index: int, end_index: int) → float#: Sum the logarithmic log_weights from begin_index to end_index.

sum_log_weights()#: Sum the logarithmic log_weights of this induction step.

compute_best_split() → Tuple[float, int | None]#

Compute the best split of the data.

The best split of the data is computed by evaluating the log likelihood of every possible split and memorizing the best one.

Returns:: The maximum log likelihood and the best split index.

log_likelihood_without_split() → float#

Calculate the log likelihood without splitting.

Returns:: The log likelihood without splitting.

log_likelihood_of_split_side(split_index: int, connecting_point: float) → float#

Calculate the log likelihood of a split side.

This method automatically determines if this is the left or right side of the split.

Parameters:

split_index – The index of the split.
connecting_point – The connecting point.

Returns:

The log likelihood of the split.

construct_left_induction_step(split_index: int) → typing_extensions.Self#

Construct the left induction step.

Parameters:: split_index – The index of the split.

construct_right_induction_step(split_index: int) → typing_extensions.Self#

Construct the right induction step.

Parameters:: split_index – The index of the split.

improvement_is_good_enough(maximum_log_likelihood: float) → bool#: Check if the improvement is good enough. :param maximum_log_likelihood: The improved maximum log likelihood. :return: Rather the improvement is good enough

induce() → List[typing_extensions.Self]#

Perform one induction step.

Returns:: The (possibly empty) list of new induction steps.

class probabilistic_model.learning.nyga_distribution.NygaDistribution#

Bases: random_events.utils.SubclassJSONSerializer

A Nyga distribution is a way to learn a deterministic mixture of uniform distributions.

variable: random_events.variable.Continuous#

min_likelihood_improvement: float = 0.01#: The relative, minimal likelihood improvement needed to create a new quantile.

min_samples_per_quantile: int = 2#: The minimal number of samples per quantile.

probabilistic_circuit: probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit.ProbabilisticCircuit#

fit(data: numpy.array, weights: numpy.array | None = None) → probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit.ProbabilisticCircuit#

Fit the distribution to the data.

Parameters:

data – The data to fit the distribution to.
weights – The optional log_weights of the data points.

Returns:

The fitted distribution.

to_json() → Dict[str, Any]#

classmethod _from_json(json_data: Dict[str, Any]) → typing_extensions.Self#

Create a variable from a json dict. This method is called from the from_json method after the correct subclass is determined and should be overwritten by the respective subclass.

Parameters:: data – The json dict
Returns:: The deserialized object

empty_copy() → typing_extensions.Self#

static from_uniform_mixture(mixture: probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit.ProbabilisticCircuit) → probabilistic_model.probabilistic_circuit.rx.probabilistic_circuit.ProbabilisticCircuit#

Construct a Nyga Distribution from a mixture of uniform distributions. The mixture does not have to be deterministic.

Parameters:: mixture – An arbitrary, univariate mixture of uniform distributions
Returns:: A Nyga Distribution describing the same function.

all_union_of_mixture_points_with(other: typing_extensions.Self)#

Computes all possible union intervals of mixture points when combining two intervals.

Returns: list of closed intervals representing all mixture points between distributions

event_of_higher_density(other: typing_extensions.Self, own_node_weights, other_node_weights) → random_events.product_algebra.Event#