autofunc.get_precision_recall

Find the accuracy of a prediction based on the known results in a testing set and the predicted results from a training set. The accuracy is represented in the F1 score.

Module Contents

Functions

precision_recall(thresh_results, test_records)

Imports the results of the frequency-finding and thresholding algorithm, which are the results predicted to be in

autofunc.get_precision_recall.precision_recall(thresh_results, test_records)

Imports the results of the frequency-finding and thresholding algorithm, which are the results predicted to be in future components. The functions and flows for the components in the testing set are predicted based on these results, which are compared with the actual results and used to calculate the accuracy of how well the prediction performed.

Predicted?

Yes

No

Actual?

Yes

True Positive

False Negative

No

False Positive

True Negative

TP = True Positive, FP = False Positive, FN = False Negative, TN = True Negative

Precision is the ratio of correct predictions to all predictions made by the classifier (TP/(TP + FP)). This number is the ratio of predictions that were identified as being in the product that are actually in the product.

Recall is the ratio of correct predictions to all actual results made by the classifier (TP/(TP + FN)). This number is the ratio of the actual results that were correctly predicted.

Recall is representative of the confidence that no positives have been missed and precision is representative of the confidence in the True Positives.

The F1 score is the harmonic mean of precision and recall

(2 * precision * recall) / (precision + recall)

Parameters
  • thresh_results (dict) –

  • results of the "find_top_thresh" function. These are the predicted function and flow combinations for each (The) –

  • component

  • test_records (list) –

  • function and flow combinations for each component in the testing set (The) –

  • in a list. (organized) –

Returns

  • learned_dict

  • Returns a dictionary of what was learned from the results of the data mining automation

  • matched

  • A dictionary of the functions and flows that were True Positives

  • overmatched

  • A dictionary of the functions and flows that were False Positives

  • unmatched

  • A dictionary of the functions and flows that were False Negatives

  • recall

  • A single number for the recall score for this combination of testing and training sets

  • precision

  • A single number for the precision score for this combination of testing and training sets

  • f1

  • A single number for the F1 score for this combination of testing and training sets