autofunc.find_similarities¶
Builds a similarity matrix with product IDs as the rows and headers and the similarity between each combination as the matrix value in that index. The diagonal is 1 because each product is 100% similar to itself.
Similarity here is defined as the percentage of components that two products have in common. The matrix is not symmetric because each product can have a different number of components.
Module Contents¶
Functions¶
|
Find the similarity between all products in a repository |
-
autofunc.find_similarities.find_similarities(input_dataframe)¶ Find the similarity between all products in a repository
- Parameters
input_dataframe (Pandas dataframe) – A Pandas dataframe with the product information
- Returns
Returns a Pandas dataframe in an nxn matrix format with the similarity between each product
- Return type
similarity_df