graph2class module¶

graph2class.calc_bc(G, return_dict)[source]¶

Parallel subprocess function to calculate the betweenness centrality.

graph2class.calc_graph_features(G)[source]¶

Calculates several graph network features. If not connected, largest subgraph is used. Uses multiprocessing for parallelsim.

graph2class.calc_shortest_pthlen(G, return_dict)[source]¶

Parallel subprocess function to calculate the average shortest path length.

graph2class.calc_similarity_score(G1_dict, G2_dict, feature_list)[source]¶

calculates the similarity score of two graphs

Parameters:

G1_dict ([dict] or [Pandas datafrane]) – graph 1 features dictionary or dataframe. must be able to use a key to access values
G2_dict ([dict] or [Pandas datafrane]) – graph 2 features dictionary or dataframe. must be able to use a key to access values
features_list ([list]) – list of graph features to compare. must be keys in graph features dictionary (above)

Returns:

similarity score (0,1) where 1 is an identical graph.

Return type:

[float]

graph2class.classify_graphs(class_file_list, sample_file_list, feature_list)[source]¶

Classifies a similarity score from a list of Class and Sample graphs

Parameters:

class_file_list ([list]) – list of control/reference graph files (classes)
sample_file_list ([list]) – list of non-control/non-reference graph files (samples)
feature_list ([list]) – list of which features to use for similarity score. must be a valid key to the graph features dictionary/dataframe (above)

Returns:

each colummn is a class and each row is the similarity score of the sampled graph

Return type:

[pandas dataframe]

graph2class.process_graphs(graph_fnames)[source]¶

take a list of graph files, calculate their features, and return as a dataframe

Parameters:: graph_fnames ([list]) – list of graph filenames to process
Returns:: dataframe containing graph features for each graph in filename list
Return type:: [pandas dataframe]

graph2class.process_similarity_df(class_similarity_df)[source]¶

Generates y_true and y_pred based on the similarity score dataframe.

y_true is a list where each index is a class and each value is the class value. E.g., class 1 is y_true[1] = 1, class 2 is y_true[2]=2, etc.

y_pred is a list where each index is a sample and each value is the maximum similarity score for that sample.

Note: This assumes the correct classification is along the diagonal of the similarity matrix/dataframe.

Parameters:: class_similarity_df ([pandas dataframe]) – each column is a class graph and each row is a sample graph. A_ij is the similarity score between graphs i and j. The exception is one column ‘name’ which contains the names of the sampled graphs for each row.
Returns:: y_true, y_pred
Return type:: ([tuple of lists])

graph2class.similarity_measure(x1, x2)[source]¶

calculates the similarity between two feature values. similarity = 1 - the relative distance between features (x1 and x2)

Parameters:

Returns:

returns the relative similarity between 2 features

Return type:

[float]

grip-tomo