-
Notifications
You must be signed in to change notification settings - Fork 7
Hublabel #239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Hublabel #239
Changes from 55 commits
c48617d
9a5bbd3
28add14
3e009fa
6755f44
37492aa
0065e97
420c9e9
5edfaa0
b9ba500
23e6745
ad43882
0e6aa47
3619491
7e33257
9a88f8f
ac07468
3b84fb0
227e4d1
9b33554
385efa8
9ae76d4
1d50a68
fb774f2
a5c20a8
9a7e4c3
a868f92
f8de1dc
41ccdf3
a3948d4
0609855
178857d
8a85c23
e9ff343
9dc0326
d429581
43a6bc7
baa9b49
ee08df2
f522ff9
badfab6
1522fa9
46aa094
efa47f2
bb713f4
00327bd
72eefa7
0be8fda
28d2d2b
4c70990
5acf1f4
fdb9a74
ea70e55
09f9d6e
2c3c0db
a7602fd
a169eca
531e77e
5aff1a5
789ad6d
78584c4
d6dffdd
9882957
90289e8
772d34c
2d12064
83cea1f
8e57a39
467ab3d
b5abb87
91b0e91
1c35a3b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,287 @@ | ||
| /* | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Include guards are missing for this header. |
||
| file for quickly playing around with stuff | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs a better description; we can't mainline a file for quickly playing around with stuff. The file might also need a more descriptive file name. What determines if a piece belongs here or in Also, if there's not a class that represents the hub labeling module, this is where you would document the whole module: the concepts involved, citations to the literature, what functions the user of the module is expected to call in sequence to actually use it, and so on. |
||
| */ | ||
| #include "landmark.hpp" | ||
| #include "hublabel.hpp" | ||
|
|
||
| #include <boost/graph/adjacency_list.hpp> | ||
| #include <boost/graph/filtered_graph.hpp> | ||
| #include <boost/graph/graphviz.hpp> | ||
| #include <boost/graph/biconnected_components.hpp> | ||
| #include <bdsg/snarl_distance_index.hpp> | ||
|
|
||
| #include <iostream> | ||
|
|
||
| //#define debug_binary_intersection | ||
| //#define debug_hhl_query | ||
|
|
||
| namespace bdsg { | ||
|
|
||
| /** | ||
| * For a handle graph indexed with HHL, get the HHL rank ("Boost graph ID") for | ||
| * an orientation of a node, as a source or destination. | ||
| */ | ||
| NODE_UINT bgid(const handle_t& h, const bdsg::HashGraph& hg); | ||
|
|
||
| /** | ||
| * For a net graph indexed with HHL, get the HHL rank for an orientation of a | ||
| * net graph element (snarl start node, snarl end node, child node, child | ||
| * chain), as either the source or destination of a query. | ||
| * | ||
| * Snarl start nodes and snarl end nodes are handled so that "forward" | ||
| * orientation runs along the snarl, regardless of the orientation that the | ||
| * underlying handle graph node is in as a snarl boundary. | ||
| * | ||
| * Child chains and nodes are also handled so that "forward" orientation is the | ||
| * orientation the thing has in the snarl. So if a node is reversed in the | ||
| * snarl, asking about forward is actually asking about that node in its local | ||
| * reverse orientation. | ||
| * | ||
| * For net graphs, we need to distinguish between source and destination status | ||
| * to allow turning around within a child chain without traversing the full | ||
| * length of the chain. Each child chain needs to be represented by a subgraph | ||
| * with different in and out "port" nodes in each orientation. The source port | ||
| * is the one you would leave the node from in that orientation. | ||
| */ | ||
| NODE_UINT bgid(size_t net_rank, bool is_reverse, bool is_source); | ||
|
|
||
| /** | ||
| * For a handle or net graph indexed with HHL, take the HHL rank of an orientation of | ||
| * a node and get that of the opposite orientation of a node. | ||
| * | ||
| * For handle graphs, ranks are the same for source and destination. | ||
| * | ||
| * For net graphs, ranks differ between source and destination "ports" for a | ||
| * net graph element; this also swaps source and destination status. | ||
| */ | ||
| NODE_UINT rev_bgid(NODE_UINT n); | ||
|
|
||
|
|
||
| typedef struct NodeProp { | ||
| // This is initialized by make_boost_graph() | ||
| DIST_UINT seqlen; | ||
| DIST_UINT max_out = 0; | ||
| NODE_UINT contracted_neighbors = 0; | ||
| NODE_UINT level = 0; | ||
| NODE_UINT arc_cover = 1; | ||
| bool contracted = false; | ||
| // This is left uninitialized until make_contraction_hierarchy() is run. | ||
| NODE_UINT new_id; | ||
| } NodeProp; | ||
|
|
||
| typedef struct EdgeProp { | ||
| bool contracted = false; | ||
| DIST_UINT weight = 0; | ||
| NODE_UINT arc_cover = 1; | ||
| bool ori = true; | ||
| } EdgeProp; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should have a little more documentation. |
||
|
|
||
| typedef boost::adjacency_list<boost::vecS, boost::vecS, boost::bidirectionalS, NodeProp, EdgeProp> CHOverlay; | ||
| typedef boost::filtered_graph<CHOverlay, function<bool(CHOverlay::edge_descriptor)>> ContractedGraph; | ||
|
|
||
| /// Allow outputting CHOverlay objects. Output text does not end with a | ||
| /// newline. | ||
| std::ostream& operator<<(std::ostream& out, const CHOverlay& ov); | ||
|
|
||
| /** | ||
| * Build the intermediate hub labeling computation data structure ("Boost | ||
| * graph") from a HashGraph. | ||
| * | ||
| * The nodes in the graph must have dense node IDs starting at 1. | ||
| * | ||
| * For later queries, orientations of nodes are assigned ranks as provided by | ||
| * the bgid() function. | ||
| */ | ||
| CHOverlay make_boost_graph(const bdsg::HashGraph& hg); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can this have a better name than The names of the hub labeling API functions we actually use don't have much to tie them together (they don't belong to a class as static methods, or to a namespace, or share any words in their names). Talking here about a "Boost graph" is telling the user about the implementation detail that should be hidden (the fact that Boost is a dependency) and not about the fact that this is hub-labeling-related or what the graph that's implemented using Boost actually represents. |
||
| /** | ||
| * Build the intermediate hub labeling computation data structure ("Boost | ||
| * graph") for the net graph of a snarl in a TemporaryDistanceIndex. | ||
| * | ||
| * all_children must contain the child chains and nodes of the snarl, as well as the bounding nodes of the snarl, in any order. | ||
| * | ||
| * For later queries, orientations of children or the snarl boundary nodes are assigned query ranks based on their snarl distance index rank. | ||
| * | ||
| * The snarl distance index ranks are 0 and 1 for the start and end nodes of the snarl, and the rank_in_parent field of the temporary index for each child. | ||
| */ | ||
| CHOverlay make_boost_graph(const bdsg::SnarlDistanceIndex::TemporaryDistanceIndex& temp_index, const SnarlDistanceIndex::temp_record_ref_t& snarl_index, const SnarlDistanceIndex::TemporaryDistanceIndex::TemporarySnarlRecord& temp_snarl_record, const vector<pair<SnarlDistanceIndex::temp_record_t, size_t>>& all_children, const HandleGraph* graph); | ||
|
|
||
| int edge_diff(ContractedGraph::vertex_descriptor nid, ContractedGraph& ch, CHOverlay& ov, vector<DIST_UINT>& node_dists, int hop_limit); | ||
|
|
||
| void contract(CHOverlay::vertex_descriptor nid, ContractedGraph& ch, CHOverlay& ov, vector<DIST_UINT>& node_dists, vector<bool>& shouldnt_contract, int hop_limit); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should maybe be |
||
|
|
||
| /** | ||
| * Find the contraction hierarchy order for the graph. | ||
| * | ||
| * Initializes the new_id field of each NodeProb in the graph. | ||
| */ | ||
| void make_contraction_hierarchy(CHOverlay& ov); | ||
|
|
||
| template <typename ItrType> | ||
| ItrType get_dist_itr(ItrType start_itr, ItrType hub_itr) { | ||
| auto node_count = *start_itr; | ||
| auto last_fwd_end_bound_itr = next(start_itr, 1+node_count); | ||
| if (hub_itr >= next(start_itr, *last_fwd_end_bound_itr)) { | ||
| //backwards label | ||
| auto first_back_bound_itr = next(start_itr, 1+node_count+1); | ||
| auto last_back_bound_itr = next(start_itr, 1+node_count+1+node_count); | ||
| auto jump_to_dist = (*last_back_bound_itr) - *first_back_bound_itr; | ||
| return next(hub_itr, jump_to_dist); | ||
| } else { | ||
| //forwards label | ||
| auto first_fwd_bound_itr = next(start_itr, 1); | ||
| auto last_fwd_bound_itr = next(start_itr, 1+node_count); | ||
| auto jump_to_dist = (*last_fwd_bound_itr) - *first_fwd_bound_itr; | ||
| return next(hub_itr, jump_to_dist); | ||
| } | ||
| } | ||
|
|
||
| DIST_UINT binary_intersection_ch(vector<HubRecord>& v1, vector<HubRecord>& v2); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This could use a doc comment. |
||
| /* | ||
| * Do binary intersection to find shared labels for two vertices. | ||
| * | ||
| * start_itr should point to the first slot of the packed label data returned | ||
| * by pack_labels(), which is the label count. | ||
| * | ||
| * start_bound_index variables are relative to start_itr, and give the | ||
| * positions of the stored start bounds for the two labels; the stored end | ||
| * bounds will be in the slots after. | ||
| */ | ||
| template <typename ItrType> | ||
| DIST_UINT binary_intersection_ch(ItrType start_itr, size_t v1_start_bound_index, size_t v2_start_bound_index) { | ||
| auto v1_start_bound_itr = next(start_itr, v1_start_bound_index); | ||
| auto v1_end_bound_itr = next(v1_start_bound_itr, 1); | ||
| auto v2_start_bound_itr = next(start_itr, v2_start_bound_index); | ||
| auto v2_end_bound_itr = next(v2_start_bound_itr, 1); | ||
|
|
||
| auto v1_start_itr = next(start_itr, *v1_start_bound_itr); | ||
| auto v1_end_itr = next(start_itr, *v1_end_bound_itr); | ||
|
|
||
| #ifdef debug_binary_intersection | ||
| std::cerr << "Found " << v1_end_itr - v1_start_itr << " labels for vertex 1" << std::endl; | ||
| #endif | ||
|
|
||
| auto v2_start_itr = next(start_itr, *v2_start_bound_itr); | ||
| auto v2_end_itr = next(start_itr, *v2_end_bound_itr); | ||
|
|
||
| #ifdef debug_binary_intersection | ||
| std::cerr << "Found " << v2_end_itr - v2_start_itr << " labels for vertex 2" << std::endl; | ||
| #endif | ||
|
|
||
| auto v1_range = ranges::subrange<ItrType>(v1_start_itr, v1_end_itr); | ||
| auto v2_range = ranges::subrange<ItrType>(v2_start_itr, v2_end_itr); | ||
|
|
||
| auto& key_vec = v1_range.size() < v2_range.size() ? v1_range : v2_range; | ||
| auto& search_vec = v1_range.size() < v2_range.size() ? v2_range : v1_range; | ||
|
|
||
| auto search_start_itr = search_vec.begin(); | ||
| auto search_end_itr = search_vec.end(); | ||
| DIST_UINT min_dist = INF_INT; | ||
| for (auto it = key_vec.begin(); it < key_vec.end(); it++) { | ||
| #ifdef debug_binary_intersection | ||
| cerr << "Performing key query" << endl; | ||
| #endif | ||
| auto k = *it; | ||
| auto k_dist_itr = get_dist_itr(start_itr, it); | ||
| #ifdef debug_binary_intersection | ||
| cerr << "Distance for k " << k << " is " << *k_dist_itr << ", at: " << distance(start_itr,k_dist_itr) << endl; | ||
| cerr << "searching for " << k << " between " << distance(start_itr,search_start_itr) << " & " << distance(start_itr,search_end_itr) << endl; | ||
| #endif | ||
| search_start_itr = lower_bound(search_start_itr, search_end_itr, k); | ||
| if (search_start_itr == search_end_itr) { | ||
| #ifdef debug_binary_intersection | ||
| std::cerr << "No more search results possible" << std::endl; | ||
| #endif | ||
| return min_dist; | ||
| } | ||
| if (*search_start_itr == k) { | ||
| #ifdef debug_binary_intersection | ||
| cerr << "match found, key: " << *search_start_itr << ", at " << distance(start_itr,search_start_itr) << endl; | ||
| #endif | ||
| auto dist_itr = get_dist_itr(start_itr, search_start_itr); | ||
| DIST_UINT d = *(dist_itr) + *(k_dist_itr); | ||
| #ifdef debug_binary_intersection | ||
| cerr << "dist for key is: " << *dist_itr << ", at " << distance(start_itr,dist_itr) << endl; | ||
| cerr << "total dist is: " << d << endl; | ||
| #endif | ||
| min_dist = min(min_dist, d); | ||
| } | ||
| } | ||
| return min_dist; | ||
| } | ||
|
|
||
| /** | ||
| * Query stored hub label data for a minimum distance. | ||
| * | ||
| * start_itr should point to the first slot of the packed label data returned | ||
| * by pack_labels(), which is the label count. | ||
| * | ||
| * The rank space covers both orientations of each node. | ||
| * | ||
| * Returns the minimum distance from the end of the node orientation at rank1 | ||
| * to the start of the node orientation at rank2. (If working in a net graph in | ||
| * a SnarlDistanceIndex, these "nodes" may really be child chains.) | ||
| * | ||
| * If rank1 == rank2, returns the minimum distance around that cycle, if any. | ||
| * | ||
| * If there is no known path between the given nodes, returns INF_INT. | ||
| */ | ||
| template <typename ItrType> | ||
| DIST_UINT hhl_query(ItrType start_itr, size_t rank1, size_t rank2) { | ||
| size_t label_count = *start_itr; | ||
|
|
||
| #ifdef debug_hhl_query | ||
| std::cerr << "Making hub label query on " << label_count << " labels" << std::endl; | ||
| #endif | ||
|
|
||
| // Bounds start after the label count, and at the rank of the first | ||
| // vertex past there we find the start bound for the first vertex. | ||
| auto start_index_1 = 1+rank1; | ||
|
|
||
| #ifdef debug_hhl_query | ||
| std::cerr << "Start bound for forward label for rank " << rank1 << " is at index " << start_index_1 << " past there" << std::endl; | ||
| #endif | ||
|
|
||
| // And there's a final end value for the first set of labels before we go on | ||
| // to the bounds where we would find the start bound for the second vertex. | ||
| auto start_index_2 = 1+label_count+1+rank2; | ||
|
|
||
| #ifdef debug_hhl_query | ||
| std::cerr << "Start bound for reverse label for rank " << rank2 << " is at index " << start_index_2 << " past there" << std::endl; | ||
| #endif | ||
|
|
||
| DIST_UINT dist = binary_intersection_ch(start_itr, start_index_1, start_index_2); | ||
|
|
||
|
|
||
| return dist; | ||
| } | ||
|
|
||
| void down_dijk(int node, CHOverlay& ov, vector<DIST_UINT>& node_dists, vector<vector<HubRecord>>& labels, vector<vector<HubRecord>>& labels_rev); | ||
|
|
||
| void down_dijk_rev(int node, CHOverlay& ov, vector<DIST_UINT>& node_dists, vector<vector<HubRecord>>& labels, vector<vector<HubRecord>>& labels_rev); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the difference between these? Also, it would be good to organize this file so the things that are part of the external interface are clearly separated from the ones (like these) that aren't. A class with public and private methods/types could be used to formalize that. |
||
|
|
||
| void test_dijk(int node, CHOverlay& ov, vector<DIST_UINT>& node_dists, vector<vector<HubRecord>>& labels, vector<vector<HubRecord>>& labels_rev); | ||
|
|
||
| void test_dijk_rev(int node, CHOverlay& ov, vector<DIST_UINT>& node_dists, vector<vector<HubRecord>>& labels, vector<vector<HubRecord>>& labels_rev); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Test code should move to the test files. |
||
|
|
||
| void create_labels(vector<vector<HubRecord>>& labels, vector<vector<HubRecord>>& labels_rev, CHOverlay& ov); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should have a doc comment. |
||
|
|
||
| /** | ||
| * Puts hub labels in a flat vector form | ||
| * | ||
| * Structure: | ||
| * - offsets are relative to start of flat vector | ||
| * - extra offset in each of fwd and back offset sets at the end so that end of ranges can be found | ||
| * -- subtracting the extra offset by the first offset of its set gets the distance to the corresponding dist of a hub | ||
| * | ||
| * The layout is: | ||
| * label count | start offsets (fwd) | start offsets (back) | fwd label hubs | fwd label dists | back label hubs | back label dists | ||
| */ | ||
| vector<size_t> pack_labels(const vector<vector<HubRecord>>& labels, const vector<vector<HubRecord>>& labels_back); | ||
|
|
||
| //not necessary stuff | ||
| void write_to_csv(CHOverlay& ov, string out_path); | ||
|
|
||
| void write_to_gr(CHOverlay& ov, string out_path); | ||
|
|
||
| vector<CHOverlay::vertex_descriptor> read_node_order(string in_path); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think these would want to be cut, probably. |
||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.