Collaboration diagram for vect_optimize_slp_pass:

[legend]

Public Member Functions
	vect_optimize_slp_pass (vec_info *vinfo)
void	run ()

Private Member Functions
struct loop *	containing_loop (slp_tree)
bool	is_cfg_latch_edge (graph_edge *)
void	build_vertices (hash_set< slp_tree > &, slp_tree)
void	build_vertices ()
void	build_graph ()
void	create_partitions ()
template<typename T>
void	for_each_partition_edge (unsigned int, T)
bool	is_compatible_layout (slp_tree, unsigned int)
int	change_layout_cost (slp_tree, unsigned int, unsigned int)
slpg_partition_layout_costs &	partition_layout_costs (unsigned int, unsigned int)
void	change_vec_perm_layout (slp_tree, lane_permutation_t &, int, unsigned int)
int	internal_node_cost (slp_tree, int, unsigned int)
void	start_choosing_layouts ()
slpg_layout_cost	edge_layout_cost (graph_edge *, unsigned int, unsigned int, unsigned int)
slpg_layout_cost	total_in_cost (unsigned int)
slpg_layout_cost	forward_cost (graph_edge *, unsigned int, unsigned int)
slpg_layout_cost	backward_cost (graph_edge *, unsigned int, unsigned int)
void	forward_pass ()
void	backward_pass ()
slp_tree	get_result_with_layout (slp_tree, unsigned int)
void	materialize ()
void	remove_redundant_permutations ()
void	decide_masked_load_lanes ()
void	dump ()

Private Attributes
vec_info *	m_vinfo
bool	m_optimize_size
graph *	m_slpg = nullptr
auto_vec< slpg_vertex >	m_vertices
auto_vec< int >	m_leafs
auto_vec< vec< unsigned > >	m_perms
auto_vec< slpg_partition_info >	m_partitions
auto_vec< unsigned int >	m_partitioned_nodes
auto_vec< slpg_partition_layout_costs >	m_partition_layout_costs
auto_vec< slp_tree >	m_node_layouts

Detailed Description

This class tries to optimize the layout of vectors in order to avoid
unnecessary shuffling.  At the moment, the set of possible layouts are
restricted to bijective permutations.

The goal of the pass depends on whether we're optimizing for size or
for speed.  When optimizing for size, the goal is to reduce the overall
number of layout changes (including layout changes implied by things
like load permutations).  When optimizing for speed, the goal is to
reduce the maximum latency attributable to layout changes on any
non-cyclical path through the data flow graph.

For example, when optimizing a loop nest for speed, we will prefer
to make layout changes outside of a loop rather than inside of a loop,
and will prefer to make layout changes in parallel rather than serially,
even if that increases the overall number of layout changes.

The high-level procedure is:

(1) Build a graph in which edges go from uses (parents) to definitions
    (children).

(2) Divide the graph into a dag of strongly-connected components (SCCs).

(3) When optimizing for speed, partition the nodes in each SCC based
    on their containing cfg loop.  When optimizing for size, treat
    each SCC as a single partition.

    This gives us a dag of partitions.  The goal is now to assign a
    layout to each partition.

(4) Construct a set of vector layouts that are worth considering.
    Record which nodes must keep their current layout.

(5) Perform a forward walk over the partition dag (from loads to stores)
    accumulating the "forward" cost of using each layout.  When visiting
    each partition, assign a tentative choice of layout to the partition
    and use that choice when calculating the cost of using a different
    layout in successor partitions.

(6) Perform a backward walk over the partition dag (from stores to loads),
    accumulating the "backward" cost of using each layout.  When visiting
    each partition, make a final choice of layout for that partition based
    on the accumulated forward costs (from (5)) and backward costs
    (from (6)).

(7) Apply the chosen layouts to the SLP graph.

For example, consider the SLP statements:

S1:      a_1 = load
    loop:
S2:      a_2 = PHI<a_1, a_3>
S3:      b_1 = load
S4:      a_3 = a_2 + b_1
    exit:
S5:      a_4 = PHI<a_3>
S6:      store a_4

S2 and S4 form an SCC and are part of the same loop.  Every other
statement is in a singleton SCC.  In this example there is a one-to-one
mapping between SCCs and partitions and the partition dag looks like this;

     S1     S3
      \     /
       S2+S4
         |
        S5
         |
        S6

S2, S3 and S4 will have a higher execution frequency than the other
statements, so when optimizing for speed, the goal is to avoid any
layout changes:

- within S3
- within S2+S4
- on the S3->S2+S4 edge

For example, if S3 was originally a reversing load, the goal of the
pass is to make it an unreversed load and change the layout on the
S1->S2+S4 and S2+S4->S5 edges to compensate.  (Changing the layout
on S1->S2+S4 and S5->S6 would also be acceptable.)

The difference between SCCs and partitions becomes important if we
add an outer loop:

S1:      a_1 = ...
    loop1:
S2:      a_2 = PHI<a_1, a_6>
S3:      b_1 = load
S4:      a_3 = a_2 + b_1
    loop2:
S5:      a_4 = PHI<a_3, a_5>
S6:      c_1 = load
S7:      a_5 = a_4 + c_1
    exit2:
S8:      a_6 = PHI<a_5>
S9:      store a_6
    exit1:

Here, S2, S4, S5, S7 and S8 form a single SCC.  However, when optimizing
for speed, we usually do not want restrictions in the outer loop to "infect"
the decision for the inner loop.  For example, if an outer-loop node
in the SCC contains a statement with a fixed layout, that should not
prevent the inner loop from using a different layout.  Conversely,
the inner loop should not dictate a layout to the outer loop: if the
outer loop does a lot of computation, then it may not be efficient to
do all of that computation in the inner loop's preferred layout.

So when optimizing for speed, we partition the SCC into S2+S4+S8 (outer)
and S5+S7 (inner).  We also try to arrange partitions so that:

- the partition for an outer loop comes before the partition for
  an inner loop

- if a sibling loop A dominates a sibling loop B, A's partition
  comes before B's

This gives the following partition dag for the example above:

     S1        S3
      \        /
       S2+S4+S8   S6
        |   \\    /
        |    S5+S7
        |
       S9

There are two edges from S2+S4+S8 to S5+S7: one for the edge S4->S5 and
one for a reversal of the edge S7->S8.

The backward walk picks a layout for S5+S7 before S2+S4+S8.  The choice
for S2+S4+S8 therefore has to balance the cost of using the outer loop's
preferred layout against the cost of changing the layout on entry to the
inner loop (S4->S5) and on exit from the inner loop (S7->S8 reversed).

Although this works well when optimizing for speed, it has the downside
when optimizing for size that the choice of layout for S5+S7 is completely
independent of S9, which lessens the chance of reducing the overall number
of permutations.  We therefore do not partition SCCs when optimizing
for size.

To give a concrete example of the difference between optimizing
for size and speed, consider:

a[0] = (b[1] << c[3]) - d[1];
a[1] = (b[0] << c[2]) - d[0];
a[2] = (b[3] << c[1]) - d[3];
a[3] = (b[2] << c[0]) - d[2];

There are three different layouts here: one for a, one for b and d,
and one for c.  When optimizing for speed it is better to permute each
of b, c and d into the order required by a, since those permutations
happen in parallel.  But when optimizing for size, it is better to:

- permute c into the same order as b
- do the arithmetic
- permute the result into the order required by a

This gives 2 permutations rather than 3.

Constructor & Destructor Documentation

◆ vect_optimize_slp_pass()

vect_optimize_slp_pass::vect_optimize_slp_pass ( vec_info * vinfo )

inline

References m_vinfo.

Member Function Documentation

◆ backward_cost()

slpg_layout_cost vect_optimize_slp_pass::backward_cost	(	graph_edge *	ud,
		unsigned int	to_node_i,
		unsigned int	from_layout_i )

private

UD represents a use-def link between TO_NODE_I and a node in an earlier
partition; TO_NODE_I could be the definition node or the use node.
The node at the other end of the link wants to use layout FROM_LAYOUT_I;
return the cost of any necessary fix-ups on edge UD, or
slpg_layout_cost::impossible () if the choice cannot be made.

At this point, TO_NODE_I's partition has a fixed choice of layout.

References slpg_layout_cost::add_serial_cost(), graph_edge::dest, edge_layout_cost(), gcc_assert, slpg_layout_cost::impossible(), slpg_partition_info::in_degree, internal_node_cost(), slpg_partition_info::layout, m_optimize_size, m_partitions, m_vertices, partition_layout_costs(), SLP_TREE_CODE, slpg_layout_cost::split(), and graph_edge::src.

Referenced by backward_pass().

◆ backward_pass()

void vect_optimize_slp_pass::backward_pass ( )

private

Make a backward pass through the partitions, accumulating output costs.
Make a final choice of layout for each partition.

References add_cost(), slpg_layout_cost::add_serial_cost(), backward_cost(), edge_layout_cost(), for_each_partition_edge(), gcc_assert, slpg_layout_cost::impossible(), slpg_layout_cost::is_better_than(), slpg_layout_cost::is_possible(), m_optimize_size, m_partitioned_nodes, m_partitions, m_perms, m_vertices, and partition_layout_costs().

Referenced by run().

◆ build_graph()

void vect_optimize_slp_pass::build_graph ( )

private

Build the graph.  Mark edges that correspond to cfg loop latch edges with
a nonnull data field.

References add_edge(), build_vertices(), graph_edge::data, is_cfg_latch_edge(), m_optimize_size, m_slpg, m_vertices, new_graph(), and SLP_TREE_CHILDREN.

Referenced by run().

◆ build_vertices() [1/2]

void vect_optimize_slp_pass::build_vertices ( )

private


Fill the vertices and leafs vector with all nodes in the SLP graph.

References build_vertices(), FOR_EACH_VEC_ELT, i, m_leafs, m_vertices, m_vinfo, SLP_INSTANCE_TREE, and visited.

Referenced by build_graph(), build_vertices(), and build_vertices().

◆ build_vertices() [2/2]

void vect_optimize_slp_pass::build_vertices	(	hash_set< slp_tree > &	visited,
		slp_tree	node )

private

Fill the vertices and leafs vector with all nodes in the SLP graph.
Also record whether we should optimize anything for speed rather
than size.

References build_vertices(), FOR_EACH_VEC_ELT, gimple_bb(), i, m_leafs, m_optimize_size, m_vertices, optimize_bb_for_speed_p(), SLP_TREE_CHILDREN, SLP_TREE_REPRESENTATIVE, vect_orig_stmt(), _slp_tree::vertex, and visited.

◆ change_layout_cost()

int vect_optimize_slp_pass::change_layout_cost	(	slp_tree	node,
		unsigned int	from_layout_i,
		unsigned int	to_layout_i )

private

Return the cost (in arbtirary units) of going from layout FROM_LAYOUT_I
to layout TO_LAYOUT_I for a node like NODE.  Return -1 if either of the
layouts is incompatible with NODE or if the change is not possible for
some other reason.

The properties taken from NODE include the number of lanes and the
vector type.  The actual operation doesn't matter.

References count, i, is_compatible_layout(), m_perms, m_vinfo, MAX, SLP_TREE_LANES, vect_slp_permute(), and vectorizable_slp_permutation_1().

Referenced by edge_layout_cost().

◆ change_vec_perm_layout()

void vect_optimize_slp_pass::change_vec_perm_layout	(	slp_tree	node,
		lane_permutation_t &	perm,
		int	in_layout_i,
		unsigned int	out_layout_i )

private

Change PERM in one of two ways:

- if IN_LAYOUT_I < 0, accept input operand I in the layout that has been
  chosen for child I of NODE.

- if IN_LAYOUT >= 0, accept all inputs operands with that layout.

In both cases, arrange for the output to have layout OUT_LAYOUT_I

References m_partitions, m_perms, m_vertices, SLP_TREE_CHILDREN, vect_slp_permute(), and _slp_tree::vertex.

Referenced by internal_node_cost(), and materialize().

◆ containing_loop()

struct loop * vect_optimize_slp_pass::containing_loop ( slp_tree node )

private


Return the cfg loop that contains NODE.

References cfun, ENTRY_BLOCK_PTR_FOR_FN, gimple_bb(), basic_block_def::loop_father, SLP_TREE_REPRESENTATIVE, and vect_orig_stmt().

Referenced by create_partitions(), and is_cfg_latch_edge().

◆ create_partitions()

void vect_optimize_slp_pass::create_partitions ( )

private


Create the node partitions.

References containing_loop(), hash_map< KeyId, Value, Traits >::empty(), gcc_assert, hash_map< KeyId, Value, Traits >::get_or_insert(), graphds_dfs(), graphds_scc(), m_leafs, m_optimize_size, m_partitioned_nodes, m_partitions, m_slpg, m_vertices, NULL, skip_cfg_latch_edges(), SLP_TREE_DEF_TYPE, SLP_TREE_VEC_DEFS, vect_constant_def, and vect_external_def.

Referenced by run().

◆ decide_masked_load_lanes()

void vect_optimize_slp_pass::decide_masked_load_lanes ( )

private


Masked load lanes discovery.

References compare_step_with_zero(), DR_GROUP_FIRST_ELEMENT, DR_GROUP_SIZE, gcc_assert, gimple_call_internal_p(), is_a(), _slp_tree::ldst_lanes, m_slpg, m_vertices, m_vinfo, SLP_TREE_CHILDREN, SLP_TREE_CODE, SLP_TREE_DEF_TYPE, SLP_TREE_LANE_PERMUTATION, SLP_TREE_LANES, SLP_TREE_REF_COUNT, SLP_TREE_REPRESENTATIVE, SLP_TREE_VECTYPE, STMT_VINFO_GROUPED_ACCESS, STMT_VINFO_SLP_VECT_ONLY, STMT_VINFO_STMT, STMT_VINFO_STRIDED_P, vect_free_slp_tree(), vect_internal_def, vect_load_lanes_supported(), and _slp_tree::vertex.

Referenced by run().

◆ dump()

void vect_optimize_slp_pass::dump ( )

private


Print the partition graph and layout information to the dump file.

References slpg_layout_cost::add_serial_cost(), slpg_layout_cost::depth, dump_printf(), dump_printf_loc(), for_each_partition_edge(), m_partitioned_nodes, m_partitions, m_perms, m_vertices, MSG_NOTE, partition_layout_costs(), print_edge(), SLP_TREE_CODE, SLP_TREE_REPRESENTATIVE, TEMPLATE, sreal::to_double(), slpg_layout_cost::total, and vect_location.

Referenced by run().

◆ edge_layout_cost()

slpg_layout_cost vect_optimize_slp_pass::edge_layout_cost	(	graph_edge *	ud,
		unsigned int	node1_i,
		unsigned int	layout1_i,
		unsigned int	layout2_i )

private

Return the cost of switching between layout LAYOUT1_I (at node NODE1_I)
and layout LAYOUT2_I on cross-partition use-to-def edge UD.  Return
slpg_layout_cost::impossible () if the change isn't possible.

References change_layout_cost(), graph_edge::dest, slpg_layout_cost::impossible(), m_optimize_size, m_vertices, slpg_layout_cost::split(), and graph_edge::src.

Referenced by backward_cost(), backward_pass(), forward_cost(), and forward_pass().

◆ for_each_partition_edge()

template<typename T>

void vect_optimize_slp_pass::for_each_partition_edge	(	unsigned int	node_i,
		T	fn )

private

Look for edges from earlier partitions into node NODE_I and edges from
node NODE_I into later partitions.  Call:

   FN (ud, other_node_i)

for each such use-to-def edge ud, where other_node_i is the node at the
other end of the edge.

References m_slpg, m_vertices, and T.

Referenced by backward_pass(), dump(), forward_pass(), start_choosing_layouts(), and total_in_cost().

◆ forward_cost()

slpg_layout_cost vect_optimize_slp_pass::forward_cost	(	graph_edge *	ud,
		unsigned int	from_node_i,
		unsigned int	to_layout_i )

private

UD represents a use-def link between FROM_NODE_I and a node in a later
partition; FROM_NODE_I could be the definition node or the use node.
The node at the other end of the link wants to use layout TO_LAYOUT_I.
Return the cost of any necessary fix-ups on edge UD, or return
slpg_layout_cost::impossible () if the change isn't possible.

At this point, FROM_NODE_I's partition has chosen the cheapest
layout based on the information available so far, but this choice
is only provisional.

References slpg_layout_cost::add_serial_cost(), edge_layout_cost(), gcc_assert, slpg_layout_cost::impossible(), slpg_layout_cost::is_better_than(), slpg_layout_cost::is_possible(), slpg_partition_info::layout, m_optimize_size, m_partitions, m_vertices, slpg_partition_info::out_degree, partition_layout_costs(), and slpg_layout_cost::split().

Referenced by forward_pass().

◆ forward_pass()

void vect_optimize_slp_pass::forward_pass ( )

private

Make a forward pass through the partitions, accumulating input costs.
Make a tentative (provisional) choice of layout for each partition,
ensuring that this choice still allows later partitions to keep
their original layout.

References add_cost(), slpg_layout_cost::add_serial_cost(), edge_layout_cost(), for_each_partition_edge(), forward_cost(), gcc_assert, slpg_layout_cost::impossible(), internal_node_cost(), slpg_layout_cost::is_better_than(), is_compatible_layout(), slpg_layout_cost::is_possible(), m_optimize_size, m_partitioned_nodes, m_partitions, m_perms, m_vertices, partition_layout_costs(), SLP_TREE_CODE, total_in_cost(), and _slp_tree::vertex.

Referenced by run().

◆ get_result_with_layout()

slp_tree vect_optimize_slp_pass::get_result_with_layout	(	slp_tree	node,
		unsigned int	to_layout_i )

private

Return a node that applies layout TO_LAYOUT_I to the original form of NODE.
NODE already has the layout that was selected for its partition.

References dump_enabled_p(), dump_printf_loc(), is_empty(), m_node_layouts, m_partitions, m_perms, m_vertices, m_vinfo, MSG_NOTE, SLP_TREE_CHILDREN, SLP_TREE_CODE, SLP_TREE_DEF_TYPE, SLP_TREE_LANE_PERMUTATION, SLP_TREE_LANES, SLP_TREE_REPRESENTATIVE, SLP_TREE_SCALAR_OPS, SLP_TREE_SCALAR_STMTS, SLP_TREE_VEC_DEFS, SLP_TREE_VECTYPE, vect_constant_def, vect_create_new_slp_node(), vect_external_def, vect_location, vect_slp_permute(), vect_slp_tree_uniform_p(), vectorizable_slp_permutation_1(), and _slp_tree::vertex.

Referenced by materialize().

◆ internal_node_cost()

int vect_optimize_slp_pass::internal_node_cost	(	slp_tree	node,
		int	in_layout_i,
		unsigned int	out_layout_i )

private

Check whether the target allows NODE to be rearranged so that the node's
output has layout OUT_LAYOUT_I.  Return the cost of the change if so,
in the same arbitrary units as for change_layout_cost.  Return -1 otherwise.

If NODE is a VEC_PERM_EXPR and IN_LAYOUT_I < 0, also check whether
NODE can adapt to the layout changes that have (perhaps provisionally)
been chosen for NODE's children, so that no extra permutations are
needed on either the input or the output of NODE.

If NODE is a VEC_PERM_EXPR and IN_LAYOUT_I >= 0, instead assume
that all inputs will be forced into layout IN_LAYOUT_I beforehand.

IN_LAYOUT_I has no meaning for other types of node.

Keeping the node as-is is always valid.  If the target doesn't appear
to support the node as-is, but might realistically support other layouts,
then layout 0 instead has the cost of a worst-case permutation.  On the
one hand, this ensures that every node has at least one valid layout,
avoiding what would otherwise be an awkward special case.  On the other,
it still encourages the pass to change an invalid pre-existing layout
choice into a valid one.

References change_vec_perm_layout(), count, DR_GROUP_FIRST_ELEMENT, DR_GROUP_SIZE, DR_IS_READ, dyn_cast(), is_compatible_layout(), LOOP_VINFO_VECT_FACTOR, m_perms, m_vinfo, SLP_TREE_CHILDREN, SLP_TREE_CODE, SLP_TREE_LANE_PERMUTATION, SLP_TREE_LANES, SLP_TREE_LOAD_PERMUTATION, SLP_TREE_REPRESENTATIVE, STMT_VINFO_DATA_REF, STMT_VINFO_GROUPED_ACCESS, vect_slp_permute(), vect_transform_slp_perm_load_1(), vectorizable_slp_permutation_1(), and vNULL.

Referenced by backward_cost(), and forward_pass().

◆ is_cfg_latch_edge()

bool vect_optimize_slp_pass::is_cfg_latch_edge ( graph_edge * ud )

private

Return true if UD (an edge from a use to a definition) is associated
with a loop latch edge in the cfg.

References bb_loop_header_p(), containing_loop(), graph_edge::dest, gimple_bb(), is_a(), m_vertices, SLP_TREE_CODE, SLP_TREE_DEF_TYPE, SLP_TREE_REPRESENTATIVE, graph_edge::src, vect_internal_def, and vect_orig_stmt().

Referenced by build_graph().

◆ is_compatible_layout()

bool vect_optimize_slp_pass::is_compatible_layout	(	slp_tree	node,
		unsigned int	layout_i )

private

Return true if layout LAYOUT_I is compatible with the number of SLP lanes
that NODE would operate on.  This test is independent of NODE's actual
operation.

References m_perms, and SLP_TREE_LANES.

Referenced by change_layout_cost(), forward_pass(), and internal_node_cost().

◆ materialize()

void vect_optimize_slp_pass::materialize ( )

private


Apply the chosen vector layouts to the SLP graph.

References bitmap_bit_p, bitmap_clear(), bitmap_set_bit, change_vec_perm_layout(), dump_enabled_p(), dump_printf_loc(), FOR_EACH_VEC_ELT, gcc_assert, get_result_with_layout(), m_node_layouts, m_partition_layout_costs, m_partitioned_nodes, m_partitions, m_perms, m_vertices, m_vinfo, MSG_NOTE, _slp_tree::refcnt, remove_redundant_permutations(), SLP_TREE_CHILDREN, SLP_TREE_CODE, SLP_TREE_LANE_PERMUTATION, SLP_TREE_LOAD_PERMUTATION, SLP_TREE_SCALAR_STMTS, vect_free_slp_tree(), vect_location, vect_slp_permute(), and vectorizable_slp_permutation_1().

Referenced by run().

◆ partition_layout_costs()

slpg_partition_layout_costs & vect_optimize_slp_pass::partition_layout_costs	(	unsigned int	partition_i,
		unsigned int	layout_i )

inlineprivate


Return the costs of assigning layout LAYOUT_I to partition PARTITION_I.

References m_partition_layout_costs, and m_perms.

Referenced by backward_cost(), backward_pass(), dump(), forward_cost(), forward_pass(), and total_in_cost().

◆ remove_redundant_permutations()

void vect_optimize_slp_pass::remove_redundant_permutations ( )

private

Elide load permutations that are not necessary.  Such permutations might
be pre-existing, rather than created by the layout optimizations.

References as_a(), DR_GROUP_FIRST_ELEMENT, DR_GROUP_GAP, DR_GROUP_NEXT_ELEMENT, DR_GROUP_SIZE, FOR_EACH_VEC_ELT, is_a(), known_eq, LOOP_VINFO_VECT_FACTOR, m_leafs, m_vertices, m_vinfo, NULL, SLP_TREE_LANES, SLP_TREE_LOAD_PERMUTATION, SLP_TREE_SCALAR_STMTS, and STMT_VINFO_GROUPED_ACCESS.

Referenced by materialize(), and run().

◆ run()

void vect_optimize_slp_pass::run ( )


Main entry point for the SLP graph optimization pass.

References backward_pass(), build_graph(), create_partitions(), decide_masked_load_lanes(), dump(), dump_enabled_p(), forward_pass(), free_graph(), m_perms, m_slpg, materialize(), remove_redundant_permutations(), and start_choosing_layouts().

Referenced by vect_optimize_slp().

◆ start_choosing_layouts()

void vect_optimize_slp_pass::start_choosing_layouts ( )

private

Decide which element layouts we should consider using.  Calculate the
weights associated with inserting layout changes on partition edges.
Also mark partitions that cannot change layout, by setting their
layout to zero.

References bitmap_bit_p, bitmap_clear(), bitmap_set_bit, DR_GROUP_FIRST_ELEMENT, DR_GROUP_SIZE, for_each_partition_edge(), gcc_assert, hash_map< KeyId, Value, Traits >::get(), hash_map< KeyId, Value, Traits >::get_or_insert(), gimple_call_combined_fn(), gimple_get_lhs(), info_for_reduction(), is_gimple_call(), m_partition_layout_costs, m_partitioned_nodes, m_partitions, m_perms, m_slpg, m_vertices, m_vinfo, MAX, MIN, needs_fold_left_reduction_p(), slp_inst_kind_ctor, slp_inst_kind_reduc_chain, SLP_INSTANCE_KIND, SLP_INSTANCE_TREE, SLP_TREE_CHILDREN, SLP_TREE_CODE, SLP_TREE_LANE_PERMUTATION, SLP_TREE_LANES, SLP_TREE_LOAD_PERMUTATION, SLP_TREE_REPRESENTATIVE, SLP_TREE_VECTYPE, graph_edge::src, STMT_VINFO_DATA_REF, STMT_VINFO_GROUPED_ACCESS, STMT_VINFO_REDUC_CODE, STMT_VINFO_STMT, TREE_TYPE, TYPE_VECTOR_SUBPARTS(), vect_slp_node_weight(), and vNULL.

Referenced by run().

◆ total_in_cost()

slpg_layout_cost vect_optimize_slp_pass::total_in_cost ( unsigned int node_i )

private

Return the incoming costs for node NODE_I, assuming that each input keeps
its current (provisional) choice of layout.  The inputs do not necessarily
have the same layout as each other.

References add_cost(), slpg_layout_cost::add_parallel_cost(), slpg_layout_cost::add_serial_cost(), for_each_partition_edge(), m_partitions, m_vertices, partition_layout_costs(), and slpg_layout_cost::split().

Referenced by forward_pass().

Field Documentation

◆ m_leafs

auto_vec<int> vect_optimize_slp_pass::m_leafs

private

Referenced by build_vertices(), build_vertices(), create_partitions(), and remove_redundant_permutations().

◆ m_node_layouts

auto_vec<slp_tree> vect_optimize_slp_pass::m_node_layouts

private

Referenced by get_result_with_layout(), and materialize().

◆ m_optimize_size

bool vect_optimize_slp_pass::m_optimize_size

private

Referenced by backward_cost(), backward_pass(), build_graph(), build_vertices(), create_partitions(), edge_layout_cost(), forward_cost(), and forward_pass().

◆ m_partition_layout_costs

auto_vec<slpg_partition_layout_costs> vect_optimize_slp_pass::m_partition_layout_costs

private

Referenced by materialize(), partition_layout_costs(), and start_choosing_layouts().

◆ m_partitioned_nodes

auto_vec<unsigned int> vect_optimize_slp_pass::m_partitioned_nodes

private

Referenced by backward_pass(), create_partitions(), dump(), forward_pass(), materialize(), and start_choosing_layouts().

◆ m_partitions

auto_vec<slpg_partition_info> vect_optimize_slp_pass::m_partitions

private

Referenced by backward_cost(), backward_pass(), change_vec_perm_layout(), create_partitions(), dump(), forward_cost(), forward_pass(), get_result_with_layout(), materialize(), start_choosing_layouts(), and total_in_cost().

◆ m_perms

auto_vec<vec<unsigned> > vect_optimize_slp_pass::m_perms

private

Referenced by backward_pass(), change_layout_cost(), change_vec_perm_layout(), dump(), forward_pass(), get_result_with_layout(), internal_node_cost(), is_compatible_layout(), materialize(), partition_layout_costs(), run(), and start_choosing_layouts().

◆ m_slpg

graph* vect_optimize_slp_pass::m_slpg = nullptr

private

Referenced by build_graph(), create_partitions(), decide_masked_load_lanes(), for_each_partition_edge(), run(), and start_choosing_layouts().

◆ m_vertices

auto_vec<slpg_vertex> vect_optimize_slp_pass::m_vertices

private

Referenced by backward_cost(), backward_pass(), build_graph(), build_vertices(), build_vertices(), change_vec_perm_layout(), create_partitions(), decide_masked_load_lanes(), dump(), edge_layout_cost(), for_each_partition_edge(), forward_cost(), forward_pass(), get_result_with_layout(), is_cfg_latch_edge(), materialize(), remove_redundant_permutations(), start_choosing_layouts(), and total_in_cost().

◆ m_vinfo

vec_info* vect_optimize_slp_pass::m_vinfo

private

Referenced by build_vertices(), change_layout_cost(), decide_masked_load_lanes(), get_result_with_layout(), internal_node_cost(), materialize(), remove_redundant_permutations(), start_choosing_layouts(), and vect_optimize_slp_pass().

The documentation for this class was generated from the following file:

tree-vect-slp.cc

Public Member Functions

Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ vect_optimize_slp_pass()

Member Function Documentation

◆ backward_cost()

◆ backward_pass()

◆ build_graph()

◆ build_vertices() [1/2]

◆ build_vertices() [2/2]

◆ change_layout_cost()

◆ change_vec_perm_layout()

◆ containing_loop()

◆ create_partitions()

◆ decide_masked_load_lanes()

◆ dump()

◆ edge_layout_cost()

◆ for_each_partition_edge()

◆ forward_cost()

◆ forward_pass()

◆ get_result_with_layout()

◆ internal_node_cost()

◆ is_cfg_latch_edge()

◆ is_compatible_layout()

◆ materialize()

◆ partition_layout_costs()

◆ remove_redundant_permutations()

◆ run()

◆ start_choosing_layouts()

◆ total_in_cost()

Field Documentation

◆ m_leafs

◆ m_node_layouts

◆ m_optimize_size

◆ m_partition_layout_costs

◆ m_partitioned_nodes

◆ m_partitions

◆ m_perms

◆ m_slpg

◆ m_vertices

◆ m_vinfo