Line data Source code
1 : /* Integrated Register Allocator (IRA) entry point.
2 : Copyright (C) 2006-2026 Free Software Foundation, Inc.
3 : Contributed by Vladimir Makarov <vmakarov@redhat.com>.
4 :
5 : This file is part of GCC.
6 :
7 : GCC is free software; you can redistribute it and/or modify it under
8 : the terms of the GNU General Public License as published by the Free
9 : Software Foundation; either version 3, or (at your option) any later
10 : version.
11 :
12 : GCC is distributed in the hope that it will be useful, but WITHOUT ANY
13 : WARRANTY; without even the implied warranty of MERCHANTABILITY or
14 : FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
15 : for more details.
16 :
17 : You should have received a copy of the GNU General Public License
18 : along with GCC; see the file COPYING3. If not see
19 : <http://www.gnu.org/licenses/>. */
20 :
21 : /* The integrated register allocator (IRA) is a
22 : regional register allocator performing graph coloring on a top-down
23 : traversal of nested regions. Graph coloring in a region is based
24 : on Chaitin-Briggs algorithm. It is called integrated because
25 : register coalescing, register live range splitting, and choosing a
26 : better hard register are done on-the-fly during coloring. Register
27 : coalescing and choosing a cheaper hard register is done by hard
28 : register preferencing during hard register assigning. The live
29 : range splitting is a byproduct of the regional register allocation.
30 :
31 : Major IRA notions are:
32 :
33 : o *Region* is a part of CFG where graph coloring based on
34 : Chaitin-Briggs algorithm is done. IRA can work on any set of
35 : nested CFG regions forming a tree. Currently the regions are
36 : the entire function for the root region and natural loops for
37 : the other regions. Therefore data structure representing a
38 : region is called loop_tree_node.
39 :
40 : o *Allocno class* is a register class used for allocation of
41 : given allocno. It means that only hard register of given
42 : register class can be assigned to given allocno. In reality,
43 : even smaller subset of (*profitable*) hard registers can be
44 : assigned. In rare cases, the subset can be even smaller
45 : because our modification of Chaitin-Briggs algorithm requires
46 : that sets of hard registers can be assigned to allocnos forms a
47 : forest, i.e. the sets can be ordered in a way where any
48 : previous set is not intersected with given set or is a superset
49 : of given set.
50 :
51 : o *Pressure class* is a register class belonging to a set of
52 : register classes containing all of the hard-registers available
53 : for register allocation. The set of all pressure classes for a
54 : target is defined in the corresponding machine-description file
55 : according some criteria. Register pressure is calculated only
56 : for pressure classes and it affects some IRA decisions as
57 : forming allocation regions.
58 :
59 : o *Allocno* represents the live range of a pseudo-register in a
60 : region. Besides the obvious attributes like the corresponding
61 : pseudo-register number, allocno class, conflicting allocnos and
62 : conflicting hard-registers, there are a few allocno attributes
63 : which are important for understanding the allocation algorithm:
64 :
65 : - *Live ranges*. This is a list of ranges of *program points*
66 : where the allocno lives. Program points represent places
67 : where a pseudo can be born or become dead (there are
68 : approximately two times more program points than the insns)
69 : and they are represented by integers starting with 0. The
70 : live ranges are used to find conflicts between allocnos.
71 : They also play very important role for the transformation of
72 : the IRA internal representation of several regions into a one
73 : region representation. The later is used during the reload
74 : pass work because each allocno represents all of the
75 : corresponding pseudo-registers.
76 :
77 : - *Hard-register costs*. This is a vector of size equal to the
78 : number of available hard-registers of the allocno class. The
79 : cost of a callee-clobbered hard-register for an allocno is
80 : increased by the cost of save/restore code around the calls
81 : through the given allocno's life. If the allocno is a move
82 : instruction operand and another operand is a hard-register of
83 : the allocno class, the cost of the hard-register is decreased
84 : by the move cost.
85 :
86 : When an allocno is assigned, the hard-register with minimal
87 : full cost is used. Initially, a hard-register's full cost is
88 : the corresponding value from the hard-register's cost vector.
89 : If the allocno is connected by a *copy* (see below) to
90 : another allocno which has just received a hard-register, the
91 : cost of the hard-register is decreased. Before choosing a
92 : hard-register for an allocno, the allocno's current costs of
93 : the hard-registers are modified by the conflict hard-register
94 : costs of all of the conflicting allocnos which are not
95 : assigned yet.
96 :
97 : - *Conflict hard-register costs*. This is a vector of the same
98 : size as the hard-register costs vector. To permit an
99 : unassigned allocno to get a better hard-register, IRA uses
100 : this vector to calculate the final full cost of the
101 : available hard-registers. Conflict hard-register costs of an
102 : unassigned allocno are also changed with a change of the
103 : hard-register cost of the allocno when a copy involving the
104 : allocno is processed as described above. This is done to
105 : show other unassigned allocnos that a given allocno prefers
106 : some hard-registers in order to remove the move instruction
107 : corresponding to the copy.
108 :
109 : o *Cap*. If a pseudo-register does not live in a region but
110 : lives in a nested region, IRA creates a special allocno called
111 : a cap in the outer region. A region cap is also created for a
112 : subregion cap.
113 :
114 : o *Copy*. Allocnos can be connected by copies. Copies are used
115 : to modify hard-register costs for allocnos during coloring.
116 : Such modifications reflects a preference to use the same
117 : hard-register for the allocnos connected by copies. Usually
118 : copies are created for move insns (in this case it results in
119 : register coalescing). But IRA also creates copies for operands
120 : of an insn which should be assigned to the same hard-register
121 : due to constraints in the machine description (it usually
122 : results in removing a move generated in reload to satisfy
123 : the constraints) and copies referring to the allocno which is
124 : the output operand of an instruction and the allocno which is
125 : an input operand dying in the instruction (creation of such
126 : copies results in less register shuffling). IRA *does not*
127 : create copies between the same register allocnos from different
128 : regions because we use another technique for propagating
129 : hard-register preference on the borders of regions.
130 :
131 : Allocnos (including caps) for the upper region in the region tree
132 : *accumulate* information important for coloring from allocnos with
133 : the same pseudo-register from nested regions. This includes
134 : hard-register and memory costs, conflicts with hard-registers,
135 : allocno conflicts, allocno copies and more. *Thus, attributes for
136 : allocnos in a region have the same values as if the region had no
137 : subregions*. It means that attributes for allocnos in the
138 : outermost region corresponding to the function have the same values
139 : as though the allocation used only one region which is the entire
140 : function. It also means that we can look at IRA work as if the
141 : first IRA did allocation for all function then it improved the
142 : allocation for loops then their subloops and so on.
143 :
144 : IRA major passes are:
145 :
146 : o Building IRA internal representation which consists of the
147 : following subpasses:
148 :
149 : * First, IRA builds regions and creates allocnos (file
150 : ira-build.cc) and initializes most of their attributes.
151 :
152 : * Then IRA finds an allocno class for each allocno and
153 : calculates its initial (non-accumulated) cost of memory and
154 : each hard-register of its allocno class (file ira-cost.c).
155 :
156 : * IRA creates live ranges of each allocno, calculates register
157 : pressure for each pressure class in each region, sets up
158 : conflict hard registers for each allocno and info about calls
159 : the allocno lives through (file ira-lives.cc).
160 :
161 : * IRA removes low register pressure loops from the regions
162 : mostly to speed IRA up (file ira-build.cc).
163 :
164 : * IRA propagates accumulated allocno info from lower region
165 : allocnos to corresponding upper region allocnos (file
166 : ira-build.cc).
167 :
168 : * IRA creates all caps (file ira-build.cc).
169 :
170 : * Having live-ranges of allocnos and their classes, IRA creates
171 : conflicting allocnos for each allocno. Conflicting allocnos
172 : are stored as a bit vector or array of pointers to the
173 : conflicting allocnos whatever is more profitable (file
174 : ira-conflicts.cc). At this point IRA creates allocno copies.
175 :
176 : o Coloring. Now IRA has all necessary info to start graph coloring
177 : process. It is done in each region on top-down traverse of the
178 : region tree (file ira-color.cc). There are following subpasses:
179 :
180 : * Finding profitable hard registers of corresponding allocno
181 : class for each allocno. For example, only callee-saved hard
182 : registers are frequently profitable for allocnos living
183 : through colors. If the profitable hard register set of
184 : allocno does not form a tree based on subset relation, we use
185 : some approximation to form the tree. This approximation is
186 : used to figure out trivial colorability of allocnos. The
187 : approximation is a pretty rare case.
188 :
189 : * Putting allocnos onto the coloring stack. IRA uses Briggs
190 : optimistic coloring which is a major improvement over
191 : Chaitin's coloring. Therefore IRA does not spill allocnos at
192 : this point. There is some freedom in the order of putting
193 : allocnos on the stack which can affect the final result of
194 : the allocation. IRA uses some heuristics to improve the
195 : order. The major one is to form *threads* from colorable
196 : allocnos and push them on the stack by threads. Thread is a
197 : set of non-conflicting colorable allocnos connected by
198 : copies. The thread contains allocnos from the colorable
199 : bucket or colorable allocnos already pushed onto the coloring
200 : stack. Pushing thread allocnos one after another onto the
201 : stack increases chances of removing copies when the allocnos
202 : get the same hard reg.
203 :
204 : We also use a modification of Chaitin-Briggs algorithm which
205 : works for intersected register classes of allocnos. To
206 : figure out trivial colorability of allocnos, the mentioned
207 : above tree of hard register sets is used. To get an idea how
208 : the algorithm works in i386 example, let us consider an
209 : allocno to which any general hard register can be assigned.
210 : If the allocno conflicts with eight allocnos to which only
211 : EAX register can be assigned, given allocno is still
212 : trivially colorable because all conflicting allocnos might be
213 : assigned only to EAX and all other general hard registers are
214 : still free.
215 :
216 : To get an idea of the used trivial colorability criterion, it
217 : is also useful to read article "Graph-Coloring Register
218 : Allocation for Irregular Architectures" by Michael D. Smith
219 : and Glen Holloway. Major difference between the article
220 : approach and approach used in IRA is that Smith's approach
221 : takes register classes only from machine description and IRA
222 : calculate register classes from intermediate code too
223 : (e.g. an explicit usage of hard registers in RTL code for
224 : parameter passing can result in creation of additional
225 : register classes which contain or exclude the hard
226 : registers). That makes IRA approach useful for improving
227 : coloring even for architectures with regular register files
228 : and in fact some benchmarking shows the improvement for
229 : regular class architectures is even bigger than for irregular
230 : ones. Another difference is that Smith's approach chooses
231 : intersection of classes of all insn operands in which a given
232 : pseudo occurs. IRA can use bigger classes if it is still
233 : more profitable than memory usage.
234 :
235 : * Popping the allocnos from the stack and assigning them hard
236 : registers. If IRA cannot assign a hard register to an
237 : allocno and the allocno is coalesced, IRA undoes the
238 : coalescing and puts the uncoalesced allocnos onto the stack in
239 : the hope that some such allocnos will get a hard register
240 : separately. If IRA fails to assign hard register or memory
241 : is more profitable for it, IRA spills the allocno. IRA
242 : assigns the allocno the hard-register with minimal full
243 : allocation cost which reflects the cost of usage of the
244 : hard-register for the allocno and cost of usage of the
245 : hard-register for allocnos conflicting with given allocno.
246 :
247 : * Chaitin-Briggs coloring assigns as many pseudos as possible
248 : to hard registers. After coloring we try to improve
249 : allocation with cost point of view. We improve the
250 : allocation by spilling some allocnos and assigning the freed
251 : hard registers to other allocnos if it decreases the overall
252 : allocation cost.
253 :
254 : * After allocno assigning in the region, IRA modifies the hard
255 : register and memory costs for the corresponding allocnos in
256 : the subregions to reflect the cost of possible loads, stores,
257 : or moves on the border of the region and its subregions.
258 : When default regional allocation algorithm is used
259 : (-fira-algorithm=mixed), IRA just propagates the assignment
260 : for allocnos if the register pressure in the region for the
261 : corresponding pressure class is less than number of available
262 : hard registers for given pressure class.
263 :
264 : o Spill/restore code moving. When IRA performs an allocation
265 : by traversing regions in top-down order, it does not know what
266 : happens below in the region tree. Therefore, sometimes IRA
267 : misses opportunities to perform a better allocation. A simple
268 : optimization tries to improve allocation in a region having
269 : subregions and containing in another region. If the
270 : corresponding allocnos in the subregion are spilled, it spills
271 : the region allocno if it is profitable. The optimization
272 : implements a simple iterative algorithm performing profitable
273 : transformations while they are still possible. It is fast in
274 : practice, so there is no real need for a better time complexity
275 : algorithm.
276 :
277 : o Code change. After coloring, two allocnos representing the
278 : same pseudo-register outside and inside a region respectively
279 : may be assigned to different locations (hard-registers or
280 : memory). In this case IRA creates and uses a new
281 : pseudo-register inside the region and adds code to move allocno
282 : values on the region's borders. This is done during top-down
283 : traversal of the regions (file ira-emit.cc). In some
284 : complicated cases IRA can create a new allocno to move allocno
285 : values (e.g. when a swap of values stored in two hard-registers
286 : is needed). At this stage, the new allocno is marked as
287 : spilled. IRA still creates the pseudo-register and the moves
288 : on the region borders even when both allocnos were assigned to
289 : the same hard-register. If the reload pass spills a
290 : pseudo-register for some reason, the effect will be smaller
291 : because another allocno will still be in the hard-register. In
292 : most cases, this is better then spilling both allocnos. If
293 : reload does not change the allocation for the two
294 : pseudo-registers, the trivial move will be removed by
295 : post-reload optimizations. IRA does not generate moves for
296 : allocnos assigned to the same hard register when the default
297 : regional allocation algorithm is used and the register pressure
298 : in the region for the corresponding pressure class is less than
299 : number of available hard registers for given pressure class.
300 : IRA also does some optimizations to remove redundant stores and
301 : to reduce code duplication on the region borders.
302 :
303 : o Flattening internal representation. After changing code, IRA
304 : transforms its internal representation for several regions into
305 : one region representation (file ira-build.cc). This process is
306 : called IR flattening. Such process is more complicated than IR
307 : rebuilding would be, but is much faster.
308 :
309 : o After IR flattening, IRA tries to assign hard registers to all
310 : spilled allocnos. This is implemented by a simple and fast
311 : priority coloring algorithm (see function
312 : ira_reassign_conflict_allocnos::ira-color.cc). Here new allocnos
313 : created during the code change pass can be assigned to hard
314 : registers.
315 :
316 : o At the end IRA calls the reload pass. The reload pass
317 : communicates with IRA through several functions in file
318 : ira-color.cc to improve its decisions in
319 :
320 : * sharing stack slots for the spilled pseudos based on IRA info
321 : about pseudo-register conflicts.
322 :
323 : * reassigning hard-registers to all spilled pseudos at the end
324 : of each reload iteration.
325 :
326 : * choosing a better hard-register to spill based on IRA info
327 : about pseudo-register live ranges and the register pressure
328 : in places where the pseudo-register lives.
329 :
330 : IRA uses a lot of data representing the target processors. These
331 : data are initialized in file ira.cc.
332 :
333 : If function has no loops (or the loops are ignored when
334 : -fira-algorithm=CB is used), we have classic Chaitin-Briggs
335 : coloring (only instead of separate pass of coalescing, we use hard
336 : register preferencing). In such case, IRA works much faster
337 : because many things are not made (like IR flattening, the
338 : spill/restore optimization, and the code change).
339 :
340 : Literature is worth to read for better understanding the code:
341 :
342 : o Preston Briggs, Keith D. Cooper, Linda Torczon. Improvements to
343 : Graph Coloring Register Allocation.
344 :
345 : o David Callahan, Brian Koblenz. Register allocation via
346 : hierarchical graph coloring.
347 :
348 : o Keith Cooper, Anshuman Dasgupta, Jason Eckhardt. Revisiting Graph
349 : Coloring Register Allocation: A Study of the Chaitin-Briggs and
350 : Callahan-Koblenz Algorithms.
351 :
352 : o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global
353 : Register Allocation Based on Graph Fusion.
354 :
355 : o Michael D. Smith and Glenn Holloway. Graph-Coloring Register
356 : Allocation for Irregular Architectures
357 :
358 : o Vladimir Makarov. The Integrated Register Allocator for GCC.
359 :
360 : o Vladimir Makarov. The top-down register allocator for irregular
361 : register file architectures.
362 :
363 : */
364 :
365 :
366 : #include "config.h"
367 : #include "system.h"
368 : #include "coretypes.h"
369 : #include "backend.h"
370 : #include "target.h"
371 : #include "rtl.h"
372 : #include "tree.h"
373 : #include "df.h"
374 : #include "memmodel.h"
375 : #include "tm_p.h"
376 : #include "insn-config.h"
377 : #include "regs.h"
378 : #include "ira.h"
379 : #include "ira-int.h"
380 : #include "diagnostic-core.h"
381 : #include "cfgrtl.h"
382 : #include "cfgbuild.h"
383 : #include "cfgcleanup.h"
384 : #include "expr.h"
385 : #include "tree-pass.h"
386 : #include "output.h"
387 : #include "reload.h"
388 : #include "cfgloop.h"
389 : #include "lra.h"
390 : #include "dce.h"
391 : #include "dbgcnt.h"
392 : #include "rtl-iter.h"
393 : #include "shrink-wrap.h"
394 : #include "print-rtl.h"
395 :
396 : struct target_ira default_target_ira;
397 : class target_ira_int default_target_ira_int;
398 : #if SWITCHABLE_TARGET
399 : struct target_ira *this_target_ira = &default_target_ira;
400 : class target_ira_int *this_target_ira_int = &default_target_ira_int;
401 : #endif
402 :
403 : /* A modified value of flag `-fira-verbose' used internally. */
404 : int internal_flag_ira_verbose;
405 :
406 : /* Dump file of the allocator if it is not NULL. */
407 : FILE *ira_dump_file;
408 :
409 : /* The number of elements in the following array. */
410 : int ira_spilled_reg_stack_slots_num;
411 :
412 : /* The following array contains info about spilled pseudo-registers
413 : stack slots used in current function so far. */
414 : class ira_spilled_reg_stack_slot *ira_spilled_reg_stack_slots;
415 :
416 : /* Correspondingly overall cost of the allocation, overall cost before
417 : reload, cost of the allocnos assigned to hard-registers, cost of
418 : the allocnos assigned to memory, cost of loads, stores and register
419 : move insns generated for pseudo-register live range splitting (see
420 : ira-emit.cc). */
421 : int64_t ira_overall_cost, overall_cost_before;
422 : int64_t ira_reg_cost, ira_mem_cost;
423 : int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost;
424 : int ira_move_loops_num, ira_additional_jumps_num;
425 :
426 : /* All registers that can be eliminated. */
427 :
428 : HARD_REG_SET eliminable_regset;
429 :
430 : /* Value of max_reg_num () before IRA work start. This value helps
431 : us to recognize a situation when new pseudos were created during
432 : IRA work. */
433 : static int max_regno_before_ira;
434 :
435 : /* Temporary hard reg set used for a different calculation. */
436 : static HARD_REG_SET temp_hard_regset;
437 :
438 : #define last_mode_for_init_move_cost \
439 : (this_target_ira_int->x_last_mode_for_init_move_cost)
440 :
441 :
442 : /* The function sets up the map IRA_REG_MODE_HARD_REGSET. */
443 : static void
444 222045 : setup_reg_mode_hard_regset (void)
445 : {
446 222045 : int i, m, hard_regno;
447 :
448 27755625 : for (m = 0; m < NUM_MACHINE_MODES; m++)
449 2560622940 : for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++)
450 : {
451 2533089360 : CLEAR_HARD_REG_SET (ira_reg_mode_hard_regset[hard_regno][m]);
452 7665551448 : for (i = hard_regno_nregs (hard_regno, (machine_mode) m) - 1;
453 7665551448 : i >= 0; i--)
454 5132462088 : if (hard_regno + i < FIRST_PSEUDO_REGISTER)
455 4708368210 : SET_HARD_REG_BIT (ira_reg_mode_hard_regset[hard_regno][m],
456 : hard_regno + i);
457 : }
458 222045 : }
459 :
460 :
461 : #define no_unit_alloc_regs \
462 : (this_target_ira_int->x_no_unit_alloc_regs)
463 :
464 : /* The function sets up the three arrays declared above. */
465 : static void
466 222045 : setup_class_hard_regs (void)
467 : {
468 222045 : int cl, i, hard_regno, n;
469 222045 : unsigned int j;
470 222045 : HARD_REG_SET processed_hard_reg_set;
471 :
472 222045 : ira_assert (SHRT_MAX >= FIRST_PSEUDO_REGISTER);
473 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
474 : {
475 7549530 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
476 7549530 : CLEAR_HARD_REG_SET (processed_hard_reg_set);
477 702106290 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
478 : {
479 694556760 : ira_non_ordered_class_hard_regs[cl][i] = -1;
480 694556760 : ira_class_hard_reg_index[cl][i] = -1;
481 : }
482 702106290 : for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
483 : {
484 : #ifdef REG_ALLOC_ORDER
485 694556760 : hard_regno = reg_alloc_order[i];
486 : #else
487 : hard_regno = i;
488 : #endif
489 694556760 : if (TEST_HARD_REG_BIT (processed_hard_reg_set, hard_regno))
490 30198120 : continue;
491 664358640 : SET_HARD_REG_BIT (processed_hard_reg_set, hard_regno);
492 664358640 : if (! TEST_HARD_REG_BIT (temp_hard_regset, hard_regno))
493 587867306 : ira_class_hard_reg_index[cl][hard_regno] = -1;
494 : else
495 : {
496 76491334 : ira_class_hard_reg_index[cl][hard_regno] = n;
497 76491334 : ira_class_hard_regs[cl][n++] = hard_regno;
498 : }
499 : }
500 7549530 : ira_class_hard_regs_num[cl] = n;
501 7549530 : n = 0;
502 7549530 : j = 0;
503 7549530 : hard_reg_set_iterator hrsi;
504 84040864 : EXECUTE_IF_SET_IN_HARD_REG_SET (temp_hard_regset, 0, j, hrsi)
505 76491334 : ira_non_ordered_class_hard_regs[cl][n++] = j;
506 :
507 7549530 : ira_assert (ira_class_hard_regs_num[cl] == n);
508 : }
509 222045 : }
510 :
511 : /* Set up global variables defining info about hard registers for the
512 : allocation. These depend on USE_HARD_FRAME_P whose TRUE value means
513 : that we can use the hard frame pointer for the allocation. */
514 : static void
515 222045 : setup_alloc_regs (bool use_hard_frame_p)
516 : {
517 : #ifdef ADJUST_REG_ALLOC_ORDER
518 222045 : ADJUST_REG_ALLOC_ORDER;
519 : #endif
520 222045 : no_unit_alloc_regs = fixed_nonglobal_reg_set;
521 222045 : if (! use_hard_frame_p)
522 72226 : add_to_hard_reg_set (&no_unit_alloc_regs, Pmode,
523 : HARD_FRAME_POINTER_REGNUM);
524 222045 : setup_class_hard_regs ();
525 222045 : }
526 :
527 :
528 :
529 : #define alloc_reg_class_subclasses \
530 : (this_target_ira_int->x_alloc_reg_class_subclasses)
531 :
532 : /* Initialize the table of subclasses of each reg class. */
533 : static void
534 222045 : setup_reg_subclasses (void)
535 : {
536 222045 : int i, j;
537 222045 : HARD_REG_SET temp_hard_regset2;
538 :
539 7771575 : for (i = 0; i < N_REG_CLASSES; i++)
540 264233550 : for (j = 0; j < N_REG_CLASSES; j++)
541 256684020 : alloc_reg_class_subclasses[i][j] = LIM_REG_CLASSES;
542 :
543 7771575 : for (i = 0; i < N_REG_CLASSES; i++)
544 : {
545 7549530 : if (i == (int) NO_REGS)
546 222045 : continue;
547 :
548 7327485 : temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
549 14654970 : if (hard_reg_set_empty_p (temp_hard_regset))
550 441160 : continue;
551 241021375 : for (j = 0; j < N_REG_CLASSES; j++)
552 234135050 : if (i != j)
553 : {
554 227248725 : enum reg_class *p;
555 :
556 227248725 : temp_hard_regset2 = reg_class_contents[j] & ~no_unit_alloc_regs;
557 454497450 : if (! hard_reg_set_subset_p (temp_hard_regset,
558 : temp_hard_regset2))
559 173163731 : continue;
560 54084994 : p = &alloc_reg_class_subclasses[j][0];
561 520210029 : while (*p != LIM_REG_CLASSES) p++;
562 54084994 : *p = (enum reg_class) i;
563 : }
564 : }
565 222045 : }
566 :
567 :
568 :
569 : /* Set up IRA_MEMORY_MOVE_COST and IRA_MAX_MEMORY_MOVE_COST. */
570 : static void
571 222045 : setup_class_subset_and_memory_move_costs (void)
572 : {
573 222045 : int cl, cl2, mode, cost;
574 222045 : HARD_REG_SET temp_hard_regset2;
575 :
576 27755625 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
577 27533580 : ira_memory_move_cost[mode][NO_REGS][0]
578 27533580 : = ira_memory_move_cost[mode][NO_REGS][1] = SHRT_MAX;
579 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
580 : {
581 7549530 : if (cl != (int) NO_REGS)
582 915935625 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
583 : {
584 1817216280 : ira_max_memory_move_cost[mode][cl][0]
585 908608140 : = ira_memory_move_cost[mode][cl][0]
586 908608140 : = memory_move_cost ((machine_mode) mode,
587 : (reg_class_t) cl, false);
588 1817216280 : ira_max_memory_move_cost[mode][cl][1]
589 908608140 : = ira_memory_move_cost[mode][cl][1]
590 908608140 : = memory_move_cost ((machine_mode) mode,
591 : (reg_class_t) cl, true);
592 : /* Costs for NO_REGS are used in cost calculation on the
593 : 1st pass when the preferred register classes are not
594 : known yet. In this case we take the best scenario. */
595 908608140 : if (!targetm.hard_regno_mode_ok (ira_class_hard_regs[cl][0],
596 : (machine_mode) mode))
597 669976509 : continue;
598 :
599 238631631 : if (ira_memory_move_cost[mode][NO_REGS][0]
600 238631631 : > ira_memory_move_cost[mode][cl][0])
601 13415669 : ira_max_memory_move_cost[mode][NO_REGS][0]
602 13415669 : = ira_memory_move_cost[mode][NO_REGS][0]
603 13415669 : = ira_memory_move_cost[mode][cl][0];
604 238631631 : if (ira_memory_move_cost[mode][NO_REGS][1]
605 238631631 : > ira_memory_move_cost[mode][cl][1])
606 13406516 : ira_max_memory_move_cost[mode][NO_REGS][1]
607 13406516 : = ira_memory_move_cost[mode][NO_REGS][1]
608 13406516 : = ira_memory_move_cost[mode][cl][1];
609 : }
610 : }
611 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
612 264233550 : for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--)
613 : {
614 256684020 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
615 256684020 : temp_hard_regset2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
616 256684020 : ira_class_subset_p[cl][cl2]
617 256684020 : = hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2);
618 256684020 : if (! hard_reg_set_empty_p (temp_hard_regset2)
619 490819070 : && hard_reg_set_subset_p (reg_class_contents[cl2],
620 : reg_class_contents[cl]))
621 6909607375 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
622 : {
623 6854330516 : cost = ira_memory_move_cost[mode][cl2][0];
624 6854330516 : if (cost > ira_max_memory_move_cost[mode][cl][0])
625 116901420 : ira_max_memory_move_cost[mode][cl][0] = cost;
626 6854330516 : cost = ira_memory_move_cost[mode][cl2][1];
627 6854330516 : if (cost > ira_max_memory_move_cost[mode][cl][1])
628 116952646 : ira_max_memory_move_cost[mode][cl][1] = cost;
629 : }
630 : }
631 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
632 943691250 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
633 : {
634 936141720 : ira_memory_move_cost[mode][cl][0]
635 936141720 : = ira_max_memory_move_cost[mode][cl][0];
636 936141720 : ira_memory_move_cost[mode][cl][1]
637 936141720 : = ira_max_memory_move_cost[mode][cl][1];
638 : }
639 222045 : setup_reg_subclasses ();
640 222045 : }
641 :
642 :
643 :
644 : /* Define the following macro if allocation through malloc if
645 : preferable. */
646 : #define IRA_NO_OBSTACK
647 :
648 : #ifndef IRA_NO_OBSTACK
649 : /* Obstack used for storing all dynamic data (except bitmaps) of the
650 : IRA. */
651 : static struct obstack ira_obstack;
652 : #endif
653 :
654 : /* Obstack used for storing all bitmaps of the IRA. */
655 : static struct bitmap_obstack ira_bitmap_obstack;
656 :
657 : /* Allocate memory of size LEN for IRA data. */
658 : void *
659 205902877 : ira_allocate (size_t len)
660 : {
661 205902877 : void *res;
662 :
663 : #ifndef IRA_NO_OBSTACK
664 : res = obstack_alloc (&ira_obstack, len);
665 : #else
666 205902877 : res = xmalloc (len);
667 : #endif
668 205902877 : return res;
669 : }
670 :
671 : /* Free memory ADDR allocated for IRA data. */
672 : void
673 205902877 : ira_free (void *addr ATTRIBUTE_UNUSED)
674 : {
675 : #ifndef IRA_NO_OBSTACK
676 : /* do nothing */
677 : #else
678 205902877 : free (addr);
679 : #endif
680 205902877 : }
681 :
682 :
683 : /* Allocate and returns bitmap for IRA. */
684 : bitmap
685 10621816 : ira_allocate_bitmap (void)
686 : {
687 10621816 : return BITMAP_ALLOC (&ira_bitmap_obstack);
688 : }
689 :
690 : /* Free bitmap B allocated for IRA. */
691 : void
692 10621816 : ira_free_bitmap (bitmap b ATTRIBUTE_UNUSED)
693 : {
694 : /* do nothing */
695 10621816 : }
696 :
697 :
698 :
699 : /* Output information about allocation of all allocnos (except for
700 : caps) into file F. */
701 : void
702 95 : ira_print_disposition (FILE *f)
703 : {
704 95 : int i, n, max_regno;
705 95 : ira_allocno_t a;
706 95 : basic_block bb;
707 :
708 95 : fprintf (f, "Disposition:");
709 95 : max_regno = max_reg_num ();
710 1973 : for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
711 1783 : for (a = ira_regno_allocno_map[i];
712 2378 : a != NULL;
713 595 : a = ALLOCNO_NEXT_REGNO_ALLOCNO (a))
714 : {
715 595 : if (n % 4 == 0)
716 178 : fprintf (f, "\n");
717 595 : n++;
718 595 : fprintf (f, " %4d:r%-4d", ALLOCNO_NUM (a), ALLOCNO_REGNO (a));
719 595 : if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL)
720 0 : fprintf (f, "b%-3d", bb->index);
721 : else
722 595 : fprintf (f, "l%-3d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
723 595 : if (ALLOCNO_HARD_REGNO (a) >= 0)
724 594 : fprintf (f, " %3d", ALLOCNO_HARD_REGNO (a));
725 : else
726 1 : fprintf (f, " mem");
727 : }
728 95 : fprintf (f, "\n");
729 95 : }
730 :
731 : /* Outputs information about allocation of all allocnos into
732 : stderr. */
733 : void
734 0 : ira_debug_disposition (void)
735 : {
736 0 : ira_print_disposition (stderr);
737 0 : }
738 :
739 :
740 :
741 : /* Set up ira_stack_reg_pressure_class which is the biggest pressure
742 : register class containing stack registers or NO_REGS if there are
743 : no stack registers. To find this class, we iterate through all
744 : register pressure classes and choose the first register pressure
745 : class containing all the stack registers and having the biggest
746 : size. */
747 : static void
748 222045 : setup_stack_reg_pressure_class (void)
749 : {
750 222045 : ira_stack_reg_pressure_class = NO_REGS;
751 : #ifdef STACK_REGS
752 222045 : {
753 222045 : int i, best, size;
754 222045 : enum reg_class cl;
755 222045 : HARD_REG_SET temp_hard_regset2;
756 :
757 222045 : CLEAR_HARD_REG_SET (temp_hard_regset);
758 1998405 : for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
759 1776360 : SET_HARD_REG_BIT (temp_hard_regset, i);
760 : best = 0;
761 1112022 : for (i = 0; i < ira_pressure_classes_num; i++)
762 : {
763 889977 : cl = ira_pressure_classes[i];
764 889977 : temp_hard_regset2 = temp_hard_regset & reg_class_contents[cl];
765 889977 : size = hard_reg_set_popcount (temp_hard_regset2);
766 889977 : if (best < size)
767 : {
768 221359 : best = size;
769 221359 : ira_stack_reg_pressure_class = cl;
770 : }
771 : }
772 : }
773 : #endif
774 222045 : }
775 :
776 : /* Find pressure classes which are register classes for which we
777 : calculate register pressure in IRA, register pressure sensitive
778 : insn scheduling, and register pressure sensitive loop invariant
779 : motion.
780 :
781 : To make register pressure calculation easy, we always use
782 : non-intersected register pressure classes. A move of hard
783 : registers from one register pressure class is not more expensive
784 : than load and store of the hard registers. Most likely an allocno
785 : class will be a subset of a register pressure class and in many
786 : cases a register pressure class. That makes usage of register
787 : pressure classes a good approximation to find a high register
788 : pressure. */
789 : static void
790 222045 : setup_pressure_classes (void)
791 : {
792 222045 : int cost, i, n, curr;
793 222045 : int cl, cl2;
794 222045 : enum reg_class pressure_classes[N_REG_CLASSES];
795 222045 : int m;
796 222045 : HARD_REG_SET temp_hard_regset2;
797 222045 : bool insert_p;
798 :
799 222045 : if (targetm.compute_pressure_classes)
800 0 : n = targetm.compute_pressure_classes (pressure_classes);
801 : else
802 : {
803 : n = 0;
804 7771575 : for (cl = 0; cl < N_REG_CLASSES; cl++)
805 : {
806 7549530 : if (ira_class_hard_regs_num[cl] == 0)
807 663205 : continue;
808 6886325 : if (ira_class_hard_regs_num[cl] != 1
809 : /* A register class without subclasses may contain a few
810 : hard registers and movement between them is costly
811 : (e.g. SPARC FPCC registers). We still should consider it
812 : as a candidate for a pressure class. */
813 4890039 : && alloc_reg_class_subclasses[cl][0] < cl)
814 : {
815 : /* Check that the moves between any hard registers of the
816 : current class are not more expensive for a legal mode
817 : than load/store of the hard registers of the current
818 : class. Such class is a potential candidate to be a
819 : register pressure class. */
820 227085510 : for (m = 0; m < NUM_MACHINE_MODES; m++)
821 : {
822 225753908 : temp_hard_regset
823 225753908 : = (reg_class_contents[cl]
824 225753908 : & ~(no_unit_alloc_regs
825 225753908 : | ira_prohibited_class_mode_regs[cl][m]));
826 451507816 : if (hard_reg_set_empty_p (temp_hard_regset))
827 166863377 : continue;
828 58890531 : ira_init_register_move_cost_if_necessary ((machine_mode) m);
829 58890531 : cost = ira_register_move_cost[m][cl][cl];
830 58890531 : if (cost <= ira_max_memory_move_cost[m][cl][1]
831 55557369 : || cost <= ira_max_memory_move_cost[m][cl][0])
832 : break;
833 : }
834 4664764 : if (m >= NUM_MACHINE_MODES)
835 1331602 : continue;
836 : }
837 5554723 : curr = 0;
838 5554723 : insert_p = true;
839 5554723 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
840 : /* Remove so far added pressure classes which are subset of the
841 : current candidate class. Prefer GENERAL_REGS as a pressure
842 : register class to another class containing the same
843 : allocatable hard registers. We do this because machine
844 : dependent cost hooks might give wrong costs for the latter
845 : class but always give the right cost for the former class
846 : (GENERAL_REGS). */
847 19931073 : for (i = 0; i < n; i++)
848 : {
849 14376350 : cl2 = pressure_classes[i];
850 14376350 : temp_hard_regset2 = (reg_class_contents[cl2]
851 14376350 : & ~no_unit_alloc_regs);
852 14376350 : if (hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2)
853 15563488 : && (temp_hard_regset != temp_hard_regset2
854 1115336 : || cl2 == (int) GENERAL_REGS))
855 : {
856 737764 : pressure_classes[curr++] = (enum reg_class) cl2;
857 737764 : insert_p = false;
858 737764 : continue;
859 : }
860 17121834 : if (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset)
861 17565568 : && (temp_hard_regset2 != temp_hard_regset
862 449374 : || cl == (int) GENERAL_REGS))
863 3483248 : continue;
864 10155338 : if (temp_hard_regset2 == temp_hard_regset)
865 443734 : insert_p = false;
866 10155338 : pressure_classes[curr++] = (enum reg_class) cl2;
867 : }
868 : /* If the current candidate is a subset of a so far added
869 : pressure class, don't add it to the list of the pressure
870 : classes. */
871 5554723 : if (insert_p)
872 4373225 : pressure_classes[curr++] = (enum reg_class) cl;
873 : n = curr;
874 : }
875 : }
876 : #ifdef ENABLE_IRA_CHECKING
877 222045 : {
878 222045 : HARD_REG_SET ignore_hard_regs;
879 :
880 : /* Check pressure classes correctness: here we check that hard
881 : registers from all register pressure classes contains all hard
882 : registers available for the allocation. */
883 888180 : CLEAR_HARD_REG_SET (temp_hard_regset);
884 222045 : CLEAR_HARD_REG_SET (temp_hard_regset2);
885 222045 : ignore_hard_regs = no_unit_alloc_regs;
886 7771575 : for (cl = 0; cl < LIM_REG_CLASSES; cl++)
887 : {
888 : /* For some targets (like MIPS with MD_REGS), there are some
889 : classes with hard registers available for allocation but
890 : not able to hold value of any mode. */
891 214552593 : for (m = 0; m < NUM_MACHINE_MODES; m++)
892 213889388 : if (contains_reg_of_mode[cl][m])
893 : break;
894 7549530 : if (m >= NUM_MACHINE_MODES)
895 : {
896 663205 : ignore_hard_regs |= reg_class_contents[cl];
897 663205 : continue;
898 : }
899 32283130 : for (i = 0; i < n; i++)
900 26286782 : if ((int) pressure_classes[i] == cl)
901 : break;
902 6886325 : temp_hard_regset2 |= reg_class_contents[cl];
903 6886325 : if (i < n)
904 7549530 : temp_hard_regset |= reg_class_contents[cl];
905 : }
906 20650185 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
907 : /* Some targets (like SPARC with ICC reg) have allocatable regs
908 : for which no reg class is defined. */
909 20428140 : if (REGNO_REG_CLASS (i) == NO_REGS)
910 444090 : SET_HARD_REG_BIT (ignore_hard_regs, i);
911 222045 : temp_hard_regset &= ~ignore_hard_regs;
912 222045 : temp_hard_regset2 &= ~ignore_hard_regs;
913 444090 : ira_assert (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset));
914 : }
915 : #endif
916 222045 : ira_pressure_classes_num = 0;
917 1112022 : for (i = 0; i < n; i++)
918 : {
919 889977 : cl = (int) pressure_classes[i];
920 889977 : ira_reg_pressure_class_p[cl] = true;
921 889977 : ira_pressure_classes[ira_pressure_classes_num++] = (enum reg_class) cl;
922 : }
923 222045 : setup_stack_reg_pressure_class ();
924 222045 : }
925 :
926 : /* Set up IRA_UNIFORM_CLASS_P. Uniform class is a register class
927 : whose register move cost between any registers of the class is the
928 : same as for all its subclasses. We use the data to speed up the
929 : 2nd pass of calculations of allocno costs. */
930 : static void
931 222045 : setup_uniform_class_p (void)
932 : {
933 222045 : int i, cl, cl2, m;
934 :
935 7771575 : for (cl = 0; cl < N_REG_CLASSES; cl++)
936 : {
937 7549530 : ira_uniform_class_p[cl] = false;
938 7549530 : if (ira_class_hard_regs_num[cl] == 0)
939 663205 : continue;
940 : /* We cannot use alloc_reg_class_subclasses here because move
941 : cost hooks does not take into account that some registers are
942 : unavailable for the subtarget. E.g. for i686, INT_SSE_REGS
943 : is element of alloc_reg_class_subclasses for GENERAL_REGS
944 : because SSE regs are unavailable. */
945 26857556 : for (i = 0; (cl2 = reg_class_subclasses[cl][i]) != LIM_REG_CLASSES; i++)
946 : {
947 21302834 : if (ira_class_hard_regs_num[cl2] == 0)
948 376 : continue;
949 2524303312 : for (m = 0; m < NUM_MACHINE_MODES; m++)
950 2504332457 : if (contains_reg_of_mode[cl][m] && contains_reg_of_mode[cl2][m])
951 : {
952 601784040 : ira_init_register_move_cost_if_necessary ((machine_mode) m);
953 601784040 : if (ira_register_move_cost[m][cl][cl]
954 601784040 : != ira_register_move_cost[m][cl2][cl2])
955 : break;
956 : }
957 21302458 : if (m < NUM_MACHINE_MODES)
958 : break;
959 : }
960 6886325 : if (cl2 == LIM_REG_CLASSES)
961 5554722 : ira_uniform_class_p[cl] = true;
962 : }
963 222045 : }
964 :
965 : /* Set up IRA_ALLOCNO_CLASSES, IRA_ALLOCNO_CLASSES_NUM,
966 : IRA_IMPORTANT_CLASSES, and IRA_IMPORTANT_CLASSES_NUM.
967 :
968 : Target may have many subtargets and not all target hard registers can
969 : be used for allocation, e.g. x86 port in 32-bit mode cannot use
970 : hard registers introduced in x86-64 like r8-r15). Some classes
971 : might have the same allocatable hard registers, e.g. INDEX_REGS
972 : and GENERAL_REGS in x86 port in 32-bit mode. To decrease different
973 : calculations efforts we introduce allocno classes which contain
974 : unique non-empty sets of allocatable hard-registers.
975 :
976 : Pseudo class cost calculation in ira-costs.cc is very expensive.
977 : Therefore we are trying to decrease number of classes involved in
978 : such calculation. Register classes used in the cost calculation
979 : are called important classes. They are allocno classes and other
980 : non-empty classes whose allocatable hard register sets are inside
981 : of an allocno class hard register set. From the first sight, it
982 : looks like that they are just allocno classes. It is not true. In
983 : example of x86-port in 32-bit mode, allocno classes will contain
984 : GENERAL_REGS but not LEGACY_REGS (because allocatable hard
985 : registers are the same for the both classes). The important
986 : classes will contain GENERAL_REGS and LEGACY_REGS. It is done
987 : because a machine description insn constraint may refers for
988 : LEGACY_REGS and code in ira-costs.cc is mostly base on investigation
989 : of the insn constraints. */
990 : static void
991 222045 : setup_allocno_and_important_classes (void)
992 : {
993 222045 : int i, j, n, cl;
994 222045 : bool set_p;
995 222045 : HARD_REG_SET temp_hard_regset2;
996 222045 : static enum reg_class classes[LIM_REG_CLASSES + 1];
997 :
998 222045 : n = 0;
999 : /* Collect classes which contain unique sets of allocatable hard
1000 : registers. Prefer GENERAL_REGS to other classes containing the
1001 : same set of hard registers. */
1002 7771575 : for (i = 0; i < LIM_REG_CLASSES; i++)
1003 : {
1004 7549530 : temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
1005 98234710 : for (j = 0; j < n; j++)
1006 : {
1007 92462870 : cl = classes[j];
1008 92462870 : temp_hard_regset2 = reg_class_contents[cl] & ~no_unit_alloc_regs;
1009 184925740 : if (temp_hard_regset == temp_hard_regset2)
1010 : break;
1011 : }
1012 7549530 : if (j >= n || targetm.additional_allocno_class_p (i))
1013 5771840 : classes[n++] = (enum reg_class) i;
1014 1777690 : else if (i == GENERAL_REGS)
1015 : /* Prefer general regs. For i386 example, it means that
1016 : we prefer GENERAL_REGS over INDEX_REGS or LEGACY_REGS
1017 : (all of them consists of the same available hard
1018 : registers). */
1019 5640 : classes[j] = (enum reg_class) i;
1020 : }
1021 222045 : classes[n] = LIM_REG_CLASSES;
1022 :
1023 : /* Set up classes which can be used for allocnos as classes
1024 : containing non-empty unique sets of allocatable hard
1025 : registers. */
1026 222045 : ira_allocno_classes_num = 0;
1027 5993885 : for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++)
1028 5771840 : if (ira_class_hard_regs_num[cl] > 0)
1029 5549795 : ira_allocno_classes[ira_allocno_classes_num++] = (enum reg_class) cl;
1030 222045 : ira_important_classes_num = 0;
1031 : /* Add non-allocno classes containing to non-empty set of
1032 : allocatable hard regs. */
1033 7771575 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1034 7549530 : if (ira_class_hard_regs_num[cl] > 0)
1035 : {
1036 6886325 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1037 6886325 : set_p = false;
1038 106885302 : for (j = 0; j < ira_allocno_classes_num; j++)
1039 : {
1040 105548772 : temp_hard_regset2 = (reg_class_contents[ira_allocno_classes[j]]
1041 105548772 : & ~no_unit_alloc_regs);
1042 105548772 : if ((enum reg_class) cl == ira_allocno_classes[j])
1043 : break;
1044 99998977 : else if (hard_reg_set_subset_p (temp_hard_regset,
1045 : temp_hard_regset2))
1046 6951445 : set_p = true;
1047 : }
1048 6886325 : if (set_p && j >= ira_allocno_classes_num)
1049 1336530 : ira_important_classes[ira_important_classes_num++]
1050 1336530 : = (enum reg_class) cl;
1051 : }
1052 : /* Now add allocno classes to the important classes. */
1053 5771840 : for (j = 0; j < ira_allocno_classes_num; j++)
1054 5549795 : ira_important_classes[ira_important_classes_num++]
1055 5549795 : = ira_allocno_classes[j];
1056 7771575 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1057 : {
1058 7549530 : ira_reg_allocno_class_p[cl] = false;
1059 7549530 : ira_reg_pressure_class_p[cl] = false;
1060 : }
1061 5771840 : for (j = 0; j < ira_allocno_classes_num; j++)
1062 5549795 : ira_reg_allocno_class_p[ira_allocno_classes[j]] = true;
1063 222045 : setup_pressure_classes ();
1064 222045 : setup_uniform_class_p ();
1065 222045 : }
1066 :
1067 : /* Setup translation in CLASS_TRANSLATE of all classes into a class
1068 : given by array CLASSES of length CLASSES_NUM. The function is used
1069 : make translation any reg class to an allocno class or to an
1070 : pressure class. This translation is necessary for some
1071 : calculations when we can use only allocno or pressure classes and
1072 : such translation represents an approximate representation of all
1073 : classes.
1074 :
1075 : The translation in case when allocatable hard register set of a
1076 : given class is subset of allocatable hard register set of a class
1077 : in CLASSES is pretty simple. We use smallest classes from CLASSES
1078 : containing a given class. If allocatable hard register set of a
1079 : given class is not a subset of any corresponding set of a class
1080 : from CLASSES, we use the cheapest (with load/store point of view)
1081 : class from CLASSES whose set intersects with given class set. */
1082 : static void
1083 444090 : setup_class_translate_array (enum reg_class *class_translate,
1084 : int classes_num, enum reg_class *classes)
1085 : {
1086 444090 : int cl, mode;
1087 444090 : enum reg_class aclass, best_class, *cl_ptr;
1088 444090 : int i, cost, min_cost, best_cost;
1089 :
1090 15543150 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1091 15099060 : class_translate[cl] = NO_REGS;
1092 :
1093 6883862 : for (i = 0; i < classes_num; i++)
1094 : {
1095 6439772 : aclass = classes[i];
1096 47283647 : for (cl_ptr = &alloc_reg_class_subclasses[aclass][0];
1097 47283647 : (cl = *cl_ptr) != LIM_REG_CLASSES;
1098 : cl_ptr++)
1099 40843875 : if (class_translate[cl] == NO_REGS)
1100 6292453 : class_translate[cl] = aclass;
1101 6439772 : class_translate[aclass] = aclass;
1102 : }
1103 : /* For classes which are not fully covered by one of given classes
1104 : (in other words covered by more one given class), use the
1105 : cheapest class. */
1106 15543150 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1107 : {
1108 15099060 : if (cl == NO_REGS || class_translate[cl] != NO_REGS)
1109 13104606 : continue;
1110 : best_class = NO_REGS;
1111 : best_cost = INT_MAX;
1112 19108684 : for (i = 0; i < classes_num; i++)
1113 : {
1114 17114230 : aclass = classes[i];
1115 17114230 : temp_hard_regset = (reg_class_contents[aclass]
1116 17114230 : & reg_class_contents[cl]
1117 17114230 : & ~no_unit_alloc_regs);
1118 34228460 : if (! hard_reg_set_empty_p (temp_hard_regset))
1119 : {
1120 : min_cost = INT_MAX;
1121 361513125 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1122 : {
1123 358621020 : cost = (ira_memory_move_cost[mode][aclass][0]
1124 358621020 : + ira_memory_move_cost[mode][aclass][1]);
1125 358621020 : if (min_cost > cost)
1126 : min_cost = cost;
1127 : }
1128 2892105 : if (best_class == NO_REGS || best_cost > min_cost)
1129 : {
1130 17114230 : best_class = aclass;
1131 17114230 : best_cost = min_cost;
1132 : }
1133 : }
1134 : }
1135 1994454 : class_translate[cl] = best_class;
1136 : }
1137 444090 : }
1138 :
1139 : /* Set up array IRA_ALLOCNO_CLASS_TRANSLATE and
1140 : IRA_PRESSURE_CLASS_TRANSLATE. */
1141 : static void
1142 222045 : setup_class_translate (void)
1143 : {
1144 222045 : setup_class_translate_array (ira_allocno_class_translate,
1145 222045 : ira_allocno_classes_num, ira_allocno_classes);
1146 222045 : setup_class_translate_array (ira_pressure_class_translate,
1147 222045 : ira_pressure_classes_num, ira_pressure_classes);
1148 222045 : }
1149 :
1150 : /* Order numbers of allocno classes in original target allocno class
1151 : array, -1 for non-allocno classes. */
1152 : static int allocno_class_order[N_REG_CLASSES];
1153 :
1154 : /* The function used to sort the important classes. */
1155 : static int
1156 185095881 : comp_reg_classes_func (const void *v1p, const void *v2p)
1157 : {
1158 185095881 : enum reg_class cl1 = *(const enum reg_class *) v1p;
1159 185095881 : enum reg_class cl2 = *(const enum reg_class *) v2p;
1160 185095881 : enum reg_class tcl1, tcl2;
1161 185095881 : int diff;
1162 :
1163 185095881 : tcl1 = ira_allocno_class_translate[cl1];
1164 185095881 : tcl2 = ira_allocno_class_translate[cl2];
1165 185095881 : if (tcl1 != NO_REGS && tcl2 != NO_REGS
1166 185095881 : && (diff = allocno_class_order[tcl1] - allocno_class_order[tcl2]) != 0)
1167 : return diff;
1168 8385246 : return (int) cl1 - (int) cl2;
1169 : }
1170 :
1171 : /* For correct work of function setup_reg_class_relation we need to
1172 : reorder important classes according to the order of their allocno
1173 : classes. It places important classes containing the same
1174 : allocatable hard register set adjacent to each other and allocno
1175 : class with the allocatable hard register set right after the other
1176 : important classes with the same set.
1177 :
1178 : In example from comments of function
1179 : setup_allocno_and_important_classes, it places LEGACY_REGS and
1180 : GENERAL_REGS close to each other and GENERAL_REGS is after
1181 : LEGACY_REGS. */
1182 : static void
1183 222045 : reorder_important_classes (void)
1184 : {
1185 222045 : int i;
1186 :
1187 7771575 : for (i = 0; i < N_REG_CLASSES; i++)
1188 7549530 : allocno_class_order[i] = -1;
1189 5771840 : for (i = 0; i < ira_allocno_classes_num; i++)
1190 5549795 : allocno_class_order[ira_allocno_classes[i]] = i;
1191 222045 : qsort (ira_important_classes, ira_important_classes_num,
1192 : sizeof (enum reg_class), comp_reg_classes_func);
1193 7330415 : for (i = 0; i < ira_important_classes_num; i++)
1194 6886325 : ira_important_class_nums[ira_important_classes[i]] = i;
1195 222045 : }
1196 :
1197 : /* Set up IRA_REG_CLASS_SUBUNION, IRA_REG_CLASS_SUPERUNION,
1198 : IRA_REG_CLASS_SUPER_CLASSES, IRA_REG_CLASSES_INTERSECT, and
1199 : IRA_REG_CLASSES_INTERSECT_P. For the meaning of the relations,
1200 : please see corresponding comments in ira-int.h. */
1201 : static void
1202 222045 : setup_reg_class_relations (void)
1203 : {
1204 222045 : int i, cl1, cl2, cl3;
1205 222045 : HARD_REG_SET intersection_set, union_set, temp_set2;
1206 222045 : bool important_class_p[N_REG_CLASSES];
1207 :
1208 222045 : memset (important_class_p, 0, sizeof (important_class_p));
1209 7108370 : for (i = 0; i < ira_important_classes_num; i++)
1210 6886325 : important_class_p[ira_important_classes[i]] = true;
1211 7771575 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1212 : {
1213 7549530 : ira_reg_class_super_classes[cl1][0] = LIM_REG_CLASSES;
1214 264233550 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1215 : {
1216 256684020 : ira_reg_classes_intersect_p[cl1][cl2] = false;
1217 256684020 : ira_reg_class_intersect[cl1][cl2] = NO_REGS;
1218 256684020 : ira_reg_class_subset[cl1][cl2] = NO_REGS;
1219 256684020 : temp_hard_regset = reg_class_contents[cl1] & ~no_unit_alloc_regs;
1220 256684020 : temp_set2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
1221 256684020 : if (hard_reg_set_empty_p (temp_hard_regset)
1222 279232990 : && hard_reg_set_empty_p (temp_set2))
1223 : {
1224 : /* The both classes have no allocatable hard registers
1225 : -- take all class hard registers into account and use
1226 : reg_class_subunion and reg_class_superunion. */
1227 781714 : for (i = 0;; i++)
1228 : {
1229 2841631 : cl3 = reg_class_subclasses[cl1][i];
1230 2841631 : if (cl3 == LIM_REG_CLASSES)
1231 : break;
1232 781714 : if (reg_class_subset_p (ira_reg_class_intersect[cl1][cl2],
1233 : (enum reg_class) cl3))
1234 733368 : ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1235 : }
1236 2059917 : ira_reg_class_subunion[cl1][cl2] = reg_class_subunion[cl1][cl2];
1237 2059917 : ira_reg_class_superunion[cl1][cl2] = reg_class_superunion[cl1][cl2];
1238 2059917 : continue;
1239 : }
1240 : ira_reg_classes_intersect_p[cl1][cl2]
1241 254624103 : = hard_reg_set_intersect_p (temp_hard_regset, temp_set2);
1242 234135050 : if (important_class_p[cl1] && important_class_p[cl2]
1243 468270100 : && hard_reg_set_subset_p (temp_hard_regset, temp_set2))
1244 : {
1245 : /* CL1 and CL2 are important classes and CL1 allocatable
1246 : hard register set is inside of CL2 allocatable hard
1247 : registers -- make CL1 a superset of CL2. */
1248 60971319 : enum reg_class *p;
1249 :
1250 60971319 : p = &ira_reg_class_super_classes[cl1][0];
1251 368001265 : while (*p != LIM_REG_CLASSES)
1252 307029946 : p++;
1253 60971319 : *p++ = (enum reg_class) cl2;
1254 60971319 : *p = LIM_REG_CLASSES;
1255 : }
1256 254624103 : ira_reg_class_subunion[cl1][cl2] = NO_REGS;
1257 254624103 : ira_reg_class_superunion[cl1][cl2] = NO_REGS;
1258 254624103 : intersection_set = (reg_class_contents[cl1]
1259 254624103 : & reg_class_contents[cl2]
1260 254624103 : & ~no_unit_alloc_regs);
1261 254624103 : union_set = ((reg_class_contents[cl1] | reg_class_contents[cl2])
1262 254624103 : & ~no_unit_alloc_regs);
1263 8911843605 : for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++)
1264 : {
1265 8657219502 : temp_hard_regset = reg_class_contents[cl3] & ~no_unit_alloc_regs;
1266 17314439004 : if (hard_reg_set_empty_p (temp_hard_regset))
1267 759584115 : continue;
1268 :
1269 7897635387 : if (hard_reg_set_subset_p (temp_hard_regset, intersection_set))
1270 : {
1271 : /* CL3 allocatable hard register set is inside of
1272 : intersection of allocatable hard register sets
1273 : of CL1 and CL2. */
1274 675031211 : if (important_class_p[cl3])
1275 : {
1276 675031211 : temp_set2
1277 675031211 : = (reg_class_contents
1278 675031211 : [ira_reg_class_intersect[cl1][cl2]]);
1279 675031211 : temp_set2 &= ~no_unit_alloc_regs;
1280 675031211 : if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1281 : /* If the allocatable hard register sets are
1282 : the same, prefer GENERAL_REGS or the
1283 : smallest class for debugging
1284 : purposes. */
1285 784725201 : || (temp_hard_regset == temp_set2
1286 105300661 : && (cl3 == GENERAL_REGS
1287 104618215 : || ((ira_reg_class_intersect[cl1][cl2]
1288 : != GENERAL_REGS)
1289 35033812 : && hard_reg_set_subset_p
1290 35033812 : (reg_class_contents[cl3],
1291 : reg_class_contents
1292 : [(int)
1293 : ira_reg_class_intersect[cl1][cl2]])))))
1294 592889537 : ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1295 : }
1296 675031211 : temp_set2
1297 675031211 : = (reg_class_contents[ira_reg_class_subset[cl1][cl2]]
1298 675031211 : & ~no_unit_alloc_regs);
1299 675031211 : if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1300 : /* Ignore unavailable hard registers and prefer
1301 : smallest class for debugging purposes. */
1302 784725201 : || (temp_hard_regset == temp_set2
1303 105300661 : && hard_reg_set_subset_p
1304 105300661 : (reg_class_contents[cl3],
1305 : reg_class_contents
1306 : [(int) ira_reg_class_subset[cl1][cl2]])))
1307 627254870 : ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3;
1308 : }
1309 7897635387 : if (important_class_p[cl3]
1310 15795270774 : && hard_reg_set_subset_p (temp_hard_regset, union_set))
1311 : {
1312 : /* CL3 allocatable hard register set is inside of
1313 : union of allocatable hard register sets of CL1
1314 : and CL2. */
1315 3515784829 : temp_set2
1316 3515784829 : = (reg_class_contents[ira_reg_class_subunion[cl1][cl2]]
1317 3515784829 : & ~no_unit_alloc_regs);
1318 3515784829 : if (ira_reg_class_subunion[cl1][cl2] == NO_REGS
1319 6776945555 : || (hard_reg_set_subset_p (temp_set2, temp_hard_regset)
1320 :
1321 1085655488 : && (temp_set2 != temp_hard_regset
1322 452225557 : || cl3 == GENERAL_REGS
1323 : /* If the allocatable hard register sets are the
1324 : same, prefer GENERAL_REGS or the smallest
1325 : class for debugging purposes. */
1326 448632913 : || (ira_reg_class_subunion[cl1][cl2] != GENERAL_REGS
1327 29893487 : && hard_reg_set_subset_p
1328 29893487 : (reg_class_contents[cl3],
1329 : reg_class_contents
1330 : [(int) ira_reg_class_subunion[cl1][cl2]])))))
1331 914821680 : ira_reg_class_subunion[cl1][cl2] = (enum reg_class) cl3;
1332 : }
1333 15795270774 : if (hard_reg_set_subset_p (union_set, temp_hard_regset))
1334 : {
1335 : /* CL3 allocatable hard register set contains union
1336 : of allocatable hard register sets of CL1 and
1337 : CL2. */
1338 1465354291 : temp_set2
1339 1465354291 : = (reg_class_contents[ira_reg_class_superunion[cl1][cl2]]
1340 1465354291 : & ~no_unit_alloc_regs);
1341 1465354291 : if (ira_reg_class_superunion[cl1][cl2] == NO_REGS
1342 2676084479 : || (hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1343 :
1344 208234159 : && (temp_set2 != temp_hard_regset
1345 207293936 : || cl3 == GENERAL_REGS
1346 : /* If the allocatable hard register sets are the
1347 : same, prefer GENERAL_REGS or the smallest
1348 : class for debugging purposes. */
1349 205732804 : || (ira_reg_class_superunion[cl1][cl2] != GENERAL_REGS
1350 20456402 : && hard_reg_set_subset_p
1351 20456402 : (reg_class_contents[cl3],
1352 : reg_class_contents
1353 : [(int) ira_reg_class_superunion[cl1][cl2]])))))
1354 271818787 : ira_reg_class_superunion[cl1][cl2] = (enum reg_class) cl3;
1355 : }
1356 : }
1357 : }
1358 : }
1359 222045 : }
1360 :
1361 : /* Output all uniform and important classes into file F. */
1362 : static void
1363 0 : print_uniform_and_important_classes (FILE *f)
1364 : {
1365 0 : int i, cl;
1366 :
1367 0 : fprintf (f, "Uniform classes:\n");
1368 0 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1369 0 : if (ira_uniform_class_p[cl])
1370 0 : fprintf (f, " %s", reg_class_names[cl]);
1371 0 : fprintf (f, "\nImportant classes:\n");
1372 0 : for (i = 0; i < ira_important_classes_num; i++)
1373 0 : fprintf (f, " %s", reg_class_names[ira_important_classes[i]]);
1374 0 : fprintf (f, "\n");
1375 0 : }
1376 :
1377 : /* Output all possible allocno or pressure classes and their
1378 : translation map into file F. */
1379 : static void
1380 0 : print_translated_classes (FILE *f, bool pressure_p)
1381 : {
1382 0 : int classes_num = (pressure_p
1383 0 : ? ira_pressure_classes_num : ira_allocno_classes_num);
1384 0 : enum reg_class *classes = (pressure_p
1385 0 : ? ira_pressure_classes : ira_allocno_classes);
1386 0 : enum reg_class *class_translate = (pressure_p
1387 : ? ira_pressure_class_translate
1388 : : ira_allocno_class_translate);
1389 0 : int i;
1390 :
1391 0 : fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno");
1392 0 : for (i = 0; i < classes_num; i++)
1393 0 : fprintf (f, " %s", reg_class_names[classes[i]]);
1394 0 : fprintf (f, "\nClass translation:\n");
1395 0 : for (i = 0; i < N_REG_CLASSES; i++)
1396 0 : fprintf (f, " %s -> %s\n", reg_class_names[i],
1397 0 : reg_class_names[class_translate[i]]);
1398 0 : }
1399 :
1400 : /* Output all possible allocno and translation classes and the
1401 : translation maps into stderr. */
1402 : void
1403 0 : ira_debug_allocno_classes (void)
1404 : {
1405 0 : print_uniform_and_important_classes (stderr);
1406 0 : print_translated_classes (stderr, false);
1407 0 : print_translated_classes (stderr, true);
1408 0 : }
1409 :
1410 : /* Set up different arrays concerning class subsets, allocno and
1411 : important classes. */
1412 : static void
1413 222045 : find_reg_classes (void)
1414 : {
1415 222045 : setup_allocno_and_important_classes ();
1416 222045 : setup_class_translate ();
1417 222045 : reorder_important_classes ();
1418 222045 : setup_reg_class_relations ();
1419 222045 : }
1420 :
1421 :
1422 :
1423 : /* Set up array ira_hard_regno_allocno_class. */
1424 : static void
1425 222045 : setup_hard_regno_aclass (void)
1426 : {
1427 222045 : int i;
1428 :
1429 20650185 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1430 : {
1431 40856280 : ira_hard_regno_allocno_class[i]
1432 30798361 : = (TEST_HARD_REG_BIT (no_unit_alloc_regs, i)
1433 20428140 : ? NO_REGS
1434 10370221 : : ira_allocno_class_translate[REGNO_REG_CLASS (i)]);
1435 : }
1436 222045 : }
1437 :
1438 :
1439 :
1440 : /* Form IRA_REG_CLASS_MAX_NREGS and IRA_REG_CLASS_MIN_NREGS maps. */
1441 : static void
1442 222045 : setup_reg_class_nregs (void)
1443 : {
1444 222045 : int i, cl, cl2, m;
1445 :
1446 27755625 : for (m = 0; m < MAX_MACHINE_MODE; m++)
1447 : {
1448 963675300 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1449 1872283440 : ira_reg_class_max_nregs[cl][m]
1450 1872283440 : = ira_reg_class_min_nregs[cl][m]
1451 936141720 : = targetm.class_max_nregs ((reg_class_t) cl, (machine_mode) m);
1452 963675300 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1453 6706539256 : for (i = 0;
1454 7642680976 : (cl2 = alloc_reg_class_subclasses[cl][i]) != LIM_REG_CLASSES;
1455 : i++)
1456 6706539256 : if (ira_reg_class_min_nregs[cl2][m]
1457 6706539256 : < ira_reg_class_min_nregs[cl][m])
1458 53812148 : ira_reg_class_min_nregs[cl][m] = ira_reg_class_min_nregs[cl2][m];
1459 : }
1460 222045 : }
1461 :
1462 :
1463 :
1464 : /* Set up IRA_PROHIBITED_CLASS_MODE_REGS, IRA_EXCLUDE_CLASS_MODE_REGS, and
1465 : IRA_CLASS_SINGLETON. This function is called once IRA_CLASS_HARD_REGS has
1466 : been initialized. */
1467 : static void
1468 222045 : setup_prohibited_and_exclude_class_mode_regs (void)
1469 : {
1470 222045 : int j, k, hard_regno, cl, last_hard_regno, count;
1471 :
1472 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1473 : {
1474 7549530 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1475 943691250 : for (j = 0; j < NUM_MACHINE_MODES; j++)
1476 : {
1477 936141720 : count = 0;
1478 936141720 : last_hard_regno = -1;
1479 3744566880 : CLEAR_HARD_REG_SET (ira_prohibited_class_mode_regs[cl][j]);
1480 936141720 : CLEAR_HARD_REG_SET (ira_exclude_class_mode_regs[cl][j]);
1481 10421067136 : for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1482 : {
1483 9484925416 : hard_regno = ira_class_hard_regs[cl][k];
1484 9484925416 : if (!targetm.hard_regno_mode_ok (hard_regno, (machine_mode) j))
1485 7150383270 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1486 : hard_regno);
1487 2334542146 : else if (in_hard_reg_set_p (temp_hard_regset,
1488 : (machine_mode) j, hard_regno))
1489 : {
1490 2235797881 : last_hard_regno = hard_regno;
1491 2235797881 : count++;
1492 : }
1493 : else
1494 : {
1495 98744265 : SET_HARD_REG_BIT (ira_exclude_class_mode_regs[cl][j], hard_regno);
1496 : }
1497 : }
1498 984727755 : ira_class_singleton[cl][j] = (count == 1 ? last_hard_regno : -1);
1499 : }
1500 : }
1501 222045 : }
1502 :
1503 : /* Clarify IRA_PROHIBITED_CLASS_MODE_REGS by excluding hard registers
1504 : spanning from one register pressure class to another one. It is
1505 : called after defining the pressure classes. */
1506 : static void
1507 222045 : clarify_prohibited_class_mode_regs (void)
1508 : {
1509 222045 : int j, k, hard_regno, cl, pclass, nregs;
1510 :
1511 7771575 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1512 943691250 : for (j = 0; j < NUM_MACHINE_MODES; j++)
1513 : {
1514 936141720 : CLEAR_HARD_REG_SET (ira_useful_class_mode_regs[cl][j]);
1515 10421067136 : for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1516 : {
1517 9484925416 : hard_regno = ira_class_hard_regs[cl][k];
1518 9484925416 : if (TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], hard_regno))
1519 7150383270 : continue;
1520 2334542146 : nregs = hard_regno_nregs (hard_regno, (machine_mode) j);
1521 2334542146 : if (hard_regno + nregs > FIRST_PSEUDO_REGISTER)
1522 : {
1523 10365 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1524 : hard_regno);
1525 10365 : continue;
1526 : }
1527 2334531781 : pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
1528 5129140244 : for (nregs-- ;nregs >= 0; nregs--)
1529 2846442649 : if (((enum reg_class) pclass
1530 2846442649 : != ira_pressure_class_translate[REGNO_REG_CLASS
1531 2846442649 : (hard_regno + nregs)]))
1532 : {
1533 51834186 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1534 : hard_regno);
1535 51834186 : break;
1536 : }
1537 2334531781 : if (!TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1538 : hard_regno))
1539 2282697595 : add_to_hard_reg_set (&ira_useful_class_mode_regs[cl][j],
1540 : (machine_mode) j, hard_regno);
1541 : }
1542 : }
1543 222045 : }
1544 :
1545 : /* Allocate and initialize IRA_REGISTER_MOVE_COST, IRA_MAY_MOVE_IN_COST
1546 : and IRA_MAY_MOVE_OUT_COST for MODE. */
1547 : void
1548 10359615 : ira_init_register_move_cost (machine_mode mode)
1549 : {
1550 10359615 : static unsigned short last_move_cost[N_REG_CLASSES][N_REG_CLASSES];
1551 10359615 : bool all_match = true;
1552 10359615 : unsigned int i, cl1, cl2;
1553 10359615 : HARD_REG_SET ok_regs;
1554 :
1555 10359615 : ira_assert (ira_register_move_cost[mode] == NULL
1556 : && ira_may_move_in_cost[mode] == NULL
1557 : && ira_may_move_out_cost[mode] == NULL);
1558 963444195 : CLEAR_HARD_REG_SET (ok_regs);
1559 963444195 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1560 953084580 : if (targetm.hard_regno_mode_ok (i, mode))
1561 433359359 : SET_HARD_REG_BIT (ok_regs, i);
1562 :
1563 : /* Note that we might be asked about the move costs of modes that
1564 : cannot be stored in any hard register, for example if an inline
1565 : asm tries to create a register operand with an impossible mode.
1566 : We therefore can't assert have_regs_of_mode[mode] here. */
1567 362586525 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1568 12327941850 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1569 : {
1570 11975714940 : int cost;
1571 11975714940 : if (!hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl1])
1572 19796926938 : || !hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl2]))
1573 : {
1574 6255918751 : if ((ira_reg_class_max_nregs[cl1][mode]
1575 6255918751 : > ira_class_hard_regs_num[cl1])
1576 4439649871 : || (ira_reg_class_max_nregs[cl2][mode]
1577 4439649871 : > ira_class_hard_regs_num[cl2]))
1578 : cost = 65535;
1579 : else
1580 3054783104 : cost = (ira_memory_move_cost[mode][cl1][0]
1581 3054783104 : + ira_memory_move_cost[mode][cl2][1]) * 2;
1582 : }
1583 : else
1584 : {
1585 5719796189 : cost = register_move_cost (mode, (enum reg_class) cl1,
1586 : (enum reg_class) cl2);
1587 5719796189 : ira_assert (cost < 65535);
1588 : }
1589 11975714940 : all_match &= (last_move_cost[cl1][cl2] == cost);
1590 11975714940 : last_move_cost[cl1][cl2] = cost;
1591 : }
1592 10359615 : if (all_match && last_mode_for_init_move_cost != -1)
1593 : {
1594 4301487 : ira_register_move_cost[mode]
1595 4301487 : = ira_register_move_cost[last_mode_for_init_move_cost];
1596 4301487 : ira_may_move_in_cost[mode]
1597 4301487 : = ira_may_move_in_cost[last_mode_for_init_move_cost];
1598 4301487 : ira_may_move_out_cost[mode]
1599 4301487 : = ira_may_move_out_cost[last_mode_for_init_move_cost];
1600 4301487 : return;
1601 : }
1602 6058128 : last_mode_for_init_move_cost = mode;
1603 6058128 : ira_register_move_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1604 6058128 : ira_may_move_in_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1605 6058128 : ira_may_move_out_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1606 212034480 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1607 7209172320 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1608 : {
1609 7003195968 : int cost;
1610 7003195968 : enum reg_class *p1, *p2;
1611 :
1612 7003195968 : if (last_move_cost[cl1][cl2] == 65535)
1613 : {
1614 1748707417 : ira_register_move_cost[mode][cl1][cl2] = 65535;
1615 1748707417 : ira_may_move_in_cost[mode][cl1][cl2] = 65535;
1616 1748707417 : ira_may_move_out_cost[mode][cl1][cl2] = 65535;
1617 : }
1618 : else
1619 : {
1620 5254488551 : cost = last_move_cost[cl1][cl2];
1621 :
1622 44028759181 : for (p2 = ®_class_subclasses[cl2][0];
1623 44028759181 : *p2 != LIM_REG_CLASSES; p2++)
1624 38774270630 : if (ira_class_hard_regs_num[*p2] > 0
1625 38059300683 : && (ira_reg_class_max_nregs[*p2][mode]
1626 : <= ira_class_hard_regs_num[*p2]))
1627 31215087151 : cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]);
1628 :
1629 44028759181 : for (p1 = ®_class_subclasses[cl1][0];
1630 44028759181 : *p1 != LIM_REG_CLASSES; p1++)
1631 38774270630 : if (ira_class_hard_regs_num[*p1] > 0
1632 38059300683 : && (ira_reg_class_max_nregs[*p1][mode]
1633 : <= ira_class_hard_regs_num[*p1]))
1634 31215087151 : cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]);
1635 :
1636 5254488551 : ira_assert (cost <= 65535);
1637 5254488551 : ira_register_move_cost[mode][cl1][cl2] = cost;
1638 :
1639 5254488551 : if (ira_class_subset_p[cl1][cl2])
1640 1586251632 : ira_may_move_in_cost[mode][cl1][cl2] = 0;
1641 : else
1642 3668236919 : ira_may_move_in_cost[mode][cl1][cl2] = cost;
1643 :
1644 5254488551 : if (ira_class_subset_p[cl2][cl1])
1645 1586251632 : ira_may_move_out_cost[mode][cl1][cl2] = 0;
1646 : else
1647 3668236919 : ira_may_move_out_cost[mode][cl1][cl2] = cost;
1648 : }
1649 : }
1650 : }
1651 :
1652 :
1653 :
1654 : /* This is called once during compiler work. It sets up
1655 : different arrays whose values don't depend on the compiled
1656 : function. */
1657 : void
1658 216567 : ira_init_once (void)
1659 : {
1660 216567 : ira_init_costs_once ();
1661 216567 : lra_init_once ();
1662 :
1663 216567 : ira_use_lra_p = targetm.lra_p ();
1664 216567 : }
1665 :
1666 : /* Free ira_max_register_move_cost, ira_may_move_in_cost and
1667 : ira_may_move_out_cost for each mode. */
1668 : void
1669 551229 : target_ira_int::free_register_move_costs (void)
1670 : {
1671 551229 : int mode, i;
1672 :
1673 : /* Reset move_cost and friends, making sure we only free shared
1674 : table entries once. */
1675 68903625 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1676 68352396 : if (x_ira_register_move_cost[mode])
1677 : {
1678 619453297 : for (i = 0;
1679 628969633 : i < mode && (x_ira_register_move_cost[i]
1680 : != x_ira_register_move_cost[mode]);
1681 : i++)
1682 : ;
1683 9516336 : if (i == mode)
1684 : {
1685 5567758 : free (x_ira_register_move_cost[mode]);
1686 5567758 : free (x_ira_may_move_in_cost[mode]);
1687 5567758 : free (x_ira_may_move_out_cost[mode]);
1688 : }
1689 : }
1690 551229 : memset (x_ira_register_move_cost, 0, sizeof x_ira_register_move_cost);
1691 551229 : memset (x_ira_may_move_in_cost, 0, sizeof x_ira_may_move_in_cost);
1692 551229 : memset (x_ira_may_move_out_cost, 0, sizeof x_ira_may_move_out_cost);
1693 551229 : last_mode_for_init_move_cost = -1;
1694 551229 : }
1695 :
1696 329184 : target_ira_int::~target_ira_int ()
1697 : {
1698 329184 : free_ira_costs ();
1699 329184 : free_register_move_costs ();
1700 329184 : }
1701 :
1702 : /* This is called every time when register related information is
1703 : changed. */
1704 : void
1705 222045 : ira_init (void)
1706 : {
1707 222045 : this_target_ira_int->free_register_move_costs ();
1708 222045 : setup_reg_mode_hard_regset ();
1709 222045 : setup_alloc_regs (flag_omit_frame_pointer != 0);
1710 222045 : setup_class_subset_and_memory_move_costs ();
1711 222045 : setup_reg_class_nregs ();
1712 222045 : setup_prohibited_and_exclude_class_mode_regs ();
1713 222045 : find_reg_classes ();
1714 222045 : clarify_prohibited_class_mode_regs ();
1715 222045 : setup_hard_regno_aclass ();
1716 222045 : ira_init_costs ();
1717 222045 : }
1718 :
1719 :
1720 : #define ira_prohibited_mode_move_regs_initialized_p \
1721 : (this_target_ira_int->x_ira_prohibited_mode_move_regs_initialized_p)
1722 :
1723 : /* Set up IRA_PROHIBITED_MODE_MOVE_REGS. */
1724 : static void
1725 1488370 : setup_prohibited_mode_move_regs (void)
1726 : {
1727 1488370 : int i, j;
1728 1488370 : rtx test_reg1, test_reg2, move_pat;
1729 1488370 : rtx_insn *move_insn;
1730 :
1731 1488370 : if (ira_prohibited_mode_move_regs_initialized_p)
1732 : return;
1733 219622 : ira_prohibited_mode_move_regs_initialized_p = true;
1734 219622 : test_reg1 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 1);
1735 219622 : test_reg2 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 2);
1736 219622 : move_pat = gen_rtx_SET (test_reg1, test_reg2);
1737 219622 : move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, move_pat, 0, -1, 0);
1738 27672372 : for (i = 0; i < NUM_MACHINE_MODES; i++)
1739 : {
1740 27233128 : SET_HARD_REG_SET (ira_prohibited_mode_move_regs[i]);
1741 2532680904 : for (j = 0; j < FIRST_PSEUDO_REGISTER; j++)
1742 : {
1743 2505447776 : if (!targetm.hard_regno_mode_ok (j, (machine_mode) i))
1744 2072328454 : continue;
1745 433119322 : set_mode_and_regno (test_reg1, (machine_mode) i, j);
1746 433119322 : set_mode_and_regno (test_reg2, (machine_mode) i, j);
1747 433119322 : INSN_CODE (move_insn) = -1;
1748 433119322 : recog_memoized (move_insn);
1749 433119322 : if (INSN_CODE (move_insn) < 0)
1750 204827557 : continue;
1751 228291765 : extract_insn (move_insn);
1752 : /* We don't know whether the move will be in code that is optimized
1753 : for size or speed, so consider all enabled alternatives. */
1754 228291765 : if (! constrain_operands (1, get_enabled_alternatives (move_insn)))
1755 1347740 : continue;
1756 226944025 : CLEAR_HARD_REG_BIT (ira_prohibited_mode_move_regs[i], j);
1757 : }
1758 : }
1759 : }
1760 :
1761 :
1762 :
1763 : /* Extract INSN and return the set of alternatives that we should consider.
1764 : This excludes any alternatives whose constraints are obviously impossible
1765 : to meet (e.g. because the constraint requires a constant and the operand
1766 : is nonconstant). It also excludes alternatives that are bound to need
1767 : a spill or reload, as long as we have other alternatives that match
1768 : exactly. */
1769 : alternative_mask
1770 102784801 : ira_setup_alts (rtx_insn *insn)
1771 : {
1772 102784801 : int nop, nalt;
1773 102784801 : bool curr_swapped;
1774 102784801 : const char *p;
1775 102784801 : int commutative = -1;
1776 :
1777 102784801 : extract_insn (insn);
1778 102784801 : preprocess_constraints (insn);
1779 102784801 : alternative_mask preferred = get_preferred_alternatives (insn);
1780 102784801 : alternative_mask alts = 0;
1781 102784801 : alternative_mask exact_alts = 0;
1782 : /* Check that the hard reg set is enough for holding all
1783 : alternatives. It is hard to imagine the situation when the
1784 : assertion is wrong. */
1785 102784801 : ira_assert (recog_data.n_alternatives
1786 : <= (int) MAX (sizeof (HARD_REG_ELT_TYPE) * CHAR_BIT,
1787 : FIRST_PSEUDO_REGISTER));
1788 301720987 : for (nop = 0; nop < recog_data.n_operands; nop++)
1789 211023324 : if (recog_data.constraints[nop][0] == '%')
1790 : {
1791 : commutative = nop;
1792 : break;
1793 : }
1794 102784801 : for (curr_swapped = false;; curr_swapped = true)
1795 : {
1796 1379497843 : for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
1797 : {
1798 1264625904 : if (!TEST_BIT (preferred, nalt) || TEST_BIT (exact_alts, nalt))
1799 437200656 : continue;
1800 :
1801 827425248 : const operand_alternative *op_alt
1802 827425248 : = &recog_op_alt[nalt * recog_data.n_operands];
1803 827425248 : int this_reject = 0;
1804 2362459846 : for (nop = 0; nop < recog_data.n_operands; nop++)
1805 : {
1806 1724825026 : int c, len;
1807 :
1808 1724825026 : this_reject += op_alt[nop].reject;
1809 :
1810 1724825026 : rtx op = recog_data.operand[nop];
1811 1724825026 : p = op_alt[nop].constraint;
1812 1724825026 : if (*p == 0 || *p == ',')
1813 24060799 : continue;
1814 :
1815 : bool win_p = false;
1816 3450830090 : do
1817 3450830090 : switch (c = *p, len = CONSTRAINT_LEN (c, p), c)
1818 : {
1819 : case '#':
1820 : case ',':
1821 : c = '\0';
1822 : /* FALLTHRU */
1823 725975828 : case '\0':
1824 725975828 : len = 0;
1825 725975828 : break;
1826 :
1827 : case '%':
1828 : /* The commutative modifier is handled above. */
1829 : break;
1830 :
1831 73762455 : case '0': case '1': case '2': case '3': case '4':
1832 73762455 : case '5': case '6': case '7': case '8': case '9':
1833 73762455 : {
1834 73762455 : char *end;
1835 73762455 : unsigned long dup = strtoul (p, &end, 10);
1836 73762455 : rtx other = recog_data.operand[dup];
1837 73762455 : len = end - p;
1838 1804309 : if (MEM_P (other)
1839 73762455 : ? rtx_equal_p (other, op)
1840 71958146 : : REG_P (op) || SUBREG_P (op))
1841 51308917 : goto op_success;
1842 22453538 : win_p = true;
1843 : }
1844 22453538 : break;
1845 :
1846 10661792 : case 'g':
1847 10661792 : goto op_success;
1848 145 : break;
1849 :
1850 145 : case '{':
1851 145 : if (REG_P (op) || SUBREG_P (op))
1852 143 : goto op_success;
1853 : win_p = true;
1854 : break;
1855 :
1856 2621405299 : default:
1857 2621405299 : {
1858 2621405299 : enum constraint_num cn = lookup_constraint (p);
1859 2621405299 : rtx mem = NULL;
1860 2621405299 : switch (get_constraint_type (cn))
1861 : {
1862 2137224964 : case CT_REGISTER:
1863 3365963810 : if (reg_class_for_constraint (cn) != NO_REGS)
1864 : {
1865 1181810012 : if (REG_P (op) || SUBREG_P (op))
1866 770930894 : goto op_success;
1867 : win_p = true;
1868 : }
1869 : break;
1870 :
1871 4036955 : case CT_CONST_INT:
1872 4036955 : if (CONST_INT_P (op)
1873 6646825 : && (insn_const_int_ok_for_constraint
1874 2609870 : (INTVAL (op), cn)))
1875 1863703 : goto op_success;
1876 : break;
1877 :
1878 803437 : case CT_ADDRESS:
1879 803437 : goto op_success;
1880 :
1881 162071324 : case CT_MEMORY:
1882 162071324 : case CT_RELAXED_MEMORY:
1883 162071324 : mem = op;
1884 : /* Fall through. */
1885 162071324 : case CT_SPECIAL_MEMORY:
1886 162071324 : if (!mem)
1887 66326021 : mem = extract_mem_from_operand (op);
1888 228397345 : if (MEM_P (mem))
1889 69348634 : goto op_success;
1890 : win_p = true;
1891 : break;
1892 :
1893 250942598 : case CT_FIXED_FORM:
1894 250942598 : if (constraint_satisfied_p (op, cn))
1895 69870879 : goto op_success;
1896 : break;
1897 : }
1898 : break;
1899 : }
1900 : }
1901 2476041691 : while (p += len, c);
1902 725975828 : if (!win_p)
1903 : break;
1904 : /* We can make the alternative match by spilling a register
1905 : to memory or loading something into a register. Count a
1906 : cost of one reload (the equivalent of the '?' constraint). */
1907 536185400 : this_reject += 6;
1908 1535034598 : op_success:
1909 1535034598 : ;
1910 : }
1911 :
1912 827425248 : if (nop >= recog_data.n_operands)
1913 : {
1914 637634820 : alts |= ALTERNATIVE_BIT (nalt);
1915 637634820 : if (this_reject == 0)
1916 135357223 : exact_alts |= ALTERNATIVE_BIT (nalt);
1917 : }
1918 : }
1919 114871939 : if (commutative < 0)
1920 : break;
1921 : /* Swap forth and back to avoid changing recog_data. */
1922 24174276 : std::swap (recog_data.operand[commutative],
1923 24174276 : recog_data.operand[commutative + 1]);
1924 24174276 : if (curr_swapped)
1925 : break;
1926 : }
1927 102784801 : return exact_alts ? exact_alts : alts;
1928 : }
1929 :
1930 : /* Return the number of the output non-early clobber operand which
1931 : should be the same in any case as operand with number OP_NUM (or
1932 : negative value if there is no such operand). ALTS is the mask
1933 : of alternatives that we should consider. SINGLE_INPUT_OP_HAS_CSTR_P
1934 : should be set in this function, it indicates whether there is only
1935 : a single input operand which has the matching constraint on the
1936 : output operand at the position specified in return value. If the
1937 : pattern allows any one of several input operands holds the matching
1938 : constraint, it's set as false, one typical case is destructive FMA
1939 : instruction on target rs6000. Note that for a non-NO_REG preferred
1940 : register class with no free register move copy, if the parameter
1941 : PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to one, this function
1942 : will check all available alternatives for matching constraints,
1943 : even if it has found or will find one alternative with non-NO_REG
1944 : regclass, it can respect more cases with matching constraints. If
1945 : PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to zero,
1946 : SINGLE_INPUT_OP_HAS_CSTR_P is always true, it will stop to find
1947 : matching constraint relationship once it hits some alternative with
1948 : some non-NO_REG regclass. */
1949 : int
1950 20298695 : ira_get_dup_out_num (int op_num, alternative_mask alts,
1951 : bool &single_input_op_has_cstr_p)
1952 : {
1953 20298695 : int curr_alt, c, original;
1954 20298695 : bool ignore_p, use_commut_op_p;
1955 20298695 : const char *str;
1956 :
1957 20298695 : if (op_num < 0 || recog_data.n_alternatives == 0)
1958 : return -1;
1959 : /* We should find duplications only for input operands. */
1960 20298695 : if (recog_data.operand_type[op_num] != OP_IN)
1961 : return -1;
1962 14659878 : str = recog_data.constraints[op_num];
1963 14659878 : use_commut_op_p = false;
1964 14659878 : single_input_op_has_cstr_p = true;
1965 :
1966 14659878 : rtx op = recog_data.operand[op_num];
1967 14659878 : int op_regno = reg_or_subregno (op);
1968 14659878 : enum reg_class op_pref_cl = reg_preferred_class (op_regno);
1969 14659878 : machine_mode op_mode = GET_MODE (op);
1970 :
1971 14659878 : ira_init_register_move_cost_if_necessary (op_mode);
1972 : /* If the preferred regclass isn't NO_REG, continue to find the matching
1973 : constraint in all available alternatives with preferred regclass, even
1974 : if we have found or will find one alternative whose constraint stands
1975 : for a REG (non-NO_REG) regclass. Note that it would be fine not to
1976 : respect matching constraint if the register copy is free, so exclude
1977 : it. */
1978 14659878 : bool respect_dup_despite_reg_cstr
1979 14659878 : = param_ira_consider_dup_in_all_alts
1980 470 : && op_pref_cl != NO_REGS
1981 14660344 : && ira_register_move_cost[op_mode][op_pref_cl][op_pref_cl] > 0;
1982 :
1983 : /* Record the alternative whose constraint uses the same regclass as the
1984 : preferred regclass, later if we find one matching constraint for this
1985 : operand with preferred reclass, we will visit these recorded
1986 : alternatives to check whether if there is one alternative in which no
1987 : any INPUT operands have one matching constraint same as our candidate.
1988 : If yes, it means there is one alternative which is perfectly fine
1989 : without satisfying this matching constraint. If no, it means in any
1990 : alternatives there is one other INPUT operand holding this matching
1991 : constraint, it's fine to respect this matching constraint and further
1992 : create this constraint copy since it would become harmless once some
1993 : other takes preference and it's interfered. */
1994 17076492 : alternative_mask pref_cl_alts;
1995 :
1996 17076492 : for (;;)
1997 : {
1998 17076492 : pref_cl_alts = 0;
1999 :
2000 17076492 : for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt),
2001 17076492 : original = -1;;)
2002 : {
2003 101766318 : c = *str;
2004 101766318 : if (c == '\0')
2005 : break;
2006 96965825 : if (c == '#')
2007 : ignore_p = true;
2008 96965825 : else if (c == ',')
2009 : {
2010 32856432 : curr_alt++;
2011 32856432 : ignore_p = !TEST_BIT (alts, curr_alt);
2012 : }
2013 64109393 : else if (! ignore_p)
2014 20498545 : switch (c)
2015 : {
2016 922 : case 'g':
2017 922 : goto fail;
2018 15448135 : default:
2019 15448135 : {
2020 15448135 : enum constraint_num cn = lookup_constraint (str);
2021 15448135 : enum reg_class cl = reg_class_for_constraint (cn);
2022 12555757 : if (cl != NO_REGS && !targetm.class_likely_spilled_p (cl))
2023 : {
2024 12273615 : if (respect_dup_despite_reg_cstr)
2025 : {
2026 : /* If it's free to move from one preferred class to
2027 : the one without matching constraint, it doesn't
2028 : have to respect this constraint with costs. */
2029 665 : if (cl != op_pref_cl
2030 104 : && (ira_reg_class_intersect[cl][op_pref_cl]
2031 : != NO_REGS)
2032 92 : && (ira_may_move_in_cost[op_mode][op_pref_cl][cl]
2033 : == 0))
2034 76 : goto fail;
2035 589 : else if (cl == op_pref_cl)
2036 561 : pref_cl_alts |= ALTERNATIVE_BIT (curr_alt);
2037 : }
2038 : else
2039 12272950 : goto fail;
2040 : }
2041 3175109 : if (constraint_satisfied_p (op, cn))
2042 2051 : goto fail;
2043 : break;
2044 : }
2045 :
2046 5049488 : case '0': case '1': case '2': case '3': case '4':
2047 5049488 : case '5': case '6': case '7': case '8': case '9':
2048 5049488 : {
2049 5049488 : char *end;
2050 5049488 : int n = (int) strtoul (str, &end, 10);
2051 5049488 : str = end;
2052 5049488 : if (original != -1 && original != n)
2053 0 : goto fail;
2054 5049488 : gcc_assert (n < recog_data.n_operands);
2055 5049488 : if (respect_dup_despite_reg_cstr)
2056 : {
2057 216 : const operand_alternative *op_alt
2058 216 : = &recog_op_alt[curr_alt * recog_data.n_operands];
2059 : /* Only respect the one with preferred rclass, without
2060 : respect_dup_despite_reg_cstr it's possible to get
2061 : one whose regclass isn't preferred first before,
2062 : but it would fail since there should be other
2063 : alternatives with preferred regclass. */
2064 216 : if (op_alt[n].cl == op_pref_cl)
2065 5049431 : original = n;
2066 : }
2067 : else
2068 : original = n;
2069 5049488 : continue;
2070 5049488 : }
2071 : }
2072 79640338 : str += CONSTRAINT_LEN (c, str);
2073 : }
2074 4800493 : if (original == -1)
2075 1804113 : goto fail;
2076 2996380 : if (recog_data.operand_type[original] == OP_OUT)
2077 : {
2078 2996068 : if (pref_cl_alts == 0)
2079 : return original;
2080 : /* Visit these recorded alternatives to check whether
2081 : there is one alternative in which no any INPUT operands
2082 : have one matching constraint same as our candidate.
2083 : Give up this candidate if so. */
2084 : int nop, nalt;
2085 376 : for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
2086 : {
2087 353 : if (!TEST_BIT (pref_cl_alts, nalt))
2088 254 : continue;
2089 99 : const operand_alternative *op_alt
2090 99 : = &recog_op_alt[nalt * recog_data.n_operands];
2091 99 : bool dup_in_other = false;
2092 365 : for (nop = 0; nop < recog_data.n_operands; nop++)
2093 : {
2094 309 : if (recog_data.operand_type[nop] != OP_IN)
2095 99 : continue;
2096 210 : if (nop == op_num)
2097 88 : continue;
2098 122 : if (op_alt[nop].matches == original)
2099 : {
2100 : dup_in_other = true;
2101 : break;
2102 : }
2103 : }
2104 99 : if (!dup_in_other)
2105 : return -1;
2106 : }
2107 23 : single_input_op_has_cstr_p = false;
2108 23 : return original;
2109 : }
2110 312 : fail:
2111 14080424 : if (use_commut_op_p)
2112 : break;
2113 12453064 : use_commut_op_p = true;
2114 12453064 : if (recog_data.constraints[op_num][0] == '%')
2115 1046943 : str = recog_data.constraints[op_num + 1];
2116 11406121 : else if (op_num > 0 && recog_data.constraints[op_num - 1][0] == '%')
2117 : str = recog_data.constraints[op_num - 1];
2118 : else
2119 : break;
2120 : }
2121 : return -1;
2122 : }
2123 :
2124 :
2125 :
2126 : /* Return true if a replacement of SRC by DEST does not lead to unsatisfiable
2127 : asm. A replacement is valid if SRC or DEST are not constrained in asm
2128 : inputs of a single asm statement. See match_asm_constraints_2() for more
2129 : details. TODO: As in match_asm_constraints_2() consider alternatives more
2130 : precisely. */
2131 :
2132 : static bool
2133 7606 : valid_replacement_for_asm_input_p_1 (const_rtx asmops, const_rtx src, const_rtx dest)
2134 : {
2135 7606 : int ninputs = ASM_OPERANDS_INPUT_LENGTH (asmops);
2136 7606 : rtvec inputs = ASM_OPERANDS_INPUT_VEC (asmops);
2137 38323 : for (int i = 0; i < ninputs; ++i)
2138 : {
2139 30717 : rtx input_src = RTVEC_ELT (inputs, i);
2140 30717 : const char *constraint_src
2141 30717 : = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, i);
2142 30717 : if (rtx_equal_p (input_src, src)
2143 30717 : && strchr (constraint_src, '{') != nullptr)
2144 0 : for (int j = 0; j < ninputs; ++j)
2145 : {
2146 0 : rtx input_dest = RTVEC_ELT (inputs, j);
2147 0 : const char *constraint_dest
2148 0 : = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, j);
2149 0 : if (rtx_equal_p (input_dest, dest)
2150 0 : && strchr (constraint_dest, '{') != nullptr)
2151 : return false;
2152 : }
2153 : }
2154 : return true;
2155 : }
2156 :
2157 : /* Return true if a replacement of SRC by DEST does not lead to unsatisfiable
2158 : asm. A replacement is valid if SRC or DEST are not constrained in asm
2159 : inputs of a single asm statement. The final check is done in function
2160 : valid_replacement_for_asm_input_p_1. */
2161 :
2162 : static bool
2163 517687 : valid_replacement_for_asm_input_p (const_rtx src, const_rtx dest)
2164 : {
2165 : /* Bail out early if there is no asm statement. */
2166 517687 : if (!crtl->has_asm_statement)
2167 : return true;
2168 26783 : for (df_ref use = DF_REG_USE_CHAIN (REGNO (src));
2169 785823 : use;
2170 759040 : use = DF_REF_NEXT_REG (use))
2171 : {
2172 759040 : struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
2173 : /* Only check real uses, not artificial ones. */
2174 759040 : if (use_info)
2175 : {
2176 759040 : rtx_insn *insn = DF_REF_INSN (use);
2177 759040 : rtx pat = PATTERN (insn);
2178 759040 : if (asm_noperands (pat) <= 0)
2179 755742 : continue;
2180 3298 : if (GET_CODE (pat) == SET)
2181 : {
2182 0 : if (!valid_replacement_for_asm_input_p_1 (SET_SRC (pat), src, dest))
2183 : return false;
2184 : }
2185 3298 : else if (GET_CODE (pat) == PARALLEL)
2186 14250 : for (int i = 0, len = XVECLEN (pat, 0); i < len; ++i)
2187 : {
2188 10952 : rtx asmops = XVECEXP (pat, 0, i);
2189 10952 : if (GET_CODE (asmops) == SET)
2190 7572 : asmops = SET_SRC (asmops);
2191 10952 : if (GET_CODE (asmops) == ASM_OPERANDS
2192 10952 : && !valid_replacement_for_asm_input_p_1 (asmops, src, dest))
2193 : return false;
2194 : }
2195 0 : else if (GET_CODE (pat) == ASM_OPERANDS)
2196 : {
2197 0 : if (!valid_replacement_for_asm_input_p_1 (pat, src, dest))
2198 : return false;
2199 : }
2200 : else
2201 0 : gcc_unreachable ();
2202 : }
2203 : }
2204 : return true;
2205 : }
2206 :
2207 : /* Search forward to see if the source register of a copy insn dies
2208 : before either it or the destination register is modified, but don't
2209 : scan past the end of the basic block. If so, we can replace the
2210 : source with the destination and let the source die in the copy
2211 : insn.
2212 :
2213 : This will reduce the number of registers live in that range and may
2214 : enable the destination and the source coalescing, thus often saving
2215 : one register in addition to a register-register copy. */
2216 :
2217 : static void
2218 1488370 : decrease_live_ranges_number (void)
2219 : {
2220 1488370 : basic_block bb;
2221 1488370 : rtx_insn *insn;
2222 1488370 : rtx set, src, dest, dest_death, note;
2223 1488370 : rtx_insn *p, *q;
2224 1488370 : int sregno, dregno;
2225 :
2226 1488370 : if (! flag_expensive_optimizations)
2227 : return;
2228 :
2229 961264 : if (ira_dump_file)
2230 32 : fprintf (ira_dump_file, "Starting decreasing number of live ranges...\n");
2231 :
2232 11191813 : FOR_EACH_BB_FN (bb, cfun)
2233 131673152 : FOR_BB_INSNS (bb, insn)
2234 : {
2235 121442603 : set = single_set (insn);
2236 121442603 : if (! set)
2237 70619835 : continue;
2238 50822768 : src = SET_SRC (set);
2239 50822768 : dest = SET_DEST (set);
2240 13949134 : if (! REG_P (src) || ! REG_P (dest)
2241 59528118 : || find_reg_note (insn, REG_DEAD, src))
2242 48040434 : continue;
2243 2782334 : sregno = REGNO (src);
2244 2782334 : dregno = REGNO (dest);
2245 :
2246 : /* We don't want to mess with hard regs if register classes
2247 : are small. */
2248 5046981 : if (sregno == dregno
2249 2782300 : || (targetm.small_register_classes_for_mode_p (GET_MODE (src))
2250 2782300 : && (sregno < FIRST_PSEUDO_REGISTER
2251 2782300 : || dregno < FIRST_PSEUDO_REGISTER))
2252 : /* We don't see all updates to SP if they are in an
2253 : auto-inc memory reference, so we must disallow this
2254 : optimization on them. */
2255 517687 : || sregno == STACK_POINTER_REGNUM
2256 517687 : || dregno == STACK_POINTER_REGNUM
2257 3300021 : || !valid_replacement_for_asm_input_p (src, dest))
2258 2264647 : continue;
2259 :
2260 517687 : dest_death = NULL_RTX;
2261 :
2262 6812611 : for (p = NEXT_INSN (insn); p; p = NEXT_INSN (p))
2263 : {
2264 6806918 : if (! INSN_P (p))
2265 1075622 : continue;
2266 5731296 : if (BLOCK_FOR_INSN (p) != bb)
2267 : break;
2268 :
2269 10531584 : if (reg_set_p (src, p) || reg_set_p (dest, p)
2270 : /* If SRC is an asm-declared register, it must not be
2271 : replaced in any asm. Unfortunately, the REG_EXPR
2272 : tree for the asm variable may be absent in the SRC
2273 : rtx, so we can't check the actual register
2274 : declaration easily (the asm operand will have it,
2275 : though). To avoid complicating the test for a rare
2276 : case, we just don't perform register replacement
2277 : for a hard reg mentioned in an asm. */
2278 5225970 : || (sregno < FIRST_PSEUDO_REGISTER
2279 0 : && asm_noperands (PATTERN (p)) >= 0
2280 0 : && reg_overlap_mentioned_p (src, PATTERN (p)))
2281 : /* Don't change hard registers used by a call. */
2282 5225970 : || (CALL_P (p) && sregno < FIRST_PSEUDO_REGISTER
2283 0 : && find_reg_fusage (p, USE, src))
2284 : /* Don't change a USE of a register. */
2285 10504777 : || (GET_CODE (PATTERN (p)) == USE
2286 1340 : && reg_overlap_mentioned_p (src, XEXP (PATTERN (p), 0))))
2287 : break;
2288 :
2289 : /* See if all of SRC dies in P. This test is slightly
2290 : more conservative than it needs to be. */
2291 5225970 : if ((note = find_regno_note (p, REG_DEAD, sregno))
2292 5225970 : && GET_MODE (XEXP (note, 0)) == GET_MODE (src))
2293 : {
2294 6668 : int failed = 0;
2295 :
2296 : /* We can do the optimization. Scan forward from INSN
2297 : again, replacing regs as we go. Set FAILED if a
2298 : replacement can't be done. In that case, we can't
2299 : move the death note for SRC. This should be
2300 : rare. */
2301 :
2302 : /* Set to stop at next insn. */
2303 6668 : for (q = next_real_insn (insn);
2304 36710 : q != next_real_insn (p);
2305 30042 : q = next_real_insn (q))
2306 : {
2307 30042 : if (reg_overlap_mentioned_p (src, PATTERN (q)))
2308 : {
2309 : /* If SRC is a hard register, we might miss
2310 : some overlapping registers with
2311 : validate_replace_rtx, so we would have to
2312 : undo it. We can't if DEST is present in
2313 : the insn, so fail in that combination of
2314 : cases. */
2315 7671 : if (sregno < FIRST_PSEUDO_REGISTER
2316 7671 : && reg_mentioned_p (dest, PATTERN (q)))
2317 : failed = 1;
2318 :
2319 : /* Attempt to replace all uses. */
2320 7671 : else if (!validate_replace_rtx (src, dest, q))
2321 : failed = 1;
2322 :
2323 : /* If this succeeded, but some part of the
2324 : register is still present, undo the
2325 : replacement. */
2326 7671 : else if (sregno < FIRST_PSEUDO_REGISTER
2327 7671 : && reg_overlap_mentioned_p (src, PATTERN (q)))
2328 : {
2329 0 : validate_replace_rtx (dest, src, q);
2330 0 : failed = 1;
2331 : }
2332 : }
2333 :
2334 : /* If DEST dies here, remove the death note and
2335 : save it for later. Make sure ALL of DEST dies
2336 : here; again, this is overly conservative. */
2337 30042 : if (! dest_death
2338 30042 : && (dest_death = find_regno_note (q, REG_DEAD, dregno)))
2339 : {
2340 2 : if (GET_MODE (XEXP (dest_death, 0)) == GET_MODE (dest))
2341 2 : remove_note (q, dest_death);
2342 : else
2343 : {
2344 : failed = 1;
2345 : dest_death = 0;
2346 : }
2347 : }
2348 : }
2349 :
2350 6668 : if (! failed)
2351 : {
2352 : /* Move death note of SRC from P to INSN. */
2353 6668 : remove_note (p, note);
2354 6668 : XEXP (note, 1) = REG_NOTES (insn);
2355 6668 : REG_NOTES (insn) = note;
2356 : }
2357 :
2358 : /* DEST is also dead if INSN has a REG_UNUSED note for
2359 : DEST. */
2360 6668 : if (! dest_death
2361 6668 : && (dest_death
2362 6666 : = find_regno_note (insn, REG_UNUSED, dregno)))
2363 : {
2364 0 : PUT_REG_NOTE_KIND (dest_death, REG_DEAD);
2365 0 : remove_note (insn, dest_death);
2366 : }
2367 :
2368 : /* Put death note of DEST on P if we saw it die. */
2369 6668 : if (dest_death)
2370 : {
2371 2 : XEXP (dest_death, 1) = REG_NOTES (p);
2372 2 : REG_NOTES (p) = dest_death;
2373 : }
2374 : break;
2375 : }
2376 :
2377 : /* If SRC is a hard register which is set or killed in
2378 : some other way, we can't do this optimization. */
2379 5219302 : else if (sregno < FIRST_PSEUDO_REGISTER && dead_or_set_p (p, src))
2380 : break;
2381 : }
2382 : }
2383 : }
2384 :
2385 :
2386 :
2387 : /* Return nonzero if REGNO is a particularly bad choice for reloading X. */
2388 : static bool
2389 0 : ira_bad_reload_regno_1 (int regno, rtx x)
2390 : {
2391 0 : int x_regno, n, i;
2392 0 : ira_allocno_t a;
2393 0 : enum reg_class pref;
2394 :
2395 : /* We only deal with pseudo regs. */
2396 0 : if (! x || GET_CODE (x) != REG)
2397 : return false;
2398 :
2399 0 : x_regno = REGNO (x);
2400 0 : if (x_regno < FIRST_PSEUDO_REGISTER)
2401 : return false;
2402 :
2403 : /* If the pseudo prefers REGNO explicitly, then do not consider
2404 : REGNO a bad spill choice. */
2405 0 : pref = reg_preferred_class (x_regno);
2406 0 : if (reg_class_size[pref] == 1)
2407 0 : return !TEST_HARD_REG_BIT (reg_class_contents[pref], regno);
2408 :
2409 : /* If the pseudo conflicts with REGNO, then we consider REGNO a
2410 : poor choice for a reload regno. */
2411 0 : a = ira_regno_allocno_map[x_regno];
2412 0 : n = ALLOCNO_NUM_OBJECTS (a);
2413 0 : for (i = 0; i < n; i++)
2414 : {
2415 0 : ira_object_t obj = ALLOCNO_OBJECT (a, i);
2416 0 : if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno))
2417 : return true;
2418 : }
2419 : return false;
2420 : }
2421 :
2422 : /* Return nonzero if REGNO is a particularly bad choice for reloading
2423 : IN or OUT. */
2424 : bool
2425 0 : ira_bad_reload_regno (int regno, rtx in, rtx out)
2426 : {
2427 0 : return (ira_bad_reload_regno_1 (regno, in)
2428 0 : || ira_bad_reload_regno_1 (regno, out));
2429 : }
2430 :
2431 : /* Add register clobbers from asm statements. */
2432 : static void
2433 1518187 : compute_regs_asm_clobbered (void)
2434 : {
2435 1518187 : basic_block bb;
2436 :
2437 16150335 : FOR_EACH_BB_FN (bb, cfun)
2438 : {
2439 14632148 : rtx_insn *insn;
2440 174042842 : FOR_BB_INSNS_REVERSE (bb, insn)
2441 : {
2442 159410694 : df_ref def;
2443 :
2444 159410694 : if (NONDEBUG_INSN_P (insn) && asm_noperands (PATTERN (insn)) >= 0)
2445 334260 : FOR_EACH_INSN_DEF (def, insn)
2446 : {
2447 223669 : unsigned int dregno = DF_REF_REGNO (def);
2448 223669 : if (HARD_REGISTER_NUM_P (dregno))
2449 306996 : add_to_hard_reg_set (&crtl->asm_clobbers,
2450 153498 : GET_MODE (DF_REF_REAL_REG (def)),
2451 : dregno);
2452 : }
2453 : }
2454 : }
2455 1518187 : }
2456 :
2457 :
2458 : /* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and
2459 : REGS_EVER_LIVE. */
2460 : void
2461 1518187 : ira_setup_eliminable_regset (void)
2462 : {
2463 1518187 : int i;
2464 1518187 : static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS;
2465 1648926 : int fp_reg_count = hard_regno_nregs (HARD_FRAME_POINTER_REGNUM, Pmode);
2466 :
2467 : /* Setup is_leaf as frame_pointer_required may use it. This function
2468 : is called by sched_init before ira if scheduling is enabled. */
2469 1518187 : crtl->is_leaf = leaf_function_p ();
2470 :
2471 : /* FIXME: If EXIT_IGNORE_STACK is set, we will not save and restore
2472 : sp for alloca. So we can't eliminate the frame pointer in that
2473 : case. At some point, we should improve this by emitting the
2474 : sp-adjusting insns for this case. */
2475 1518187 : frame_pointer_needed
2476 3036374 : = (! flag_omit_frame_pointer
2477 1057386 : || (cfun->calls_alloca && EXIT_IGNORE_STACK)
2478 : /* We need the frame pointer to catch stack overflow exceptions if
2479 : the stack pointer is moving (as for the alloca case just above). */
2480 1047895 : || (STACK_CHECK_MOVING_SP
2481 1047895 : && flag_stack_check
2482 62 : && flag_exceptions
2483 26 : && cfun->can_throw_non_call_exceptions)
2484 1047891 : || crtl->accesses_prior_frames
2485 1044961 : || (SUPPORTS_STACK_ALIGNMENT && crtl->stack_realign_needed)
2486 2517199 : || targetm.frame_pointer_required ());
2487 :
2488 : /* The chance that FRAME_POINTER_NEEDED is changed from inspecting
2489 : RTL is very small. So if we use frame pointer for RA and RTL
2490 : actually prevents this, we will spill pseudos assigned to the
2491 : frame pointer in LRA. */
2492 :
2493 1518187 : if (frame_pointer_needed)
2494 1038440 : for (i = 0; i < fp_reg_count; i++)
2495 519220 : df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2496 :
2497 1518187 : ira_no_alloc_regs = no_unit_alloc_regs;
2498 1518187 : CLEAR_HARD_REG_SET (eliminable_regset);
2499 :
2500 1518187 : compute_regs_asm_clobbered ();
2501 :
2502 : /* Build the regset of all eliminable registers and show we can't
2503 : use those that we already know won't be eliminated. */
2504 7590935 : for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++)
2505 : {
2506 6072748 : bool cannot_elim
2507 6072748 : = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
2508 6072748 : || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed));
2509 :
2510 6072748 : if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from))
2511 : {
2512 6072748 : SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from);
2513 :
2514 6072748 : if (cannot_elim)
2515 1098651 : SET_HARD_REG_BIT (ira_no_alloc_regs, eliminables[i].from);
2516 : }
2517 0 : else if (cannot_elim)
2518 0 : error ("%s cannot be used in %<asm%> here",
2519 : reg_names[eliminables[i].from]);
2520 : else
2521 0 : df_set_regs_ever_live (eliminables[i].from, true);
2522 : }
2523 : if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
2524 : {
2525 3036374 : for (i = 0; i < fp_reg_count; i++)
2526 1518187 : if (global_regs[HARD_FRAME_POINTER_REGNUM + i])
2527 : /* Nothing to do: the register is already treated as live
2528 : where appropriate, and cannot be eliminated. */
2529 : ;
2530 1518166 : else if (!TEST_HARD_REG_BIT (crtl->asm_clobbers,
2531 : HARD_FRAME_POINTER_REGNUM + i))
2532 : {
2533 1516875 : SET_HARD_REG_BIT (eliminable_regset,
2534 : HARD_FRAME_POINTER_REGNUM + i);
2535 1516875 : if (frame_pointer_needed)
2536 519218 : SET_HARD_REG_BIT (ira_no_alloc_regs,
2537 : HARD_FRAME_POINTER_REGNUM + i);
2538 : }
2539 1291 : else if (frame_pointer_needed)
2540 0 : error ("%s cannot be used in %<asm%> here",
2541 : reg_names[HARD_FRAME_POINTER_REGNUM + i]);
2542 : else
2543 1291 : df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2544 : }
2545 1518187 : }
2546 :
2547 :
2548 :
2549 : /* Vector of substitutions of register numbers,
2550 : used to map pseudo regs into hardware regs.
2551 : This is set up as a result of register allocation.
2552 : Element N is the hard reg assigned to pseudo reg N,
2553 : or is -1 if no hard reg was assigned.
2554 : If N is a hard reg number, element N is N. */
2555 : short *reg_renumber;
2556 :
2557 : /* Set up REG_RENUMBER and CALLER_SAVE_NEEDED (used by reload) from
2558 : the allocation found by IRA. */
2559 : static void
2560 1488370 : setup_reg_renumber (void)
2561 : {
2562 1488370 : int regno, hard_regno;
2563 1488370 : ira_allocno_t a;
2564 1488370 : ira_allocno_iterator ai;
2565 :
2566 1488370 : caller_save_needed = 0;
2567 38010757 : FOR_EACH_ALLOCNO (a, ai)
2568 : {
2569 36522387 : if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL)
2570 3676953 : continue;
2571 : /* There are no caps at this point. */
2572 32845434 : ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL);
2573 32845434 : if (! ALLOCNO_ASSIGNED_P (a))
2574 : /* It can happen if A is not referenced but partially anticipated
2575 : somewhere in a region. */
2576 0 : ALLOCNO_ASSIGNED_P (a) = true;
2577 32845434 : ira_free_allocno_updated_costs (a);
2578 32845434 : hard_regno = ALLOCNO_HARD_REGNO (a);
2579 32845434 : regno = ALLOCNO_REGNO (a);
2580 32845434 : reg_renumber[regno] = (hard_regno < 0 ? -1 : hard_regno);
2581 32845434 : if (hard_regno >= 0)
2582 : {
2583 29519353 : int i, nwords;
2584 29519353 : enum reg_class pclass;
2585 29519353 : ira_object_t obj;
2586 :
2587 29519353 : pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
2588 29519353 : nwords = ALLOCNO_NUM_OBJECTS (a);
2589 60155731 : for (i = 0; i < nwords; i++)
2590 : {
2591 30636378 : obj = ALLOCNO_OBJECT (a, i);
2592 30636378 : OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)
2593 61272756 : |= ~reg_class_contents[pclass];
2594 : }
2595 29519353 : if (ira_need_caller_save_p (a, hard_regno))
2596 : {
2597 427169 : ira_assert (!optimize || flag_caller_saves
2598 : || (ALLOCNO_CALLS_CROSSED_NUM (a)
2599 : == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a))
2600 : || regno >= ira_reg_equiv_len
2601 : || ira_equiv_no_lvalue_p (regno));
2602 427169 : caller_save_needed = 1;
2603 : }
2604 : }
2605 : }
2606 1488370 : }
2607 :
2608 : /* Set up allocno assignment flags for further allocation
2609 : improvements. */
2610 : static void
2611 0 : setup_allocno_assignment_flags (void)
2612 : {
2613 0 : int hard_regno;
2614 0 : ira_allocno_t a;
2615 0 : ira_allocno_iterator ai;
2616 :
2617 0 : FOR_EACH_ALLOCNO (a, ai)
2618 : {
2619 0 : if (! ALLOCNO_ASSIGNED_P (a))
2620 : /* It can happen if A is not referenced but partially anticipated
2621 : somewhere in a region. */
2622 0 : ira_free_allocno_updated_costs (a);
2623 0 : hard_regno = ALLOCNO_HARD_REGNO (a);
2624 : /* Don't assign hard registers to allocnos which are destination
2625 : of removed store at the end of loop. It has no sense to keep
2626 : the same value in different hard registers. It is also
2627 : impossible to assign hard registers correctly to such
2628 : allocnos because the cost info and info about intersected
2629 : calls are incorrect for them. */
2630 0 : ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0
2631 0 : || ALLOCNO_EMIT_DATA (a)->mem_optimized_dest_p
2632 0 : || (ALLOCNO_MEMORY_COST (a)
2633 0 : - ALLOCNO_CLASS_COST (a)) < 0);
2634 0 : ira_assert
2635 : (hard_regno < 0
2636 : || ira_hard_reg_in_set_p (hard_regno, ALLOCNO_MODE (a),
2637 : reg_class_contents[ALLOCNO_CLASS (a)]));
2638 : }
2639 0 : }
2640 :
2641 : /* Evaluate overall allocation cost and the costs for using hard
2642 : registers and memory for allocnos. */
2643 : static void
2644 1488370 : calculate_allocation_cost (void)
2645 : {
2646 1488370 : int hard_regno, cost;
2647 1488370 : ira_allocno_t a;
2648 1488370 : ira_allocno_iterator ai;
2649 :
2650 1488370 : ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
2651 38010757 : FOR_EACH_ALLOCNO (a, ai)
2652 : {
2653 36522387 : hard_regno = ALLOCNO_HARD_REGNO (a);
2654 36522387 : ira_assert (hard_regno < 0
2655 : || (ira_hard_reg_in_set_p
2656 : (hard_regno, ALLOCNO_MODE (a),
2657 : reg_class_contents[ALLOCNO_CLASS (a)])));
2658 36522387 : if (hard_regno < 0)
2659 : {
2660 3724198 : cost = ALLOCNO_MEMORY_COST (a);
2661 3724198 : ira_mem_cost += cost;
2662 : }
2663 32798189 : else if (ALLOCNO_HARD_REG_COSTS (a) != NULL)
2664 : {
2665 8552872 : cost = (ALLOCNO_HARD_REG_COSTS (a)
2666 : [ira_class_hard_reg_index
2667 8552872 : [ALLOCNO_CLASS (a)][hard_regno]]);
2668 8552872 : ira_reg_cost += cost;
2669 : }
2670 : else
2671 : {
2672 24245317 : cost = ALLOCNO_CLASS_COST (a);
2673 24245317 : ira_reg_cost += cost;
2674 : }
2675 36522387 : ira_overall_cost += cost;
2676 : }
2677 :
2678 1488370 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
2679 : {
2680 95 : fprintf (ira_dump_file,
2681 : "+++Costs: overall %" PRId64
2682 : ", reg %" PRId64
2683 : ", mem %" PRId64
2684 : ", ld %" PRId64
2685 : ", st %" PRId64
2686 : ", move %" PRId64,
2687 : ira_overall_cost, ira_reg_cost, ira_mem_cost,
2688 : ira_load_cost, ira_store_cost, ira_shuffle_cost);
2689 95 : fprintf (ira_dump_file, "\n+++ move loops %d, new jumps %d\n",
2690 : ira_move_loops_num, ira_additional_jumps_num);
2691 : }
2692 :
2693 1488370 : }
2694 :
2695 : #ifdef ENABLE_IRA_CHECKING
2696 : /* Check the correctness of the allocation. We do need this because
2697 : of complicated code to transform more one region internal
2698 : representation into one region representation. */
2699 : static void
2700 0 : check_allocation (void)
2701 : {
2702 0 : ira_allocno_t a;
2703 0 : int hard_regno, nregs, conflict_nregs;
2704 0 : ira_allocno_iterator ai;
2705 :
2706 0 : FOR_EACH_ALLOCNO (a, ai)
2707 : {
2708 0 : int n = ALLOCNO_NUM_OBJECTS (a);
2709 0 : int i;
2710 :
2711 0 : if (ALLOCNO_CAP_MEMBER (a) != NULL
2712 0 : || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0)
2713 0 : continue;
2714 0 : nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a));
2715 0 : if (nregs == 1)
2716 : /* We allocated a single hard register. */
2717 : n = 1;
2718 0 : else if (n > 1)
2719 : /* We allocated multiple hard registers, and we will test
2720 : conflicts in a granularity of single hard regs. */
2721 0 : nregs = 1;
2722 :
2723 0 : for (i = 0; i < n; i++)
2724 : {
2725 0 : ira_object_t obj = ALLOCNO_OBJECT (a, i);
2726 0 : ira_object_t conflict_obj;
2727 0 : ira_object_conflict_iterator oci;
2728 0 : int this_regno = hard_regno;
2729 0 : if (n > 1)
2730 : {
2731 0 : if (REG_WORDS_BIG_ENDIAN)
2732 : this_regno += n - i - 1;
2733 : else
2734 0 : this_regno += i;
2735 : }
2736 0 : FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
2737 : {
2738 0 : ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
2739 0 : int conflict_hard_regno = ALLOCNO_HARD_REGNO (conflict_a);
2740 0 : if (conflict_hard_regno < 0)
2741 0 : continue;
2742 0 : if (ira_soft_conflict (a, conflict_a))
2743 0 : continue;
2744 :
2745 0 : conflict_nregs = hard_regno_nregs (conflict_hard_regno,
2746 0 : ALLOCNO_MODE (conflict_a));
2747 :
2748 0 : if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1
2749 0 : && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a))
2750 : {
2751 0 : if (REG_WORDS_BIG_ENDIAN)
2752 : conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a)
2753 : - OBJECT_SUBWORD (conflict_obj) - 1);
2754 : else
2755 0 : conflict_hard_regno += OBJECT_SUBWORD (conflict_obj);
2756 0 : conflict_nregs = 1;
2757 : }
2758 :
2759 0 : if ((conflict_hard_regno <= this_regno
2760 0 : && this_regno < conflict_hard_regno + conflict_nregs)
2761 0 : || (this_regno <= conflict_hard_regno
2762 0 : && conflict_hard_regno < this_regno + nregs))
2763 : {
2764 0 : fprintf (stderr, "bad allocation for %d and %d\n",
2765 : ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a));
2766 0 : gcc_unreachable ();
2767 : }
2768 : }
2769 : }
2770 : }
2771 0 : }
2772 : #endif
2773 :
2774 : /* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should
2775 : be already calculated. */
2776 : static void
2777 1488370 : setup_reg_equiv_init (void)
2778 : {
2779 1488370 : int i;
2780 1488370 : int max_regno = max_reg_num ();
2781 :
2782 205420867 : for (i = 0; i < max_regno; i++)
2783 202444127 : reg_equiv_init (i) = ira_reg_equiv[i].init_insns;
2784 1488370 : }
2785 :
2786 : /* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS
2787 : are insns which were generated for such movement. It is assumed
2788 : that FROM_REGNO and TO_REGNO always have the same value at the
2789 : point of any move containing such registers. This function is used
2790 : to update equiv info for register shuffles on the region borders
2791 : and for caller save/restore insns. */
2792 : void
2793 2147614 : ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx_insn *insns)
2794 : {
2795 2147614 : rtx_insn *insn;
2796 2147614 : rtx x, note;
2797 :
2798 2147614 : if (! ira_reg_equiv[from_regno].defined_p
2799 2147614 : && (! ira_reg_equiv[to_regno].defined_p
2800 923 : || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX
2801 922 : && ! MEM_READONLY_P (x))))
2802 : return;
2803 41870 : insn = insns;
2804 41870 : if (NEXT_INSN (insn) != NULL_RTX)
2805 : {
2806 0 : if (! ira_reg_equiv[to_regno].defined_p)
2807 : {
2808 0 : ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX);
2809 : return;
2810 : }
2811 0 : ira_reg_equiv[to_regno].defined_p = false;
2812 0 : ira_reg_equiv[to_regno].caller_save_p = false;
2813 0 : ira_reg_equiv[to_regno].memory
2814 0 : = ira_reg_equiv[to_regno].constant
2815 0 : = ira_reg_equiv[to_regno].invariant
2816 0 : = ira_reg_equiv[to_regno].init_insns = NULL;
2817 0 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2818 0 : fprintf (ira_dump_file,
2819 : " Invalidating equiv info for reg %d\n", to_regno);
2820 0 : return;
2821 : }
2822 : /* It is possible that FROM_REGNO still has no equivalence because
2823 : in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd
2824 : insn was not processed yet. */
2825 41870 : if (ira_reg_equiv[from_regno].defined_p)
2826 : {
2827 41869 : ira_reg_equiv[to_regno].defined_p = true;
2828 41869 : if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX)
2829 : {
2830 41716 : ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX
2831 : && ira_reg_equiv[from_regno].constant == NULL_RTX);
2832 41716 : ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX
2833 : || rtx_equal_p (ira_reg_equiv[to_regno].memory, x));
2834 41716 : ira_reg_equiv[to_regno].memory = x;
2835 41716 : if (! MEM_READONLY_P (x))
2836 : /* We don't add the insn to insn init list because memory
2837 : equivalence is just to say what memory is better to use
2838 : when the pseudo is spilled. */
2839 : return;
2840 : }
2841 153 : else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX)
2842 : {
2843 41 : ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX);
2844 41 : ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX
2845 : || rtx_equal_p (ira_reg_equiv[to_regno].constant, x));
2846 41 : ira_reg_equiv[to_regno].constant = x;
2847 : }
2848 : else
2849 : {
2850 112 : x = ira_reg_equiv[from_regno].invariant;
2851 112 : ira_assert (x != NULL_RTX);
2852 112 : ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX
2853 : || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x));
2854 112 : ira_reg_equiv[to_regno].invariant = x;
2855 : }
2856 170 : if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX)
2857 : {
2858 170 : note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (x));
2859 170 : gcc_assert (note != NULL_RTX);
2860 170 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2861 : {
2862 0 : fprintf (ira_dump_file,
2863 : " Adding equiv note to insn %u for reg %d ",
2864 0 : INSN_UID (insn), to_regno);
2865 0 : dump_value_slim (ira_dump_file, x, 1);
2866 0 : fprintf (ira_dump_file, "\n");
2867 : }
2868 : }
2869 : }
2870 171 : ira_reg_equiv[to_regno].init_insns
2871 342 : = gen_rtx_INSN_LIST (VOIDmode, insn,
2872 171 : ira_reg_equiv[to_regno].init_insns);
2873 171 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2874 0 : fprintf (ira_dump_file,
2875 : " Adding equiv init move insn %u to reg %d\n",
2876 0 : INSN_UID (insn), to_regno);
2877 : }
2878 :
2879 : /* Fix values of array REG_EQUIV_INIT after live range splitting done
2880 : by IRA. */
2881 : static void
2882 2083540 : fix_reg_equiv_init (void)
2883 : {
2884 2083540 : int max_regno = max_reg_num ();
2885 2083540 : int i, new_regno, max;
2886 2083540 : rtx set;
2887 2083540 : rtx_insn_list *x, *next, *prev;
2888 2083540 : rtx_insn *insn;
2889 :
2890 2083540 : if (max_regno_before_ira < max_regno)
2891 : {
2892 507965 : max = vec_safe_length (reg_equivs);
2893 507965 : grow_reg_equivs ();
2894 47406720 : for (i = FIRST_PSEUDO_REGISTER; i < max; i++)
2895 46898755 : for (prev = NULL, x = reg_equiv_init (i);
2896 51399388 : x != NULL_RTX;
2897 : x = next)
2898 : {
2899 4500633 : next = x->next ();
2900 4500633 : insn = x->insn ();
2901 4500633 : set = single_set (insn);
2902 4500633 : ira_assert (set != NULL_RTX
2903 : && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))));
2904 4500633 : if (REG_P (SET_DEST (set))
2905 4500633 : && ((int) REGNO (SET_DEST (set)) == i
2906 0 : || (int) ORIGINAL_REGNO (SET_DEST (set)) == i))
2907 : new_regno = REGNO (SET_DEST (set));
2908 495599 : else if (REG_P (SET_SRC (set))
2909 495599 : && ((int) REGNO (SET_SRC (set)) == i
2910 0 : || (int) ORIGINAL_REGNO (SET_SRC (set)) == i))
2911 : new_regno = REGNO (SET_SRC (set));
2912 : else
2913 0 : gcc_unreachable ();
2914 4500633 : if (new_regno == i)
2915 : prev = x;
2916 : else
2917 : {
2918 : /* Remove the wrong list element. */
2919 0 : if (prev == NULL_RTX)
2920 0 : reg_equiv_init (i) = next;
2921 : else
2922 0 : XEXP (prev, 1) = next;
2923 0 : XEXP (x, 1) = reg_equiv_init (new_regno);
2924 0 : reg_equiv_init (new_regno) = x;
2925 : }
2926 : }
2927 : }
2928 2083540 : }
2929 :
2930 : #ifdef ENABLE_IRA_CHECKING
2931 : /* Print redundant memory-memory copies. */
2932 : static void
2933 1041770 : print_redundant_copies (void)
2934 : {
2935 1041770 : int hard_regno;
2936 1041770 : ira_allocno_t a;
2937 1041770 : ira_copy_t cp, next_cp;
2938 1041770 : ira_allocno_iterator ai;
2939 :
2940 26007026 : FOR_EACH_ALLOCNO (a, ai)
2941 : {
2942 24965256 : if (ALLOCNO_CAP_MEMBER (a) != NULL)
2943 : /* It is a cap. */
2944 3676953 : continue;
2945 21288303 : hard_regno = ALLOCNO_HARD_REGNO (a);
2946 21288303 : if (hard_regno >= 0)
2947 18050638 : continue;
2948 4182811 : for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp)
2949 945146 : if (cp->first == a)
2950 371816 : next_cp = cp->next_first_allocno_copy;
2951 : else
2952 : {
2953 573330 : next_cp = cp->next_second_allocno_copy;
2954 573330 : if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL
2955 1 : && cp->insn != NULL_RTX
2956 0 : && ALLOCNO_HARD_REGNO (cp->first) == hard_regno)
2957 0 : fprintf (ira_dump_file,
2958 : " Redundant move from %d(freq %d):%d\n",
2959 0 : INSN_UID (cp->insn), cp->freq, hard_regno);
2960 : }
2961 : }
2962 1041770 : }
2963 : #endif
2964 :
2965 : /* Setup preferred and alternative classes for new pseudo-registers
2966 : created by IRA starting with START. */
2967 : static void
2968 1077382 : setup_preferred_alternate_classes_for_new_pseudos (int start)
2969 : {
2970 1077382 : int i, old_regno;
2971 1077382 : int max_regno = max_reg_num ();
2972 :
2973 2258859 : for (i = start; i < max_regno; i++)
2974 : {
2975 1181477 : old_regno = ORIGINAL_REGNO (regno_reg_rtx[i]);
2976 1181477 : ira_assert (i != old_regno);
2977 1181477 : setup_reg_classes (i, reg_preferred_class (old_regno),
2978 : reg_alternate_class (old_regno),
2979 : reg_allocno_class (old_regno));
2980 1181477 : if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
2981 0 : fprintf (ira_dump_file,
2982 : " New r%d: setting preferred %s, alternative %s\n",
2983 0 : i, reg_class_names[reg_preferred_class (old_regno)],
2984 0 : reg_class_names[reg_alternate_class (old_regno)]);
2985 : }
2986 1077382 : }
2987 :
2988 :
2989 : /* The number of entries allocated in reg_info. */
2990 : static int allocated_reg_info_size;
2991 :
2992 : /* Regional allocation can create new pseudo-registers. This function
2993 : expands some arrays for pseudo-registers. */
2994 : static void
2995 1077382 : expand_reg_info (void)
2996 : {
2997 1077382 : int i;
2998 1077382 : int size = max_reg_num ();
2999 :
3000 1077382 : resize_reg_info ();
3001 2258859 : for (i = allocated_reg_info_size; i < size; i++)
3002 1181477 : setup_reg_classes (i, GENERAL_REGS, ALL_REGS, GENERAL_REGS);
3003 1077382 : setup_preferred_alternate_classes_for_new_pseudos (allocated_reg_info_size);
3004 1077382 : allocated_reg_info_size = size;
3005 1077382 : }
3006 :
3007 : /* Return TRUE if there is too high register pressure in the function.
3008 : It is used to decide when stack slot sharing is worth to do. */
3009 : static bool
3010 1488370 : too_high_register_pressure_p (void)
3011 : {
3012 1488370 : int i;
3013 1488370 : enum reg_class pclass;
3014 :
3015 7479440 : for (i = 0; i < ira_pressure_classes_num; i++)
3016 : {
3017 5991072 : pclass = ira_pressure_classes[i];
3018 5991072 : if (ira_loop_tree_root->reg_pressure[pclass] > 10000)
3019 : return true;
3020 : }
3021 : return false;
3022 : }
3023 :
3024 :
3025 :
3026 : /* Indicate that hard register number FROM was eliminated and replaced with
3027 : an offset from hard register number TO. The status of hard registers live
3028 : at the start of a basic block is updated by replacing a use of FROM with
3029 : a use of TO. */
3030 :
3031 : void
3032 0 : mark_elimination (int from, int to)
3033 : {
3034 0 : basic_block bb;
3035 0 : bitmap r;
3036 :
3037 0 : FOR_EACH_BB_FN (bb, cfun)
3038 : {
3039 0 : r = DF_LR_IN (bb);
3040 0 : if (bitmap_bit_p (r, from))
3041 : {
3042 0 : bitmap_clear_bit (r, from);
3043 0 : bitmap_set_bit (r, to);
3044 : }
3045 0 : if (! df_live)
3046 0 : continue;
3047 0 : r = DF_LIVE_IN (bb);
3048 0 : if (bitmap_bit_p (r, from))
3049 : {
3050 0 : bitmap_clear_bit (r, from);
3051 0 : bitmap_set_bit (r, to);
3052 : }
3053 : }
3054 0 : }
3055 :
3056 :
3057 :
3058 : /* The length of the following array. */
3059 : int ira_reg_equiv_len;
3060 :
3061 : /* Info about equiv. info for each register. */
3062 : struct ira_reg_equiv_s *ira_reg_equiv;
3063 :
3064 : /* Expand ira_reg_equiv if necessary. */
3065 : void
3066 14886968 : ira_expand_reg_equiv (void)
3067 : {
3068 14886968 : int old = ira_reg_equiv_len;
3069 :
3070 14886968 : if (ira_reg_equiv_len > max_reg_num ())
3071 : return;
3072 1491613 : ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1;
3073 1491613 : ira_reg_equiv
3074 2983226 : = (struct ira_reg_equiv_s *) xrealloc (ira_reg_equiv,
3075 1491613 : ira_reg_equiv_len
3076 : * sizeof (struct ira_reg_equiv_s));
3077 1491613 : gcc_assert (old < ira_reg_equiv_len);
3078 1491613 : memset (ira_reg_equiv + old, 0,
3079 1491613 : sizeof (struct ira_reg_equiv_s) * (ira_reg_equiv_len - old));
3080 : }
3081 :
3082 : static void
3083 1488370 : init_reg_equiv (void)
3084 : {
3085 1488370 : ira_reg_equiv_len = 0;
3086 1488370 : ira_reg_equiv = NULL;
3087 0 : ira_expand_reg_equiv ();
3088 0 : }
3089 :
3090 : static void
3091 1488370 : finish_reg_equiv (void)
3092 : {
3093 1488370 : free (ira_reg_equiv);
3094 0 : }
3095 :
3096 :
3097 :
3098 : struct equivalence
3099 : {
3100 : /* Set when a REG_EQUIV note is found or created. Use to
3101 : keep track of what memory accesses might be created later,
3102 : e.g. by reload. */
3103 : rtx replacement;
3104 : rtx *src_p;
3105 :
3106 : /* The list of each instruction which initializes this register.
3107 :
3108 : NULL indicates we know nothing about this register's equivalence
3109 : properties.
3110 :
3111 : An INSN_LIST with a NULL insn indicates this pseudo is already
3112 : known to not have a valid equivalence. */
3113 : rtx_insn_list *init_insns;
3114 :
3115 : /* Loop depth is used to recognize equivalences which appear
3116 : to be present within the same loop (or in an inner loop). */
3117 : short loop_depth;
3118 : /* Nonzero if this had a preexisting REG_EQUIV note. */
3119 : unsigned char is_arg_equivalence : 1;
3120 : /* Set when an attempt should be made to replace a register
3121 : with the associated src_p entry. */
3122 : unsigned char replace : 1;
3123 : /* Set if this register has no known equivalence. */
3124 : unsigned char no_equiv : 1;
3125 : /* Set if this register is mentioned in a paradoxical subreg. */
3126 : unsigned char pdx_subregs : 1;
3127 : };
3128 :
3129 : /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence
3130 : structure for that register. */
3131 : static struct equivalence *reg_equiv;
3132 :
3133 : /* Used for communication between the following two functions. */
3134 : struct equiv_mem_data
3135 : {
3136 : /* A MEM that we wish to ensure remains unchanged. */
3137 : rtx equiv_mem;
3138 :
3139 : /* Set true if EQUIV_MEM is modified. */
3140 : bool equiv_mem_modified;
3141 : };
3142 :
3143 : /* If EQUIV_MEM is modified by modifying DEST, indicate that it is modified.
3144 : Called via note_stores. */
3145 : static void
3146 14236506 : validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED,
3147 : void *data)
3148 : {
3149 14236506 : struct equiv_mem_data *info = (struct equiv_mem_data *) data;
3150 :
3151 14236506 : if ((REG_P (dest)
3152 10476559 : && reg_overlap_mentioned_p (dest, info->equiv_mem))
3153 24698025 : || (MEM_P (dest)
3154 3725999 : && anti_dependence (info->equiv_mem, dest)))
3155 320996 : info->equiv_mem_modified = true;
3156 14236506 : }
3157 :
3158 : static bool equiv_init_varies_p (rtx x);
3159 :
3160 : enum valid_equiv { valid_none, valid_combine, valid_reload };
3161 :
3162 : /* Verify that no store between START and the death of REG invalidates
3163 : MEMREF. MEMREF is invalidated by modifying a register used in MEMREF,
3164 : by storing into an overlapping memory location, or with a non-const
3165 : CALL_INSN.
3166 :
3167 : Return VALID_RELOAD if MEMREF remains valid for both reload and
3168 : combine_and_move insns, VALID_COMBINE if only valid for
3169 : combine_and_move_insns, and VALID_NONE otherwise. */
3170 : static enum valid_equiv
3171 3974814 : validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref)
3172 : {
3173 3974814 : rtx_insn *insn;
3174 3974814 : rtx note;
3175 3974814 : struct equiv_mem_data info = { memref, false };
3176 3974814 : enum valid_equiv ret = valid_reload;
3177 :
3178 : /* If the memory reference has side effects or is volatile, it isn't a
3179 : valid equivalence. */
3180 3974814 : if (side_effects_p (memref))
3181 : return valid_none;
3182 :
3183 21096713 : for (insn = start; insn; insn = NEXT_INSN (insn))
3184 : {
3185 21096497 : if (!INSN_P (insn))
3186 1398740 : continue;
3187 :
3188 19697757 : if (find_reg_note (insn, REG_DEAD, reg))
3189 : return ret;
3190 :
3191 16922053 : if (CALL_P (insn))
3192 : {
3193 : /* We can combine a reg def from one insn into a reg use in
3194 : another over a call if the memory is readonly or the call
3195 : const/pure. However, we can't set reg_equiv notes up for
3196 : reload over any call. The problem is the equivalent form
3197 : may reference a pseudo which gets assigned a call
3198 : clobbered hard reg. When we later replace REG with its
3199 : equivalent form, the value in the call-clobbered reg has
3200 : been changed and all hell breaks loose. */
3201 92421 : ret = valid_combine;
3202 92421 : if (!MEM_READONLY_P (memref)
3203 92421 : && (!RTL_CONST_OR_PURE_CALL_P (insn)
3204 7974 : || equiv_init_varies_p (XEXP (memref, 0))))
3205 86443 : return valid_none;
3206 : }
3207 :
3208 16835610 : note_stores (insn, validate_equiv_mem_from_store, &info);
3209 16835610 : if (info.equiv_mem_modified)
3210 : return valid_none;
3211 :
3212 : /* If a register mentioned in MEMREF is modified via an
3213 : auto-increment, we lose the equivalence. Do the same if one
3214 : dies; although we could extend the life, it doesn't seem worth
3215 : the trouble. */
3216 :
3217 23039714 : for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3218 7139946 : if ((REG_NOTE_KIND (note) == REG_INC
3219 7139946 : || REG_NOTE_KIND (note) == REG_DEAD)
3220 5369222 : && REG_P (XEXP (note, 0))
3221 12509168 : && reg_overlap_mentioned_p (XEXP (note, 0), memref))
3222 : return valid_none;
3223 : }
3224 :
3225 : return valid_none;
3226 : }
3227 :
3228 : /* Returns false if X is known to be invariant. */
3229 : static bool
3230 827703 : equiv_init_varies_p (rtx x)
3231 : {
3232 827703 : RTX_CODE code = GET_CODE (x);
3233 827703 : int i;
3234 827703 : const char *fmt;
3235 :
3236 827703 : switch (code)
3237 : {
3238 219868 : case MEM:
3239 219868 : return !MEM_READONLY_P (x) || equiv_init_varies_p (XEXP (x, 0));
3240 :
3241 : case CONST:
3242 : CASE_CONST_ANY:
3243 : case SYMBOL_REF:
3244 : case LABEL_REF:
3245 : return false;
3246 :
3247 197585 : case REG:
3248 197585 : return reg_equiv[REGNO (x)].replace == 0 && rtx_varies_p (x, 0);
3249 :
3250 0 : case ASM_OPERANDS:
3251 0 : if (MEM_VOLATILE_P (x))
3252 : return true;
3253 :
3254 : /* Fall through. */
3255 :
3256 132154 : default:
3257 132154 : break;
3258 : }
3259 :
3260 132154 : fmt = GET_RTX_FORMAT (code);
3261 324802 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3262 217963 : if (fmt[i] == 'e')
3263 : {
3264 216015 : if (equiv_init_varies_p (XEXP (x, i)))
3265 : return true;
3266 : }
3267 1948 : else if (fmt[i] == 'E')
3268 : {
3269 : int j;
3270 3276 : for (j = 0; j < XVECLEN (x, i); j++)
3271 2916 : if (equiv_init_varies_p (XVECEXP (x, i, j)))
3272 : return true;
3273 : }
3274 :
3275 : return false;
3276 : }
3277 :
3278 : /* Returns true if X (used to initialize register REGNO) is movable.
3279 : X is only movable if the registers it uses have equivalent initializations
3280 : which appear to be within the same loop (or in an inner loop) and movable
3281 : or if they are not candidates for local_alloc and don't vary. */
3282 : static bool
3283 10285338 : equiv_init_movable_p (rtx x, int regno)
3284 : {
3285 13169623 : int i, j;
3286 13169623 : const char *fmt;
3287 13169623 : enum rtx_code code = GET_CODE (x);
3288 :
3289 13169623 : switch (code)
3290 : {
3291 2884285 : case SET:
3292 2884285 : return equiv_init_movable_p (SET_SRC (x), regno);
3293 :
3294 : case CLOBBER:
3295 : return false;
3296 :
3297 : case PRE_INC:
3298 : case PRE_DEC:
3299 : case POST_INC:
3300 : case POST_DEC:
3301 : case PRE_MODIFY:
3302 : case POST_MODIFY:
3303 : return false;
3304 :
3305 1744569 : case REG:
3306 1744569 : return ((reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth
3307 1289238 : && reg_equiv[REGNO (x)].replace)
3308 2963716 : || (REG_BASIC_BLOCK (REGNO (x)) < NUM_FIXED_BLOCKS
3309 1531117 : && ! rtx_varies_p (x, 0)));
3310 :
3311 : case UNSPEC_VOLATILE:
3312 : return false;
3313 :
3314 0 : case ASM_OPERANDS:
3315 0 : if (MEM_VOLATILE_P (x))
3316 : return false;
3317 :
3318 : /* Fall through. */
3319 :
3320 7901317 : default:
3321 7901317 : break;
3322 : }
3323 :
3324 7901317 : fmt = GET_RTX_FORMAT (code);
3325 18700984 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3326 12794836 : switch (fmt[i])
3327 : {
3328 5895813 : case 'e':
3329 5895813 : if (! equiv_init_movable_p (XEXP (x, i), regno))
3330 : return false;
3331 : break;
3332 756626 : case 'E':
3333 982956 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3334 865788 : if (! equiv_init_movable_p (XVECEXP (x, i, j), regno))
3335 : return false;
3336 : break;
3337 : }
3338 :
3339 : return true;
3340 : }
3341 :
3342 : static bool memref_referenced_p (rtx memref, rtx x, bool read_p);
3343 :
3344 : /* Auxiliary function for memref_referenced_p. Process setting X for
3345 : MEMREF store. */
3346 : static bool
3347 834763 : process_set_for_memref_referenced_p (rtx memref, rtx x)
3348 : {
3349 : /* If we are setting a MEM, it doesn't count (its address does), but any
3350 : other SET_DEST that has a MEM in it is referencing the MEM. */
3351 834763 : if (MEM_P (x))
3352 : {
3353 675924 : if (memref_referenced_p (memref, XEXP (x, 0), true))
3354 : return true;
3355 : }
3356 158839 : else if (memref_referenced_p (memref, x, false))
3357 : return true;
3358 :
3359 : return false;
3360 : }
3361 :
3362 : /* TRUE if X references a memory location (as a read if READ_P) that
3363 : would be affected by a store to MEMREF. */
3364 : static bool
3365 3963723 : memref_referenced_p (rtx memref, rtx x, bool read_p)
3366 : {
3367 3963723 : int i, j;
3368 3963723 : const char *fmt;
3369 3963723 : enum rtx_code code = GET_CODE (x);
3370 :
3371 3963723 : switch (code)
3372 : {
3373 : case CONST:
3374 : case LABEL_REF:
3375 : case SYMBOL_REF:
3376 : CASE_CONST_ANY:
3377 : case PC:
3378 : case HIGH:
3379 : case LO_SUM:
3380 : return false;
3381 :
3382 1588740 : case REG:
3383 1588740 : return (reg_equiv[REGNO (x)].replacement
3384 1660340 : && memref_referenced_p (memref,
3385 71600 : reg_equiv[REGNO (x)].replacement, read_p));
3386 :
3387 129443 : case MEM:
3388 : /* Memory X might have another effective type than MEMREF. */
3389 129443 : if (read_p || true_dependence (memref, VOIDmode, x))
3390 117064 : return true;
3391 : break;
3392 :
3393 818635 : case SET:
3394 818635 : if (process_set_for_memref_referenced_p (memref, SET_DEST (x)))
3395 : return true;
3396 :
3397 803913 : return memref_referenced_p (memref, SET_SRC (x), true);
3398 :
3399 16128 : case CLOBBER:
3400 16128 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3401 : return true;
3402 :
3403 : return false;
3404 :
3405 0 : case PRE_DEC:
3406 0 : case POST_DEC:
3407 0 : case PRE_INC:
3408 0 : case POST_INC:
3409 0 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3410 : return true;
3411 :
3412 0 : return memref_referenced_p (memref, XEXP (x, 0), true);
3413 :
3414 0 : case POST_MODIFY:
3415 0 : case PRE_MODIFY:
3416 : /* op0 = op0 + op1 */
3417 0 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3418 : return true;
3419 :
3420 0 : if (memref_referenced_p (memref, XEXP (x, 0), true))
3421 : return true;
3422 :
3423 0 : return memref_referenced_p (memref, XEXP (x, 1), true);
3424 :
3425 : default:
3426 : break;
3427 : }
3428 :
3429 737810 : fmt = GET_RTX_FORMAT (code);
3430 2143503 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3431 1437255 : switch (fmt[i])
3432 : {
3433 1394971 : case 'e':
3434 1394971 : if (memref_referenced_p (memref, XEXP (x, i), read_p))
3435 : return true;
3436 : break;
3437 19898 : case 'E':
3438 56823 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3439 39863 : if (memref_referenced_p (memref, XVECEXP (x, i, j), read_p))
3440 : return true;
3441 : break;
3442 : }
3443 :
3444 : return false;
3445 : }
3446 :
3447 : /* TRUE if some insn in the range (START, END] references a memory location
3448 : that would be affected by a store to MEMREF.
3449 :
3450 : Callers should not call this routine if START is after END in the
3451 : RTL chain. */
3452 :
3453 : static bool
3454 622462 : memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end)
3455 : {
3456 622462 : rtx_insn *insn;
3457 :
3458 2076323 : for (insn = NEXT_INSN (start);
3459 4136953 : insn && insn != NEXT_INSN (end);
3460 1453861 : insn = NEXT_INSN (insn))
3461 : {
3462 1570925 : if (!NONDEBUG_INSN_P (insn))
3463 752312 : continue;
3464 :
3465 818613 : if (memref_referenced_p (memref, PATTERN (insn), false))
3466 : return true;
3467 :
3468 : /* Nonconst functions may access memory. */
3469 701549 : if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn)))
3470 : return true;
3471 : }
3472 :
3473 505398 : gcc_assert (insn == NEXT_INSN (end));
3474 : return false;
3475 : }
3476 :
3477 : /* Mark REG as having no known equivalence.
3478 : Some instructions might have been processed before and furnished
3479 : with REG_EQUIV notes for this register; these notes will have to be
3480 : removed.
3481 : STORE is the piece of RTL that does the non-constant / conflicting
3482 : assignment - a SET, CLOBBER or REG_INC note. It is currently not used,
3483 : but needs to be there because this function is called from note_stores. */
3484 : static void
3485 50532603 : no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
3486 : void *data ATTRIBUTE_UNUSED)
3487 : {
3488 50532603 : int regno;
3489 50532603 : rtx_insn_list *list;
3490 :
3491 50532603 : if (!REG_P (reg))
3492 : return;
3493 34937555 : regno = REGNO (reg);
3494 34937555 : reg_equiv[regno].no_equiv = 1;
3495 34937555 : list = reg_equiv[regno].init_insns;
3496 63563228 : if (list && list->insn () == NULL)
3497 : return;
3498 7087941 : reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, NULL_RTX, NULL);
3499 7087941 : reg_equiv[regno].replacement = NULL_RTX;
3500 : /* This doesn't matter for equivalences made for argument registers, we
3501 : should keep their initialization insns. */
3502 7087941 : if (reg_equiv[regno].is_arg_equivalence)
3503 : return;
3504 7082681 : ira_reg_equiv[regno].defined_p = false;
3505 7082681 : ira_reg_equiv[regno].caller_save_p = false;
3506 7082681 : ira_reg_equiv[regno].init_insns = NULL;
3507 7892428 : for (; list; list = list->next ())
3508 : {
3509 809747 : rtx_insn *insn = list->insn ();
3510 809747 : remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX));
3511 : }
3512 : }
3513 :
3514 : /* Check whether the SUBREG is a paradoxical subreg and set the result
3515 : in PDX_SUBREGS. */
3516 :
3517 : static void
3518 83397276 : set_paradoxical_subreg (rtx_insn *insn)
3519 : {
3520 83397276 : subrtx_iterator::array_type array;
3521 528267719 : FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
3522 : {
3523 444870443 : const_rtx subreg = *iter;
3524 444870443 : if (GET_CODE (subreg) == SUBREG)
3525 : {
3526 2917212 : const_rtx reg = SUBREG_REG (subreg);
3527 2917212 : if (REG_P (reg) && paradoxical_subreg_p (subreg))
3528 829946 : reg_equiv[REGNO (reg)].pdx_subregs = true;
3529 : }
3530 : }
3531 83397276 : }
3532 :
3533 : /* In DEBUG_INSN location adjust REGs from CLEARED_REGS bitmap to the
3534 : equivalent replacement. */
3535 :
3536 : static rtx
3537 39972823 : adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
3538 : {
3539 39972823 : if (REG_P (loc))
3540 : {
3541 6080867 : bitmap cleared_regs = (bitmap) data;
3542 6080867 : if (bitmap_bit_p (cleared_regs, REGNO (loc)))
3543 17376 : return simplify_replace_fn_rtx (copy_rtx (*reg_equiv[REGNO (loc)].src_p),
3544 17376 : NULL_RTX, adjust_cleared_regs, data);
3545 : }
3546 : return NULL_RTX;
3547 : }
3548 :
3549 : /* Given register REGNO is set only once, return true if the defining
3550 : insn dominates all uses. */
3551 :
3552 : static bool
3553 49797 : def_dominates_uses (int regno)
3554 : {
3555 49797 : df_ref def = DF_REG_DEF_CHAIN (regno);
3556 :
3557 49797 : struct df_insn_info *def_info = DF_REF_INSN_INFO (def);
3558 : /* If this is an artificial def (eh handler regs, hard frame pointer
3559 : for non-local goto, regs defined on function entry) then def_info
3560 : is NULL and the reg is always live before any use. We might
3561 : reasonably return true in that case, but since the only call
3562 : of this function is currently here in ira.cc when we are looking
3563 : at a defining insn we can't have an artificial def as that would
3564 : bump DF_REG_DEF_COUNT. */
3565 49797 : gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && def_info != NULL);
3566 :
3567 49797 : rtx_insn *def_insn = DF_REF_INSN (def);
3568 49797 : basic_block def_bb = BLOCK_FOR_INSN (def_insn);
3569 :
3570 49797 : for (df_ref use = DF_REG_USE_CHAIN (regno);
3571 143112 : use;
3572 93315 : use = DF_REF_NEXT_REG (use))
3573 : {
3574 93315 : struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
3575 : /* Only check real uses, not artificial ones. */
3576 93315 : if (use_info)
3577 : {
3578 93315 : rtx_insn *use_insn = DF_REF_INSN (use);
3579 93315 : if (!DEBUG_INSN_P (use_insn))
3580 : {
3581 93061 : basic_block use_bb = BLOCK_FOR_INSN (use_insn);
3582 93061 : if (use_bb != def_bb
3583 93061 : ? !dominated_by_p (CDI_DOMINATORS, use_bb, def_bb)
3584 54813 : : DF_INSN_INFO_LUID (use_info) < DF_INSN_INFO_LUID (def_info))
3585 : return false;
3586 : }
3587 : }
3588 : }
3589 : return true;
3590 : }
3591 :
3592 : /* Scan the instructions before update_equiv_regs. Record which registers
3593 : are referenced as paradoxical subregs. Also check for cases in which
3594 : the current function needs to save a register that one of its call
3595 : instructions clobbers.
3596 :
3597 : These things are logically unrelated, but it's more efficient to do
3598 : them together. */
3599 :
3600 : static void
3601 1488370 : update_equiv_regs_prescan (void)
3602 : {
3603 1488370 : basic_block bb;
3604 1488370 : rtx_insn *insn;
3605 1488370 : function_abi_aggregator callee_abis;
3606 :
3607 15780720 : FOR_EACH_BB_FN (bb, cfun)
3608 171491835 : FOR_BB_INSNS (bb, insn)
3609 157199485 : if (NONDEBUG_INSN_P (insn))
3610 : {
3611 83397276 : set_paradoxical_subreg (insn);
3612 83397276 : if (CALL_P (insn))
3613 5963612 : callee_abis.note_callee_abi (insn_callee_abi (insn));
3614 : }
3615 :
3616 1488370 : HARD_REG_SET extra_caller_saves = callee_abis.caller_save_regs (*crtl->abi);
3617 :
3618 1488370 : hard_reg_set_iterator hrsi;
3619 1488370 : unsigned int regno = 0;
3620 2976740 : if (!hard_reg_set_empty_p (extra_caller_saves))
3621 : {
3622 0 : EXECUTE_IF_SET_IN_HARD_REG_SET (extra_caller_saves, 0, regno, hrsi)
3623 0 : df_set_regs_ever_live (regno, true);
3624 : }
3625 1488370 : }
3626 :
3627 : /* Find registers that are equivalent to a single value throughout the
3628 : compilation (either because they can be referenced in memory or are
3629 : set once from a single constant). Lower their priority for a
3630 : register.
3631 :
3632 : If such a register is only referenced once, try substituting its
3633 : value into the using insn. If it succeeds, we can eliminate the
3634 : register completely.
3635 :
3636 : Initialize init_insns in ira_reg_equiv array. */
3637 : static void
3638 1488370 : update_equiv_regs (void)
3639 : {
3640 1488370 : rtx_insn *insn;
3641 1488370 : basic_block bb;
3642 :
3643 : /* Scan the insns and find which registers have equivalences. Do this
3644 : in a separate scan of the insns because (due to -fcse-follow-jumps)
3645 : a register can be set below its use. */
3646 1488370 : bitmap setjmp_crosses = regstat_get_setjmp_crosses ();
3647 15780720 : FOR_EACH_BB_FN (bb, cfun)
3648 : {
3649 14292350 : int loop_depth = bb_loop_depth (bb);
3650 :
3651 171491835 : for (insn = BB_HEAD (bb);
3652 171491835 : insn != NEXT_INSN (BB_END (bb));
3653 157199485 : insn = NEXT_INSN (insn))
3654 : {
3655 157199485 : rtx note;
3656 157199485 : rtx set;
3657 157199485 : rtx dest, src;
3658 157199485 : int regno;
3659 :
3660 157199485 : if (! INSN_P (insn))
3661 26392955 : continue;
3662 :
3663 215829983 : for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3664 85023453 : if (REG_NOTE_KIND (note) == REG_INC)
3665 0 : no_equiv (XEXP (note, 0), note, NULL);
3666 :
3667 130806530 : set = single_set (insn);
3668 :
3669 : /* If this insn contains more (or less) than a single SET,
3670 : only mark all destinations as having no known equivalence. */
3671 186496936 : if (set == NULL_RTX
3672 130806530 : || side_effects_p (SET_SRC (set)))
3673 : {
3674 55690406 : note_pattern_stores (PATTERN (insn), no_equiv, NULL);
3675 55690406 : continue;
3676 : }
3677 75116124 : else if (GET_CODE (PATTERN (insn)) == PARALLEL)
3678 : {
3679 10511433 : int i;
3680 :
3681 31737812 : for (i = XVECLEN (PATTERN (insn), 0) - 1; i >= 0; i--)
3682 : {
3683 21226379 : rtx part = XVECEXP (PATTERN (insn), 0, i);
3684 21226379 : if (part != set)
3685 10714946 : note_pattern_stores (part, no_equiv, NULL);
3686 : }
3687 : }
3688 :
3689 75116124 : dest = SET_DEST (set);
3690 75116124 : src = SET_SRC (set);
3691 :
3692 : /* See if this is setting up the equivalence between an argument
3693 : register and its stack slot. */
3694 75116124 : note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3695 75116124 : if (note)
3696 : {
3697 231236 : gcc_assert (REG_P (dest));
3698 231236 : regno = REGNO (dest);
3699 :
3700 : /* Note that we don't want to clear init_insns in
3701 : ira_reg_equiv even if there are multiple sets of this
3702 : register. */
3703 231236 : reg_equiv[regno].is_arg_equivalence = 1;
3704 :
3705 : /* The insn result can have equivalence memory although
3706 : the equivalence is not set up by the insn. We add
3707 : this insn to init insns as it is a flag for now that
3708 : regno has an equivalence. We will remove the insn
3709 : from init insn list later. */
3710 231236 : if (rtx_equal_p (src, XEXP (note, 0)) || MEM_P (XEXP (note, 0)))
3711 231236 : ira_reg_equiv[regno].init_insns
3712 231236 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3713 231236 : ira_reg_equiv[regno].init_insns);
3714 :
3715 : /* Continue normally in case this is a candidate for
3716 : replacements. */
3717 : }
3718 :
3719 75116124 : if (!optimize)
3720 23095205 : continue;
3721 :
3722 : /* We only handle the case of a pseudo register being set
3723 : once, or always to the same value. */
3724 : /* ??? The mn10200 port breaks if we add equivalences for
3725 : values that need an ADDRESS_REGS register and set them equivalent
3726 : to a MEM of a pseudo. The actual problem is in the over-conservative
3727 : handling of INPADDR_ADDRESS / INPUT_ADDRESS / INPUT triples in
3728 : calculate_needs, but we traditionally work around this problem
3729 : here by rejecting equivalences when the destination is in a register
3730 : that's likely spilled. This is fragile, of course, since the
3731 : preferred class of a pseudo depends on all instructions that set
3732 : or use it. */
3733 :
3734 85929656 : if (!REG_P (dest)
3735 36053590 : || (regno = REGNO (dest)) < FIRST_PSEUDO_REGISTER
3736 20642782 : || (reg_equiv[regno].init_insns
3737 3352107 : && reg_equiv[regno].init_insns->insn () == NULL)
3738 70133164 : || (targetm.class_likely_spilled_p (reg_preferred_class (regno))
3739 350 : && MEM_P (src) && ! reg_equiv[regno].is_arg_equivalence))
3740 : {
3741 : /* This might be setting a SUBREG of a pseudo, a pseudo that is
3742 : also set somewhere else to a constant. */
3743 33908737 : note_pattern_stores (set, no_equiv, NULL);
3744 33908737 : continue;
3745 : }
3746 :
3747 : /* Don't set reg mentioned in a paradoxical subreg
3748 : equivalent to a mem. */
3749 18112182 : if (MEM_P (src) && reg_equiv[regno].pdx_subregs)
3750 : {
3751 17764 : note_pattern_stores (set, no_equiv, NULL);
3752 17764 : continue;
3753 : }
3754 :
3755 18094418 : note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
3756 :
3757 : /* cse sometimes generates function invariants, but doesn't put a
3758 : REG_EQUAL note on the insn. Since this note would be redundant,
3759 : there's no point creating it earlier than here. */
3760 18094418 : if (! note && ! rtx_varies_p (src, 0))
3761 2683609 : note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
3762 :
3763 : /* Don't bother considering a REG_EQUAL note containing an EXPR_LIST
3764 : since it represents a function call. */
3765 18094418 : if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST)
3766 14389676 : note = NULL_RTX;
3767 :
3768 18094418 : if (DF_REG_DEF_COUNT (regno) != 1)
3769 : {
3770 2841578 : bool equal_p = true;
3771 2841578 : rtx_insn_list *list;
3772 :
3773 : /* If we have already processed this pseudo and determined it
3774 : cannot have an equivalence, then honor that decision. */
3775 2841578 : if (reg_equiv[regno].no_equiv)
3776 0 : continue;
3777 :
3778 4613783 : if (! note
3779 1102376 : || rtx_varies_p (XEXP (note, 0), 0)
3780 3910951 : || (reg_equiv[regno].replacement
3781 0 : && ! rtx_equal_p (XEXP (note, 0),
3782 : reg_equiv[regno].replacement)))
3783 : {
3784 1772205 : no_equiv (dest, set, NULL);
3785 1772205 : continue;
3786 : }
3787 :
3788 1069373 : list = reg_equiv[regno].init_insns;
3789 2011018 : for (; list; list = list->next ())
3790 : {
3791 1026375 : rtx note_tmp;
3792 1026375 : rtx_insn *insn_tmp;
3793 :
3794 1026375 : insn_tmp = list->insn ();
3795 1026375 : note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX);
3796 1026375 : gcc_assert (note_tmp);
3797 1026375 : if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0)))
3798 : {
3799 : equal_p = false;
3800 : break;
3801 : }
3802 : }
3803 :
3804 1069373 : if (! equal_p)
3805 : {
3806 84730 : no_equiv (dest, set, NULL);
3807 84730 : continue;
3808 : }
3809 : }
3810 :
3811 : /* Record this insn as initializing this register. */
3812 16237483 : reg_equiv[regno].init_insns
3813 16237483 : = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns);
3814 :
3815 : /* If this register is known to be equal to a constant, record that
3816 : it is always equivalent to the constant.
3817 : Note that it is possible to have a register use before
3818 : the def in loops (see gcc.c-torture/execute/pr79286.c)
3819 : where the reg is undefined on first use. If the def insn
3820 : won't trap we can use it as an equivalence, effectively
3821 : choosing the "undefined" value for the reg to be the
3822 : same as the value set by the def. */
3823 16237483 : if (DF_REG_DEF_COUNT (regno) == 1
3824 15252840 : && note
3825 2602366 : && !rtx_varies_p (XEXP (note, 0), 0)
3826 18413324 : && (!may_trap_or_fault_p (XEXP (note, 0))
3827 49797 : || def_dominates_uses (regno)))
3828 : {
3829 2175841 : rtx note_value = XEXP (note, 0);
3830 2175841 : remove_note (insn, note);
3831 2175841 : set_unique_reg_note (insn, REG_EQUIV, note_value);
3832 : }
3833 :
3834 : /* If this insn introduces a "constant" register, decrease the priority
3835 : of that register. Record this insn if the register is only used once
3836 : more and the equivalence value is the same as our source.
3837 :
3838 : The latter condition is checked for two reasons: First, it is an
3839 : indication that it may be more efficient to actually emit the insn
3840 : as written (if no registers are available, reload will substitute
3841 : the equivalence). Secondly, it avoids problems with any registers
3842 : dying in this insn whose death notes would be missed.
3843 :
3844 : If we don't have a REG_EQUIV note, see if this insn is loading
3845 : a register used only in one basic block from a MEM. If so, and the
3846 : MEM remains unchanged for the life of the register, add a REG_EQUIV
3847 : note. */
3848 16237483 : note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3849 :
3850 16237483 : rtx replacement = NULL_RTX;
3851 16237483 : if (note)
3852 2399659 : replacement = XEXP (note, 0);
3853 13837824 : else if (REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3854 9928252 : && MEM_P (SET_SRC (set)))
3855 : {
3856 2913029 : enum valid_equiv validity;
3857 2913029 : validity = validate_equiv_mem (insn, dest, SET_SRC (set));
3858 2913029 : if (validity != valid_none)
3859 : {
3860 2153019 : replacement = copy_rtx (SET_SRC (set));
3861 2153019 : if (validity == valid_reload)
3862 : {
3863 2152220 : note = set_unique_reg_note (insn, REG_EQUIV, replacement);
3864 : }
3865 799 : else if (ira_use_lra_p)
3866 : {
3867 : /* We still can use this equivalence for caller save
3868 : optimization in LRA. Mark this. */
3869 799 : ira_reg_equiv[regno].caller_save_p = true;
3870 799 : ira_reg_equiv[regno].init_insns
3871 799 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3872 799 : ira_reg_equiv[regno].init_insns);
3873 : }
3874 : }
3875 : }
3876 :
3877 : /* If we haven't done so, record for reload that this is an
3878 : equivalencing insn. */
3879 16237483 : if (note && !reg_equiv[regno].is_arg_equivalence)
3880 4328061 : ira_reg_equiv[regno].init_insns
3881 4328061 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3882 4328061 : ira_reg_equiv[regno].init_insns);
3883 :
3884 16237483 : if (replacement)
3885 : {
3886 4552678 : reg_equiv[regno].replacement = replacement;
3887 4552678 : reg_equiv[regno].src_p = &SET_SRC (set);
3888 4552678 : reg_equiv[regno].loop_depth = (short) loop_depth;
3889 :
3890 : /* Don't mess with things live during setjmp. */
3891 4552678 : if (optimize && !bitmap_bit_p (setjmp_crosses, regno))
3892 : {
3893 : /* If the register is referenced exactly twice, meaning it is
3894 : set once and used once, indicate that the reference may be
3895 : replaced by the equivalence we computed above. Do this
3896 : even if the register is only used in one block so that
3897 : dependencies can be handled where the last register is
3898 : used in a different block (i.e. HIGH / LO_SUM sequences)
3899 : and to reduce the number of registers alive across
3900 : calls. */
3901 :
3902 4552627 : if (REG_N_REFS (regno) == 2
3903 3596714 : && (rtx_equal_p (replacement, src)
3904 380934 : || ! equiv_init_varies_p (src))
3905 3523737 : && NONJUMP_INSN_P (insn)
3906 8076364 : && equiv_init_movable_p (PATTERN (insn), regno))
3907 2126613 : reg_equiv[regno].replace = 1;
3908 : }
3909 : }
3910 : }
3911 : }
3912 1488370 : }
3913 :
3914 : /* For insns that set a MEM to the contents of a REG that is only used
3915 : in a single basic block, see if the register is always equivalent
3916 : to that memory location and if moving the store from INSN to the
3917 : insn that sets REG is safe. If so, put a REG_EQUIV note on the
3918 : initializing insn. */
3919 : static void
3920 961263 : add_store_equivs (void)
3921 : {
3922 961263 : auto_sbitmap seen_insns (get_max_uid () + 1);
3923 961263 : bitmap_clear (seen_insns);
3924 :
3925 125807295 : for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
3926 : {
3927 124846032 : rtx set, src, dest;
3928 124846032 : unsigned regno;
3929 124846032 : rtx_insn *init_insn;
3930 :
3931 124846032 : bitmap_set_bit (seen_insns, INSN_UID (insn));
3932 :
3933 124846032 : if (! INSN_P (insn))
3934 23425455 : continue;
3935 :
3936 101420577 : set = single_set (insn);
3937 101420577 : if (! set)
3938 50840239 : continue;
3939 :
3940 50580338 : dest = SET_DEST (set);
3941 50580338 : src = SET_SRC (set);
3942 :
3943 : /* Don't add a REG_EQUIV note if the insn already has one. The existing
3944 : REG_EQUIV is likely more useful than the one we are adding. */
3945 7724078 : if (MEM_P (dest) && REG_P (src)
3946 5191173 : && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
3947 5124832 : && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3948 2930259 : && DF_REG_DEF_COUNT (regno) == 1
3949 2864849 : && ! reg_equiv[regno].pdx_subregs
3950 2721571 : && reg_equiv[regno].init_insns != NULL
3951 2721571 : && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0
3952 2667559 : && bitmap_bit_p (seen_insns, INSN_UID (init_insn))
3953 2667559 : && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX)
3954 1061785 : && validate_equiv_mem (init_insn, src, dest) == valid_reload
3955 622462 : && ! memref_used_between_p (dest, init_insn, insn)
3956 : /* Attaching a REG_EQUIV note will fail if INIT_INSN has
3957 : multiple sets. */
3958 51085736 : && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
3959 : {
3960 : /* This insn makes the equivalence, not the one initializing
3961 : the register. */
3962 504909 : ira_reg_equiv[regno].init_insns
3963 504909 : = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
3964 504909 : df_notes_rescan (init_insn);
3965 504909 : if (dump_file)
3966 88 : fprintf (dump_file,
3967 : "Adding REG_EQUIV to insn %d for source of insn %d\n",
3968 88 : INSN_UID (init_insn),
3969 88 : INSN_UID (insn));
3970 : }
3971 : }
3972 961263 : }
3973 :
3974 : /* Scan all regs killed in an insn to see if any of them are registers
3975 : only used that once. If so, see if we can replace the reference
3976 : with the equivalent form. If we can, delete the initializing
3977 : reference and this register will go away. If we can't replace the
3978 : reference, and the initializing reference is within the same loop
3979 : (or in an inner loop), then move the register initialization just
3980 : before the use, so that they are in the same basic block. */
3981 : static void
3982 1041717 : combine_and_move_insns (void)
3983 : {
3984 1041717 : auto_bitmap cleared_regs;
3985 1041717 : int max = max_reg_num ();
3986 :
3987 51431716 : for (int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++)
3988 : {
3989 50389999 : if (!reg_equiv[regno].replace)
3990 48263599 : continue;
3991 :
3992 2126400 : rtx_insn *use_insn = 0;
3993 2126400 : bool multiple_insns = false;
3994 2126400 : for (df_ref use = DF_REG_USE_CHAIN (regno);
3995 4273225 : use;
3996 2146825 : use = DF_REF_NEXT_REG (use))
3997 2146825 : if (DF_REF_INSN_INFO (use))
3998 : {
3999 2146825 : if (DEBUG_INSN_P (DF_REF_INSN (use)))
4000 20423 : continue;
4001 2126402 : if (use_insn && DF_REF_INSN (use) != use_insn)
4002 : {
4003 : multiple_insns = true;
4004 : break;
4005 : }
4006 : use_insn = DF_REF_INSN (use);
4007 : }
4008 2126400 : gcc_assert (use_insn);
4009 :
4010 : /* If a register is used by more than one insn, we cannot trivially move
4011 : or delete the definition anymore. */
4012 2126400 : if (multiple_insns)
4013 0 : continue;
4014 :
4015 : /* Don't substitute into jumps. indirect_jump_optimize does
4016 : this for anything we are prepared to handle. */
4017 2126400 : if (JUMP_P (use_insn))
4018 400 : continue;
4019 :
4020 : /* Also don't substitute into a conditional trap insn -- it can become
4021 : an unconditional trap, and that is a flow control insn. */
4022 2126000 : if (GET_CODE (PATTERN (use_insn)) == TRAP_IF)
4023 0 : continue;
4024 :
4025 2126000 : df_ref def = DF_REG_DEF_CHAIN (regno);
4026 2126000 : gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def));
4027 2126000 : rtx_insn *def_insn = DF_REF_INSN (def);
4028 :
4029 : /* We may not move instructions that can throw, since that
4030 : changes basic block boundaries and we are not prepared to
4031 : adjust the CFG to match. */
4032 2126000 : if (can_throw_internal (def_insn))
4033 0 : continue;
4034 :
4035 : /* Instructions with multiple sets can only be moved if DF analysis is
4036 : performed for all of the registers set. See PR91052. */
4037 2126000 : if (multiple_sets (def_insn))
4038 0 : continue;
4039 :
4040 2126000 : basic_block use_bb = BLOCK_FOR_INSN (use_insn);
4041 2126000 : basic_block def_bb = BLOCK_FOR_INSN (def_insn);
4042 2126000 : if (bb_loop_depth (use_bb) > bb_loop_depth (def_bb))
4043 134778 : continue;
4044 :
4045 1991222 : if (asm_noperands (PATTERN (def_insn)) < 0
4046 3982444 : && validate_replace_rtx (regno_reg_rtx[regno],
4047 1991222 : *reg_equiv[regno].src_p, use_insn))
4048 : {
4049 377261 : rtx link;
4050 : /* Append the REG_DEAD notes from def_insn. */
4051 755417 : for (rtx *p = ®_NOTES (def_insn); (link = *p) != 0; )
4052 : {
4053 378156 : if (REG_NOTE_KIND (link) == REG_DEAD)
4054 : {
4055 344 : *p = XEXP (link, 1);
4056 344 : XEXP (link, 1) = REG_NOTES (use_insn);
4057 344 : REG_NOTES (use_insn) = link;
4058 : }
4059 : else
4060 377812 : p = &XEXP (link, 1);
4061 : }
4062 :
4063 377261 : remove_death (regno, use_insn);
4064 377261 : SET_REG_N_REFS (regno, 0);
4065 377261 : REG_FREQ (regno) = 0;
4066 377261 : df_ref use;
4067 455574 : FOR_EACH_INSN_USE (use, def_insn)
4068 : {
4069 78313 : unsigned int use_regno = DF_REF_REGNO (use);
4070 78313 : if (!HARD_REGISTER_NUM_P (use_regno))
4071 1247 : reg_equiv[use_regno].replace = 0;
4072 : }
4073 :
4074 377261 : delete_insn (def_insn);
4075 :
4076 377261 : reg_equiv[regno].init_insns = NULL;
4077 377261 : ira_reg_equiv[regno].init_insns = NULL;
4078 377261 : bitmap_set_bit (cleared_regs, regno);
4079 : }
4080 :
4081 : /* Move the initialization of the register to just before
4082 : USE_INSN. Update the flow information. */
4083 1613961 : else if (prev_nondebug_insn (use_insn) != def_insn)
4084 : {
4085 312294 : rtx_insn *new_insn;
4086 :
4087 312294 : new_insn = emit_insn_before (PATTERN (def_insn), use_insn);
4088 312294 : REG_NOTES (new_insn) = REG_NOTES (def_insn);
4089 312294 : REG_NOTES (def_insn) = 0;
4090 : /* Rescan it to process the notes. */
4091 312294 : df_insn_rescan (new_insn);
4092 :
4093 : /* Make sure this insn is recognized before reload begins,
4094 : otherwise eliminate_regs_in_insn will die. */
4095 312294 : INSN_CODE (new_insn) = INSN_CODE (def_insn);
4096 :
4097 312294 : delete_insn (def_insn);
4098 :
4099 312294 : XEXP (reg_equiv[regno].init_insns, 0) = new_insn;
4100 :
4101 312294 : REG_BASIC_BLOCK (regno) = use_bb->index;
4102 312294 : REG_N_CALLS_CROSSED (regno) = 0;
4103 :
4104 312294 : if (use_insn == BB_HEAD (use_bb))
4105 0 : BB_HEAD (use_bb) = new_insn;
4106 :
4107 : /* We know regno dies in use_insn, but inside a loop
4108 : REG_DEAD notes might be missing when def_insn was in
4109 : another basic block. However, when we move def_insn into
4110 : this bb we'll definitely get a REG_DEAD note and reload
4111 : will see the death. It's possible that update_equiv_regs
4112 : set up an equivalence referencing regno for a reg set by
4113 : use_insn, when regno was seen as non-local. Now that
4114 : regno is local to this block, and dies, such an
4115 : equivalence is invalid. */
4116 312294 : if (find_reg_note (use_insn, REG_EQUIV, regno_reg_rtx[regno]))
4117 : {
4118 0 : rtx set = single_set (use_insn);
4119 0 : if (set && REG_P (SET_DEST (set)))
4120 0 : no_equiv (SET_DEST (set), set, NULL);
4121 : }
4122 :
4123 312294 : ira_reg_equiv[regno].init_insns
4124 312294 : = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX);
4125 312294 : bitmap_set_bit (cleared_regs, regno);
4126 : }
4127 : }
4128 :
4129 1041717 : if (!bitmap_empty_p (cleared_regs))
4130 : {
4131 221368 : basic_block bb;
4132 :
4133 5643864 : FOR_EACH_BB_FN (bb, cfun)
4134 : {
4135 10844992 : bitmap_and_compl_into (DF_LR_IN (bb), cleared_regs);
4136 10844992 : bitmap_and_compl_into (DF_LR_OUT (bb), cleared_regs);
4137 5422496 : if (!df_live)
4138 5422496 : continue;
4139 0 : bitmap_and_compl_into (DF_LIVE_IN (bb), cleared_regs);
4140 0 : bitmap_and_compl_into (DF_LIVE_OUT (bb), cleared_regs);
4141 : }
4142 :
4143 : /* Last pass - adjust debug insns referencing cleared regs. */
4144 221368 : if (MAY_HAVE_DEBUG_BIND_INSNS)
4145 59460048 : for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
4146 59339265 : if (DEBUG_BIND_INSN_P (insn))
4147 : {
4148 21950108 : rtx old_loc = INSN_VAR_LOCATION_LOC (insn);
4149 21950108 : INSN_VAR_LOCATION_LOC (insn)
4150 43900216 : = simplify_replace_fn_rtx (old_loc, NULL_RTX,
4151 : adjust_cleared_regs,
4152 21950108 : (void *) cleared_regs);
4153 21950108 : if (old_loc != INSN_VAR_LOCATION_LOC (insn))
4154 17037 : df_insn_rescan (insn);
4155 : }
4156 : }
4157 1041717 : }
4158 :
4159 : /* A pass over indirect jumps, converting simple cases to direct jumps.
4160 : Combine does this optimization too, but only within a basic block. */
4161 : static void
4162 1488370 : indirect_jump_optimize (void)
4163 : {
4164 1488370 : basic_block bb;
4165 1488370 : bool rebuild_p = false;
4166 :
4167 15780722 : FOR_EACH_BB_REVERSE_FN (bb, cfun)
4168 : {
4169 14292352 : rtx_insn *insn = BB_END (bb);
4170 19932901 : if (!JUMP_P (insn)
4171 14292352 : || find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX))
4172 5640549 : continue;
4173 :
4174 8651803 : rtx x = pc_set (insn);
4175 8651803 : if (!x || !REG_P (SET_SRC (x)))
4176 8650350 : continue;
4177 :
4178 1453 : int regno = REGNO (SET_SRC (x));
4179 1453 : if (DF_REG_DEF_COUNT (regno) == 1)
4180 : {
4181 1342 : df_ref def = DF_REG_DEF_CHAIN (regno);
4182 1342 : if (!DF_REF_IS_ARTIFICIAL (def))
4183 : {
4184 1342 : rtx_insn *def_insn = DF_REF_INSN (def);
4185 1342 : rtx lab = NULL_RTX;
4186 1342 : rtx set = single_set (def_insn);
4187 1342 : if (set && GET_CODE (SET_SRC (set)) == LABEL_REF)
4188 : lab = SET_SRC (set);
4189 : else
4190 : {
4191 1341 : rtx eqnote = find_reg_note (def_insn, REG_EQUAL, NULL_RTX);
4192 1341 : if (eqnote && GET_CODE (XEXP (eqnote, 0)) == LABEL_REF)
4193 : lab = XEXP (eqnote, 0);
4194 : }
4195 1 : if (lab && validate_replace_rtx (SET_SRC (x), lab, insn))
4196 : rebuild_p = true;
4197 : }
4198 : }
4199 : }
4200 :
4201 1488370 : if (rebuild_p)
4202 : {
4203 1 : timevar_push (TV_JUMP);
4204 1 : rebuild_jump_labels (get_insns ());
4205 1 : if (purge_all_dead_edges ())
4206 1 : delete_unreachable_blocks ();
4207 1 : timevar_pop (TV_JUMP);
4208 : }
4209 1488370 : }
4210 :
4211 : /* Set up fields memory, constant, and invariant from init_insns in
4212 : the structures of array ira_reg_equiv. */
4213 : static void
4214 1488370 : setup_reg_equiv (void)
4215 : {
4216 1488370 : int i;
4217 1488370 : rtx_insn_list *elem, *prev_elem, *next_elem;
4218 1488370 : rtx_insn *insn;
4219 1488370 : rtx set, x;
4220 :
4221 169146851 : for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++)
4222 167658481 : for (prev_elem = NULL, elem = ira_reg_equiv[i].init_insns;
4223 172220535 : elem;
4224 : prev_elem = elem, elem = next_elem)
4225 : {
4226 4685480 : next_elem = elem->next ();
4227 4685480 : insn = elem->insn ();
4228 4685480 : set = single_set (insn);
4229 :
4230 : /* Init insns can set up equivalence when the reg is a destination or
4231 : a source (in this case the destination is memory). */
4232 4685480 : if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
4233 : {
4234 4685480 : if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
4235 : {
4236 4179849 : x = XEXP (x, 0);
4237 4179849 : if (REG_P (SET_DEST (set))
4238 4179849 : && REGNO (SET_DEST (set)) == (unsigned int) i
4239 8359698 : && ! rtx_equal_p (SET_SRC (set), x) && MEM_P (x))
4240 : {
4241 : /* This insn reporting the equivalence but
4242 : actually not setting it. Remove it from the
4243 : list. */
4244 30401 : if (prev_elem == NULL)
4245 30401 : ira_reg_equiv[i].init_insns = next_elem;
4246 : else
4247 0 : XEXP (prev_elem, 1) = next_elem;
4248 : elem = prev_elem;
4249 : }
4250 : }
4251 505631 : else if (REG_P (SET_DEST (set))
4252 505631 : && REGNO (SET_DEST (set)) == (unsigned int) i)
4253 722 : x = SET_SRC (set);
4254 : else
4255 : {
4256 504909 : gcc_assert (REG_P (SET_SRC (set))
4257 : && REGNO (SET_SRC (set)) == (unsigned int) i);
4258 : x = SET_DEST (set);
4259 : }
4260 : /* If PIC is enabled and the equiv is not a LEGITIMATE_PIC_OPERAND,
4261 : we can't use it. */
4262 4685480 : if (! CONSTANT_P (x)
4263 983956 : || ! flag_pic
4264 : /* A function invariant is often CONSTANT_P but may
4265 : include a register. We promise to only pass
4266 : CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */
4267 4809073 : || LEGITIMATE_PIC_OPERAND_P (x))
4268 : {
4269 : /* It can happen that a REG_EQUIV note contains a MEM
4270 : that is not a legitimate memory operand. As later
4271 : stages of reload assume that all addresses found in
4272 : the lra_regno_equiv_* arrays were originally
4273 : legitimate, we ignore such REG_EQUIV notes. */
4274 4641083 : if (memory_operand (x, VOIDmode))
4275 : {
4276 2749070 : ira_reg_equiv[i].defined_p = !ira_reg_equiv[i].caller_save_p;
4277 2749070 : ira_reg_equiv[i].memory = x;
4278 2749070 : continue;
4279 : }
4280 1892013 : else if (function_invariant_p (x))
4281 : {
4282 1813491 : machine_mode mode;
4283 :
4284 1813491 : mode = GET_MODE (SET_DEST (set));
4285 1813491 : if (GET_CODE (x) == PLUS
4286 940745 : || x == frame_pointer_rtx || x == arg_pointer_rtx)
4287 : /* This is PLUS of frame pointer and a constant,
4288 : or fp, or argp. */
4289 873932 : ira_reg_equiv[i].invariant = x;
4290 939559 : else if (targetm.legitimate_constant_p (mode, x))
4291 697016 : ira_reg_equiv[i].constant = x;
4292 : else
4293 : {
4294 242543 : ira_reg_equiv[i].memory = force_const_mem (mode, x);
4295 242543 : if (ira_reg_equiv[i].memory == NULL_RTX)
4296 : {
4297 507 : ira_reg_equiv[i].defined_p = false;
4298 507 : ira_reg_equiv[i].caller_save_p = false;
4299 507 : ira_reg_equiv[i].init_insns = NULL;
4300 507 : break;
4301 : }
4302 : }
4303 1812984 : ira_reg_equiv[i].defined_p = true;
4304 1812984 : continue;
4305 1812984 : }
4306 : }
4307 : }
4308 122919 : ira_reg_equiv[i].defined_p = false;
4309 122919 : ira_reg_equiv[i].caller_save_p = false;
4310 122919 : ira_reg_equiv[i].init_insns = NULL;
4311 122919 : break;
4312 : }
4313 1488370 : }
4314 :
4315 :
4316 :
4317 : /* Print chain C to FILE. */
4318 : static void
4319 0 : print_insn_chain (FILE *file, class insn_chain *c)
4320 : {
4321 0 : fprintf (file, "insn=%d, ", INSN_UID (c->insn));
4322 0 : bitmap_print (file, &c->live_throughout, "live_throughout: ", ", ");
4323 0 : bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n");
4324 0 : }
4325 :
4326 :
4327 : /* Print all reload_insn_chains to FILE. */
4328 : static void
4329 0 : print_insn_chains (FILE *file)
4330 : {
4331 0 : class insn_chain *c;
4332 0 : for (c = reload_insn_chain; c ; c = c->next)
4333 0 : print_insn_chain (file, c);
4334 0 : }
4335 :
4336 : /* Return true if pseudo REGNO should be added to set live_throughout
4337 : or dead_or_set of the insn chains for reload consideration. */
4338 : static bool
4339 0 : pseudo_for_reload_consideration_p (int regno)
4340 : {
4341 : /* Consider spilled pseudos too for IRA because they still have a
4342 : chance to get hard-registers in the reload when IRA is used. */
4343 0 : return (reg_renumber[regno] >= 0 || ira_conflicts_p);
4344 : }
4345 :
4346 : /* Return true if we can track the individual bytes of subreg X.
4347 : When returning true, set *OUTER_SIZE to the number of bytes in
4348 : X itself, *INNER_SIZE to the number of bytes in the inner register
4349 : and *START to the offset of the first byte. */
4350 : static bool
4351 0 : get_subreg_tracking_sizes (rtx x, HOST_WIDE_INT *outer_size,
4352 : HOST_WIDE_INT *inner_size, HOST_WIDE_INT *start)
4353 : {
4354 0 : rtx reg = regno_reg_rtx[REGNO (SUBREG_REG (x))];
4355 0 : return (GET_MODE_SIZE (GET_MODE (x)).is_constant (outer_size)
4356 0 : && GET_MODE_SIZE (GET_MODE (reg)).is_constant (inner_size)
4357 0 : && SUBREG_BYTE (x).is_constant (start));
4358 : }
4359 :
4360 : /* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] for
4361 : a register with SIZE bytes, making the register live if INIT_VALUE. */
4362 : static void
4363 0 : init_live_subregs (bool init_value, sbitmap *live_subregs,
4364 : bitmap live_subregs_used, int allocnum, int size)
4365 : {
4366 0 : gcc_assert (size > 0);
4367 :
4368 : /* Been there, done that. */
4369 0 : if (bitmap_bit_p (live_subregs_used, allocnum))
4370 : return;
4371 :
4372 : /* Create a new one. */
4373 0 : if (live_subregs[allocnum] == NULL)
4374 0 : live_subregs[allocnum] = sbitmap_alloc (size);
4375 :
4376 : /* If the entire reg was live before blasting into subregs, we need
4377 : to init all of the subregs to ones else init to 0. */
4378 0 : if (init_value)
4379 0 : bitmap_ones (live_subregs[allocnum]);
4380 : else
4381 0 : bitmap_clear (live_subregs[allocnum]);
4382 :
4383 0 : bitmap_set_bit (live_subregs_used, allocnum);
4384 : }
4385 :
4386 : /* Walk the insns of the current function and build reload_insn_chain,
4387 : and record register life information. */
4388 : static void
4389 0 : build_insn_chain (void)
4390 : {
4391 0 : unsigned int i;
4392 0 : class insn_chain **p = &reload_insn_chain;
4393 0 : basic_block bb;
4394 0 : class insn_chain *c = NULL;
4395 0 : class insn_chain *next = NULL;
4396 0 : auto_bitmap live_relevant_regs;
4397 0 : auto_bitmap elim_regset;
4398 : /* live_subregs is a vector used to keep accurate information about
4399 : which hardregs are live in multiword pseudos. live_subregs and
4400 : live_subregs_used are indexed by pseudo number. The live_subreg
4401 : entry for a particular pseudo is only used if the corresponding
4402 : element is non zero in live_subregs_used. The sbitmap size of
4403 : live_subreg[allocno] is number of bytes that the pseudo can
4404 : occupy. */
4405 0 : sbitmap *live_subregs = XCNEWVEC (sbitmap, max_regno);
4406 0 : auto_bitmap live_subregs_used;
4407 :
4408 0 : hard_reg_set_iterator hrsi;
4409 0 : EXECUTE_IF_SET_IN_HARD_REG_SET (eliminable_regset, 0, i, hrsi)
4410 0 : bitmap_set_bit (elim_regset, i);
4411 0 : FOR_EACH_BB_REVERSE_FN (bb, cfun)
4412 : {
4413 0 : bitmap_iterator bi;
4414 0 : rtx_insn *insn;
4415 :
4416 0 : CLEAR_REG_SET (live_relevant_regs);
4417 0 : bitmap_clear (live_subregs_used);
4418 :
4419 0 : EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), 0, i, bi)
4420 : {
4421 0 : if (i >= FIRST_PSEUDO_REGISTER)
4422 : break;
4423 0 : bitmap_set_bit (live_relevant_regs, i);
4424 : }
4425 :
4426 0 : EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb),
4427 : FIRST_PSEUDO_REGISTER, i, bi)
4428 : {
4429 0 : if (pseudo_for_reload_consideration_p (i))
4430 0 : bitmap_set_bit (live_relevant_regs, i);
4431 : }
4432 :
4433 0 : FOR_BB_INSNS_REVERSE (bb, insn)
4434 : {
4435 0 : if (!NOTE_P (insn) && !BARRIER_P (insn))
4436 : {
4437 0 : struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4438 0 : df_ref def, use;
4439 :
4440 0 : c = new_insn_chain ();
4441 0 : c->next = next;
4442 0 : next = c;
4443 0 : *p = c;
4444 0 : p = &c->prev;
4445 :
4446 0 : c->insn = insn;
4447 0 : c->block = bb->index;
4448 :
4449 0 : if (NONDEBUG_INSN_P (insn))
4450 0 : FOR_EACH_INSN_INFO_DEF (def, insn_info)
4451 : {
4452 0 : unsigned int regno = DF_REF_REGNO (def);
4453 :
4454 : /* Ignore may clobbers because these are generated
4455 : from calls. However, every other kind of def is
4456 : added to dead_or_set. */
4457 0 : if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MAY_CLOBBER))
4458 : {
4459 0 : if (regno < FIRST_PSEUDO_REGISTER)
4460 : {
4461 0 : if (!fixed_regs[regno])
4462 0 : bitmap_set_bit (&c->dead_or_set, regno);
4463 : }
4464 0 : else if (pseudo_for_reload_consideration_p (regno))
4465 0 : bitmap_set_bit (&c->dead_or_set, regno);
4466 : }
4467 :
4468 0 : if ((regno < FIRST_PSEUDO_REGISTER
4469 0 : || reg_renumber[regno] >= 0
4470 0 : || ira_conflicts_p)
4471 0 : && (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL)))
4472 : {
4473 0 : rtx reg = DF_REF_REG (def);
4474 0 : HOST_WIDE_INT outer_size, inner_size, start;
4475 :
4476 : /* We can usually track the liveness of individual
4477 : bytes within a subreg. The only exceptions are
4478 : subregs wrapped in ZERO_EXTRACTs and subregs whose
4479 : size is not known; in those cases we need to be
4480 : conservative and treat the definition as a partial
4481 : definition of the full register rather than a full
4482 : definition of a specific part of the register. */
4483 0 : if (GET_CODE (reg) == SUBREG
4484 0 : && !DF_REF_FLAGS_IS_SET (def, DF_REF_ZERO_EXTRACT)
4485 0 : && get_subreg_tracking_sizes (reg, &outer_size,
4486 : &inner_size, &start))
4487 : {
4488 0 : HOST_WIDE_INT last = start + outer_size;
4489 :
4490 0 : init_live_subregs
4491 0 : (bitmap_bit_p (live_relevant_regs, regno),
4492 : live_subregs, live_subregs_used, regno,
4493 : inner_size);
4494 :
4495 0 : if (!DF_REF_FLAGS_IS_SET
4496 : (def, DF_REF_STRICT_LOW_PART))
4497 : {
4498 : /* Expand the range to cover entire words.
4499 : Bytes added here are "don't care". */
4500 0 : start
4501 0 : = start / UNITS_PER_WORD * UNITS_PER_WORD;
4502 0 : last = ((last + UNITS_PER_WORD - 1)
4503 0 : / UNITS_PER_WORD * UNITS_PER_WORD);
4504 : }
4505 :
4506 : /* Ignore the paradoxical bits. */
4507 0 : if (last > SBITMAP_SIZE (live_subregs[regno]))
4508 : last = SBITMAP_SIZE (live_subregs[regno]);
4509 :
4510 0 : while (start < last)
4511 : {
4512 0 : bitmap_clear_bit (live_subregs[regno], start);
4513 0 : start++;
4514 : }
4515 :
4516 0 : if (bitmap_empty_p (live_subregs[regno]))
4517 : {
4518 0 : bitmap_clear_bit (live_subregs_used, regno);
4519 0 : bitmap_clear_bit (live_relevant_regs, regno);
4520 : }
4521 : else
4522 : /* Set live_relevant_regs here because
4523 : that bit has to be true to get us to
4524 : look at the live_subregs fields. */
4525 0 : bitmap_set_bit (live_relevant_regs, regno);
4526 : }
4527 : else
4528 : {
4529 : /* DF_REF_PARTIAL is generated for
4530 : subregs, STRICT_LOW_PART, and
4531 : ZERO_EXTRACT. We handle the subreg
4532 : case above so here we have to keep from
4533 : modeling the def as a killing def. */
4534 0 : if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL))
4535 : {
4536 0 : bitmap_clear_bit (live_subregs_used, regno);
4537 0 : bitmap_clear_bit (live_relevant_regs, regno);
4538 : }
4539 : }
4540 : }
4541 : }
4542 :
4543 0 : bitmap_and_compl_into (live_relevant_regs, elim_regset);
4544 0 : bitmap_copy (&c->live_throughout, live_relevant_regs);
4545 :
4546 0 : if (NONDEBUG_INSN_P (insn))
4547 0 : FOR_EACH_INSN_INFO_USE (use, insn_info)
4548 : {
4549 0 : unsigned int regno = DF_REF_REGNO (use);
4550 0 : rtx reg = DF_REF_REG (use);
4551 :
4552 : /* DF_REF_READ_WRITE on a use means that this use
4553 : is fabricated from a def that is a partial set
4554 : to a multiword reg. Here, we only model the
4555 : subreg case that is not wrapped in ZERO_EXTRACT
4556 : precisely so we do not need to look at the
4557 : fabricated use. */
4558 0 : if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE)
4559 0 : && !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT)
4560 0 : && DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG))
4561 0 : continue;
4562 :
4563 : /* Add the last use of each var to dead_or_set. */
4564 0 : if (!bitmap_bit_p (live_relevant_regs, regno))
4565 : {
4566 0 : if (regno < FIRST_PSEUDO_REGISTER)
4567 : {
4568 0 : if (!fixed_regs[regno])
4569 0 : bitmap_set_bit (&c->dead_or_set, regno);
4570 : }
4571 0 : else if (pseudo_for_reload_consideration_p (regno))
4572 0 : bitmap_set_bit (&c->dead_or_set, regno);
4573 : }
4574 :
4575 0 : if (regno < FIRST_PSEUDO_REGISTER
4576 0 : || pseudo_for_reload_consideration_p (regno))
4577 : {
4578 0 : HOST_WIDE_INT outer_size, inner_size, start;
4579 0 : if (GET_CODE (reg) == SUBREG
4580 0 : && !DF_REF_FLAGS_IS_SET (use,
4581 : DF_REF_SIGN_EXTRACT
4582 : | DF_REF_ZERO_EXTRACT)
4583 0 : && get_subreg_tracking_sizes (reg, &outer_size,
4584 : &inner_size, &start))
4585 : {
4586 0 : HOST_WIDE_INT last = start + outer_size;
4587 :
4588 0 : init_live_subregs
4589 0 : (bitmap_bit_p (live_relevant_regs, regno),
4590 : live_subregs, live_subregs_used, regno,
4591 : inner_size);
4592 :
4593 : /* Ignore the paradoxical bits. */
4594 0 : if (last > SBITMAP_SIZE (live_subregs[regno]))
4595 : last = SBITMAP_SIZE (live_subregs[regno]);
4596 :
4597 0 : while (start < last)
4598 : {
4599 0 : bitmap_set_bit (live_subregs[regno], start);
4600 0 : start++;
4601 : }
4602 : }
4603 : else
4604 : /* Resetting the live_subregs_used is
4605 : effectively saying do not use the subregs
4606 : because we are reading the whole
4607 : pseudo. */
4608 0 : bitmap_clear_bit (live_subregs_used, regno);
4609 0 : bitmap_set_bit (live_relevant_regs, regno);
4610 : }
4611 : }
4612 : }
4613 : }
4614 :
4615 : /* FIXME!! The following code is a disaster. Reload needs to see the
4616 : labels and jump tables that are just hanging out in between
4617 : the basic blocks. See pr33676. */
4618 0 : insn = BB_HEAD (bb);
4619 :
4620 : /* Skip over the barriers and cruft. */
4621 0 : while (insn && (BARRIER_P (insn) || NOTE_P (insn)
4622 0 : || BLOCK_FOR_INSN (insn) == bb))
4623 0 : insn = PREV_INSN (insn);
4624 :
4625 : /* While we add anything except barriers and notes, the focus is
4626 : to get the labels and jump tables into the
4627 : reload_insn_chain. */
4628 0 : while (insn)
4629 : {
4630 0 : if (!NOTE_P (insn) && !BARRIER_P (insn))
4631 : {
4632 0 : if (BLOCK_FOR_INSN (insn))
4633 : break;
4634 :
4635 0 : c = new_insn_chain ();
4636 0 : c->next = next;
4637 0 : next = c;
4638 0 : *p = c;
4639 0 : p = &c->prev;
4640 :
4641 : /* The block makes no sense here, but it is what the old
4642 : code did. */
4643 0 : c->block = bb->index;
4644 0 : c->insn = insn;
4645 0 : bitmap_copy (&c->live_throughout, live_relevant_regs);
4646 : }
4647 0 : insn = PREV_INSN (insn);
4648 : }
4649 : }
4650 :
4651 0 : reload_insn_chain = c;
4652 0 : *p = NULL;
4653 :
4654 0 : for (i = 0; i < (unsigned int) max_regno; i++)
4655 0 : if (live_subregs[i] != NULL)
4656 0 : sbitmap_free (live_subregs[i]);
4657 0 : free (live_subregs);
4658 :
4659 0 : if (dump_file)
4660 0 : print_insn_chains (dump_file);
4661 0 : }
4662 :
4663 : /* Examine the rtx found in *LOC, which is read or written to as determined
4664 : by TYPE. Return false if we find a reason why an insn containing this
4665 : rtx should not be moved (such as accesses to non-constant memory), true
4666 : otherwise. */
4667 : static bool
4668 6694449 : rtx_moveable_p (rtx *loc, enum op_type type)
4669 : {
4670 6703488 : const char *fmt;
4671 6703488 : rtx x = *loc;
4672 6703488 : int i, j;
4673 :
4674 6703488 : enum rtx_code code = GET_CODE (x);
4675 6703488 : switch (code)
4676 : {
4677 : case CONST:
4678 : CASE_CONST_ANY:
4679 : case SYMBOL_REF:
4680 : case LABEL_REF:
4681 : return true;
4682 :
4683 0 : case PC:
4684 0 : return type == OP_IN;
4685 :
4686 2370950 : case REG:
4687 2370950 : if (x == frame_pointer_rtx)
4688 : return true;
4689 2369802 : if (HARD_REGISTER_P (x))
4690 : return false;
4691 :
4692 : return true;
4693 :
4694 622839 : case MEM:
4695 622839 : if (type == OP_IN && MEM_READONLY_P (x))
4696 8964 : return rtx_moveable_p (&XEXP (x, 0), OP_IN);
4697 : return false;
4698 :
4699 2065170 : case SET:
4700 2065170 : return (rtx_moveable_p (&SET_SRC (x), OP_IN)
4701 2065170 : && rtx_moveable_p (&SET_DEST (x), OP_OUT));
4702 :
4703 10 : case STRICT_LOW_PART:
4704 10 : return rtx_moveable_p (&XEXP (x, 0), OP_OUT);
4705 :
4706 487 : case ZERO_EXTRACT:
4707 487 : case SIGN_EXTRACT:
4708 487 : return (rtx_moveable_p (&XEXP (x, 0), type)
4709 487 : && rtx_moveable_p (&XEXP (x, 1), OP_IN)
4710 974 : && rtx_moveable_p (&XEXP (x, 2), OP_IN));
4711 :
4712 65 : case CLOBBER:
4713 65 : return rtx_moveable_p (&SET_DEST (x), OP_OUT);
4714 :
4715 : case UNSPEC_VOLATILE:
4716 : /* It is a bad idea to consider insns with such rtl
4717 : as moveable ones. The insn scheduler also considers them as barrier
4718 : for a reason. */
4719 : return false;
4720 :
4721 0 : case ASM_OPERANDS:
4722 : /* The same is true for volatile asm: it has unknown side effects, it
4723 : cannot be moved at will. */
4724 0 : if (MEM_VOLATILE_P (x))
4725 : return false;
4726 :
4727 1098451 : default:
4728 1098451 : break;
4729 : }
4730 :
4731 1098451 : fmt = GET_RTX_FORMAT (code);
4732 2830309 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
4733 : {
4734 1901938 : if (fmt[i] == 'e')
4735 : {
4736 1498743 : if (!rtx_moveable_p (&XEXP (x, i), type))
4737 : return false;
4738 : }
4739 403195 : else if (fmt[i] == 'E')
4740 532646 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
4741 : {
4742 406301 : if (!rtx_moveable_p (&XVECEXP (x, i, j), type))
4743 : return false;
4744 : }
4745 : }
4746 : return true;
4747 : }
4748 :
4749 : /* A wrapper around dominated_by_p, which uses the information in UID_LUID
4750 : to give dominance relationships between two insns I1 and I2. */
4751 : static bool
4752 20996961 : insn_dominated_by_p (rtx i1, rtx i2, int *uid_luid)
4753 : {
4754 20996961 : basic_block bb1 = BLOCK_FOR_INSN (i1);
4755 20996961 : basic_block bb2 = BLOCK_FOR_INSN (i2);
4756 :
4757 20996961 : if (bb1 == bb2)
4758 11102165 : return uid_luid[INSN_UID (i2)] < uid_luid[INSN_UID (i1)];
4759 9894796 : return dominated_by_p (CDI_DOMINATORS, bb1, bb2);
4760 : }
4761 :
4762 : /* Record the range of register numbers added by find_moveable_pseudos. */
4763 : int first_moveable_pseudo, last_moveable_pseudo;
4764 :
4765 : /* These two vectors hold data for every register added by
4766 : find_movable_pseudos, with index 0 holding data for the
4767 : first_moveable_pseudo. */
4768 : /* The original home register. */
4769 : static vec<rtx> pseudo_replaced_reg;
4770 :
4771 : /* Look for instances where we have an instruction that is known to increase
4772 : register pressure, and whose result is not used immediately. If it is
4773 : possible to move the instruction downwards to just before its first use,
4774 : split its lifetime into two ranges. We create a new pseudo to compute the
4775 : value, and emit a move instruction just before the first use. If, after
4776 : register allocation, the new pseudo remains unallocated, the function
4777 : move_unallocated_pseudos then deletes the move instruction and places
4778 : the computation just before the first use.
4779 :
4780 : Such a move is safe and profitable if all the input registers remain live
4781 : and unchanged between the original computation and its first use. In such
4782 : a situation, the computation is known to increase register pressure, and
4783 : moving it is known to at least not worsen it.
4784 :
4785 : We restrict moves to only those cases where a register remains unallocated,
4786 : in order to avoid interfering too much with the instruction schedule. As
4787 : an exception, we may move insns which only modify their input register
4788 : (typically induction variables), as this increases the freedom for our
4789 : intended transformation, and does not limit the second instruction
4790 : scheduler pass. */
4791 :
4792 : static void
4793 1041770 : find_moveable_pseudos (void)
4794 : {
4795 1041770 : unsigned i;
4796 1041770 : int max_regs = max_reg_num ();
4797 1041770 : int max_uid = get_max_uid ();
4798 1041770 : basic_block bb;
4799 1041770 : int *uid_luid = XNEWVEC (int, max_uid);
4800 1041770 : rtx_insn **closest_uses = XNEWVEC (rtx_insn *, max_regs);
4801 : /* A set of registers which are live but not modified throughout a block. */
4802 1041770 : bitmap_head *bb_transp_live = XNEWVEC (bitmap_head,
4803 : last_basic_block_for_fn (cfun));
4804 : /* A set of registers which only exist in a given basic block. */
4805 1041770 : bitmap_head *bb_local = XNEWVEC (bitmap_head,
4806 : last_basic_block_for_fn (cfun));
4807 : /* A set of registers which are set once, in an instruction that can be
4808 : moved freely downwards, but are otherwise transparent to a block. */
4809 1041770 : bitmap_head *bb_moveable_reg_sets = XNEWVEC (bitmap_head,
4810 : last_basic_block_for_fn (cfun));
4811 1041770 : auto_bitmap live, used, set, interesting, unusable_as_input;
4812 1041770 : bitmap_iterator bi;
4813 :
4814 1041770 : first_moveable_pseudo = max_regs;
4815 1041770 : pseudo_replaced_reg.release ();
4816 1041770 : pseudo_replaced_reg.safe_grow_cleared (max_regs, true);
4817 :
4818 1041770 : df_analyze ();
4819 1041770 : calculate_dominance_info (CDI_DOMINATORS);
4820 :
4821 1041770 : i = 0;
4822 11862724 : FOR_EACH_BB_FN (bb, cfun)
4823 : {
4824 10820954 : rtx_insn *insn;
4825 10820954 : bitmap transp = bb_transp_live + bb->index;
4826 10820954 : bitmap moveable = bb_moveable_reg_sets + bb->index;
4827 10820954 : bitmap local = bb_local + bb->index;
4828 :
4829 10820954 : bitmap_initialize (local, 0);
4830 10820954 : bitmap_initialize (transp, 0);
4831 10820954 : bitmap_initialize (moveable, 0);
4832 10820954 : bitmap_copy (live, df_get_live_out (bb));
4833 10820954 : bitmap_and_into (live, df_get_live_in (bb));
4834 10820954 : bitmap_copy (transp, live);
4835 10820954 : bitmap_clear (moveable);
4836 10820954 : bitmap_clear (live);
4837 10820954 : bitmap_clear (used);
4838 10820954 : bitmap_clear (set);
4839 137090898 : FOR_BB_INSNS (bb, insn)
4840 126269944 : if (NONDEBUG_INSN_P (insn))
4841 : {
4842 57908455 : df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4843 57908455 : df_ref def, use;
4844 :
4845 57908455 : uid_luid[INSN_UID (insn)] = i++;
4846 :
4847 57908455 : def = df_single_def (insn_info);
4848 57908455 : use = df_single_use (insn_info);
4849 57908455 : if (use
4850 57908455 : && def
4851 19019090 : && DF_REF_REGNO (use) == DF_REF_REGNO (def)
4852 622698 : && !bitmap_bit_p (set, DF_REF_REGNO (use))
4853 57990772 : && rtx_moveable_p (&PATTERN (insn), OP_IN))
4854 : {
4855 34576 : unsigned regno = DF_REF_REGNO (use);
4856 34576 : bitmap_set_bit (moveable, regno);
4857 34576 : bitmap_set_bit (set, regno);
4858 34576 : bitmap_set_bit (used, regno);
4859 34576 : bitmap_clear_bit (transp, regno);
4860 34576 : continue;
4861 34576 : }
4862 129820454 : FOR_EACH_INSN_INFO_USE (use, insn_info)
4863 : {
4864 71946575 : unsigned regno = DF_REF_REGNO (use);
4865 71946575 : bitmap_set_bit (used, regno);
4866 71946575 : if (bitmap_clear_bit (moveable, regno))
4867 15892 : bitmap_clear_bit (transp, regno);
4868 : }
4869 :
4870 480403844 : FOR_EACH_INSN_INFO_DEF (def, insn_info)
4871 : {
4872 422529965 : unsigned regno = DF_REF_REGNO (def);
4873 422529965 : bitmap_set_bit (set, regno);
4874 422529965 : bitmap_clear_bit (transp, regno);
4875 422529965 : bitmap_clear_bit (moveable, regno);
4876 : }
4877 : }
4878 : }
4879 :
4880 11862724 : FOR_EACH_BB_FN (bb, cfun)
4881 : {
4882 10820954 : bitmap local = bb_local + bb->index;
4883 10820954 : rtx_insn *insn;
4884 :
4885 137090898 : FOR_BB_INSNS (bb, insn)
4886 126269944 : if (NONDEBUG_INSN_P (insn))
4887 : {
4888 57908455 : df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4889 57908455 : rtx_insn *def_insn;
4890 57908455 : rtx closest_use, note;
4891 57908455 : df_ref def, use;
4892 57908455 : unsigned regno;
4893 57908455 : bool all_dominated, all_local;
4894 57908455 : machine_mode mode;
4895 :
4896 57908455 : def = df_single_def (insn_info);
4897 : /* There must be exactly one def in this insn. */
4898 31599285 : if (!def || !single_set (insn))
4899 26393811 : continue;
4900 : /* This must be the only definition of the reg. We also limit
4901 : which modes we deal with so that we can assume we can generate
4902 : move instructions. */
4903 31514644 : regno = DF_REF_REGNO (def);
4904 31514644 : mode = GET_MODE (DF_REF_REG (def));
4905 31514644 : if (DF_REG_DEF_COUNT (regno) != 1
4906 12056438 : || !DF_REF_INSN_INFO (def)
4907 12056438 : || HARD_REGISTER_NUM_P (regno)
4908 12023895 : || DF_REG_EQ_USE_COUNT (regno) > 0
4909 11482643 : || (!INTEGRAL_MODE_P (mode)
4910 : && !FLOAT_MODE_P (mode)
4911 : && !OPAQUE_MODE_P (mode)))
4912 20032001 : continue;
4913 11482643 : def_insn = DF_REF_INSN (def);
4914 :
4915 19436747 : for (note = REG_NOTES (def_insn); note; note = XEXP (note, 1))
4916 10524012 : if (REG_NOTE_KIND (note) == REG_EQUIV && MEM_P (XEXP (note, 0)))
4917 : break;
4918 :
4919 11482643 : if (note)
4920 : {
4921 2569908 : if (dump_file)
4922 68 : fprintf (dump_file, "Ignoring reg %d, has equiv memory\n",
4923 : regno);
4924 2569908 : bitmap_set_bit (unusable_as_input, regno);
4925 2569908 : continue;
4926 : }
4927 :
4928 8912735 : use = DF_REG_USE_CHAIN (regno);
4929 8912735 : all_dominated = true;
4930 8912735 : all_local = true;
4931 8912735 : closest_use = NULL_RTX;
4932 26265158 : for (; use; use = DF_REF_NEXT_REG (use))
4933 : {
4934 17352423 : rtx_insn *insn;
4935 17352423 : if (!DF_REF_INSN_INFO (use))
4936 : {
4937 : all_dominated = false;
4938 : all_local = false;
4939 : break;
4940 : }
4941 17352423 : insn = DF_REF_INSN (use);
4942 17352423 : if (DEBUG_INSN_P (insn))
4943 2174651 : continue;
4944 15177772 : if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (def_insn))
4945 6027382 : all_local = false;
4946 15177772 : if (!insn_dominated_by_p (insn, def_insn, uid_luid))
4947 9618 : all_dominated = false;
4948 15177772 : if (closest_use != insn && closest_use != const0_rtx)
4949 : {
4950 13358611 : if (closest_use == NULL_RTX)
4951 : closest_use = insn;
4952 4509019 : else if (insn_dominated_by_p (closest_use, insn, uid_luid))
4953 : closest_use = insn;
4954 1310170 : else if (!insn_dominated_by_p (insn, closest_use, uid_luid))
4955 618113 : closest_use = const0_rtx;
4956 : }
4957 : }
4958 8912735 : if (!all_dominated)
4959 : {
4960 4874 : if (dump_file)
4961 0 : fprintf (dump_file, "Reg %d not all uses dominated by set\n",
4962 : regno);
4963 4874 : continue;
4964 : }
4965 8907861 : if (all_local)
4966 6355410 : bitmap_set_bit (local, regno);
4967 8291424 : if (closest_use == const0_rtx || closest_use == NULL
4968 17136142 : || next_nonnote_nondebug_insn (def_insn) == closest_use)
4969 : {
4970 5601121 : if (dump_file)
4971 99 : fprintf (dump_file, "Reg %d uninteresting%s\n", regno,
4972 99 : closest_use == const0_rtx || closest_use == NULL
4973 : ? " (no unique first use)" : "");
4974 5601121 : continue;
4975 : }
4976 :
4977 3306740 : bitmap_set_bit (interesting, regno);
4978 : /* If we get here, we know closest_use is a non-NULL insn
4979 : (as opposed to const_0_rtx). */
4980 3306740 : closest_uses[regno] = as_a <rtx_insn *> (closest_use);
4981 :
4982 3306740 : if (dump_file && (all_local || all_dominated))
4983 : {
4984 78 : fprintf (dump_file, "Reg %u:", regno);
4985 78 : if (all_local)
4986 14 : fprintf (dump_file, " local to bb %d", bb->index);
4987 78 : if (all_dominated)
4988 78 : fprintf (dump_file, " def dominates all uses");
4989 78 : if (closest_use != const0_rtx)
4990 78 : fprintf (dump_file, " has unique first use");
4991 78 : fputs ("\n", dump_file);
4992 : }
4993 : }
4994 : }
4995 :
4996 4348510 : EXECUTE_IF_SET_IN_BITMAP (interesting, 0, i, bi)
4997 : {
4998 3306740 : df_ref def = DF_REG_DEF_CHAIN (i);
4999 3306740 : rtx_insn *def_insn = DF_REF_INSN (def);
5000 3306740 : basic_block def_block = BLOCK_FOR_INSN (def_insn);
5001 3306740 : bitmap def_bb_local = bb_local + def_block->index;
5002 3306740 : bitmap def_bb_moveable = bb_moveable_reg_sets + def_block->index;
5003 3306740 : bitmap def_bb_transp = bb_transp_live + def_block->index;
5004 3306740 : bool local_to_bb_p = bitmap_bit_p (def_bb_local, i);
5005 3306740 : rtx_insn *use_insn = closest_uses[i];
5006 3306740 : df_ref use;
5007 3306740 : bool all_ok = true;
5008 3306740 : bool all_transp = true;
5009 :
5010 3306740 : if (!REG_P (DF_REF_REG (def)))
5011 49905 : continue;
5012 :
5013 3256835 : if (!local_to_bb_p)
5014 : {
5015 1227861 : if (dump_file)
5016 64 : fprintf (dump_file, "Reg %u not local to one basic block\n",
5017 : i);
5018 1227861 : continue;
5019 : }
5020 2028974 : if (reg_equiv_init (i) != NULL_RTX)
5021 : {
5022 46092 : if (dump_file)
5023 0 : fprintf (dump_file, "Ignoring reg %u with equiv init insn\n",
5024 : i);
5025 46092 : continue;
5026 : }
5027 1982882 : if (!rtx_moveable_p (&PATTERN (def_insn), OP_IN))
5028 : {
5029 1374530 : if (dump_file)
5030 14 : fprintf (dump_file, "Found def insn %d for %d to be not moveable\n",
5031 14 : INSN_UID (def_insn), i);
5032 1374530 : continue;
5033 : }
5034 608352 : if (dump_file)
5035 0 : fprintf (dump_file, "Examining insn %d, def for %d\n",
5036 0 : INSN_UID (def_insn), i);
5037 1366250 : FOR_EACH_INSN_USE (use, def_insn)
5038 : {
5039 816669 : unsigned regno = DF_REF_REGNO (use);
5040 816669 : if (bitmap_bit_p (unusable_as_input, regno))
5041 : {
5042 58771 : all_ok = false;
5043 58771 : if (dump_file)
5044 0 : fprintf (dump_file, " found unusable input reg %u.\n", regno);
5045 : break;
5046 : }
5047 757898 : if (!bitmap_bit_p (def_bb_transp, regno))
5048 : {
5049 707321 : if (bitmap_bit_p (def_bb_moveable, regno)
5050 707321 : && !control_flow_insn_p (use_insn))
5051 : {
5052 35 : if (modified_between_p (DF_REF_REG (use), def_insn, use_insn))
5053 : {
5054 0 : rtx_insn *x = NEXT_INSN (def_insn);
5055 0 : while (!modified_in_p (DF_REF_REG (use), x))
5056 : {
5057 0 : gcc_assert (x != use_insn);
5058 0 : x = NEXT_INSN (x);
5059 : }
5060 0 : if (dump_file)
5061 0 : fprintf (dump_file, " input reg %u modified but insn %d moveable\n",
5062 0 : regno, INSN_UID (x));
5063 0 : emit_insn_after (PATTERN (x), use_insn);
5064 0 : set_insn_deleted (x);
5065 : }
5066 : else
5067 : {
5068 35 : if (dump_file)
5069 0 : fprintf (dump_file, " input reg %u modified between def and use\n",
5070 : regno);
5071 : all_transp = false;
5072 : }
5073 : }
5074 : else
5075 : all_transp = false;
5076 : }
5077 : }
5078 0 : if (!all_ok)
5079 58771 : continue;
5080 549581 : if (!dbg_cnt (ira_move))
5081 : break;
5082 549581 : if (dump_file)
5083 0 : fprintf (dump_file, " all ok%s\n", all_transp ? " and transp" : "");
5084 :
5085 549581 : if (all_transp)
5086 : {
5087 13858 : rtx def_reg = DF_REF_REG (def);
5088 13858 : rtx newreg = ira_create_new_reg (def_reg);
5089 13858 : if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0))
5090 : {
5091 13858 : unsigned nregno = REGNO (newreg);
5092 13858 : emit_insn_before (gen_move_insn (def_reg, newreg), use_insn);
5093 13858 : nregno -= max_regs;
5094 13858 : pseudo_replaced_reg[nregno] = def_reg;
5095 : }
5096 : }
5097 : }
5098 :
5099 11862724 : FOR_EACH_BB_FN (bb, cfun)
5100 : {
5101 10820954 : bitmap_clear (bb_local + bb->index);
5102 10820954 : bitmap_clear (bb_transp_live + bb->index);
5103 10820954 : bitmap_clear (bb_moveable_reg_sets + bb->index);
5104 : }
5105 1041770 : free (uid_luid);
5106 1041770 : free (closest_uses);
5107 1041770 : free (bb_local);
5108 1041770 : free (bb_transp_live);
5109 1041770 : free (bb_moveable_reg_sets);
5110 :
5111 1041770 : last_moveable_pseudo = max_reg_num ();
5112 :
5113 1041770 : fix_reg_equiv_init ();
5114 1041770 : expand_reg_info ();
5115 1041770 : regstat_free_n_sets_and_refs ();
5116 1041770 : regstat_free_ri ();
5117 1041770 : regstat_init_n_sets_and_refs ();
5118 1041770 : regstat_compute_ri ();
5119 1041770 : free_dominance_info (CDI_DOMINATORS);
5120 1041770 : }
5121 :
5122 : /* If SET pattern SET is an assignment from a hard register to a pseudo which
5123 : is live at CALL_DOM (if non-NULL, otherwise this check is omitted), return
5124 : the destination. Otherwise return NULL. */
5125 :
5126 : static rtx
5127 2095706 : interesting_dest_for_shprep_1 (rtx set, basic_block call_dom)
5128 : {
5129 2095706 : rtx src = SET_SRC (set);
5130 2095706 : rtx dest = SET_DEST (set);
5131 704655 : if (!REG_P (src) || !HARD_REGISTER_P (src)
5132 550453 : || !REG_P (dest) || HARD_REGISTER_P (dest)
5133 2625137 : || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
5134 1655326 : return NULL;
5135 : return dest;
5136 : }
5137 :
5138 : /* If insn is interesting for parameter range-splitting shrink-wrapping
5139 : preparation, i.e. it is a single set from a hard register to a pseudo, which
5140 : is live at CALL_DOM (if non-NULL, otherwise this check is omitted), or a
5141 : parallel statement with only one such statement, return the destination.
5142 : Otherwise return NULL. */
5143 :
5144 : static rtx
5145 4285301 : interesting_dest_for_shprep (rtx_insn *insn, basic_block call_dom)
5146 : {
5147 4285301 : if (!INSN_P (insn))
5148 : return NULL;
5149 3418595 : rtx pat = PATTERN (insn);
5150 3418595 : if (GET_CODE (pat) == SET)
5151 1862466 : return interesting_dest_for_shprep_1 (pat, call_dom);
5152 :
5153 1556129 : if (GET_CODE (pat) != PARALLEL)
5154 : return NULL;
5155 : rtx ret = NULL;
5156 609082 : for (int i = 0; i < XVECLEN (pat, 0); i++)
5157 : {
5158 411567 : rtx sub = XVECEXP (pat, 0, i);
5159 411567 : if (GET_CODE (sub) == USE || GET_CODE (sub) == CLOBBER)
5160 171009 : continue;
5161 240558 : if (GET_CODE (sub) != SET
5162 240558 : || side_effects_p (sub))
5163 7318 : return NULL;
5164 233240 : rtx dest = interesting_dest_for_shprep_1 (sub, call_dom);
5165 233240 : if (dest && ret)
5166 : return NULL;
5167 233240 : if (dest)
5168 404249 : ret = dest;
5169 : }
5170 : return ret;
5171 : }
5172 :
5173 : /* Split live ranges of pseudos that are loaded from hard registers in the
5174 : first BB in a BB that dominates all non-sibling call if such a BB can be
5175 : found and is not in a loop. Return true if the function has made any
5176 : changes. */
5177 :
5178 : static bool
5179 1041770 : split_live_ranges_for_shrink_wrap (void)
5180 : {
5181 1041770 : basic_block bb, call_dom = NULL;
5182 1041770 : basic_block first = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
5183 1041770 : rtx_insn *insn, *last_interesting_insn = NULL;
5184 1041770 : auto_bitmap need_new, reachable;
5185 1041770 : vec<basic_block> queue;
5186 :
5187 1041770 : if (!SHRINK_WRAPPING_ENABLED)
5188 246 : return false;
5189 :
5190 1041524 : queue.create (n_basic_blocks_for_fn (cfun));
5191 :
5192 7380502 : FOR_EACH_BB_FN (bb, cfun)
5193 73341427 : FOR_BB_INSNS (bb, insn)
5194 68340892 : if (CALL_P (insn) && !SIBLING_CALL_P (insn))
5195 : {
5196 1746871 : if (bb == first)
5197 : {
5198 408428 : queue.release ();
5199 408428 : return false;
5200 : }
5201 :
5202 1338443 : bitmap_set_bit (need_new, bb->index);
5203 1338443 : bitmap_set_bit (reachable, bb->index);
5204 1338443 : queue.quick_push (bb);
5205 1338443 : break;
5206 : }
5207 :
5208 633096 : if (queue.is_empty ())
5209 : {
5210 379889 : queue.release ();
5211 379889 : return false;
5212 : }
5213 :
5214 4094027 : while (!queue.is_empty ())
5215 : {
5216 3840820 : edge e;
5217 3840820 : edge_iterator ei;
5218 :
5219 3840820 : bb = queue.pop ();
5220 9215568 : FOR_EACH_EDGE (e, ei, bb->succs)
5221 5374748 : if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5222 5374748 : && bitmap_set_bit (reachable, e->dest->index))
5223 2502377 : queue.quick_push (e->dest);
5224 : }
5225 253207 : queue.release ();
5226 :
5227 3847256 : FOR_BB_INSNS (first, insn)
5228 : {
5229 3594323 : rtx dest = interesting_dest_for_shprep (insn, NULL);
5230 3594323 : if (!dest)
5231 3208150 : continue;
5232 :
5233 386173 : if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
5234 : return false;
5235 :
5236 385899 : for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
5237 2757894 : use;
5238 2371995 : use = DF_REF_NEXT_REG (use))
5239 : {
5240 2371995 : int ubbi = DF_REF_BB (use)->index;
5241 :
5242 : /* Only non debug insns should be taken into account. */
5243 2371995 : if (NONDEBUG_INSN_P (DF_REF_INSN (use))
5244 2371995 : && bitmap_bit_p (reachable, ubbi))
5245 1076913 : bitmap_set_bit (need_new, ubbi);
5246 : }
5247 : last_interesting_insn = insn;
5248 : }
5249 :
5250 252933 : if (!last_interesting_insn)
5251 : return false;
5252 :
5253 180752 : call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, need_new);
5254 180752 : if (call_dom == first)
5255 : return false;
5256 :
5257 95346 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5258 213479 : while (bb_loop_depth (call_dom) > 0)
5259 22787 : call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
5260 95346 : loop_optimizer_finalize ();
5261 :
5262 95346 : if (call_dom == first)
5263 : return false;
5264 :
5265 83539 : calculate_dominance_info (CDI_POST_DOMINATORS);
5266 83539 : if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
5267 : {
5268 7812 : free_dominance_info (CDI_POST_DOMINATORS);
5269 7812 : return false;
5270 : }
5271 75727 : free_dominance_info (CDI_POST_DOMINATORS);
5272 :
5273 75727 : if (dump_file)
5274 2 : fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
5275 : call_dom->index);
5276 :
5277 75727 : bool ret = false;
5278 742948 : FOR_BB_INSNS (first, insn)
5279 : {
5280 690978 : rtx dest = interesting_dest_for_shprep (insn, call_dom);
5281 690978 : if (!dest || dest == pic_offset_table_rtx)
5282 636771 : continue;
5283 :
5284 54207 : bool need_newreg = false;
5285 54207 : df_ref use, next;
5286 67651 : for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5287 : {
5288 67581 : rtx_insn *uin = DF_REF_INSN (use);
5289 67581 : next = DF_REF_NEXT_REG (use);
5290 :
5291 67581 : if (DEBUG_INSN_P (uin))
5292 376 : continue;
5293 :
5294 67205 : basic_block ubb = BLOCK_FOR_INSN (uin);
5295 67205 : if (ubb == call_dom
5296 67205 : || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5297 : {
5298 : need_newreg = true;
5299 : break;
5300 : }
5301 : }
5302 :
5303 54207 : if (need_newreg)
5304 : {
5305 54137 : rtx newreg = ira_create_new_reg (dest);
5306 :
5307 425221 : for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5308 : {
5309 371084 : rtx_insn *uin = DF_REF_INSN (use);
5310 371084 : next = DF_REF_NEXT_REG (use);
5311 :
5312 371084 : basic_block ubb = BLOCK_FOR_INSN (uin);
5313 371084 : if (ubb == call_dom
5314 371084 : || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5315 271716 : validate_change (uin, DF_REF_REAL_LOC (use), newreg, true);
5316 : }
5317 :
5318 54137 : rtx_insn *new_move = gen_move_insn (newreg, dest);
5319 54137 : emit_insn_after (new_move, bb_note (call_dom));
5320 54137 : if (dump_file)
5321 : {
5322 2 : fprintf (dump_file, "Split live-range of register ");
5323 2 : print_rtl_single (dump_file, dest);
5324 : }
5325 : ret = true;
5326 : }
5327 :
5328 54207 : if (insn == last_interesting_insn)
5329 : break;
5330 : }
5331 75727 : apply_change_group ();
5332 75727 : return ret;
5333 1041770 : }
5334 :
5335 : /* Perform the second half of the transformation started in
5336 : find_moveable_pseudos. We look for instances where the newly introduced
5337 : pseudo remains unallocated, and remove it by moving the definition to
5338 : just before its use, replacing the move instruction generated by
5339 : find_moveable_pseudos. */
5340 : static void
5341 1041770 : move_unallocated_pseudos (void)
5342 : {
5343 1041770 : int i;
5344 1055628 : for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++)
5345 13858 : if (reg_renumber[i] < 0)
5346 : {
5347 3455 : int idx = i - first_moveable_pseudo;
5348 3455 : rtx other_reg = pseudo_replaced_reg[idx];
5349 : /* The iterating range [first_moveable_pseudo, last_moveable_pseudo)
5350 : covers every new pseudo created in find_moveable_pseudos,
5351 : regardless of the validation with it is successful or not.
5352 : So we need to skip the pseudos which were used in those failed
5353 : validations to avoid unexpected DF info and consequent ICE.
5354 : We only set pseudo_replaced_reg[] when the validation is successful
5355 : in find_moveable_pseudos, it's enough to check it here. */
5356 3455 : if (!other_reg)
5357 0 : continue;
5358 3455 : rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i));
5359 : /* The use must follow all definitions of OTHER_REG, so we can
5360 : insert the new definition immediately after any of them. */
5361 3455 : df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg));
5362 3455 : rtx_insn *move_insn = DF_REF_INSN (other_def);
5363 3455 : rtx_insn *newinsn = emit_insn_after (PATTERN (def_insn), move_insn);
5364 3455 : rtx set;
5365 3455 : int success;
5366 :
5367 3455 : if (dump_file)
5368 0 : fprintf (dump_file, "moving def of %d (insn %d now) ",
5369 0 : REGNO (other_reg), INSN_UID (def_insn));
5370 :
5371 3455 : delete_insn (move_insn);
5372 6910 : while ((other_def = DF_REG_DEF_CHAIN (REGNO (other_reg))))
5373 0 : delete_insn (DF_REF_INSN (other_def));
5374 3455 : delete_insn (def_insn);
5375 :
5376 3455 : set = single_set (newinsn);
5377 3455 : success = validate_change (newinsn, &SET_DEST (set), other_reg, 0);
5378 3455 : gcc_assert (success);
5379 3455 : if (dump_file)
5380 0 : fprintf (dump_file, " %d) rather than keep unallocated replacement %d\n",
5381 0 : INSN_UID (newinsn), i);
5382 3455 : SET_REG_N_REFS (i, 0);
5383 : }
5384 :
5385 1041770 : first_moveable_pseudo = last_moveable_pseudo = 0;
5386 1041770 : }
5387 :
5388 :
5389 :
5390 : /* Code dealing with scratches (changing them onto
5391 : pseudos and restoring them from the pseudos).
5392 :
5393 : We change scratches into pseudos at the beginning of IRA to
5394 : simplify dealing with them (conflicts, hard register assignments).
5395 :
5396 : If the pseudo denoting scratch was spilled it means that we do not
5397 : need a hard register for it. Such pseudos are transformed back to
5398 : scratches at the end of LRA. */
5399 :
5400 : /* Description of location of a former scratch operand. */
5401 : struct sloc
5402 : {
5403 : rtx_insn *insn; /* Insn where the scratch was. */
5404 : int nop; /* Number of the operand which was a scratch. */
5405 : unsigned regno; /* regno generated instead of scratch */
5406 : int icode; /* Original icode from which scratch was removed. */
5407 : };
5408 :
5409 : typedef struct sloc *sloc_t;
5410 :
5411 : /* Locations of the former scratches. */
5412 : static vec<sloc_t> scratches;
5413 :
5414 : /* Bitmap of scratch regnos. */
5415 : static bitmap_head scratch_bitmap;
5416 :
5417 : /* Bitmap of scratch operands. */
5418 : static bitmap_head scratch_operand_bitmap;
5419 :
5420 : /* Return true if pseudo REGNO is made of SCRATCH. */
5421 : bool
5422 369441226 : ira_former_scratch_p (int regno)
5423 : {
5424 369441226 : return bitmap_bit_p (&scratch_bitmap, regno);
5425 : }
5426 :
5427 : /* Return true if the operand NOP of INSN is a former scratch. */
5428 : bool
5429 0 : ira_former_scratch_operand_p (rtx_insn *insn, int nop)
5430 : {
5431 0 : return bitmap_bit_p (&scratch_operand_bitmap,
5432 0 : INSN_UID (insn) * MAX_RECOG_OPERANDS + nop) != 0;
5433 : }
5434 :
5435 : /* Register operand NOP in INSN as a former scratch. It will be
5436 : changed to scratch back, if it is necessary, at the LRA end. */
5437 : void
5438 90793 : ira_register_new_scratch_op (rtx_insn *insn, int nop, int icode)
5439 : {
5440 90793 : rtx op = *recog_data.operand_loc[nop];
5441 90793 : sloc_t loc = XNEW (struct sloc);
5442 90793 : ira_assert (REG_P (op));
5443 90793 : loc->insn = insn;
5444 90793 : loc->nop = nop;
5445 90793 : loc->regno = REGNO (op);
5446 90793 : loc->icode = icode;
5447 90793 : scratches.safe_push (loc);
5448 90793 : bitmap_set_bit (&scratch_bitmap, REGNO (op));
5449 181586 : bitmap_set_bit (&scratch_operand_bitmap,
5450 90793 : INSN_UID (insn) * MAX_RECOG_OPERANDS + nop);
5451 90793 : add_reg_note (insn, REG_UNUSED, op);
5452 90793 : }
5453 :
5454 : /* Return true if string STR contains constraint 'X'. */
5455 : static bool
5456 90793 : contains_X_constraint_p (const char *str)
5457 : {
5458 90793 : int c;
5459 :
5460 375750 : while ((c = *str))
5461 : {
5462 293997 : str += CONSTRAINT_LEN (c, str);
5463 293997 : if (c == 'X') return true;
5464 : }
5465 : return false;
5466 : }
5467 :
5468 : /* Change INSN's scratches into pseudos and save their location.
5469 : Return true if we changed any scratch. */
5470 : bool
5471 270716890 : ira_remove_insn_scratches (rtx_insn *insn, bool all_p, FILE *dump_file,
5472 : rtx (*get_reg) (rtx original))
5473 : {
5474 270716890 : int i;
5475 270716890 : bool insn_changed_p;
5476 270716890 : rtx reg, *loc;
5477 :
5478 270716890 : extract_insn (insn);
5479 270716890 : insn_changed_p = false;
5480 916741589 : for (i = 0; i < recog_data.n_operands; i++)
5481 : {
5482 375307809 : loc = recog_data.operand_loc[i];
5483 375307809 : if (GET_CODE (*loc) == SCRATCH && GET_MODE (*loc) != VOIDmode)
5484 : {
5485 99833 : if (! all_p && contains_X_constraint_p (recog_data.constraints[i]))
5486 9040 : continue;
5487 90793 : insn_changed_p = true;
5488 90793 : *loc = reg = get_reg (*loc);
5489 90793 : ira_register_new_scratch_op (insn, i, INSN_CODE (insn));
5490 90793 : if (dump_file != NULL)
5491 0 : fprintf (dump_file,
5492 : "Removing SCRATCH to p%u in insn #%u (nop %d)\n",
5493 0 : REGNO (reg), INSN_UID (insn), i);
5494 : }
5495 : }
5496 270716890 : return insn_changed_p;
5497 : }
5498 :
5499 : /* Return new register of the same mode as ORIGINAL. Used in
5500 : remove_scratches. */
5501 : static rtx
5502 81753 : get_scratch_reg (rtx original)
5503 : {
5504 81753 : return gen_reg_rtx (GET_MODE (original));
5505 : }
5506 :
5507 : /* Change scratches into pseudos and save their location. Return true
5508 : if we changed any scratch. */
5509 : static bool
5510 1488370 : remove_scratches (void)
5511 : {
5512 1488370 : bool change_p = false;
5513 1488370 : basic_block bb;
5514 1488370 : rtx_insn *insn;
5515 :
5516 1488370 : scratches.create (get_max_uid ());
5517 1488370 : bitmap_initialize (&scratch_bitmap, ®_obstack);
5518 1488370 : bitmap_initialize (&scratch_operand_bitmap, ®_obstack);
5519 15780720 : FOR_EACH_BB_FN (bb, cfun)
5520 171114574 : FOR_BB_INSNS (bb, insn)
5521 156822224 : if (INSN_P (insn)
5522 156822224 : && ira_remove_insn_scratches (insn, false, ira_dump_file, get_scratch_reg))
5523 : {
5524 : /* Because we might use DF, we need to keep DF info up to date. */
5525 80587 : df_insn_rescan (insn);
5526 80587 : change_p = true;
5527 : }
5528 1488370 : return change_p;
5529 : }
5530 :
5531 : /* Changes pseudos created by function remove_scratches onto scratches. */
5532 : void
5533 1488370 : ira_restore_scratches (FILE *dump_file)
5534 : {
5535 1488370 : int regno, n;
5536 1488370 : unsigned i;
5537 1488370 : rtx *op_loc;
5538 1488370 : sloc_t loc;
5539 :
5540 1579163 : for (i = 0; scratches.iterate (i, &loc); i++)
5541 : {
5542 : /* Ignore already deleted insns. */
5543 90793 : if (NOTE_P (loc->insn)
5544 0 : && NOTE_KIND (loc->insn) == NOTE_INSN_DELETED)
5545 0 : continue;
5546 90793 : extract_insn (loc->insn);
5547 90793 : if (loc->icode != INSN_CODE (loc->insn))
5548 : {
5549 : /* The icode doesn't match, which means the insn has been modified
5550 : (e.g. register elimination). The scratch cannot be restored. */
5551 0 : continue;
5552 : }
5553 90793 : op_loc = recog_data.operand_loc[loc->nop];
5554 90793 : if (REG_P (*op_loc)
5555 90793 : && ((regno = REGNO (*op_loc)) >= FIRST_PSEUDO_REGISTER)
5556 181586 : && reg_renumber[regno] < 0)
5557 : {
5558 : /* It should be only case when scratch register with chosen
5559 : constraint 'X' did not get memory or hard register. */
5560 5352 : ira_assert (ira_former_scratch_p (regno));
5561 5352 : *op_loc = gen_rtx_SCRATCH (GET_MODE (*op_loc));
5562 5352 : for (n = 0; n < recog_data.n_dups; n++)
5563 0 : *recog_data.dup_loc[n]
5564 0 : = *recog_data.operand_loc[(int) recog_data.dup_num[n]];
5565 5352 : if (dump_file != NULL)
5566 0 : fprintf (dump_file, "Restoring SCRATCH in insn #%u(nop %d)\n",
5567 0 : INSN_UID (loc->insn), loc->nop);
5568 : }
5569 : }
5570 1579163 : for (i = 0; scratches.iterate (i, &loc); i++)
5571 90793 : free (loc);
5572 1488370 : scratches.release ();
5573 1488370 : bitmap_clear (&scratch_bitmap);
5574 1488370 : bitmap_clear (&scratch_operand_bitmap);
5575 1488370 : }
5576 :
5577 :
5578 :
5579 : /* If the backend knows where to allocate pseudos for hard
5580 : register initial values, register these allocations now. */
5581 : static void
5582 1488370 : allocate_initial_values (void)
5583 : {
5584 1488370 : if (targetm.allocate_initial_value)
5585 : {
5586 : rtx hreg, preg, x;
5587 : int i, regno;
5588 :
5589 0 : for (i = 0; HARD_REGISTER_NUM_P (i); i++)
5590 : {
5591 0 : if (! initial_value_entry (i, &hreg, &preg))
5592 : break;
5593 :
5594 0 : x = targetm.allocate_initial_value (hreg);
5595 0 : regno = REGNO (preg);
5596 0 : if (x && REG_N_SETS (regno) <= 1)
5597 : {
5598 0 : if (MEM_P (x))
5599 0 : reg_equiv_memory_loc (regno) = x;
5600 : else
5601 : {
5602 0 : basic_block bb;
5603 0 : int new_regno;
5604 :
5605 0 : gcc_assert (REG_P (x));
5606 0 : new_regno = REGNO (x);
5607 0 : reg_renumber[regno] = new_regno;
5608 : /* Poke the regno right into regno_reg_rtx so that even
5609 : fixed regs are accepted. */
5610 0 : SET_REGNO (preg, new_regno);
5611 : /* Update global register liveness information. */
5612 0 : FOR_EACH_BB_FN (bb, cfun)
5613 : {
5614 0 : if (REGNO_REG_SET_P (df_get_live_in (bb), regno))
5615 0 : SET_REGNO_REG_SET (df_get_live_in (bb), new_regno);
5616 0 : if (REGNO_REG_SET_P (df_get_live_out (bb), regno))
5617 0 : SET_REGNO_REG_SET (df_get_live_out (bb), new_regno);
5618 : }
5619 : }
5620 : }
5621 : }
5622 :
5623 0 : gcc_checking_assert (! initial_value_entry (FIRST_PSEUDO_REGISTER,
5624 : &hreg, &preg));
5625 : }
5626 1488370 : }
5627 :
5628 :
5629 :
5630 :
5631 : /* True when we use LRA instead of reload pass for the current
5632 : function. */
5633 : bool ira_use_lra_p;
5634 :
5635 : /* True if we have allocno conflicts. It is false for non-optimized
5636 : mode or when the conflict table is too big. */
5637 : bool ira_conflicts_p;
5638 :
5639 : /* Saved between IRA and reload. */
5640 : static int saved_flag_ira_share_spill_slots;
5641 :
5642 : /* Set to true while in IRA. */
5643 : bool ira_in_progress = false;
5644 :
5645 : /* Set up array ira_hard_regno_nrefs. */
5646 : static void
5647 1488370 : setup_hard_regno_nrefs (void)
5648 : {
5649 1488370 : int i;
5650 :
5651 138418410 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
5652 : {
5653 136930040 : ira_hard_regno_nrefs[i] = 0;
5654 136930040 : for (df_ref use = DF_REG_USE_CHAIN (i);
5655 252439443 : use != NULL;
5656 115509403 : use = DF_REF_NEXT_REG (use))
5657 115509403 : if (DF_REF_CLASS (use) != DF_REF_ARTIFICIAL
5658 53049991 : && !(DF_REF_INSN_INFO (use) && DEBUG_INSN_P (DF_REF_INSN (use))))
5659 49968543 : ira_hard_regno_nrefs[i]++;
5660 136930040 : for (df_ref def = DF_REG_DEF_CHAIN (i);
5661 686422076 : def != NULL;
5662 549492036 : def = DF_REF_NEXT_REG (def))
5663 549492036 : if (DF_REF_CLASS (def) != DF_REF_ARTIFICIAL
5664 521855574 : && !(DF_REF_INSN_INFO (def) && DEBUG_INSN_P (DF_REF_INSN (def))))
5665 521855574 : ira_hard_regno_nrefs[i]++;
5666 : }
5667 1488370 : }
5668 :
5669 : /* This is the main entry of IRA. */
5670 : static void
5671 1488370 : ira (FILE *f)
5672 : {
5673 1488370 : bool loops_p;
5674 1488370 : int ira_max_point_before_emit;
5675 1488370 : bool saved_flag_caller_saves = flag_caller_saves;
5676 1488370 : enum ira_region saved_flag_ira_region = flag_ira_region;
5677 1488370 : basic_block bb;
5678 1488370 : edge_iterator ei;
5679 1488370 : edge e;
5680 1488370 : bool output_jump_reload_p = false;
5681 :
5682 1488370 : setup_hard_regno_nrefs ();
5683 1488370 : if (ira_use_lra_p)
5684 : {
5685 : /* First put potential jump output reloads on the output edges
5686 : as USE which will be removed at the end of LRA. The major
5687 : goal is actually to create BBs for critical edges for LRA and
5688 : populate them later by live info. In LRA it will be
5689 : difficult to do this. */
5690 15780711 : FOR_EACH_BB_FN (bb, cfun)
5691 : {
5692 14292341 : rtx_insn *end = BB_END (bb);
5693 14292341 : if (!JUMP_P (end))
5694 5639660 : continue;
5695 8652681 : extract_insn (end);
5696 23477818 : for (int i = 0; i < recog_data.n_operands; i++)
5697 14825288 : if (recog_data.operand_type[i] != OP_IN)
5698 : {
5699 151 : bool skip_p = false;
5700 509 : FOR_EACH_EDGE (e, ei, bb->succs)
5701 727 : if (EDGE_CRITICAL_P (e)
5702 11 : && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5703 369 : && (e->flags & EDGE_ABNORMAL))
5704 : {
5705 : skip_p = true;
5706 : break;
5707 : }
5708 151 : if (skip_p)
5709 : break;
5710 151 : output_jump_reload_p = true;
5711 509 : FOR_EACH_EDGE (e, ei, bb->succs)
5712 727 : if (EDGE_CRITICAL_P (e)
5713 369 : && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
5714 : {
5715 11 : start_sequence ();
5716 : /* We need to put some no-op insn here. We can
5717 : not put a note as commit_edges insertion will
5718 : fail. */
5719 11 : emit_insn (gen_rtx_USE (VOIDmode, const1_rtx));
5720 11 : rtx_insn *insns = end_sequence ();
5721 11 : insert_insn_on_edge (insns, e);
5722 : }
5723 : break;
5724 : }
5725 : }
5726 1488370 : if (output_jump_reload_p)
5727 145 : commit_edge_insertions ();
5728 : }
5729 :
5730 1488370 : if (flag_ira_verbose < 10)
5731 : {
5732 1488370 : internal_flag_ira_verbose = flag_ira_verbose;
5733 1488370 : ira_dump_file = f;
5734 : }
5735 : else
5736 : {
5737 0 : internal_flag_ira_verbose = flag_ira_verbose - 10;
5738 0 : ira_dump_file = stderr;
5739 : }
5740 :
5741 1488370 : clear_bb_flags ();
5742 :
5743 : /* Determine if the current function is a leaf before running IRA
5744 : since this can impact optimizations done by the prologue and
5745 : epilogue thus changing register elimination offsets.
5746 : Other target callbacks may use crtl->is_leaf too, including
5747 : SHRINK_WRAPPING_ENABLED, so initialize as early as possible. */
5748 1488370 : crtl->is_leaf = leaf_function_p ();
5749 :
5750 : /* Perform target specific PIC register initialization. */
5751 1488370 : targetm.init_pic_reg ();
5752 :
5753 1488370 : ira_conflicts_p = optimize > 0;
5754 :
5755 : /* Determine the number of pseudos actually requiring coloring. */
5756 1488370 : unsigned int num_used_regs = 0;
5757 66864839 : for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
5758 65376469 : if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
5759 29907816 : num_used_regs++;
5760 :
5761 : /* If there are too many pseudos and/or basic blocks (e.g. 10K pseudos and
5762 : 10K blocks or 100K pseudos and 1K blocks) or we have too many function
5763 : insns, we will use simplified and faster algorithms in LRA. */
5764 1488370 : lra_simple_p
5765 1488370 : = (ira_use_lra_p
5766 1488370 : && (num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun)
5767 : /* max uid is a good evaluation of the number of insns as most
5768 : optimizations are done on tree-SSA level. */
5769 1488365 : || ((uint64_t) get_max_uid ()
5770 1488365 : > (uint64_t) param_ira_simple_lra_insn_threshold * 1000)));
5771 :
5772 1488370 : if (lra_simple_p)
5773 : {
5774 : /* It permits to skip live range splitting in LRA. */
5775 5 : flag_caller_saves = false;
5776 : /* There is no sense to do regional allocation when we use
5777 : simplified LRA. */
5778 5 : flag_ira_region = IRA_REGION_ONE;
5779 5 : ira_conflicts_p = false;
5780 : }
5781 :
5782 : #ifndef IRA_NO_OBSTACK
5783 : gcc_obstack_init (&ira_obstack);
5784 : #endif
5785 1488370 : bitmap_obstack_initialize (&ira_bitmap_obstack);
5786 :
5787 : /* LRA uses its own infrastructure to handle caller save registers. */
5788 1488370 : if (flag_caller_saves && !ira_use_lra_p)
5789 0 : init_caller_save ();
5790 :
5791 1488370 : setup_prohibited_mode_move_regs ();
5792 1488370 : decrease_live_ranges_number ();
5793 1488370 : df_note_add_problem ();
5794 :
5795 : /* DF_LIVE can't be used in the register allocator, too many other
5796 : parts of the compiler depend on using the "classic" liveness
5797 : interpretation of the DF_LR problem. See PR38711.
5798 : Remove the problem, so that we don't spend time updating it in
5799 : any of the df_analyze() calls during IRA/LRA. */
5800 1488370 : if (optimize > 1)
5801 961301 : df_remove_problem (df_live);
5802 1488370 : gcc_checking_assert (df_live == NULL);
5803 :
5804 1488370 : if (flag_checking)
5805 1488350 : df->changeable_flags |= DF_VERIFY_SCHEDULED;
5806 :
5807 1488370 : df_analyze ();
5808 :
5809 1488370 : init_reg_equiv ();
5810 1488370 : if (ira_conflicts_p)
5811 : {
5812 1041770 : calculate_dominance_info (CDI_DOMINATORS);
5813 :
5814 1041770 : if (split_live_ranges_for_shrink_wrap ())
5815 27078 : df_analyze ();
5816 :
5817 1041770 : free_dominance_info (CDI_DOMINATORS);
5818 : }
5819 :
5820 1488370 : df_clear_flags (DF_NO_INSN_RESCAN);
5821 :
5822 1488370 : indirect_jump_optimize ();
5823 1488370 : if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
5824 5391 : df_analyze ();
5825 :
5826 1488370 : regstat_init_n_sets_and_refs ();
5827 1488370 : regstat_compute_ri ();
5828 :
5829 : /* If we are not optimizing, then this is the only place before
5830 : register allocation where dataflow is done. And that is needed
5831 : to generate these warnings. */
5832 1488370 : if (warn_clobbered)
5833 135501 : generate_setjmp_warnings ();
5834 :
5835 : /* update_equiv_regs can use reg classes of pseudos and they are set up in
5836 : register pressure sensitive scheduling and loop invariant motion and in
5837 : live range shrinking. This info can become obsolete if we add new pseudos
5838 : since the last set up. Recalculate it again if the new pseudos were
5839 : added. */
5840 1488370 : if (resize_reg_info () && (flag_sched_pressure || flag_live_range_shrinkage
5841 1488275 : || flag_ira_loop_pressure))
5842 43 : ira_set_pseudo_classes (true, ira_dump_file);
5843 :
5844 1488370 : init_alias_analysis ();
5845 1488370 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5846 1488370 : reg_equiv = XCNEWVEC (struct equivalence, max_reg_num ());
5847 1488370 : update_equiv_regs_prescan ();
5848 1488370 : update_equiv_regs ();
5849 :
5850 : /* Don't move insns if live range shrinkage or register
5851 : pressure-sensitive scheduling were done because it will not
5852 : improve allocation but likely worsen insn scheduling. */
5853 1488370 : if (optimize
5854 1041770 : && !flag_live_range_shrinkage
5855 1041739 : && !(flag_sched_pressure && flag_schedule_insns))
5856 1041717 : combine_and_move_insns ();
5857 :
5858 : /* Gather additional equivalences with memory. */
5859 1488370 : if (optimize && flag_expensive_optimizations)
5860 961263 : add_store_equivs ();
5861 :
5862 1488370 : loop_optimizer_finalize ();
5863 1488370 : free_dominance_info (CDI_DOMINATORS);
5864 1488370 : end_alias_analysis ();
5865 1488370 : free (reg_equiv);
5866 :
5867 : /* Once max_regno changes, we need to free and re-init/re-compute
5868 : some data structures like regstat_n_sets_and_refs and reg_info_p. */
5869 1561144 : auto regstat_recompute_for_max_regno = []() {
5870 72774 : regstat_free_n_sets_and_refs ();
5871 72774 : regstat_free_ri ();
5872 72774 : regstat_init_n_sets_and_refs ();
5873 72774 : regstat_compute_ri ();
5874 72774 : resize_reg_info ();
5875 72774 : };
5876 :
5877 1488370 : int max_regno_before_rm = max_reg_num ();
5878 1488370 : if (ira_use_lra_p && remove_scratches ())
5879 : {
5880 37581 : ira_expand_reg_equiv ();
5881 : /* For now remove_scatches is supposed to create pseudos when it
5882 : succeeds, assert this happens all the time. Once it doesn't
5883 : hold, we should guard the regstat recompute for the case
5884 : max_regno changes. */
5885 37581 : gcc_assert (max_regno_before_rm != max_reg_num ());
5886 37581 : regstat_recompute_for_max_regno ();
5887 : }
5888 :
5889 1488370 : setup_reg_equiv ();
5890 1488370 : grow_reg_equivs ();
5891 1488370 : setup_reg_equiv_init ();
5892 :
5893 1488370 : allocated_reg_info_size = max_reg_num ();
5894 :
5895 : /* It is not worth to do such improvement when we use a simple
5896 : allocation because of -O0 usage or because the function is too
5897 : big. */
5898 1488370 : if (ira_conflicts_p)
5899 1041770 : find_moveable_pseudos ();
5900 :
5901 1488370 : max_regno_before_ira = max_reg_num ();
5902 1488370 : ira_setup_eliminable_regset ();
5903 :
5904 1488370 : ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
5905 1488370 : ira_load_cost = ira_store_cost = ira_shuffle_cost = 0;
5906 1488370 : ira_move_loops_num = ira_additional_jumps_num = 0;
5907 :
5908 1488370 : ira_assert (current_loops == NULL);
5909 1488370 : if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED)
5910 995570 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS | LOOPS_HAVE_RECORDED_EXITS);
5911 :
5912 1488370 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5913 95 : fprintf (ira_dump_file, "Building IRA IR\n");
5914 1488370 : loops_p = ira_build ();
5915 :
5916 1488370 : ira_assert (ira_conflicts_p || !loops_p);
5917 :
5918 1488370 : saved_flag_ira_share_spill_slots = flag_ira_share_spill_slots;
5919 1488370 : if (too_high_register_pressure_p () || cfun->calls_setjmp)
5920 : /* It is just wasting compiler's time to pack spilled pseudos into
5921 : stack slots in this case -- prohibit it. We also do this if
5922 : there is setjmp call because a variable not modified between
5923 : setjmp and longjmp the compiler is required to preserve its
5924 : value and sharing slots does not guarantee it. */
5925 1367 : flag_ira_share_spill_slots = false;
5926 :
5927 1488370 : ira_color ();
5928 :
5929 1488370 : ira_max_point_before_emit = ira_max_point;
5930 :
5931 1488370 : ira_initiate_emit_data ();
5932 :
5933 1488370 : ira_emit (loops_p);
5934 :
5935 1488370 : max_regno = max_reg_num ();
5936 1488370 : if (ira_conflicts_p)
5937 : {
5938 1041770 : if (! loops_p)
5939 : {
5940 1006158 : if (! ira_use_lra_p)
5941 0 : ira_initiate_assign ();
5942 : }
5943 : else
5944 : {
5945 35612 : expand_reg_info ();
5946 :
5947 35612 : if (ira_use_lra_p)
5948 : {
5949 35612 : ira_allocno_t a;
5950 35612 : ira_allocno_iterator ai;
5951 :
5952 11667465 : FOR_EACH_ALLOCNO (a, ai)
5953 : {
5954 11596241 : int old_regno = ALLOCNO_REGNO (a);
5955 11596241 : int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)->reg);
5956 :
5957 11596241 : ALLOCNO_REGNO (a) = new_regno;
5958 :
5959 11596241 : if (old_regno != new_regno)
5960 1282857 : setup_reg_classes (new_regno, reg_preferred_class (old_regno),
5961 : reg_alternate_class (old_regno),
5962 : reg_allocno_class (old_regno));
5963 : }
5964 : }
5965 : else
5966 : {
5967 0 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5968 0 : fprintf (ira_dump_file, "Flattening IR\n");
5969 0 : ira_flattening (max_regno_before_ira, ira_max_point_before_emit);
5970 : }
5971 : /* New insns were generated: add notes and recalculate live
5972 : info. */
5973 35612 : df_analyze ();
5974 :
5975 : /* ??? Rebuild the loop tree, but why? Does the loop tree
5976 : change if new insns were generated? Can that be handled
5977 : by updating the loop tree incrementally? */
5978 35612 : loop_optimizer_finalize ();
5979 35612 : free_dominance_info (CDI_DOMINATORS);
5980 35612 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS
5981 : | LOOPS_HAVE_RECORDED_EXITS);
5982 :
5983 35612 : if (! ira_use_lra_p)
5984 : {
5985 0 : setup_allocno_assignment_flags ();
5986 0 : ira_initiate_assign ();
5987 0 : ira_reassign_conflict_allocnos (max_regno);
5988 : }
5989 : }
5990 : }
5991 :
5992 1488370 : ira_finish_emit_data ();
5993 :
5994 1488370 : setup_reg_renumber ();
5995 :
5996 1488370 : calculate_allocation_cost ();
5997 :
5998 : #ifdef ENABLE_IRA_CHECKING
5999 1488370 : if (ira_conflicts_p && ! ira_use_lra_p)
6000 : /* Opposite to reload pass, LRA does not use any conflict info
6001 : from IRA. We don't rebuild conflict info for LRA (through
6002 : ira_flattening call) and cannot use the check here. We could
6003 : rebuild this info for LRA in the check mode but there is a risk
6004 : that code generated with the check and without it will be a bit
6005 : different. Calling ira_flattening in any mode would be a
6006 : wasting CPU time. So do not check the allocation for LRA. */
6007 0 : check_allocation ();
6008 : #endif
6009 :
6010 1488370 : if (max_regno != max_regno_before_ira)
6011 35193 : regstat_recompute_for_max_regno ();
6012 :
6013 1488370 : overall_cost_before = ira_overall_cost;
6014 1488370 : if (! ira_conflicts_p)
6015 446600 : grow_reg_equivs ();
6016 : else
6017 : {
6018 1041770 : fix_reg_equiv_init ();
6019 :
6020 : #ifdef ENABLE_IRA_CHECKING
6021 1041770 : print_redundant_copies ();
6022 : #endif
6023 1041770 : if (! ira_use_lra_p)
6024 : {
6025 0 : ira_spilled_reg_stack_slots_num = 0;
6026 0 : ira_spilled_reg_stack_slots
6027 0 : = ((class ira_spilled_reg_stack_slot *)
6028 0 : ira_allocate (max_regno
6029 : * sizeof (class ira_spilled_reg_stack_slot)));
6030 0 : memset ((void *)ira_spilled_reg_stack_slots, 0,
6031 0 : max_regno * sizeof (class ira_spilled_reg_stack_slot));
6032 : }
6033 : }
6034 1488370 : allocate_initial_values ();
6035 :
6036 : /* See comment for find_moveable_pseudos call. */
6037 1488370 : if (ira_conflicts_p)
6038 1041770 : move_unallocated_pseudos ();
6039 :
6040 : /* Restore original values. */
6041 1488370 : if (lra_simple_p)
6042 : {
6043 5 : flag_caller_saves = saved_flag_caller_saves;
6044 5 : flag_ira_region = saved_flag_ira_region;
6045 : }
6046 1488370 : }
6047 :
6048 : /* Modify asm goto to avoid further trouble with this insn. We can
6049 : not replace the insn by USE as in other asm insns as we still
6050 : need to keep CFG consistency. */
6051 : void
6052 6 : ira_nullify_asm_goto (rtx_insn *insn)
6053 : {
6054 6 : ira_assert (JUMP_P (insn) && INSN_CODE (insn) < 0);
6055 6 : rtx tmp = extract_asm_operands (PATTERN (insn));
6056 6 : PATTERN (insn) = gen_rtx_ASM_OPERANDS (VOIDmode, ggc_strdup (""), "", 0,
6057 : rtvec_alloc (0),
6058 : rtvec_alloc (0),
6059 : ASM_OPERANDS_LABEL_VEC (tmp),
6060 : ASM_OPERANDS_SOURCE_LOCATION(tmp));
6061 6 : }
6062 :
6063 : static void
6064 1488370 : do_reload (void)
6065 : {
6066 1488370 : basic_block bb;
6067 1488370 : bool need_dce;
6068 1488370 : unsigned pic_offset_table_regno = INVALID_REGNUM;
6069 :
6070 1488370 : if (flag_ira_verbose < 10)
6071 1488370 : ira_dump_file = dump_file;
6072 :
6073 : /* If pic_offset_table_rtx is a pseudo register, then keep it so
6074 : after reload to avoid possible wrong usages of hard reg assigned
6075 : to it. */
6076 1488370 : if (pic_offset_table_rtx
6077 1488370 : && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
6078 : pic_offset_table_regno = REGNO (pic_offset_table_rtx);
6079 :
6080 1488370 : timevar_push (TV_RELOAD);
6081 1488370 : if (ira_use_lra_p)
6082 : {
6083 1488370 : if (current_loops != NULL)
6084 : {
6085 995570 : loop_optimizer_finalize ();
6086 995570 : free_dominance_info (CDI_DOMINATORS);
6087 : }
6088 18861751 : FOR_ALL_BB_FN (bb, cfun)
6089 17373381 : bb->loop_father = NULL;
6090 1488370 : current_loops = NULL;
6091 :
6092 1488370 : ira_destroy ();
6093 :
6094 1488370 : lra (ira_dump_file, internal_flag_ira_verbose);
6095 : /* ???!!! Move it before lra () when we use ira_reg_equiv in
6096 : LRA. */
6097 1488370 : vec_free (reg_equivs);
6098 1488370 : reg_equivs = NULL;
6099 1488370 : need_dce = false;
6100 : }
6101 : else
6102 : {
6103 0 : df_set_flags (DF_NO_INSN_RESCAN);
6104 0 : build_insn_chain ();
6105 :
6106 0 : need_dce = reload (get_insns (), ira_conflicts_p);
6107 : }
6108 :
6109 1488370 : timevar_pop (TV_RELOAD);
6110 :
6111 1488370 : timevar_push (TV_IRA);
6112 :
6113 1488370 : if (ira_conflicts_p && ! ira_use_lra_p)
6114 : {
6115 0 : ira_free (ira_spilled_reg_stack_slots);
6116 0 : ira_finish_assign ();
6117 : }
6118 :
6119 1488370 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL
6120 96 : && overall_cost_before != ira_overall_cost)
6121 0 : fprintf (ira_dump_file, "+++Overall after reload %" PRId64 "\n",
6122 : ira_overall_cost);
6123 :
6124 1488370 : flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots;
6125 :
6126 1488370 : if (! ira_use_lra_p)
6127 : {
6128 0 : ira_destroy ();
6129 0 : if (current_loops != NULL)
6130 : {
6131 0 : loop_optimizer_finalize ();
6132 0 : free_dominance_info (CDI_DOMINATORS);
6133 : }
6134 0 : FOR_ALL_BB_FN (bb, cfun)
6135 0 : bb->loop_father = NULL;
6136 0 : current_loops = NULL;
6137 :
6138 0 : regstat_free_ri ();
6139 0 : regstat_free_n_sets_and_refs ();
6140 : }
6141 :
6142 1488370 : if (optimize)
6143 1041770 : cleanup_cfg (CLEANUP_EXPENSIVE);
6144 :
6145 1488370 : finish_reg_equiv ();
6146 :
6147 1488370 : bitmap_obstack_release (&ira_bitmap_obstack);
6148 : #ifndef IRA_NO_OBSTACK
6149 : obstack_free (&ira_obstack, NULL);
6150 : #endif
6151 :
6152 : /* The code after the reload has changed so much that at this point
6153 : we might as well just rescan everything. Note that
6154 : df_rescan_all_insns is not going to help here because it does not
6155 : touch the artificial uses and defs. */
6156 1488370 : df_finish_pass (true);
6157 1488370 : df_scan_alloc (NULL);
6158 1488370 : df_scan_blocks ();
6159 :
6160 1488370 : if (optimize > 1)
6161 : {
6162 961301 : df_live_add_problem ();
6163 961301 : df_live_set_all_dirty ();
6164 : }
6165 :
6166 1488370 : if (optimize)
6167 1041770 : df_analyze ();
6168 :
6169 1488370 : if (need_dce && optimize)
6170 0 : run_fast_dce ();
6171 :
6172 : /* Diagnose uses of the hard frame pointer when it is used as a global
6173 : register. Often we can get away with letting the user appropriate
6174 : the frame pointer, but we should let them know when code generation
6175 : makes that impossible. */
6176 1488370 : if (global_regs[HARD_FRAME_POINTER_REGNUM] && frame_pointer_needed)
6177 : {
6178 2 : tree decl = global_regs_decl[HARD_FRAME_POINTER_REGNUM];
6179 2 : error_at (DECL_SOURCE_LOCATION (current_function_decl),
6180 : "frame pointer required, but reserved");
6181 2 : inform (DECL_SOURCE_LOCATION (decl), "for %qD", decl);
6182 : }
6183 :
6184 : /* If we are doing generic stack checking, give a warning if this
6185 : function's frame size is larger than we expect. */
6186 1488370 : if (flag_stack_check == GENERIC_STACK_CHECK)
6187 : {
6188 49 : poly_int64 size = get_frame_size () + STACK_CHECK_FIXED_FRAME_SIZE;
6189 :
6190 4557 : for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
6191 4508 : if (df_regs_ever_live_p (i)
6192 235 : && !fixed_regs[i]
6193 4679 : && !crtl->abi->clobbers_full_reg_p (i))
6194 84 : size += UNITS_PER_WORD;
6195 :
6196 49 : if (constant_lower_bound (size) > STACK_CHECK_MAX_FRAME_SIZE)
6197 1 : warning (0, "frame size too large for reliable stack checking");
6198 : }
6199 :
6200 1488370 : if (pic_offset_table_regno != INVALID_REGNUM)
6201 80871 : pic_offset_table_rtx = gen_rtx_REG (Pmode, pic_offset_table_regno);
6202 :
6203 1488370 : timevar_pop (TV_IRA);
6204 1488370 : }
6205 :
6206 : /* Run the integrated register allocator. */
6207 :
6208 : namespace {
6209 :
6210 : const pass_data pass_data_ira =
6211 : {
6212 : RTL_PASS, /* type */
6213 : "ira", /* name */
6214 : OPTGROUP_NONE, /* optinfo_flags */
6215 : TV_IRA, /* tv_id */
6216 : 0, /* properties_required */
6217 : 0, /* properties_provided */
6218 : 0, /* properties_destroyed */
6219 : 0, /* todo_flags_start */
6220 : TODO_do_not_ggc_collect, /* todo_flags_finish */
6221 : };
6222 :
6223 : class pass_ira : public rtl_opt_pass
6224 : {
6225 : public:
6226 298828 : pass_ira (gcc::context *ctxt)
6227 597656 : : rtl_opt_pass (pass_data_ira, ctxt)
6228 : {}
6229 :
6230 : /* opt_pass methods: */
6231 1488378 : bool gate (function *) final override
6232 : {
6233 1488378 : return !targetm.no_register_allocation;
6234 : }
6235 1488370 : unsigned int execute (function *) final override
6236 : {
6237 1488370 : ira_in_progress = true;
6238 1488370 : ira (dump_file);
6239 1488370 : ira_in_progress = false;
6240 1488370 : return 0;
6241 : }
6242 :
6243 : }; // class pass_ira
6244 :
6245 : } // anon namespace
6246 :
6247 : rtl_opt_pass *
6248 298828 : make_pass_ira (gcc::context *ctxt)
6249 : {
6250 298828 : return new pass_ira (ctxt);
6251 : }
6252 :
6253 : namespace {
6254 :
6255 : const pass_data pass_data_reload =
6256 : {
6257 : RTL_PASS, /* type */
6258 : "reload", /* name */
6259 : OPTGROUP_NONE, /* optinfo_flags */
6260 : TV_RELOAD, /* tv_id */
6261 : 0, /* properties_required */
6262 : 0, /* properties_provided */
6263 : 0, /* properties_destroyed */
6264 : 0, /* todo_flags_start */
6265 : 0, /* todo_flags_finish */
6266 : };
6267 :
6268 : class pass_reload : public rtl_opt_pass
6269 : {
6270 : public:
6271 298828 : pass_reload (gcc::context *ctxt)
6272 597656 : : rtl_opt_pass (pass_data_reload, ctxt)
6273 : {}
6274 :
6275 : /* opt_pass methods: */
6276 1488378 : bool gate (function *) final override
6277 : {
6278 1488378 : return !targetm.no_register_allocation;
6279 : }
6280 1488370 : unsigned int execute (function *) final override
6281 : {
6282 1488370 : do_reload ();
6283 1488370 : return 0;
6284 : }
6285 :
6286 : }; // class pass_reload
6287 :
6288 : } // anon namespace
6289 :
6290 : rtl_opt_pass *
6291 298828 : make_pass_reload (gcc::context *ctxt)
6292 : {
6293 298828 : return new pass_reload (ctxt);
6294 : }
|