Line data Source code
1 : /* Integrated Register Allocator (IRA) entry point.
2 : Copyright (C) 2006-2026 Free Software Foundation, Inc.
3 : Contributed by Vladimir Makarov <vmakarov@redhat.com>.
4 :
5 : This file is part of GCC.
6 :
7 : GCC is free software; you can redistribute it and/or modify it under
8 : the terms of the GNU General Public License as published by the Free
9 : Software Foundation; either version 3, or (at your option) any later
10 : version.
11 :
12 : GCC is distributed in the hope that it will be useful, but WITHOUT ANY
13 : WARRANTY; without even the implied warranty of MERCHANTABILITY or
14 : FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
15 : for more details.
16 :
17 : You should have received a copy of the GNU General Public License
18 : along with GCC; see the file COPYING3. If not see
19 : <http://www.gnu.org/licenses/>. */
20 :
21 : /* The integrated register allocator (IRA) is a
22 : regional register allocator performing graph coloring on a top-down
23 : traversal of nested regions. Graph coloring in a region is based
24 : on Chaitin-Briggs algorithm. It is called integrated because
25 : register coalescing, register live range splitting, and choosing a
26 : better hard register are done on-the-fly during coloring. Register
27 : coalescing and choosing a cheaper hard register is done by hard
28 : register preferencing during hard register assigning. The live
29 : range splitting is a byproduct of the regional register allocation.
30 :
31 : Major IRA notions are:
32 :
33 : o *Region* is a part of CFG where graph coloring based on
34 : Chaitin-Briggs algorithm is done. IRA can work on any set of
35 : nested CFG regions forming a tree. Currently the regions are
36 : the entire function for the root region and natural loops for
37 : the other regions. Therefore data structure representing a
38 : region is called loop_tree_node.
39 :
40 : o *Allocno class* is a register class used for allocation of
41 : given allocno. It means that only hard register of given
42 : register class can be assigned to given allocno. In reality,
43 : even smaller subset of (*profitable*) hard registers can be
44 : assigned. In rare cases, the subset can be even smaller
45 : because our modification of Chaitin-Briggs algorithm requires
46 : that sets of hard registers can be assigned to allocnos forms a
47 : forest, i.e. the sets can be ordered in a way where any
48 : previous set is not intersected with given set or is a superset
49 : of given set.
50 :
51 : o *Pressure class* is a register class belonging to a set of
52 : register classes containing all of the hard-registers available
53 : for register allocation. The set of all pressure classes for a
54 : target is defined in the corresponding machine-description file
55 : according some criteria. Register pressure is calculated only
56 : for pressure classes and it affects some IRA decisions as
57 : forming allocation regions.
58 :
59 : o *Allocno* represents the live range of a pseudo-register in a
60 : region. Besides the obvious attributes like the corresponding
61 : pseudo-register number, allocno class, conflicting allocnos and
62 : conflicting hard-registers, there are a few allocno attributes
63 : which are important for understanding the allocation algorithm:
64 :
65 : - *Live ranges*. This is a list of ranges of *program points*
66 : where the allocno lives. Program points represent places
67 : where a pseudo can be born or become dead (there are
68 : approximately two times more program points than the insns)
69 : and they are represented by integers starting with 0. The
70 : live ranges are used to find conflicts between allocnos.
71 : They also play very important role for the transformation of
72 : the IRA internal representation of several regions into a one
73 : region representation. The later is used during the reload
74 : pass work because each allocno represents all of the
75 : corresponding pseudo-registers.
76 :
77 : - *Hard-register costs*. This is a vector of size equal to the
78 : number of available hard-registers of the allocno class. The
79 : cost of a callee-clobbered hard-register for an allocno is
80 : increased by the cost of save/restore code around the calls
81 : through the given allocno's life. If the allocno is a move
82 : instruction operand and another operand is a hard-register of
83 : the allocno class, the cost of the hard-register is decreased
84 : by the move cost.
85 :
86 : When an allocno is assigned, the hard-register with minimal
87 : full cost is used. Initially, a hard-register's full cost is
88 : the corresponding value from the hard-register's cost vector.
89 : If the allocno is connected by a *copy* (see below) to
90 : another allocno which has just received a hard-register, the
91 : cost of the hard-register is decreased. Before choosing a
92 : hard-register for an allocno, the allocno's current costs of
93 : the hard-registers are modified by the conflict hard-register
94 : costs of all of the conflicting allocnos which are not
95 : assigned yet.
96 :
97 : - *Conflict hard-register costs*. This is a vector of the same
98 : size as the hard-register costs vector. To permit an
99 : unassigned allocno to get a better hard-register, IRA uses
100 : this vector to calculate the final full cost of the
101 : available hard-registers. Conflict hard-register costs of an
102 : unassigned allocno are also changed with a change of the
103 : hard-register cost of the allocno when a copy involving the
104 : allocno is processed as described above. This is done to
105 : show other unassigned allocnos that a given allocno prefers
106 : some hard-registers in order to remove the move instruction
107 : corresponding to the copy.
108 :
109 : o *Cap*. If a pseudo-register does not live in a region but
110 : lives in a nested region, IRA creates a special allocno called
111 : a cap in the outer region. A region cap is also created for a
112 : subregion cap.
113 :
114 : o *Copy*. Allocnos can be connected by copies. Copies are used
115 : to modify hard-register costs for allocnos during coloring.
116 : Such modifications reflects a preference to use the same
117 : hard-register for the allocnos connected by copies. Usually
118 : copies are created for move insns (in this case it results in
119 : register coalescing). But IRA also creates copies for operands
120 : of an insn which should be assigned to the same hard-register
121 : due to constraints in the machine description (it usually
122 : results in removing a move generated in reload to satisfy
123 : the constraints) and copies referring to the allocno which is
124 : the output operand of an instruction and the allocno which is
125 : an input operand dying in the instruction (creation of such
126 : copies results in less register shuffling). IRA *does not*
127 : create copies between the same register allocnos from different
128 : regions because we use another technique for propagating
129 : hard-register preference on the borders of regions.
130 :
131 : Allocnos (including caps) for the upper region in the region tree
132 : *accumulate* information important for coloring from allocnos with
133 : the same pseudo-register from nested regions. This includes
134 : hard-register and memory costs, conflicts with hard-registers,
135 : allocno conflicts, allocno copies and more. *Thus, attributes for
136 : allocnos in a region have the same values as if the region had no
137 : subregions*. It means that attributes for allocnos in the
138 : outermost region corresponding to the function have the same values
139 : as though the allocation used only one region which is the entire
140 : function. It also means that we can look at IRA work as if the
141 : first IRA did allocation for all function then it improved the
142 : allocation for loops then their subloops and so on.
143 :
144 : IRA major passes are:
145 :
146 : o Building IRA internal representation which consists of the
147 : following subpasses:
148 :
149 : * First, IRA builds regions and creates allocnos (file
150 : ira-build.cc) and initializes most of their attributes.
151 :
152 : * Then IRA finds an allocno class for each allocno and
153 : calculates its initial (non-accumulated) cost of memory and
154 : each hard-register of its allocno class (file ira-cost.c).
155 :
156 : * IRA creates live ranges of each allocno, calculates register
157 : pressure for each pressure class in each region, sets up
158 : conflict hard registers for each allocno and info about calls
159 : the allocno lives through (file ira-lives.cc).
160 :
161 : * IRA removes low register pressure loops from the regions
162 : mostly to speed IRA up (file ira-build.cc).
163 :
164 : * IRA propagates accumulated allocno info from lower region
165 : allocnos to corresponding upper region allocnos (file
166 : ira-build.cc).
167 :
168 : * IRA creates all caps (file ira-build.cc).
169 :
170 : * Having live-ranges of allocnos and their classes, IRA creates
171 : conflicting allocnos for each allocno. Conflicting allocnos
172 : are stored as a bit vector or array of pointers to the
173 : conflicting allocnos whatever is more profitable (file
174 : ira-conflicts.cc). At this point IRA creates allocno copies.
175 :
176 : o Coloring. Now IRA has all necessary info to start graph coloring
177 : process. It is done in each region on top-down traverse of the
178 : region tree (file ira-color.cc). There are following subpasses:
179 :
180 : * Finding profitable hard registers of corresponding allocno
181 : class for each allocno. For example, only callee-saved hard
182 : registers are frequently profitable for allocnos living
183 : through colors. If the profitable hard register set of
184 : allocno does not form a tree based on subset relation, we use
185 : some approximation to form the tree. This approximation is
186 : used to figure out trivial colorability of allocnos. The
187 : approximation is a pretty rare case.
188 :
189 : * Putting allocnos onto the coloring stack. IRA uses Briggs
190 : optimistic coloring which is a major improvement over
191 : Chaitin's coloring. Therefore IRA does not spill allocnos at
192 : this point. There is some freedom in the order of putting
193 : allocnos on the stack which can affect the final result of
194 : the allocation. IRA uses some heuristics to improve the
195 : order. The major one is to form *threads* from colorable
196 : allocnos and push them on the stack by threads. Thread is a
197 : set of non-conflicting colorable allocnos connected by
198 : copies. The thread contains allocnos from the colorable
199 : bucket or colorable allocnos already pushed onto the coloring
200 : stack. Pushing thread allocnos one after another onto the
201 : stack increases chances of removing copies when the allocnos
202 : get the same hard reg.
203 :
204 : We also use a modification of Chaitin-Briggs algorithm which
205 : works for intersected register classes of allocnos. To
206 : figure out trivial colorability of allocnos, the mentioned
207 : above tree of hard register sets is used. To get an idea how
208 : the algorithm works in i386 example, let us consider an
209 : allocno to which any general hard register can be assigned.
210 : If the allocno conflicts with eight allocnos to which only
211 : EAX register can be assigned, given allocno is still
212 : trivially colorable because all conflicting allocnos might be
213 : assigned only to EAX and all other general hard registers are
214 : still free.
215 :
216 : To get an idea of the used trivial colorability criterion, it
217 : is also useful to read article "Graph-Coloring Register
218 : Allocation for Irregular Architectures" by Michael D. Smith
219 : and Glen Holloway. Major difference between the article
220 : approach and approach used in IRA is that Smith's approach
221 : takes register classes only from machine description and IRA
222 : calculate register classes from intermediate code too
223 : (e.g. an explicit usage of hard registers in RTL code for
224 : parameter passing can result in creation of additional
225 : register classes which contain or exclude the hard
226 : registers). That makes IRA approach useful for improving
227 : coloring even for architectures with regular register files
228 : and in fact some benchmarking shows the improvement for
229 : regular class architectures is even bigger than for irregular
230 : ones. Another difference is that Smith's approach chooses
231 : intersection of classes of all insn operands in which a given
232 : pseudo occurs. IRA can use bigger classes if it is still
233 : more profitable than memory usage.
234 :
235 : * Popping the allocnos from the stack and assigning them hard
236 : registers. If IRA cannot assign a hard register to an
237 : allocno and the allocno is coalesced, IRA undoes the
238 : coalescing and puts the uncoalesced allocnos onto the stack in
239 : the hope that some such allocnos will get a hard register
240 : separately. If IRA fails to assign hard register or memory
241 : is more profitable for it, IRA spills the allocno. IRA
242 : assigns the allocno the hard-register with minimal full
243 : allocation cost which reflects the cost of usage of the
244 : hard-register for the allocno and cost of usage of the
245 : hard-register for allocnos conflicting with given allocno.
246 :
247 : * Chaitin-Briggs coloring assigns as many pseudos as possible
248 : to hard registers. After coloring we try to improve
249 : allocation with cost point of view. We improve the
250 : allocation by spilling some allocnos and assigning the freed
251 : hard registers to other allocnos if it decreases the overall
252 : allocation cost.
253 :
254 : * After allocno assigning in the region, IRA modifies the hard
255 : register and memory costs for the corresponding allocnos in
256 : the subregions to reflect the cost of possible loads, stores,
257 : or moves on the border of the region and its subregions.
258 : When default regional allocation algorithm is used
259 : (-fira-algorithm=mixed), IRA just propagates the assignment
260 : for allocnos if the register pressure in the region for the
261 : corresponding pressure class is less than number of available
262 : hard registers for given pressure class.
263 :
264 : o Spill/restore code moving. When IRA performs an allocation
265 : by traversing regions in top-down order, it does not know what
266 : happens below in the region tree. Therefore, sometimes IRA
267 : misses opportunities to perform a better allocation. A simple
268 : optimization tries to improve allocation in a region having
269 : subregions and containing in another region. If the
270 : corresponding allocnos in the subregion are spilled, it spills
271 : the region allocno if it is profitable. The optimization
272 : implements a simple iterative algorithm performing profitable
273 : transformations while they are still possible. It is fast in
274 : practice, so there is no real need for a better time complexity
275 : algorithm.
276 :
277 : o Code change. After coloring, two allocnos representing the
278 : same pseudo-register outside and inside a region respectively
279 : may be assigned to different locations (hard-registers or
280 : memory). In this case IRA creates and uses a new
281 : pseudo-register inside the region and adds code to move allocno
282 : values on the region's borders. This is done during top-down
283 : traversal of the regions (file ira-emit.cc). In some
284 : complicated cases IRA can create a new allocno to move allocno
285 : values (e.g. when a swap of values stored in two hard-registers
286 : is needed). At this stage, the new allocno is marked as
287 : spilled. IRA still creates the pseudo-register and the moves
288 : on the region borders even when both allocnos were assigned to
289 : the same hard-register. If the reload pass spills a
290 : pseudo-register for some reason, the effect will be smaller
291 : because another allocno will still be in the hard-register. In
292 : most cases, this is better then spilling both allocnos. If
293 : reload does not change the allocation for the two
294 : pseudo-registers, the trivial move will be removed by
295 : post-reload optimizations. IRA does not generate moves for
296 : allocnos assigned to the same hard register when the default
297 : regional allocation algorithm is used and the register pressure
298 : in the region for the corresponding pressure class is less than
299 : number of available hard registers for given pressure class.
300 : IRA also does some optimizations to remove redundant stores and
301 : to reduce code duplication on the region borders.
302 :
303 : o Flattening internal representation. After changing code, IRA
304 : transforms its internal representation for several regions into
305 : one region representation (file ira-build.cc). This process is
306 : called IR flattening. Such process is more complicated than IR
307 : rebuilding would be, but is much faster.
308 :
309 : o After IR flattening, IRA tries to assign hard registers to all
310 : spilled allocnos. This is implemented by a simple and fast
311 : priority coloring algorithm (see function
312 : ira_reassign_conflict_allocnos::ira-color.cc). Here new allocnos
313 : created during the code change pass can be assigned to hard
314 : registers.
315 :
316 : o At the end IRA calls the reload pass. The reload pass
317 : communicates with IRA through several functions in file
318 : ira-color.cc to improve its decisions in
319 :
320 : * sharing stack slots for the spilled pseudos based on IRA info
321 : about pseudo-register conflicts.
322 :
323 : * reassigning hard-registers to all spilled pseudos at the end
324 : of each reload iteration.
325 :
326 : * choosing a better hard-register to spill based on IRA info
327 : about pseudo-register live ranges and the register pressure
328 : in places where the pseudo-register lives.
329 :
330 : IRA uses a lot of data representing the target processors. These
331 : data are initialized in file ira.cc.
332 :
333 : If function has no loops (or the loops are ignored when
334 : -fira-algorithm=CB is used), we have classic Chaitin-Briggs
335 : coloring (only instead of separate pass of coalescing, we use hard
336 : register preferencing). In such case, IRA works much faster
337 : because many things are not made (like IR flattening, the
338 : spill/restore optimization, and the code change).
339 :
340 : Literature is worth to read for better understanding the code:
341 :
342 : o Preston Briggs, Keith D. Cooper, Linda Torczon. Improvements to
343 : Graph Coloring Register Allocation.
344 :
345 : o David Callahan, Brian Koblenz. Register allocation via
346 : hierarchical graph coloring.
347 :
348 : o Keith Cooper, Anshuman Dasgupta, Jason Eckhardt. Revisiting Graph
349 : Coloring Register Allocation: A Study of the Chaitin-Briggs and
350 : Callahan-Koblenz Algorithms.
351 :
352 : o Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Global
353 : Register Allocation Based on Graph Fusion.
354 :
355 : o Michael D. Smith and Glenn Holloway. Graph-Coloring Register
356 : Allocation for Irregular Architectures
357 :
358 : o Vladimir Makarov. The Integrated Register Allocator for GCC.
359 :
360 : o Vladimir Makarov. The top-down register allocator for irregular
361 : register file architectures.
362 :
363 : */
364 :
365 :
366 : #include "config.h"
367 : #include "system.h"
368 : #include "coretypes.h"
369 : #include "backend.h"
370 : #include "target.h"
371 : #include "rtl.h"
372 : #include "tree.h"
373 : #include "df.h"
374 : #include "memmodel.h"
375 : #include "tm_p.h"
376 : #include "insn-config.h"
377 : #include "regs.h"
378 : #include "ira.h"
379 : #include "ira-int.h"
380 : #include "diagnostic-core.h"
381 : #include "cfgrtl.h"
382 : #include "cfgbuild.h"
383 : #include "cfgcleanup.h"
384 : #include "expr.h"
385 : #include "tree-pass.h"
386 : #include "output.h"
387 : #include "reload.h"
388 : #include "cfgloop.h"
389 : #include "lra.h"
390 : #include "dce.h"
391 : #include "dbgcnt.h"
392 : #include "rtl-iter.h"
393 : #include "shrink-wrap.h"
394 : #include "print-rtl.h"
395 :
396 : struct target_ira default_target_ira;
397 : class target_ira_int default_target_ira_int;
398 : #if SWITCHABLE_TARGET
399 : struct target_ira *this_target_ira = &default_target_ira;
400 : class target_ira_int *this_target_ira_int = &default_target_ira_int;
401 : #endif
402 :
403 : /* A modified value of flag `-fira-verbose' used internally. */
404 : int internal_flag_ira_verbose;
405 :
406 : /* Dump file of the allocator if it is not NULL. */
407 : FILE *ira_dump_file;
408 :
409 : /* The number of elements in the following array. */
410 : int ira_spilled_reg_stack_slots_num;
411 :
412 : /* The following array contains info about spilled pseudo-registers
413 : stack slots used in current function so far. */
414 : class ira_spilled_reg_stack_slot *ira_spilled_reg_stack_slots;
415 :
416 : /* Correspondingly overall cost of the allocation, overall cost before
417 : reload, cost of the allocnos assigned to hard-registers, cost of
418 : the allocnos assigned to memory, cost of loads, stores and register
419 : move insns generated for pseudo-register live range splitting (see
420 : ira-emit.cc). */
421 : int64_t ira_overall_cost, overall_cost_before;
422 : int64_t ira_reg_cost, ira_mem_cost;
423 : int64_t ira_load_cost, ira_store_cost, ira_shuffle_cost;
424 : int ira_move_loops_num, ira_additional_jumps_num;
425 :
426 : /* All registers that can be eliminated. */
427 :
428 : HARD_REG_SET eliminable_regset;
429 :
430 : /* Value of max_reg_num () before IRA work start. This value helps
431 : us to recognize a situation when new pseudos were created during
432 : IRA work. */
433 : static int max_regno_before_ira;
434 :
435 : /* Temporary hard reg set used for a different calculation. */
436 : static HARD_REG_SET temp_hard_regset;
437 :
438 : #define last_mode_for_init_move_cost \
439 : (this_target_ira_int->x_last_mode_for_init_move_cost)
440 :
441 :
442 : /* The function sets up the map IRA_REG_MODE_HARD_REGSET. */
443 : static void
444 214527 : setup_reg_mode_hard_regset (void)
445 : {
446 214527 : int i, m, hard_regno;
447 :
448 26815875 : for (m = 0; m < NUM_MACHINE_MODES; m++)
449 2473925364 : for (hard_regno = 0; hard_regno < FIRST_PSEUDO_REGISTER; hard_regno++)
450 : {
451 2447324016 : CLEAR_HARD_REG_SET (ira_reg_mode_hard_regset[hard_regno][m]);
452 7408461256 : for (i = hard_regno_nregs (hard_regno, (machine_mode) m) - 1;
453 7408461256 : i >= 0; i--)
454 4961137240 : if (hard_regno + i < FIRST_PSEUDO_REGISTER)
455 4550544310 : SET_HARD_REG_BIT (ira_reg_mode_hard_regset[hard_regno][m],
456 : hard_regno + i);
457 : }
458 214527 : }
459 :
460 :
461 : #define no_unit_alloc_regs \
462 : (this_target_ira_int->x_no_unit_alloc_regs)
463 :
464 : /* The function sets up the three arrays declared above. */
465 : static void
466 214527 : setup_class_hard_regs (void)
467 : {
468 214527 : int cl, i, hard_regno, n;
469 214527 : HARD_REG_SET processed_hard_reg_set;
470 :
471 214527 : ira_assert (SHRT_MAX >= FIRST_PSEUDO_REGISTER);
472 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
473 : {
474 7293918 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
475 7293918 : CLEAR_HARD_REG_SET (processed_hard_reg_set);
476 678334374 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
477 : {
478 671040456 : ira_non_ordered_class_hard_regs[cl][i] = -1;
479 671040456 : ira_class_hard_reg_index[cl][i] = -1;
480 : }
481 678334374 : for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
482 : {
483 : #ifdef REG_ALLOC_ORDER
484 671040456 : hard_regno = reg_alloc_order[i];
485 : #else
486 : hard_regno = i;
487 : #endif
488 671040456 : if (TEST_HARD_REG_BIT (processed_hard_reg_set, hard_regno))
489 29175672 : continue;
490 641864784 : SET_HARD_REG_BIT (processed_hard_reg_set, hard_regno);
491 641864784 : if (! TEST_HARD_REG_BIT (temp_hard_regset, hard_regno))
492 567961590 : ira_class_hard_reg_index[cl][hard_regno] = -1;
493 : else
494 : {
495 73903194 : ira_class_hard_reg_index[cl][hard_regno] = n;
496 73903194 : ira_class_hard_regs[cl][n++] = hard_regno;
497 : }
498 : }
499 7293918 : ira_class_hard_regs_num[cl] = n;
500 678334374 : for (n = 0, i = 0; i < FIRST_PSEUDO_REGISTER; i++)
501 671040456 : if (TEST_HARD_REG_BIT (temp_hard_regset, i))
502 73903194 : ira_non_ordered_class_hard_regs[cl][n++] = i;
503 7293918 : ira_assert (ira_class_hard_regs_num[cl] == n);
504 : }
505 214527 : }
506 :
507 : /* Set up global variables defining info about hard registers for the
508 : allocation. These depend on USE_HARD_FRAME_P whose TRUE value means
509 : that we can use the hard frame pointer for the allocation. */
510 : static void
511 214527 : setup_alloc_regs (bool use_hard_frame_p)
512 : {
513 : #ifdef ADJUST_REG_ALLOC_ORDER
514 214527 : ADJUST_REG_ALLOC_ORDER;
515 : #endif
516 214527 : no_unit_alloc_regs = fixed_nonglobal_reg_set;
517 214527 : if (! use_hard_frame_p)
518 66465 : add_to_hard_reg_set (&no_unit_alloc_regs, Pmode,
519 : HARD_FRAME_POINTER_REGNUM);
520 214527 : setup_class_hard_regs ();
521 214527 : }
522 :
523 :
524 :
525 : #define alloc_reg_class_subclasses \
526 : (this_target_ira_int->x_alloc_reg_class_subclasses)
527 :
528 : /* Initialize the table of subclasses of each reg class. */
529 : static void
530 214527 : setup_reg_subclasses (void)
531 : {
532 214527 : int i, j;
533 214527 : HARD_REG_SET temp_hard_regset2;
534 :
535 7508445 : for (i = 0; i < N_REG_CLASSES; i++)
536 255287130 : for (j = 0; j < N_REG_CLASSES; j++)
537 247993212 : alloc_reg_class_subclasses[i][j] = LIM_REG_CLASSES;
538 :
539 7508445 : for (i = 0; i < N_REG_CLASSES; i++)
540 : {
541 7293918 : if (i == (int) NO_REGS)
542 214527 : continue;
543 :
544 7079391 : temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
545 14158782 : if (hard_reg_set_empty_p (temp_hard_regset))
546 426593 : continue;
547 232847930 : for (j = 0; j < N_REG_CLASSES; j++)
548 226195132 : if (i != j)
549 : {
550 219542334 : enum reg_class *p;
551 :
552 219542334 : temp_hard_regset2 = reg_class_contents[j] & ~no_unit_alloc_regs;
553 439084668 : if (! hard_reg_set_subset_p (temp_hard_regset,
554 : temp_hard_regset2))
555 167288690 : continue;
556 52253644 : p = &alloc_reg_class_subclasses[j][0];
557 502615444 : while (*p != LIM_REG_CLASSES) p++;
558 52253644 : *p = (enum reg_class) i;
559 : }
560 : }
561 214527 : }
562 :
563 :
564 :
565 : /* Set up IRA_MEMORY_MOVE_COST and IRA_MAX_MEMORY_MOVE_COST. */
566 : static void
567 214527 : setup_class_subset_and_memory_move_costs (void)
568 : {
569 214527 : int cl, cl2, mode, cost;
570 214527 : HARD_REG_SET temp_hard_regset2;
571 :
572 26815875 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
573 26601348 : ira_memory_move_cost[mode][NO_REGS][0]
574 26601348 : = ira_memory_move_cost[mode][NO_REGS][1] = SHRT_MAX;
575 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
576 : {
577 7293918 : if (cl != (int) NO_REGS)
578 884923875 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
579 : {
580 1755688968 : ira_max_memory_move_cost[mode][cl][0]
581 877844484 : = ira_memory_move_cost[mode][cl][0]
582 877844484 : = memory_move_cost ((machine_mode) mode,
583 : (reg_class_t) cl, false);
584 1755688968 : ira_max_memory_move_cost[mode][cl][1]
585 877844484 : = ira_memory_move_cost[mode][cl][1]
586 877844484 : = memory_move_cost ((machine_mode) mode,
587 : (reg_class_t) cl, true);
588 : /* Costs for NO_REGS are used in cost calculation on the
589 : 1st pass when the preferred register classes are not
590 : known yet. In this case we take the best scenario. */
591 877844484 : if (!targetm.hard_regno_mode_ok (ira_class_hard_regs[cl][0],
592 : (machine_mode) mode))
593 647323842 : continue;
594 :
595 230520642 : if (ira_memory_move_cost[mode][NO_REGS][0]
596 230520642 : > ira_memory_move_cost[mode][cl][0])
597 12955801 : ira_max_memory_move_cost[mode][NO_REGS][0]
598 12955801 : = ira_memory_move_cost[mode][NO_REGS][0]
599 12955801 : = ira_memory_move_cost[mode][cl][0];
600 230520642 : if (ira_memory_move_cost[mode][NO_REGS][1]
601 230520642 : > ira_memory_move_cost[mode][cl][1])
602 12946647 : ira_max_memory_move_cost[mode][NO_REGS][1]
603 12946647 : = ira_memory_move_cost[mode][NO_REGS][1]
604 12946647 : = ira_memory_move_cost[mode][cl][1];
605 : }
606 : }
607 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
608 255287130 : for (cl2 = (int) N_REG_CLASSES - 1; cl2 >= 0; cl2--)
609 : {
610 247993212 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
611 247993212 : temp_hard_regset2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
612 247993212 : ira_class_subset_p[cl][cl2]
613 247993212 : = hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2);
614 247993212 : if (! hard_reg_set_empty_p (temp_hard_regset2)
615 474188344 : && hard_reg_set_subset_p (reg_class_contents[cl2],
616 : reg_class_contents[cl]))
617 6675484250 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
618 : {
619 6622080376 : cost = ira_memory_move_cost[mode][cl2][0];
620 6622080376 : if (cost > ira_max_memory_move_cost[mode][cl][0])
621 112987874 : ira_max_memory_move_cost[mode][cl][0] = cost;
622 6622080376 : cost = ira_memory_move_cost[mode][cl2][1];
623 6622080376 : if (cost > ira_max_memory_move_cost[mode][cl][1])
624 113037523 : ira_max_memory_move_cost[mode][cl][1] = cost;
625 : }
626 : }
627 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
628 911739750 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
629 : {
630 904445832 : ira_memory_move_cost[mode][cl][0]
631 904445832 : = ira_max_memory_move_cost[mode][cl][0];
632 904445832 : ira_memory_move_cost[mode][cl][1]
633 904445832 : = ira_max_memory_move_cost[mode][cl][1];
634 : }
635 214527 : setup_reg_subclasses ();
636 214527 : }
637 :
638 :
639 :
640 : /* Define the following macro if allocation through malloc if
641 : preferable. */
642 : #define IRA_NO_OBSTACK
643 :
644 : #ifndef IRA_NO_OBSTACK
645 : /* Obstack used for storing all dynamic data (except bitmaps) of the
646 : IRA. */
647 : static struct obstack ira_obstack;
648 : #endif
649 :
650 : /* Obstack used for storing all bitmaps of the IRA. */
651 : static struct bitmap_obstack ira_bitmap_obstack;
652 :
653 : /* Allocate memory of size LEN for IRA data. */
654 : void *
655 206279500 : ira_allocate (size_t len)
656 : {
657 206279500 : void *res;
658 :
659 : #ifndef IRA_NO_OBSTACK
660 : res = obstack_alloc (&ira_obstack, len);
661 : #else
662 206279500 : res = xmalloc (len);
663 : #endif
664 206279500 : return res;
665 : }
666 :
667 : /* Free memory ADDR allocated for IRA data. */
668 : void
669 206279500 : ira_free (void *addr ATTRIBUTE_UNUSED)
670 : {
671 : #ifndef IRA_NO_OBSTACK
672 : /* do nothing */
673 : #else
674 206279500 : free (addr);
675 : #endif
676 206279500 : }
677 :
678 :
679 : /* Allocate and returns bitmap for IRA. */
680 : bitmap
681 10587810 : ira_allocate_bitmap (void)
682 : {
683 10587810 : return BITMAP_ALLOC (&ira_bitmap_obstack);
684 : }
685 :
686 : /* Free bitmap B allocated for IRA. */
687 : void
688 10587810 : ira_free_bitmap (bitmap b ATTRIBUTE_UNUSED)
689 : {
690 : /* do nothing */
691 10587810 : }
692 :
693 :
694 :
695 : /* Output information about allocation of all allocnos (except for
696 : caps) into file F. */
697 : void
698 95 : ira_print_disposition (FILE *f)
699 : {
700 95 : int i, n, max_regno;
701 95 : ira_allocno_t a;
702 95 : basic_block bb;
703 :
704 95 : fprintf (f, "Disposition:");
705 95 : max_regno = max_reg_num ();
706 1973 : for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
707 1783 : for (a = ira_regno_allocno_map[i];
708 2378 : a != NULL;
709 595 : a = ALLOCNO_NEXT_REGNO_ALLOCNO (a))
710 : {
711 595 : if (n % 4 == 0)
712 178 : fprintf (f, "\n");
713 595 : n++;
714 595 : fprintf (f, " %4d:r%-4d", ALLOCNO_NUM (a), ALLOCNO_REGNO (a));
715 595 : if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL)
716 0 : fprintf (f, "b%-3d", bb->index);
717 : else
718 595 : fprintf (f, "l%-3d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
719 595 : if (ALLOCNO_HARD_REGNO (a) >= 0)
720 594 : fprintf (f, " %3d", ALLOCNO_HARD_REGNO (a));
721 : else
722 1 : fprintf (f, " mem");
723 : }
724 95 : fprintf (f, "\n");
725 95 : }
726 :
727 : /* Outputs information about allocation of all allocnos into
728 : stderr. */
729 : void
730 0 : ira_debug_disposition (void)
731 : {
732 0 : ira_print_disposition (stderr);
733 0 : }
734 :
735 :
736 :
737 : /* Set up ira_stack_reg_pressure_class which is the biggest pressure
738 : register class containing stack registers or NO_REGS if there are
739 : no stack registers. To find this class, we iterate through all
740 : register pressure classes and choose the first register pressure
741 : class containing all the stack registers and having the biggest
742 : size. */
743 : static void
744 214527 : setup_stack_reg_pressure_class (void)
745 : {
746 214527 : ira_stack_reg_pressure_class = NO_REGS;
747 : #ifdef STACK_REGS
748 214527 : {
749 214527 : int i, best, size;
750 214527 : enum reg_class cl;
751 214527 : HARD_REG_SET temp_hard_regset2;
752 :
753 214527 : CLEAR_HARD_REG_SET (temp_hard_regset);
754 1930743 : for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++)
755 1716216 : SET_HARD_REG_BIT (temp_hard_regset, i);
756 : best = 0;
757 1074174 : for (i = 0; i < ira_pressure_classes_num; i++)
758 : {
759 859647 : cl = ira_pressure_classes[i];
760 859647 : temp_hard_regset2 = temp_hard_regset & reg_class_contents[cl];
761 859647 : size = hard_reg_set_size (temp_hard_regset2);
762 859647 : if (best < size)
763 : {
764 213856 : best = size;
765 213856 : ira_stack_reg_pressure_class = cl;
766 : }
767 : }
768 : }
769 : #endif
770 214527 : }
771 :
772 : /* Find pressure classes which are register classes for which we
773 : calculate register pressure in IRA, register pressure sensitive
774 : insn scheduling, and register pressure sensitive loop invariant
775 : motion.
776 :
777 : To make register pressure calculation easy, we always use
778 : non-intersected register pressure classes. A move of hard
779 : registers from one register pressure class is not more expensive
780 : than load and store of the hard registers. Most likely an allocno
781 : class will be a subset of a register pressure class and in many
782 : cases a register pressure class. That makes usage of register
783 : pressure classes a good approximation to find a high register
784 : pressure. */
785 : static void
786 214527 : setup_pressure_classes (void)
787 : {
788 214527 : int cost, i, n, curr;
789 214527 : int cl, cl2;
790 214527 : enum reg_class pressure_classes[N_REG_CLASSES];
791 214527 : int m;
792 214527 : HARD_REG_SET temp_hard_regset2;
793 214527 : bool insert_p;
794 :
795 214527 : if (targetm.compute_pressure_classes)
796 0 : n = targetm.compute_pressure_classes (pressure_classes);
797 : else
798 : {
799 : n = 0;
800 7508445 : for (cl = 0; cl < N_REG_CLASSES; cl++)
801 : {
802 7293918 : if (ira_class_hard_regs_num[cl] == 0)
803 641120 : continue;
804 6652798 : if (ira_class_hard_regs_num[cl] != 1
805 : /* A register class without subclasses may contain a few
806 : hard registers and movement between them is costly
807 : (e.g. SPARC FPCC registers). We still should consider it
808 : as a candidate for a pressure class. */
809 4724128 : && alloc_reg_class_subclasses[cl][0] < cl)
810 : {
811 : /* Check that the moves between any hard registers of the
812 : current class are not more expensive for a legal mode
813 : than load/store of the hard registers of the current
814 : class. Such class is a potential candidate to be a
815 : register pressure class. */
816 219393113 : for (m = 0; m < NUM_MACHINE_MODES; m++)
817 : {
818 218106604 : temp_hard_regset
819 218106604 : = (reg_class_contents[cl]
820 218106604 : & ~(no_unit_alloc_regs
821 218106604 : | ira_prohibited_class_mode_regs[cl][m]));
822 436213208 : if (hard_reg_set_empty_p (temp_hard_regset))
823 161225234 : continue;
824 56881370 : ira_init_register_move_cost_if_necessary ((machine_mode) m);
825 56881370 : cost = ira_register_move_cost[m][cl][cl];
826 56881370 : if (cost <= ira_max_memory_move_cost[m][cl][1]
827 53661219 : || cost <= ira_max_memory_move_cost[m][cl][0])
828 : break;
829 : }
830 4506660 : if (m >= NUM_MACHINE_MODES)
831 1286509 : continue;
832 : }
833 5366289 : curr = 0;
834 5366289 : insert_p = true;
835 5366289 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
836 : /* Remove so far added pressure classes which are subset of the
837 : current candidate class. Prefer GENERAL_REGS as a pressure
838 : register class to another class containing the same
839 : allocatable hard registers. We do this because machine
840 : dependent cost hooks might give wrong costs for the latter
841 : class but always give the right cost for the former class
842 : (GENERAL_REGS). */
843 19256969 : for (i = 0; i < n; i++)
844 : {
845 13890680 : cl2 = pressure_classes[i];
846 13890680 : temp_hard_regset2 = (reg_class_contents[cl2]
847 13890680 : & ~no_unit_alloc_regs);
848 13890680 : if (hard_reg_set_subset_p (temp_hard_regset, temp_hard_regset2)
849 15034667 : && (temp_hard_regset != temp_hard_regset2
850 1077917 : || cl2 == (int) GENERAL_REGS))
851 : {
852 709488 : pressure_classes[curr++] = (enum reg_class) cl2;
853 709488 : insert_p = false;
854 709488 : continue;
855 : }
856 16549489 : if (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset)
857 16978346 : && (temp_hard_regset2 != temp_hard_regset
858 434499 : || cl == (int) GENERAL_REGS))
859 3368297 : continue;
860 9812895 : if (temp_hard_regset2 == temp_hard_regset)
861 428857 : insert_p = false;
862 9812895 : pressure_classes[curr++] = (enum reg_class) cl2;
863 : }
864 : /* If the current candidate is a subset of a so far added
865 : pressure class, don't add it to the list of the pressure
866 : classes. */
867 5366289 : if (insert_p)
868 4227944 : pressure_classes[curr++] = (enum reg_class) cl;
869 : n = curr;
870 : }
871 : }
872 : #ifdef ENABLE_IRA_CHECKING
873 214527 : {
874 214527 : HARD_REG_SET ignore_hard_regs;
875 :
876 : /* Check pressure classes correctness: here we check that hard
877 : registers from all register pressure classes contains all hard
878 : registers available for the allocation. */
879 858108 : CLEAR_HARD_REG_SET (temp_hard_regset);
880 214527 : CLEAR_HARD_REG_SET (temp_hard_regset2);
881 214527 : ignore_hard_regs = no_unit_alloc_regs;
882 7508445 : for (cl = 0; cl < LIM_REG_CLASSES; cl++)
883 : {
884 : /* For some targets (like MIPS with MD_REGS), there are some
885 : classes with hard registers available for allocation but
886 : not able to hold value of any mode. */
887 207327720 : for (m = 0; m < NUM_MACHINE_MODES; m++)
888 206686600 : if (contains_reg_of_mode[cl][m])
889 : break;
890 7293918 : if (m >= NUM_MACHINE_MODES)
891 : {
892 641120 : ignore_hard_regs |= reg_class_contents[cl];
893 641120 : continue;
894 : }
895 31183041 : for (i = 0; i < n; i++)
896 25389890 : if ((int) pressure_classes[i] == cl)
897 : break;
898 6652798 : temp_hard_regset2 |= reg_class_contents[cl];
899 6652798 : if (i < n)
900 7293918 : temp_hard_regset |= reg_class_contents[cl];
901 : }
902 19951011 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
903 : /* Some targets (like SPARC with ICC reg) have allocatable regs
904 : for which no reg class is defined. */
905 19736484 : if (REGNO_REG_CLASS (i) == NO_REGS)
906 429054 : SET_HARD_REG_BIT (ignore_hard_regs, i);
907 214527 : temp_hard_regset &= ~ignore_hard_regs;
908 214527 : temp_hard_regset2 &= ~ignore_hard_regs;
909 429054 : ira_assert (hard_reg_set_subset_p (temp_hard_regset2, temp_hard_regset));
910 : }
911 : #endif
912 214527 : ira_pressure_classes_num = 0;
913 1074174 : for (i = 0; i < n; i++)
914 : {
915 859647 : cl = (int) pressure_classes[i];
916 859647 : ira_reg_pressure_class_p[cl] = true;
917 859647 : ira_pressure_classes[ira_pressure_classes_num++] = (enum reg_class) cl;
918 : }
919 214527 : setup_stack_reg_pressure_class ();
920 214527 : }
921 :
922 : /* Set up IRA_UNIFORM_CLASS_P. Uniform class is a register class
923 : whose register move cost between any registers of the class is the
924 : same as for all its subclasses. We use the data to speed up the
925 : 2nd pass of calculations of allocno costs. */
926 : static void
927 214527 : setup_uniform_class_p (void)
928 : {
929 214527 : int i, cl, cl2, m;
930 :
931 7508445 : for (cl = 0; cl < N_REG_CLASSES; cl++)
932 : {
933 7293918 : ira_uniform_class_p[cl] = false;
934 7293918 : if (ira_class_hard_regs_num[cl] == 0)
935 641120 : continue;
936 : /* We cannot use alloc_reg_class_subclasses here because move
937 : cost hooks does not take into account that some registers are
938 : unavailable for the subtarget. E.g. for i686, INT_SSE_REGS
939 : is element of alloc_reg_class_subclasses for GENERAL_REGS
940 : because SSE regs are unavailable. */
941 25947610 : for (i = 0; (cl2 = reg_class_subclasses[cl][i]) != LIM_REG_CLASSES; i++)
942 : {
943 20581322 : if (ira_class_hard_regs_num[cl2] == 0)
944 376 : continue;
945 2438804359 : for (m = 0; m < NUM_MACHINE_MODES; m++)
946 2419509923 : if (contains_reg_of_mode[cl][m] && contains_reg_of_mode[cl2][m])
947 : {
948 581238257 : ira_init_register_move_cost_if_necessary ((machine_mode) m);
949 581238257 : if (ira_register_move_cost[m][cl][cl]
950 581238257 : != ira_register_move_cost[m][cl2][cl2])
951 : break;
952 : }
953 20580946 : if (m < NUM_MACHINE_MODES)
954 : break;
955 : }
956 6652798 : if (cl2 == LIM_REG_CLASSES)
957 5366288 : ira_uniform_class_p[cl] = true;
958 : }
959 214527 : }
960 :
961 : /* Set up IRA_ALLOCNO_CLASSES, IRA_ALLOCNO_CLASSES_NUM,
962 : IRA_IMPORTANT_CLASSES, and IRA_IMPORTANT_CLASSES_NUM.
963 :
964 : Target may have many subtargets and not all target hard registers can
965 : be used for allocation, e.g. x86 port in 32-bit mode cannot use
966 : hard registers introduced in x86-64 like r8-r15). Some classes
967 : might have the same allocatable hard registers, e.g. INDEX_REGS
968 : and GENERAL_REGS in x86 port in 32-bit mode. To decrease different
969 : calculations efforts we introduce allocno classes which contain
970 : unique non-empty sets of allocatable hard-registers.
971 :
972 : Pseudo class cost calculation in ira-costs.cc is very expensive.
973 : Therefore we are trying to decrease number of classes involved in
974 : such calculation. Register classes used in the cost calculation
975 : are called important classes. They are allocno classes and other
976 : non-empty classes whose allocatable hard register sets are inside
977 : of an allocno class hard register set. From the first sight, it
978 : looks like that they are just allocno classes. It is not true. In
979 : example of x86-port in 32-bit mode, allocno classes will contain
980 : GENERAL_REGS but not LEGACY_REGS (because allocatable hard
981 : registers are the same for the both classes). The important
982 : classes will contain GENERAL_REGS and LEGACY_REGS. It is done
983 : because a machine description insn constraint may refers for
984 : LEGACY_REGS and code in ira-costs.cc is mostly base on investigation
985 : of the insn constraints. */
986 : static void
987 214527 : setup_allocno_and_important_classes (void)
988 : {
989 214527 : int i, j, n, cl;
990 214527 : bool set_p;
991 214527 : HARD_REG_SET temp_hard_regset2;
992 214527 : static enum reg_class classes[LIM_REG_CLASSES + 1];
993 :
994 214527 : n = 0;
995 : /* Collect classes which contain unique sets of allocatable hard
996 : registers. Prefer GENERAL_REGS to other classes containing the
997 : same set of hard registers. */
998 7508445 : for (i = 0; i < LIM_REG_CLASSES; i++)
999 : {
1000 7293918 : temp_hard_regset = reg_class_contents[i] & ~no_unit_alloc_regs;
1001 94893509 : for (j = 0; j < n; j++)
1002 : {
1003 89318013 : cl = classes[j];
1004 89318013 : temp_hard_regset2 = reg_class_contents[cl] & ~no_unit_alloc_regs;
1005 178636026 : if (temp_hard_regset == temp_hard_regset2)
1006 : break;
1007 : }
1008 7293918 : if (j >= n || targetm.additional_allocno_class_p (i))
1009 5575496 : classes[n++] = (enum reg_class) i;
1010 1718422 : else if (i == GENERAL_REGS)
1011 : /* Prefer general regs. For i386 example, it means that
1012 : we prefer GENERAL_REGS over INDEX_REGS or LEGACY_REGS
1013 : (all of them consists of the same available hard
1014 : registers). */
1015 5642 : classes[j] = (enum reg_class) i;
1016 : }
1017 214527 : classes[n] = LIM_REG_CLASSES;
1018 :
1019 : /* Set up classes which can be used for allocnos as classes
1020 : containing non-empty unique sets of allocatable hard
1021 : registers. */
1022 214527 : ira_allocno_classes_num = 0;
1023 5790023 : for (i = 0; (cl = classes[i]) != LIM_REG_CLASSES; i++)
1024 5575496 : if (ira_class_hard_regs_num[cl] > 0)
1025 5360969 : ira_allocno_classes[ira_allocno_classes_num++] = (enum reg_class) cl;
1026 214527 : ira_important_classes_num = 0;
1027 : /* Add non-allocno classes containing to non-empty set of
1028 : allocatable hard regs. */
1029 7508445 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1030 7293918 : if (ira_class_hard_regs_num[cl] > 0)
1031 : {
1032 6652798 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1033 6652798 : set_p = false;
1034 103252158 : for (j = 0; j < ira_allocno_classes_num; j++)
1035 : {
1036 101960329 : temp_hard_regset2 = (reg_class_contents[ira_allocno_classes[j]]
1037 101960329 : & ~no_unit_alloc_regs);
1038 101960329 : if ((enum reg_class) cl == ira_allocno_classes[j])
1039 : break;
1040 96599360 : else if (hard_reg_set_subset_p (temp_hard_regset,
1041 : temp_hard_regset2))
1042 6714177 : set_p = true;
1043 : }
1044 6652798 : if (set_p && j >= ira_allocno_classes_num)
1045 1291829 : ira_important_classes[ira_important_classes_num++]
1046 1291829 : = (enum reg_class) cl;
1047 : }
1048 : /* Now add allocno classes to the important classes. */
1049 5575496 : for (j = 0; j < ira_allocno_classes_num; j++)
1050 5360969 : ira_important_classes[ira_important_classes_num++]
1051 5360969 : = ira_allocno_classes[j];
1052 7508445 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1053 : {
1054 7293918 : ira_reg_allocno_class_p[cl] = false;
1055 7293918 : ira_reg_pressure_class_p[cl] = false;
1056 : }
1057 5575496 : for (j = 0; j < ira_allocno_classes_num; j++)
1058 5360969 : ira_reg_allocno_class_p[ira_allocno_classes[j]] = true;
1059 214527 : setup_pressure_classes ();
1060 214527 : setup_uniform_class_p ();
1061 214527 : }
1062 :
1063 : /* Setup translation in CLASS_TRANSLATE of all classes into a class
1064 : given by array CLASSES of length CLASSES_NUM. The function is used
1065 : make translation any reg class to an allocno class or to an
1066 : pressure class. This translation is necessary for some
1067 : calculations when we can use only allocno or pressure classes and
1068 : such translation represents an approximate representation of all
1069 : classes.
1070 :
1071 : The translation in case when allocatable hard register set of a
1072 : given class is subset of allocatable hard register set of a class
1073 : in CLASSES is pretty simple. We use smallest classes from CLASSES
1074 : containing a given class. If allocatable hard register set of a
1075 : given class is not a subset of any corresponding set of a class
1076 : from CLASSES, we use the cheapest (with load/store point of view)
1077 : class from CLASSES whose set intersects with given class set. */
1078 : static void
1079 429054 : setup_class_translate_array (enum reg_class *class_translate,
1080 : int classes_num, enum reg_class *classes)
1081 : {
1082 429054 : int cl, mode;
1083 429054 : enum reg_class aclass, best_class, *cl_ptr;
1084 429054 : int i, cost, min_cost, best_cost;
1085 :
1086 15016890 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1087 14587836 : class_translate[cl] = NO_REGS;
1088 :
1089 6649670 : for (i = 0; i < classes_num; i++)
1090 : {
1091 6220616 : aclass = classes[i];
1092 45671575 : for (cl_ptr = &alloc_reg_class_subclasses[aclass][0];
1093 45671575 : (cl = *cl_ptr) != LIM_REG_CLASSES;
1094 : cl_ptr++)
1095 39450959 : if (class_translate[cl] == NO_REGS)
1096 6076645 : class_translate[cl] = aclass;
1097 6220616 : class_translate[aclass] = aclass;
1098 : }
1099 : /* For classes which are not fully covered by one of given classes
1100 : (in other words covered by more one given class), use the
1101 : cheapest class. */
1102 15016890 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1103 : {
1104 14587836 : if (cl == NO_REGS || class_translate[cl] != NO_REGS)
1105 12660333 : continue;
1106 : best_class = NO_REGS;
1107 : best_cost = INT_MAX;
1108 18470465 : for (i = 0; i < classes_num; i++)
1109 : {
1110 16542962 : aclass = classes[i];
1111 16542962 : temp_hard_regset = (reg_class_contents[aclass]
1112 16542962 : & reg_class_contents[cl]
1113 16542962 : & ~no_unit_alloc_regs);
1114 33085924 : if (! hard_reg_set_empty_p (temp_hard_regset))
1115 : {
1116 : min_cost = INT_MAX;
1117 349207500 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1118 : {
1119 346413840 : cost = (ira_memory_move_cost[mode][aclass][0]
1120 346413840 : + ira_memory_move_cost[mode][aclass][1]);
1121 346413840 : if (min_cost > cost)
1122 : min_cost = cost;
1123 : }
1124 2793660 : if (best_class == NO_REGS || best_cost > min_cost)
1125 : {
1126 16542962 : best_class = aclass;
1127 16542962 : best_cost = min_cost;
1128 : }
1129 : }
1130 : }
1131 1927503 : class_translate[cl] = best_class;
1132 : }
1133 429054 : }
1134 :
1135 : /* Set up array IRA_ALLOCNO_CLASS_TRANSLATE and
1136 : IRA_PRESSURE_CLASS_TRANSLATE. */
1137 : static void
1138 214527 : setup_class_translate (void)
1139 : {
1140 214527 : setup_class_translate_array (ira_allocno_class_translate,
1141 214527 : ira_allocno_classes_num, ira_allocno_classes);
1142 214527 : setup_class_translate_array (ira_pressure_class_translate,
1143 214527 : ira_pressure_classes_num, ira_pressure_classes);
1144 214527 : }
1145 :
1146 : /* Order numbers of allocno classes in original target allocno class
1147 : array, -1 for non-allocno classes. */
1148 : static int allocno_class_order[N_REG_CLASSES];
1149 :
1150 : /* The function used to sort the important classes. */
1151 : static int
1152 178817656 : comp_reg_classes_func (const void *v1p, const void *v2p)
1153 : {
1154 178817656 : enum reg_class cl1 = *(const enum reg_class *) v1p;
1155 178817656 : enum reg_class cl2 = *(const enum reg_class *) v2p;
1156 178817656 : enum reg_class tcl1, tcl2;
1157 178817656 : int diff;
1158 :
1159 178817656 : tcl1 = ira_allocno_class_translate[cl1];
1160 178817656 : tcl2 = ira_allocno_class_translate[cl2];
1161 178817656 : if (tcl1 != NO_REGS && tcl2 != NO_REGS
1162 178817656 : && (diff = allocno_class_order[tcl1] - allocno_class_order[tcl2]) != 0)
1163 : return diff;
1164 8109983 : return (int) cl1 - (int) cl2;
1165 : }
1166 :
1167 : /* For correct work of function setup_reg_class_relation we need to
1168 : reorder important classes according to the order of their allocno
1169 : classes. It places important classes containing the same
1170 : allocatable hard register set adjacent to each other and allocno
1171 : class with the allocatable hard register set right after the other
1172 : important classes with the same set.
1173 :
1174 : In example from comments of function
1175 : setup_allocno_and_important_classes, it places LEGACY_REGS and
1176 : GENERAL_REGS close to each other and GENERAL_REGS is after
1177 : LEGACY_REGS. */
1178 : static void
1179 214527 : reorder_important_classes (void)
1180 : {
1181 214527 : int i;
1182 :
1183 7508445 : for (i = 0; i < N_REG_CLASSES; i++)
1184 7293918 : allocno_class_order[i] = -1;
1185 5575496 : for (i = 0; i < ira_allocno_classes_num; i++)
1186 5360969 : allocno_class_order[ira_allocno_classes[i]] = i;
1187 214527 : qsort (ira_important_classes, ira_important_classes_num,
1188 : sizeof (enum reg_class), comp_reg_classes_func);
1189 7081852 : for (i = 0; i < ira_important_classes_num; i++)
1190 6652798 : ira_important_class_nums[ira_important_classes[i]] = i;
1191 214527 : }
1192 :
1193 : /* Set up IRA_REG_CLASS_SUBUNION, IRA_REG_CLASS_SUPERUNION,
1194 : IRA_REG_CLASS_SUPER_CLASSES, IRA_REG_CLASSES_INTERSECT, and
1195 : IRA_REG_CLASSES_INTERSECT_P. For the meaning of the relations,
1196 : please see corresponding comments in ira-int.h. */
1197 : static void
1198 214527 : setup_reg_class_relations (void)
1199 : {
1200 214527 : int i, cl1, cl2, cl3;
1201 214527 : HARD_REG_SET intersection_set, union_set, temp_set2;
1202 214527 : bool important_class_p[N_REG_CLASSES];
1203 :
1204 214527 : memset (important_class_p, 0, sizeof (important_class_p));
1205 6867325 : for (i = 0; i < ira_important_classes_num; i++)
1206 6652798 : important_class_p[ira_important_classes[i]] = true;
1207 7508445 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1208 : {
1209 7293918 : ira_reg_class_super_classes[cl1][0] = LIM_REG_CLASSES;
1210 255287130 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1211 : {
1212 247993212 : ira_reg_classes_intersect_p[cl1][cl2] = false;
1213 247993212 : ira_reg_class_intersect[cl1][cl2] = NO_REGS;
1214 247993212 : ira_reg_class_subset[cl1][cl2] = NO_REGS;
1215 247993212 : temp_hard_regset = reg_class_contents[cl1] & ~no_unit_alloc_regs;
1216 247993212 : temp_set2 = reg_class_contents[cl2] & ~no_unit_alloc_regs;
1217 247993212 : if (hard_reg_set_empty_p (temp_hard_regset)
1218 269791292 : && hard_reg_set_empty_p (temp_set2))
1219 : {
1220 : /* The both classes have no allocatable hard registers
1221 : -- take all class hard registers into account and use
1222 : reg_class_subunion and reg_class_superunion. */
1223 757191 : for (i = 0;; i++)
1224 : {
1225 2749813 : cl3 = reg_class_subclasses[cl1][i];
1226 2749813 : if (cl3 == LIM_REG_CLASSES)
1227 : break;
1228 757191 : if (reg_class_subset_p (ira_reg_class_intersect[cl1][cl2],
1229 : (enum reg_class) cl3))
1230 709925 : ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1231 : }
1232 1992622 : ira_reg_class_subunion[cl1][cl2] = reg_class_subunion[cl1][cl2];
1233 1992622 : ira_reg_class_superunion[cl1][cl2] = reg_class_superunion[cl1][cl2];
1234 1992622 : continue;
1235 : }
1236 : ira_reg_classes_intersect_p[cl1][cl2]
1237 246000590 : = hard_reg_set_intersect_p (temp_hard_regset, temp_set2);
1238 226195132 : if (important_class_p[cl1] && important_class_p[cl2]
1239 452390264 : && hard_reg_set_subset_p (temp_hard_regset, temp_set2))
1240 : {
1241 : /* CL1 and CL2 are important classes and CL1 allocatable
1242 : hard register set is inside of CL2 allocatable hard
1243 : registers -- make CL1 a superset of CL2. */
1244 58906442 : enum reg_class *p;
1245 :
1246 58906442 : p = &ira_reg_class_super_classes[cl1][0];
1247 355528438 : while (*p != LIM_REG_CLASSES)
1248 296621996 : p++;
1249 58906442 : *p++ = (enum reg_class) cl2;
1250 58906442 : *p = LIM_REG_CLASSES;
1251 : }
1252 246000590 : ira_reg_class_subunion[cl1][cl2] = NO_REGS;
1253 246000590 : ira_reg_class_superunion[cl1][cl2] = NO_REGS;
1254 246000590 : intersection_set = (reg_class_contents[cl1]
1255 246000590 : & reg_class_contents[cl2]
1256 246000590 : & ~no_unit_alloc_regs);
1257 246000590 : union_set = ((reg_class_contents[cl1] | reg_class_contents[cl2])
1258 246000590 : & ~no_unit_alloc_regs);
1259 8610020650 : for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++)
1260 : {
1261 8364020060 : temp_hard_regset = reg_class_contents[cl3] & ~no_unit_alloc_regs;
1262 16728040120 : if (hard_reg_set_empty_p (temp_hard_regset))
1263 734274768 : continue;
1264 :
1265 7629745292 : if (hard_reg_set_subset_p (temp_hard_regset, intersection_set))
1266 : {
1267 : /* CL3 allocatable hard register set is inside of
1268 : intersection of allocatable hard register sets
1269 : of CL1 and CL2. */
1270 652150434 : if (important_class_p[cl3])
1271 : {
1272 652150434 : temp_set2
1273 652150434 : = (reg_class_contents
1274 652150434 : [ira_reg_class_intersect[cl1][cl2]]);
1275 652150434 : temp_set2 &= ~no_unit_alloc_regs;
1276 652150434 : if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1277 : /* If the allocatable hard register sets are
1278 : the same, prefer GENERAL_REGS or the
1279 : smallest class for debugging
1280 : purposes. */
1281 758119964 : || (temp_hard_regset == temp_set2
1282 101786139 : && (cl3 == GENERAL_REGS
1283 101103451 : || ((ira_reg_class_intersect[cl1][cl2]
1284 : != GENERAL_REGS)
1285 33832989 : && hard_reg_set_subset_p
1286 33832989 : (reg_class_contents[cl3],
1287 : reg_class_contents
1288 : [(int)
1289 : ira_reg_class_intersect[cl1][cl2]])))))
1290 572823624 : ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3;
1291 : }
1292 652150434 : temp_set2
1293 652150434 : = (reg_class_contents[ira_reg_class_subset[cl1][cl2]]
1294 652150434 : & ~no_unit_alloc_regs);
1295 652150434 : if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1296 : /* Ignore unavailable hard registers and prefer
1297 : smallest class for debugging purposes. */
1298 758119964 : || (temp_hard_regset == temp_set2
1299 101786139 : && hard_reg_set_subset_p
1300 101786139 : (reg_class_contents[cl3],
1301 : reg_class_contents
1302 : [(int) ira_reg_class_subset[cl1][cl2]])))
1303 605971040 : ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3;
1304 : }
1305 7629745292 : if (important_class_p[cl3]
1306 15259490584 : && hard_reg_set_subset_p (temp_hard_regset, union_set))
1307 : {
1308 : /* CL3 allocatable hard register set is inside of
1309 : union of allocatable hard register sets of CL1
1310 : and CL2. */
1311 3396715626 : temp_set2
1312 3396715626 : = (reg_class_contents[ira_reg_class_subunion[cl1][cl2]]
1313 3396715626 : & ~no_unit_alloc_regs);
1314 3396715626 : if (ira_reg_class_subunion[cl1][cl2] == NO_REGS
1315 6547430662 : || (hard_reg_set_subset_p (temp_set2, temp_hard_regset)
1316 :
1317 1049252441 : && (temp_set2 != temp_hard_regset
1318 437157386 : || cl3 == GENERAL_REGS
1319 : /* If the allocatable hard register sets are the
1320 : same, prefer GENERAL_REGS or the smallest
1321 : class for debugging purposes. */
1322 433563468 : || (ira_reg_class_subunion[cl1][cl2] != GENERAL_REGS
1323 28883095 : && hard_reg_set_subset_p
1324 28883095 : (reg_class_contents[cl3],
1325 : reg_class_contents
1326 : [(int) ira_reg_class_subunion[cl1][cl2]])))))
1327 884179760 : ira_reg_class_subunion[cl1][cl2] = (enum reg_class) cl3;
1328 : }
1329 15259490584 : if (hard_reg_set_subset_p (union_set, temp_hard_regset))
1330 : {
1331 : /* CL3 allocatable hard register set contains union
1332 : of allocatable hard register sets of CL1 and
1333 : CL2. */
1334 1415973090 : temp_set2
1335 1415973090 : = (reg_class_contents[ira_reg_class_superunion[cl1][cl2]]
1336 1415973090 : & ~no_unit_alloc_regs);
1337 1415973090 : if (ira_reg_class_superunion[cl1][cl2] == NO_REGS
1338 2585945590 : || (hard_reg_set_subset_p (temp_hard_regset, temp_set2)
1339 :
1340 201419329 : && (temp_set2 != temp_hard_regset
1341 200553332 : || cl3 == GENERAL_REGS
1342 : /* If the allocatable hard register sets are the
1343 : same, prefer GENERAL_REGS or the smallest
1344 : class for debugging purposes. */
1345 198991646 : || (ira_reg_class_superunion[cl1][cl2] != GENERAL_REGS
1346 19830674 : && hard_reg_set_subset_p
1347 19830674 : (reg_class_contents[cl3],
1348 : reg_class_contents
1349 : [(int) ira_reg_class_superunion[cl1][cl2]])))))
1350 262678846 : ira_reg_class_superunion[cl1][cl2] = (enum reg_class) cl3;
1351 : }
1352 : }
1353 : }
1354 : }
1355 214527 : }
1356 :
1357 : /* Output all uniform and important classes into file F. */
1358 : static void
1359 0 : print_uniform_and_important_classes (FILE *f)
1360 : {
1361 0 : int i, cl;
1362 :
1363 0 : fprintf (f, "Uniform classes:\n");
1364 0 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1365 0 : if (ira_uniform_class_p[cl])
1366 0 : fprintf (f, " %s", reg_class_names[cl]);
1367 0 : fprintf (f, "\nImportant classes:\n");
1368 0 : for (i = 0; i < ira_important_classes_num; i++)
1369 0 : fprintf (f, " %s", reg_class_names[ira_important_classes[i]]);
1370 0 : fprintf (f, "\n");
1371 0 : }
1372 :
1373 : /* Output all possible allocno or pressure classes and their
1374 : translation map into file F. */
1375 : static void
1376 0 : print_translated_classes (FILE *f, bool pressure_p)
1377 : {
1378 0 : int classes_num = (pressure_p
1379 0 : ? ira_pressure_classes_num : ira_allocno_classes_num);
1380 0 : enum reg_class *classes = (pressure_p
1381 0 : ? ira_pressure_classes : ira_allocno_classes);
1382 0 : enum reg_class *class_translate = (pressure_p
1383 : ? ira_pressure_class_translate
1384 : : ira_allocno_class_translate);
1385 0 : int i;
1386 :
1387 0 : fprintf (f, "%s classes:\n", pressure_p ? "Pressure" : "Allocno");
1388 0 : for (i = 0; i < classes_num; i++)
1389 0 : fprintf (f, " %s", reg_class_names[classes[i]]);
1390 0 : fprintf (f, "\nClass translation:\n");
1391 0 : for (i = 0; i < N_REG_CLASSES; i++)
1392 0 : fprintf (f, " %s -> %s\n", reg_class_names[i],
1393 0 : reg_class_names[class_translate[i]]);
1394 0 : }
1395 :
1396 : /* Output all possible allocno and translation classes and the
1397 : translation maps into stderr. */
1398 : void
1399 0 : ira_debug_allocno_classes (void)
1400 : {
1401 0 : print_uniform_and_important_classes (stderr);
1402 0 : print_translated_classes (stderr, false);
1403 0 : print_translated_classes (stderr, true);
1404 0 : }
1405 :
1406 : /* Set up different arrays concerning class subsets, allocno and
1407 : important classes. */
1408 : static void
1409 214527 : find_reg_classes (void)
1410 : {
1411 214527 : setup_allocno_and_important_classes ();
1412 214527 : setup_class_translate ();
1413 214527 : reorder_important_classes ();
1414 214527 : setup_reg_class_relations ();
1415 214527 : }
1416 :
1417 :
1418 :
1419 : /* Set up array ira_hard_regno_allocno_class. */
1420 : static void
1421 214527 : setup_hard_regno_aclass (void)
1422 : {
1423 214527 : int i;
1424 :
1425 19951011 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1426 : {
1427 39472968 : ira_hard_regno_allocno_class[i]
1428 29752200 : = (TEST_HARD_REG_BIT (no_unit_alloc_regs, i)
1429 19736484 : ? NO_REGS
1430 10015716 : : ira_allocno_class_translate[REGNO_REG_CLASS (i)]);
1431 : }
1432 214527 : }
1433 :
1434 :
1435 :
1436 : /* Form IRA_REG_CLASS_MAX_NREGS and IRA_REG_CLASS_MIN_NREGS maps. */
1437 : static void
1438 214527 : setup_reg_class_nregs (void)
1439 : {
1440 214527 : int i, cl, cl2, m;
1441 :
1442 26815875 : for (m = 0; m < MAX_MACHINE_MODE; m++)
1443 : {
1444 931047180 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1445 1808891664 : ira_reg_class_max_nregs[cl][m]
1446 1808891664 : = ira_reg_class_min_nregs[cl][m]
1447 904445832 : = targetm.class_max_nregs ((reg_class_t) cl, (machine_mode) m);
1448 931047180 : for (cl = 0; cl < N_REG_CLASSES; cl++)
1449 6479451856 : for (i = 0;
1450 7383897688 : (cl2 = alloc_reg_class_subclasses[cl][i]) != LIM_REG_CLASSES;
1451 : i++)
1452 6479451856 : if (ira_reg_class_min_nregs[cl2][m]
1453 6479451856 : < ira_reg_class_min_nregs[cl][m])
1454 51993376 : ira_reg_class_min_nregs[cl][m] = ira_reg_class_min_nregs[cl2][m];
1455 : }
1456 214527 : }
1457 :
1458 :
1459 :
1460 : /* Set up IRA_PROHIBITED_CLASS_MODE_REGS, IRA_EXCLUDE_CLASS_MODE_REGS, and
1461 : IRA_CLASS_SINGLETON. This function is called once IRA_CLASS_HARD_REGS has
1462 : been initialized. */
1463 : static void
1464 214527 : setup_prohibited_and_exclude_class_mode_regs (void)
1465 : {
1466 214527 : int j, k, hard_regno, cl, last_hard_regno, count;
1467 :
1468 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1469 : {
1470 7293918 : temp_hard_regset = reg_class_contents[cl] & ~no_unit_alloc_regs;
1471 911739750 : for (j = 0; j < NUM_MACHINE_MODES; j++)
1472 : {
1473 904445832 : count = 0;
1474 904445832 : last_hard_regno = -1;
1475 3617783328 : CLEAR_HARD_REG_SET (ira_prohibited_class_mode_regs[cl][j]);
1476 904445832 : CLEAR_HARD_REG_SET (ira_exclude_class_mode_regs[cl][j]);
1477 10068441888 : for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1478 : {
1479 9163996056 : hard_regno = ira_class_hard_regs[cl][k];
1480 9163996056 : if (!targetm.hard_regno_mode_ok (hard_regno, (machine_mode) j))
1481 6908377294 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1482 : hard_regno);
1483 2255618762 : else if (in_hard_reg_set_p (temp_hard_regset,
1484 : (machine_mode) j, hard_regno))
1485 : {
1486 2160193451 : last_hard_regno = hard_regno;
1487 2160193451 : count++;
1488 : }
1489 : else
1490 : {
1491 95425311 : SET_HARD_REG_BIT (ira_exclude_class_mode_regs[cl][j], hard_regno);
1492 : }
1493 : }
1494 951350115 : ira_class_singleton[cl][j] = (count == 1 ? last_hard_regno : -1);
1495 : }
1496 : }
1497 214527 : }
1498 :
1499 : /* Clarify IRA_PROHIBITED_CLASS_MODE_REGS by excluding hard registers
1500 : spanning from one register pressure class to another one. It is
1501 : called after defining the pressure classes. */
1502 : static void
1503 214527 : clarify_prohibited_class_mode_regs (void)
1504 : {
1505 214527 : int j, k, hard_regno, cl, pclass, nregs;
1506 :
1507 7508445 : for (cl = (int) N_REG_CLASSES - 1; cl >= 0; cl--)
1508 911739750 : for (j = 0; j < NUM_MACHINE_MODES; j++)
1509 : {
1510 904445832 : CLEAR_HARD_REG_SET (ira_useful_class_mode_regs[cl][j]);
1511 10068441888 : for (k = ira_class_hard_regs_num[cl] - 1; k >= 0; k--)
1512 : {
1513 9163996056 : hard_regno = ira_class_hard_regs[cl][k];
1514 9163996056 : if (TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j], hard_regno))
1515 6908377294 : continue;
1516 2255618762 : nregs = hard_regno_nregs (hard_regno, (machine_mode) j);
1517 2255618762 : if (hard_regno + nregs > FIRST_PSEUDO_REGISTER)
1518 : {
1519 9765 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1520 : hard_regno);
1521 9765 : continue;
1522 : }
1523 2255608997 : pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
1524 4955832662 : for (nregs-- ;nregs >= 0; nregs--)
1525 2750410834 : if (((enum reg_class) pclass
1526 2750410834 : != ira_pressure_class_translate[REGNO_REG_CLASS
1527 2750410834 : (hard_regno + nregs)]))
1528 : {
1529 50187169 : SET_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1530 : hard_regno);
1531 50187169 : break;
1532 : }
1533 2255608997 : if (!TEST_HARD_REG_BIT (ira_prohibited_class_mode_regs[cl][j],
1534 : hard_regno))
1535 2205421828 : add_to_hard_reg_set (&ira_useful_class_mode_regs[cl][j],
1536 : (machine_mode) j, hard_regno);
1537 : }
1538 : }
1539 214527 : }
1540 :
1541 : /* Allocate and initialize IRA_REGISTER_MOVE_COST, IRA_MAY_MOVE_IN_COST
1542 : and IRA_MAY_MOVE_OUT_COST for MODE. */
1543 : void
1544 10005480 : ira_init_register_move_cost (machine_mode mode)
1545 : {
1546 10005480 : static unsigned short last_move_cost[N_REG_CLASSES][N_REG_CLASSES];
1547 10005480 : bool all_match = true;
1548 10005480 : unsigned int i, cl1, cl2;
1549 10005480 : HARD_REG_SET ok_regs;
1550 :
1551 10005480 : ira_assert (ira_register_move_cost[mode] == NULL
1552 : && ira_may_move_in_cost[mode] == NULL
1553 : && ira_may_move_out_cost[mode] == NULL);
1554 930509640 : CLEAR_HARD_REG_SET (ok_regs);
1555 930509640 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
1556 920504160 : if (targetm.hard_regno_mode_ok (i, mode))
1557 418579332 : SET_HARD_REG_BIT (ok_regs, i);
1558 :
1559 : /* Note that we might be asked about the move costs of modes that
1560 : cannot be stored in any hard register, for example if an inline
1561 : asm tries to create a register operand with an impossible mode.
1562 : We therefore can't assert have_regs_of_mode[mode] here. */
1563 350191800 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1564 11906521200 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1565 : {
1566 11566334880 : int cost;
1567 11566334880 : if (!hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl1])
1568 19121252656 : || !hard_reg_set_intersect_p (ok_regs, reg_class_contents[cl2]))
1569 : {
1570 6040992650 : if ((ira_reg_class_max_nregs[cl1][mode]
1571 6040992650 : > ira_class_hard_regs_num[cl1])
1572 4286391389 : || (ira_reg_class_max_nregs[cl2][mode]
1573 4286391389 : > ira_class_hard_regs_num[cl2]))
1574 : cost = 65535;
1575 : else
1576 2948681300 : cost = (ira_memory_move_cost[mode][cl1][0]
1577 2948681300 : + ira_memory_move_cost[mode][cl2][1]) * 2;
1578 : }
1579 : else
1580 : {
1581 5525342230 : cost = register_move_cost (mode, (enum reg_class) cl1,
1582 : (enum reg_class) cl2);
1583 5525342230 : ira_assert (cost < 65535);
1584 : }
1585 11566334880 : all_match &= (last_move_cost[cl1][cl2] == cost);
1586 11566334880 : last_move_cost[cl1][cl2] = cost;
1587 : }
1588 10005480 : if (all_match && last_mode_for_init_move_cost != -1)
1589 : {
1590 4153253 : ira_register_move_cost[mode]
1591 4153253 : = ira_register_move_cost[last_mode_for_init_move_cost];
1592 4153253 : ira_may_move_in_cost[mode]
1593 4153253 : = ira_may_move_in_cost[last_mode_for_init_move_cost];
1594 4153253 : ira_may_move_out_cost[mode]
1595 4153253 : = ira_may_move_out_cost[last_mode_for_init_move_cost];
1596 4153253 : return;
1597 : }
1598 5852227 : last_mode_for_init_move_cost = mode;
1599 5852227 : ira_register_move_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1600 5852227 : ira_may_move_in_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1601 5852227 : ira_may_move_out_cost[mode] = XNEWVEC (move_table, N_REG_CLASSES);
1602 204827945 : for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
1603 6964150130 : for (cl2 = 0; cl2 < N_REG_CLASSES; cl2++)
1604 : {
1605 6765174412 : int cost;
1606 6765174412 : enum reg_class *p1, *p2;
1607 :
1608 6765174412 : if (last_move_cost[cl1][cl2] == 65535)
1609 : {
1610 1689758233 : ira_register_move_cost[mode][cl1][cl2] = 65535;
1611 1689758233 : ira_may_move_in_cost[mode][cl1][cl2] = 65535;
1612 1689758233 : ira_may_move_out_cost[mode][cl1][cl2] = 65535;
1613 : }
1614 : else
1615 : {
1616 5075416179 : cost = last_move_cost[cl1][cl2];
1617 :
1618 42529921543 : for (p2 = ®_class_subclasses[cl2][0];
1619 42529921543 : *p2 != LIM_REG_CLASSES; p2++)
1620 37454505364 : if (ira_class_hard_regs_num[*p2] > 0
1621 36763110975 : && (ira_reg_class_max_nregs[*p2][mode]
1622 : <= ira_class_hard_regs_num[*p2]))
1623 30150560697 : cost = MAX (cost, ira_register_move_cost[mode][cl1][*p2]);
1624 :
1625 42529921543 : for (p1 = ®_class_subclasses[cl1][0];
1626 42529921543 : *p1 != LIM_REG_CLASSES; p1++)
1627 37454505364 : if (ira_class_hard_regs_num[*p1] > 0
1628 36763110975 : && (ira_reg_class_max_nregs[*p1][mode]
1629 : <= ira_class_hard_regs_num[*p1]))
1630 30150560697 : cost = MAX (cost, ira_register_move_cost[mode][*p1][cl2]);
1631 :
1632 5075416179 : ira_assert (cost <= 65535);
1633 5075416179 : ira_register_move_cost[mode][cl1][cl2] = cost;
1634 :
1635 5075416179 : if (ira_class_subset_p[cl1][cl2])
1636 1532398260 : ira_may_move_in_cost[mode][cl1][cl2] = 0;
1637 : else
1638 3543017919 : ira_may_move_in_cost[mode][cl1][cl2] = cost;
1639 :
1640 5075416179 : if (ira_class_subset_p[cl2][cl1])
1641 1532398260 : ira_may_move_out_cost[mode][cl1][cl2] = 0;
1642 : else
1643 3543017919 : ira_may_move_out_cost[mode][cl1][cl2] = cost;
1644 : }
1645 : }
1646 : }
1647 :
1648 :
1649 :
1650 : /* This is called once during compiler work. It sets up
1651 : different arrays whose values don't depend on the compiled
1652 : function. */
1653 : void
1654 209218 : ira_init_once (void)
1655 : {
1656 209218 : ira_init_costs_once ();
1657 209218 : lra_init_once ();
1658 :
1659 209218 : ira_use_lra_p = targetm.lra_p ();
1660 209218 : }
1661 :
1662 : /* Free ira_max_register_move_cost, ira_may_move_in_cost and
1663 : ira_may_move_out_cost for each mode. */
1664 : void
1665 532308 : target_ira_int::free_register_move_costs (void)
1666 : {
1667 532308 : int mode, i;
1668 :
1669 : /* Reset move_cost and friends, making sure we only free shared
1670 : table entries once. */
1671 66538500 : for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
1672 66006192 : if (x_ira_register_move_cost[mode])
1673 : {
1674 600756238 : for (i = 0;
1675 609986654 : i < mode && (x_ira_register_move_cost[i]
1676 : != x_ira_register_move_cost[mode]);
1677 : i++)
1678 : ;
1679 9230416 : if (i == mode)
1680 : {
1681 5401861 : free (x_ira_register_move_cost[mode]);
1682 5401861 : free (x_ira_may_move_in_cost[mode]);
1683 5401861 : free (x_ira_may_move_out_cost[mode]);
1684 : }
1685 : }
1686 532308 : memset (x_ira_register_move_cost, 0, sizeof x_ira_register_move_cost);
1687 532308 : memset (x_ira_may_move_in_cost, 0, sizeof x_ira_may_move_in_cost);
1688 532308 : memset (x_ira_may_move_out_cost, 0, sizeof x_ira_may_move_out_cost);
1689 532308 : last_mode_for_init_move_cost = -1;
1690 532308 : }
1691 :
1692 317781 : target_ira_int::~target_ira_int ()
1693 : {
1694 317781 : free_ira_costs ();
1695 317781 : free_register_move_costs ();
1696 317781 : }
1697 :
1698 : /* This is called every time when register related information is
1699 : changed. */
1700 : void
1701 214527 : ira_init (void)
1702 : {
1703 214527 : this_target_ira_int->free_register_move_costs ();
1704 214527 : setup_reg_mode_hard_regset ();
1705 214527 : setup_alloc_regs (flag_omit_frame_pointer != 0);
1706 214527 : setup_class_subset_and_memory_move_costs ();
1707 214527 : setup_reg_class_nregs ();
1708 214527 : setup_prohibited_and_exclude_class_mode_regs ();
1709 214527 : find_reg_classes ();
1710 214527 : clarify_prohibited_class_mode_regs ();
1711 214527 : setup_hard_regno_aclass ();
1712 214527 : ira_init_costs ();
1713 214527 : }
1714 :
1715 :
1716 : #define ira_prohibited_mode_move_regs_initialized_p \
1717 : (this_target_ira_int->x_ira_prohibited_mode_move_regs_initialized_p)
1718 :
1719 : /* Set up IRA_PROHIBITED_MODE_MOVE_REGS. */
1720 : static void
1721 1471362 : setup_prohibited_mode_move_regs (void)
1722 : {
1723 1471362 : int i, j;
1724 1471362 : rtx test_reg1, test_reg2, move_pat;
1725 1471362 : rtx_insn *move_insn;
1726 :
1727 1471362 : if (ira_prohibited_mode_move_regs_initialized_p)
1728 : return;
1729 212181 : ira_prohibited_mode_move_regs_initialized_p = true;
1730 212181 : test_reg1 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 1);
1731 212181 : test_reg2 = gen_rtx_REG (word_mode, LAST_VIRTUAL_REGISTER + 2);
1732 212181 : move_pat = gen_rtx_SET (test_reg1, test_reg2);
1733 212181 : move_insn = gen_rtx_INSN (VOIDmode, 0, 0, 0, move_pat, 0, -1, 0);
1734 26734806 : for (i = 0; i < NUM_MACHINE_MODES; i++)
1735 : {
1736 26310444 : SET_HARD_REG_SET (ira_prohibited_mode_move_regs[i]);
1737 2446871292 : for (j = 0; j < FIRST_PSEUDO_REGISTER; j++)
1738 : {
1739 2420560848 : if (!targetm.hard_regno_mode_ok (j, (machine_mode) i))
1740 2002220631 : continue;
1741 418340217 : set_mode_and_regno (test_reg1, (machine_mode) i, j);
1742 418340217 : set_mode_and_regno (test_reg2, (machine_mode) i, j);
1743 418340217 : INSN_CODE (move_insn) = -1;
1744 418340217 : recog_memoized (move_insn);
1745 418340217 : if (INSN_CODE (move_insn) < 0)
1746 197961546 : continue;
1747 220378671 : extract_insn (move_insn);
1748 : /* We don't know whether the move will be in code that is optimized
1749 : for size or speed, so consider all enabled alternatives. */
1750 220378671 : if (! constrain_operands (1, get_enabled_alternatives (move_insn)))
1751 1303059 : continue;
1752 219075612 : CLEAR_HARD_REG_BIT (ira_prohibited_mode_move_regs[i], j);
1753 : }
1754 : }
1755 : }
1756 :
1757 :
1758 :
1759 : /* Extract INSN and return the set of alternatives that we should consider.
1760 : This excludes any alternatives whose constraints are obviously impossible
1761 : to meet (e.g. because the constraint requires a constant and the operand
1762 : is nonconstant). It also excludes alternatives that are bound to need
1763 : a spill or reload, as long as we have other alternatives that match
1764 : exactly. */
1765 : alternative_mask
1766 103135612 : ira_setup_alts (rtx_insn *insn)
1767 : {
1768 103135612 : int nop, nalt;
1769 103135612 : bool curr_swapped;
1770 103135612 : const char *p;
1771 103135612 : int commutative = -1;
1772 :
1773 103135612 : extract_insn (insn);
1774 103135612 : preprocess_constraints (insn);
1775 103135612 : alternative_mask preferred = get_preferred_alternatives (insn);
1776 103135612 : alternative_mask alts = 0;
1777 103135612 : alternative_mask exact_alts = 0;
1778 : /* Check that the hard reg set is enough for holding all
1779 : alternatives. It is hard to imagine the situation when the
1780 : assertion is wrong. */
1781 103135612 : ira_assert (recog_data.n_alternatives
1782 : <= (int) MAX (sizeof (HARD_REG_ELT_TYPE) * CHAR_BIT,
1783 : FIRST_PSEUDO_REGISTER));
1784 302619876 : for (nop = 0; nop < recog_data.n_operands; nop++)
1785 211638318 : if (recog_data.constraints[nop][0] == '%')
1786 : {
1787 : commutative = nop;
1788 : break;
1789 : }
1790 103135612 : for (curr_swapped = false;; curr_swapped = true)
1791 : {
1792 1357954709 : for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
1793 : {
1794 1242665043 : if (!TEST_BIT (preferred, nalt) || TEST_BIT (exact_alts, nalt))
1795 414054174 : continue;
1796 :
1797 828610869 : const operand_alternative *op_alt
1798 828610869 : = &recog_op_alt[nalt * recog_data.n_operands];
1799 828610869 : int this_reject = 0;
1800 2365555114 : for (nop = 0; nop < recog_data.n_operands; nop++)
1801 : {
1802 1727284437 : int c, len;
1803 :
1804 1727284437 : this_reject += op_alt[nop].reject;
1805 :
1806 1727284437 : rtx op = recog_data.operand[nop];
1807 1727284437 : p = op_alt[nop].constraint;
1808 1727284437 : if (*p == 0 || *p == ',')
1809 24011540 : continue;
1810 :
1811 : bool win_p = false;
1812 3454328590 : do
1813 3454328590 : switch (c = *p, len = CONSTRAINT_LEN (c, p), c)
1814 : {
1815 : case '#':
1816 : case ',':
1817 : c = '\0';
1818 : /* FALLTHRU */
1819 726400646 : case '\0':
1820 726400646 : len = 0;
1821 726400646 : break;
1822 :
1823 : case '%':
1824 : /* The commutative modifier is handled above. */
1825 : break;
1826 :
1827 74055094 : case '0': case '1': case '2': case '3': case '4':
1828 74055094 : case '5': case '6': case '7': case '8': case '9':
1829 74055094 : {
1830 74055094 : char *end;
1831 74055094 : unsigned long dup = strtoul (p, &end, 10);
1832 74055094 : rtx other = recog_data.operand[dup];
1833 74055094 : len = end - p;
1834 1820480 : if (MEM_P (other)
1835 74055094 : ? rtx_equal_p (other, op)
1836 72234614 : : REG_P (op) || SUBREG_P (op))
1837 51477458 : goto op_success;
1838 22577636 : win_p = true;
1839 : }
1840 22577636 : break;
1841 :
1842 10596455 : case 'g':
1843 10596455 : goto op_success;
1844 145 : break;
1845 :
1846 145 : case '{':
1847 145 : if (REG_P (op) || SUBREG_P (op))
1848 143 : goto op_success;
1849 : win_p = true;
1850 : break;
1851 :
1852 2624160021 : default:
1853 2624160021 : {
1854 2624160021 : enum constraint_num cn = lookup_constraint (p);
1855 2624160021 : rtx mem = NULL;
1856 2624160021 : switch (get_constraint_type (cn))
1857 : {
1858 2139316851 : case CT_REGISTER:
1859 3369911376 : if (reg_class_for_constraint (cn) != NO_REGS)
1860 : {
1861 1183469338 : if (REG_P (op) || SUBREG_P (op))
1862 772766164 : goto op_success;
1863 : win_p = true;
1864 : }
1865 : break;
1866 :
1867 4028360 : case CT_CONST_INT:
1868 4028360 : if (CONST_INT_P (op)
1869 6634734 : && (insn_const_int_ok_for_constraint
1870 2606374 : (INTVAL (op), cn)))
1871 1851247 : goto op_success;
1872 : break;
1873 :
1874 814125 : case CT_ADDRESS:
1875 814125 : goto op_success;
1876 :
1877 162273465 : case CT_MEMORY:
1878 162273465 : case CT_RELAXED_MEMORY:
1879 162273465 : mem = op;
1880 : /* Fall through. */
1881 162273465 : case CT_SPECIAL_MEMORY:
1882 162273465 : if (!mem)
1883 66311285 : mem = extract_mem_from_operand (op);
1884 228584750 : if (MEM_P (mem))
1885 69415345 : goto op_success;
1886 : win_p = true;
1887 : break;
1888 :
1889 251415935 : case CT_FIXED_FORM:
1890 251415935 : if (constraint_satisfied_p (op, cn))
1891 69951314 : goto op_success;
1892 : break;
1893 : }
1894 : break;
1895 : }
1896 : }
1897 2477456339 : while (p += len, c);
1898 726400646 : if (!win_p)
1899 : break;
1900 : /* We can make the alternative match by spilling a register
1901 : to memory or loading something into a register. Count a
1902 : cost of one reload (the equivalent of the '?' constraint). */
1903 536060454 : this_reject += 6;
1904 1536944245 : op_success:
1905 1536944245 : ;
1906 : }
1907 :
1908 828610869 : if (nop >= recog_data.n_operands)
1909 : {
1910 638270677 : alts |= ALTERNATIVE_BIT (nalt);
1911 638270677 : if (this_reject == 0)
1912 135576263 : exact_alts |= ALTERNATIVE_BIT (nalt);
1913 : }
1914 : }
1915 115289666 : if (commutative < 0)
1916 : break;
1917 : /* Swap forth and back to avoid changing recog_data. */
1918 24308108 : std::swap (recog_data.operand[commutative],
1919 24308108 : recog_data.operand[commutative + 1]);
1920 24308108 : if (curr_swapped)
1921 : break;
1922 : }
1923 103135612 : return exact_alts ? exact_alts : alts;
1924 : }
1925 :
1926 : /* Return the number of the output non-early clobber operand which
1927 : should be the same in any case as operand with number OP_NUM (or
1928 : negative value if there is no such operand). ALTS is the mask
1929 : of alternatives that we should consider. SINGLE_INPUT_OP_HAS_CSTR_P
1930 : should be set in this function, it indicates whether there is only
1931 : a single input operand which has the matching constraint on the
1932 : output operand at the position specified in return value. If the
1933 : pattern allows any one of several input operands holds the matching
1934 : constraint, it's set as false, one typical case is destructive FMA
1935 : instruction on target rs6000. Note that for a non-NO_REG preferred
1936 : register class with no free register move copy, if the parameter
1937 : PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to one, this function
1938 : will check all available alternatives for matching constraints,
1939 : even if it has found or will find one alternative with non-NO_REG
1940 : regclass, it can respect more cases with matching constraints. If
1941 : PARAM_IRA_CONSIDER_DUP_IN_ALL_ALTS is set to zero,
1942 : SINGLE_INPUT_OP_HAS_CSTR_P is always true, it will stop to find
1943 : matching constraint relationship once it hits some alternative with
1944 : some non-NO_REG regclass. */
1945 : int
1946 20449694 : ira_get_dup_out_num (int op_num, alternative_mask alts,
1947 : bool &single_input_op_has_cstr_p)
1948 : {
1949 20449694 : int curr_alt, c, original;
1950 20449694 : bool ignore_p, use_commut_op_p;
1951 20449694 : const char *str;
1952 :
1953 20449694 : if (op_num < 0 || recog_data.n_alternatives == 0)
1954 : return -1;
1955 : /* We should find duplications only for input operands. */
1956 20449694 : if (recog_data.operand_type[op_num] != OP_IN)
1957 : return -1;
1958 14759852 : str = recog_data.constraints[op_num];
1959 14759852 : use_commut_op_p = false;
1960 14759852 : single_input_op_has_cstr_p = true;
1961 :
1962 14759852 : rtx op = recog_data.operand[op_num];
1963 14759852 : int op_regno = reg_or_subregno (op);
1964 14759852 : enum reg_class op_pref_cl = reg_preferred_class (op_regno);
1965 14759852 : machine_mode op_mode = GET_MODE (op);
1966 :
1967 14759852 : ira_init_register_move_cost_if_necessary (op_mode);
1968 : /* If the preferred regclass isn't NO_REG, continue to find the matching
1969 : constraint in all available alternatives with preferred regclass, even
1970 : if we have found or will find one alternative whose constraint stands
1971 : for a REG (non-NO_REG) regclass. Note that it would be fine not to
1972 : respect matching constraint if the register copy is free, so exclude
1973 : it. */
1974 14759852 : bool respect_dup_despite_reg_cstr
1975 14759852 : = param_ira_consider_dup_in_all_alts
1976 472 : && op_pref_cl != NO_REGS
1977 14760320 : && ira_register_move_cost[op_mode][op_pref_cl][op_pref_cl] > 0;
1978 :
1979 : /* Record the alternative whose constraint uses the same regclass as the
1980 : preferred regclass, later if we find one matching constraint for this
1981 : operand with preferred reclass, we will visit these recorded
1982 : alternatives to check whether if there is one alternative in which no
1983 : any INPUT operands have one matching constraint same as our candidate.
1984 : If yes, it means there is one alternative which is perfectly fine
1985 : without satisfying this matching constraint. If no, it means in any
1986 : alternatives there is one other INPUT operand holding this matching
1987 : constraint, it's fine to respect this matching constraint and further
1988 : create this constraint copy since it would become harmless once some
1989 : other takes preference and it's interfered. */
1990 17224255 : alternative_mask pref_cl_alts;
1991 :
1992 17224255 : for (;;)
1993 : {
1994 17224255 : pref_cl_alts = 0;
1995 :
1996 17224255 : for (curr_alt = 0, ignore_p = !TEST_BIT (alts, curr_alt),
1997 17224255 : original = -1;;)
1998 : {
1999 101408483 : c = *str;
2000 101408483 : if (c == '\0')
2001 : break;
2002 96569972 : if (c == '#')
2003 : ignore_p = true;
2004 96569972 : else if (c == ',')
2005 : {
2006 31441635 : curr_alt++;
2007 31441635 : ignore_p = !TEST_BIT (alts, curr_alt);
2008 : }
2009 65128337 : else if (! ignore_p)
2010 20714148 : switch (c)
2011 : {
2012 922 : case 'g':
2013 922 : goto fail;
2014 15613784 : default:
2015 15613784 : {
2016 15613784 : enum constraint_num cn = lookup_constraint (str);
2017 15613784 : enum reg_class cl = reg_class_for_constraint (cn);
2018 12661457 : if (cl != NO_REGS && !targetm.class_likely_spilled_p (cl))
2019 : {
2020 12383362 : if (respect_dup_despite_reg_cstr)
2021 : {
2022 : /* If it's free to move from one preferred class to
2023 : the one without matching constraint, it doesn't
2024 : have to respect this constraint with costs. */
2025 667 : if (cl != op_pref_cl
2026 104 : && (ira_reg_class_intersect[cl][op_pref_cl]
2027 : != NO_REGS)
2028 92 : && (ira_may_move_in_cost[op_mode][op_pref_cl][cl]
2029 : == 0))
2030 76 : goto fail;
2031 591 : else if (cl == op_pref_cl)
2032 563 : pref_cl_alts |= ALTERNATIVE_BIT (curr_alt);
2033 : }
2034 : else
2035 12382695 : goto fail;
2036 : }
2037 3231013 : if (constraint_satisfied_p (op, cn))
2038 2051 : goto fail;
2039 : break;
2040 : }
2041 :
2042 5099442 : case '0': case '1': case '2': case '3': case '4':
2043 5099442 : case '5': case '6': case '7': case '8': case '9':
2044 5099442 : {
2045 5099442 : char *end;
2046 5099442 : int n = (int) strtoul (str, &end, 10);
2047 5099442 : str = end;
2048 5099442 : if (original != -1 && original != n)
2049 0 : goto fail;
2050 5099442 : gcc_assert (n < recog_data.n_operands);
2051 5099442 : if (respect_dup_despite_reg_cstr)
2052 : {
2053 217 : const operand_alternative *op_alt
2054 217 : = &recog_op_alt[curr_alt * recog_data.n_operands];
2055 : /* Only respect the one with preferred rclass, without
2056 : respect_dup_despite_reg_cstr it's possible to get
2057 : one whose regclass isn't preferred first before,
2058 : but it would fail since there should be other
2059 : alternatives with preferred regclass. */
2060 217 : if (op_alt[n].cl == op_pref_cl)
2061 5099384 : original = n;
2062 : }
2063 : else
2064 : original = n;
2065 5099442 : continue;
2066 5099442 : }
2067 : }
2068 79084786 : str += CONSTRAINT_LEN (c, str);
2069 : }
2070 4838511 : if (original == -1)
2071 1817882 : goto fail;
2072 3020629 : if (recog_data.operand_type[original] == OP_OUT)
2073 : {
2074 3020319 : if (pref_cl_alts == 0)
2075 : return original;
2076 : /* Visit these recorded alternatives to check whether
2077 : there is one alternative in which no any INPUT operands
2078 : have one matching constraint same as our candidate.
2079 : Give up this candidate if so. */
2080 : int nop, nalt;
2081 361 : for (nalt = 0; nalt < recog_data.n_alternatives; nalt++)
2082 : {
2083 338 : if (!TEST_BIT (pref_cl_alts, nalt))
2084 239 : continue;
2085 99 : const operand_alternative *op_alt
2086 99 : = &recog_op_alt[nalt * recog_data.n_operands];
2087 99 : bool dup_in_other = false;
2088 365 : for (nop = 0; nop < recog_data.n_operands; nop++)
2089 : {
2090 309 : if (recog_data.operand_type[nop] != OP_IN)
2091 99 : continue;
2092 210 : if (nop == op_num)
2093 88 : continue;
2094 122 : if (op_alt[nop].matches == original)
2095 : {
2096 : dup_in_other = true;
2097 : break;
2098 : }
2099 : }
2100 99 : if (!dup_in_other)
2101 : return -1;
2102 : }
2103 23 : single_input_op_has_cstr_p = false;
2104 23 : return original;
2105 : }
2106 310 : fail:
2107 14203936 : if (use_commut_op_p)
2108 : break;
2109 12533753 : use_commut_op_p = true;
2110 12533753 : if (recog_data.constraints[op_num][0] == '%')
2111 1075302 : str = recog_data.constraints[op_num + 1];
2112 11458451 : else if (op_num > 0 && recog_data.constraints[op_num - 1][0] == '%')
2113 : str = recog_data.constraints[op_num - 1];
2114 : else
2115 : break;
2116 : }
2117 : return -1;
2118 : }
2119 :
2120 :
2121 :
2122 : /* Return true if a replacement of SRC by DEST does not lead to unsatisfiable
2123 : asm. A replacement is valid if SRC or DEST are not constrained in asm
2124 : inputs of a single asm statement. See match_asm_constraints_2() for more
2125 : details. TODO: As in match_asm_constraints_2() consider alternatives more
2126 : precisely. */
2127 :
2128 : static bool
2129 7604 : valid_replacement_for_asm_input_p_1 (const_rtx asmops, const_rtx src, const_rtx dest)
2130 : {
2131 7604 : int ninputs = ASM_OPERANDS_INPUT_LENGTH (asmops);
2132 7604 : rtvec inputs = ASM_OPERANDS_INPUT_VEC (asmops);
2133 38313 : for (int i = 0; i < ninputs; ++i)
2134 : {
2135 30709 : rtx input_src = RTVEC_ELT (inputs, i);
2136 30709 : const char *constraint_src
2137 30709 : = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, i);
2138 30709 : if (rtx_equal_p (input_src, src)
2139 30709 : && strchr (constraint_src, '{') != nullptr)
2140 0 : for (int j = 0; j < ninputs; ++j)
2141 : {
2142 0 : rtx input_dest = RTVEC_ELT (inputs, j);
2143 0 : const char *constraint_dest
2144 0 : = ASM_OPERANDS_INPUT_CONSTRAINT (asmops, j);
2145 0 : if (rtx_equal_p (input_dest, dest)
2146 0 : && strchr (constraint_dest, '{') != nullptr)
2147 : return false;
2148 : }
2149 : }
2150 : return true;
2151 : }
2152 :
2153 : /* Return true if a replacement of SRC by DEST does not lead to unsatisfiable
2154 : asm. A replacement is valid if SRC or DEST are not constrained in asm
2155 : inputs of a single asm statement. The final check is done in function
2156 : valid_replacement_for_asm_input_p_1. */
2157 :
2158 : static bool
2159 522265 : valid_replacement_for_asm_input_p (const_rtx src, const_rtx dest)
2160 : {
2161 : /* Bail out early if there is no asm statement. */
2162 522265 : if (!crtl->has_asm_statement)
2163 : return true;
2164 25878 : for (df_ref use = DF_REG_USE_CHAIN (REGNO (src));
2165 780337 : use;
2166 754459 : use = DF_REF_NEXT_REG (use))
2167 : {
2168 754459 : struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
2169 : /* Only check real uses, not artificial ones. */
2170 754459 : if (use_info)
2171 : {
2172 754459 : rtx_insn *insn = DF_REF_INSN (use);
2173 754459 : rtx pat = PATTERN (insn);
2174 754459 : if (asm_noperands (pat) <= 0)
2175 751162 : continue;
2176 3297 : if (GET_CODE (pat) == SET)
2177 : {
2178 0 : if (!valid_replacement_for_asm_input_p_1 (SET_SRC (pat), src, dest))
2179 : return false;
2180 : }
2181 3297 : else if (GET_CODE (pat) == PARALLEL)
2182 14246 : for (int i = 0, len = XVECLEN (pat, 0); i < len; ++i)
2183 : {
2184 10949 : rtx asmops = XVECEXP (pat, 0, i);
2185 10949 : if (GET_CODE (asmops) == SET)
2186 7570 : asmops = SET_SRC (asmops);
2187 10949 : if (GET_CODE (asmops) == ASM_OPERANDS
2188 10949 : && !valid_replacement_for_asm_input_p_1 (asmops, src, dest))
2189 : return false;
2190 : }
2191 0 : else if (GET_CODE (pat) == ASM_OPERANDS)
2192 : {
2193 0 : if (!valid_replacement_for_asm_input_p_1 (pat, src, dest))
2194 : return false;
2195 : }
2196 : else
2197 0 : gcc_unreachable ();
2198 : }
2199 : }
2200 : return true;
2201 : }
2202 :
2203 : /* Search forward to see if the source register of a copy insn dies
2204 : before either it or the destination register is modified, but don't
2205 : scan past the end of the basic block. If so, we can replace the
2206 : source with the destination and let the source die in the copy
2207 : insn.
2208 :
2209 : This will reduce the number of registers live in that range and may
2210 : enable the destination and the source coalescing, thus often saving
2211 : one register in addition to a register-register copy. */
2212 :
2213 : static void
2214 1471362 : decrease_live_ranges_number (void)
2215 : {
2216 1471362 : basic_block bb;
2217 1471362 : rtx_insn *insn;
2218 1471362 : rtx set, src, dest, dest_death, note;
2219 1471362 : rtx_insn *p, *q;
2220 1471362 : int sregno, dregno;
2221 :
2222 1471362 : if (! flag_expensive_optimizations)
2223 : return;
2224 :
2225 963944 : if (ira_dump_file)
2226 32 : fprintf (ira_dump_file, "Starting decreasing number of live ranges...\n");
2227 :
2228 11385684 : FOR_EACH_BB_FN (bb, cfun)
2229 134318562 : FOR_BB_INSNS (bb, insn)
2230 : {
2231 123896822 : set = single_set (insn);
2232 123896822 : if (! set)
2233 72433193 : continue;
2234 51463629 : src = SET_SRC (set);
2235 51463629 : dest = SET_DEST (set);
2236 14094998 : if (! REG_P (src) || ! REG_P (dest)
2237 60304603 : || find_reg_note (insn, REG_DEAD, src))
2238 48641758 : continue;
2239 2821871 : sregno = REGNO (src);
2240 2821871 : dregno = REGNO (dest);
2241 :
2242 : /* We don't want to mess with hard regs if register classes
2243 : are small. */
2244 5121477 : if (sregno == dregno
2245 2821837 : || (targetm.small_register_classes_for_mode_p (GET_MODE (src))
2246 2821837 : && (sregno < FIRST_PSEUDO_REGISTER
2247 2821837 : || dregno < FIRST_PSEUDO_REGISTER))
2248 : /* We don't see all updates to SP if they are in an
2249 : auto-inc memory reference, so we must disallow this
2250 : optimization on them. */
2251 522265 : || sregno == STACK_POINTER_REGNUM
2252 522265 : || dregno == STACK_POINTER_REGNUM
2253 3344136 : || !valid_replacement_for_asm_input_p (src, dest))
2254 2299606 : continue;
2255 :
2256 522265 : dest_death = NULL_RTX;
2257 :
2258 6252993 : for (p = NEXT_INSN (insn); p; p = NEXT_INSN (p))
2259 : {
2260 6248875 : if (! INSN_P (p))
2261 1072570 : continue;
2262 5176305 : if (BLOCK_FOR_INSN (p) != bb)
2263 : break;
2264 :
2265 9408407 : if (reg_set_p (src, p) || reg_set_p (dest, p)
2266 : /* If SRC is an asm-declared register, it must not be
2267 : replaced in any asm. Unfortunately, the REG_EXPR
2268 : tree for the asm variable may be absent in the SRC
2269 : rtx, so we can't check the actual register
2270 : declaration easily (the asm operand will have it,
2271 : though). To avoid complicating the test for a rare
2272 : case, we just don't perform register replacement
2273 : for a hard reg mentioned in an asm. */
2274 4665117 : || (sregno < FIRST_PSEUDO_REGISTER
2275 0 : && asm_noperands (PATTERN (p)) >= 0
2276 0 : && reg_overlap_mentioned_p (src, PATTERN (p)))
2277 : /* Don't change hard registers used by a call. */
2278 4665117 : || (CALL_P (p) && sregno < FIRST_PSEUDO_REGISTER
2279 0 : && find_reg_fusage (p, USE, src))
2280 : /* Don't change a USE of a register. */
2281 9382321 : || (GET_CODE (PATTERN (p)) == USE
2282 911 : && reg_overlap_mentioned_p (src, XEXP (PATTERN (p), 0))))
2283 : break;
2284 :
2285 : /* See if all of SRC dies in P. This test is slightly
2286 : more conservative than it needs to be. */
2287 4665117 : if ((note = find_regno_note (p, REG_DEAD, sregno))
2288 4665117 : && GET_MODE (XEXP (note, 0)) == GET_MODE (src))
2289 : {
2290 6959 : int failed = 0;
2291 :
2292 : /* We can do the optimization. Scan forward from INSN
2293 : again, replacing regs as we go. Set FAILED if a
2294 : replacement can't be done. In that case, we can't
2295 : move the death note for SRC. This should be
2296 : rare. */
2297 :
2298 : /* Set to stop at next insn. */
2299 6959 : for (q = next_real_insn (insn);
2300 37397 : q != next_real_insn (p);
2301 30438 : q = next_real_insn (q))
2302 : {
2303 30438 : if (reg_overlap_mentioned_p (src, PATTERN (q)))
2304 : {
2305 : /* If SRC is a hard register, we might miss
2306 : some overlapping registers with
2307 : validate_replace_rtx, so we would have to
2308 : undo it. We can't if DEST is present in
2309 : the insn, so fail in that combination of
2310 : cases. */
2311 8124 : if (sregno < FIRST_PSEUDO_REGISTER
2312 8124 : && reg_mentioned_p (dest, PATTERN (q)))
2313 : failed = 1;
2314 :
2315 : /* Attempt to replace all uses. */
2316 8124 : else if (!validate_replace_rtx (src, dest, q))
2317 : failed = 1;
2318 :
2319 : /* If this succeeded, but some part of the
2320 : register is still present, undo the
2321 : replacement. */
2322 8124 : else if (sregno < FIRST_PSEUDO_REGISTER
2323 8124 : && reg_overlap_mentioned_p (src, PATTERN (q)))
2324 : {
2325 0 : validate_replace_rtx (dest, src, q);
2326 0 : failed = 1;
2327 : }
2328 : }
2329 :
2330 : /* If DEST dies here, remove the death note and
2331 : save it for later. Make sure ALL of DEST dies
2332 : here; again, this is overly conservative. */
2333 30438 : if (! dest_death
2334 30438 : && (dest_death = find_regno_note (q, REG_DEAD, dregno)))
2335 : {
2336 9 : if (GET_MODE (XEXP (dest_death, 0)) == GET_MODE (dest))
2337 9 : remove_note (q, dest_death);
2338 : else
2339 : {
2340 : failed = 1;
2341 : dest_death = 0;
2342 : }
2343 : }
2344 : }
2345 :
2346 6959 : if (! failed)
2347 : {
2348 : /* Move death note of SRC from P to INSN. */
2349 6959 : remove_note (p, note);
2350 6959 : XEXP (note, 1) = REG_NOTES (insn);
2351 6959 : REG_NOTES (insn) = note;
2352 : }
2353 :
2354 : /* DEST is also dead if INSN has a REG_UNUSED note for
2355 : DEST. */
2356 6959 : if (! dest_death
2357 6959 : && (dest_death
2358 6950 : = find_regno_note (insn, REG_UNUSED, dregno)))
2359 : {
2360 0 : PUT_REG_NOTE_KIND (dest_death, REG_DEAD);
2361 0 : remove_note (insn, dest_death);
2362 : }
2363 :
2364 : /* Put death note of DEST on P if we saw it die. */
2365 6959 : if (dest_death)
2366 : {
2367 9 : XEXP (dest_death, 1) = REG_NOTES (p);
2368 9 : REG_NOTES (p) = dest_death;
2369 : }
2370 : break;
2371 : }
2372 :
2373 : /* If SRC is a hard register which is set or killed in
2374 : some other way, we can't do this optimization. */
2375 4658158 : else if (sregno < FIRST_PSEUDO_REGISTER && dead_or_set_p (p, src))
2376 : break;
2377 : }
2378 : }
2379 : }
2380 :
2381 :
2382 :
2383 : /* Return nonzero if REGNO is a particularly bad choice for reloading X. */
2384 : static bool
2385 0 : ira_bad_reload_regno_1 (int regno, rtx x)
2386 : {
2387 0 : int x_regno, n, i;
2388 0 : ira_allocno_t a;
2389 0 : enum reg_class pref;
2390 :
2391 : /* We only deal with pseudo regs. */
2392 0 : if (! x || GET_CODE (x) != REG)
2393 : return false;
2394 :
2395 0 : x_regno = REGNO (x);
2396 0 : if (x_regno < FIRST_PSEUDO_REGISTER)
2397 : return false;
2398 :
2399 : /* If the pseudo prefers REGNO explicitly, then do not consider
2400 : REGNO a bad spill choice. */
2401 0 : pref = reg_preferred_class (x_regno);
2402 0 : if (reg_class_size[pref] == 1)
2403 0 : return !TEST_HARD_REG_BIT (reg_class_contents[pref], regno);
2404 :
2405 : /* If the pseudo conflicts with REGNO, then we consider REGNO a
2406 : poor choice for a reload regno. */
2407 0 : a = ira_regno_allocno_map[x_regno];
2408 0 : n = ALLOCNO_NUM_OBJECTS (a);
2409 0 : for (i = 0; i < n; i++)
2410 : {
2411 0 : ira_object_t obj = ALLOCNO_OBJECT (a, i);
2412 0 : if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno))
2413 : return true;
2414 : }
2415 : return false;
2416 : }
2417 :
2418 : /* Return nonzero if REGNO is a particularly bad choice for reloading
2419 : IN or OUT. */
2420 : bool
2421 0 : ira_bad_reload_regno (int regno, rtx in, rtx out)
2422 : {
2423 0 : return (ira_bad_reload_regno_1 (regno, in)
2424 0 : || ira_bad_reload_regno_1 (regno, out));
2425 : }
2426 :
2427 : /* Add register clobbers from asm statements. */
2428 : static void
2429 1500777 : compute_regs_asm_clobbered (void)
2430 : {
2431 1500777 : basic_block bb;
2432 :
2433 16247075 : FOR_EACH_BB_FN (bb, cfun)
2434 : {
2435 14746298 : rtx_insn *insn;
2436 175974396 : FOR_BB_INSNS_REVERSE (bb, insn)
2437 : {
2438 161228098 : df_ref def;
2439 :
2440 161228098 : if (NONDEBUG_INSN_P (insn) && asm_noperands (PATTERN (insn)) >= 0)
2441 330072 : FOR_EACH_INSN_DEF (def, insn)
2442 : {
2443 220673 : unsigned int dregno = DF_REF_REGNO (def);
2444 220673 : if (HARD_REGISTER_NUM_P (dregno))
2445 304612 : add_to_hard_reg_set (&crtl->asm_clobbers,
2446 152306 : GET_MODE (DF_REF_REAL_REG (def)),
2447 : dregno);
2448 : }
2449 : }
2450 : }
2451 1500777 : }
2452 :
2453 :
2454 : /* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and
2455 : REGS_EVER_LIVE. */
2456 : void
2457 1500777 : ira_setup_eliminable_regset (void)
2458 : {
2459 1500777 : int i;
2460 1500777 : static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS;
2461 1631219 : int fp_reg_count = hard_regno_nregs (HARD_FRAME_POINTER_REGNUM, Pmode);
2462 :
2463 : /* Setup is_leaf as frame_pointer_required may use it. This function
2464 : is called by sched_init before ira if scheduling is enabled. */
2465 1500777 : crtl->is_leaf = leaf_function_p ();
2466 :
2467 : /* FIXME: If EXIT_IGNORE_STACK is set, we will not save and restore
2468 : sp for alloca. So we can't eliminate the frame pointer in that
2469 : case. At some point, we should improve this by emitting the
2470 : sp-adjusting insns for this case. */
2471 1500777 : frame_pointer_needed
2472 3001554 : = (! flag_omit_frame_pointer
2473 1058896 : || (cfun->calls_alloca && EXIT_IGNORE_STACK)
2474 : /* We need the frame pointer to catch stack overflow exceptions if
2475 : the stack pointer is moving (as for the alloca case just above). */
2476 1049231 : || (STACK_CHECK_MOVING_SP
2477 1049231 : && flag_stack_check
2478 63 : && flag_exceptions
2479 26 : && cfun->can_throw_non_call_exceptions)
2480 1049227 : || crtl->accesses_prior_frames
2481 1046297 : || (SUPPORTS_STACK_ALIGNMENT && crtl->stack_realign_needed)
2482 2501274 : || targetm.frame_pointer_required ());
2483 :
2484 : /* The chance that FRAME_POINTER_NEEDED is changed from inspecting
2485 : RTL is very small. So if we use frame pointer for RA and RTL
2486 : actually prevents this, we will spill pseudos assigned to the
2487 : frame pointer in LRA. */
2488 :
2489 1500777 : if (frame_pointer_needed)
2490 1000650 : for (i = 0; i < fp_reg_count; i++)
2491 500325 : df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2492 :
2493 1500777 : ira_no_alloc_regs = no_unit_alloc_regs;
2494 1500777 : CLEAR_HARD_REG_SET (eliminable_regset);
2495 :
2496 1500777 : compute_regs_asm_clobbered ();
2497 :
2498 : /* Build the regset of all eliminable registers and show we can't
2499 : use those that we already know won't be eliminated. */
2500 7503885 : for (i = 0; i < (int) ARRAY_SIZE (eliminables); i++)
2501 : {
2502 6003108 : bool cannot_elim
2503 6003108 : = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to)
2504 6003108 : || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed));
2505 :
2506 6003108 : if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from))
2507 : {
2508 6003108 : SET_HARD_REG_BIT (eliminable_regset, eliminables[i].from);
2509 :
2510 6003108 : if (cannot_elim)
2511 1060584 : SET_HARD_REG_BIT (ira_no_alloc_regs, eliminables[i].from);
2512 : }
2513 0 : else if (cannot_elim)
2514 0 : error ("%s cannot be used in %<asm%> here",
2515 : reg_names[eliminables[i].from]);
2516 : else
2517 0 : df_set_regs_ever_live (eliminables[i].from, true);
2518 : }
2519 : if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
2520 : {
2521 3001554 : for (i = 0; i < fp_reg_count; i++)
2522 1500777 : if (global_regs[HARD_FRAME_POINTER_REGNUM + i])
2523 : /* Nothing to do: the register is already treated as live
2524 : where appropriate, and cannot be eliminated. */
2525 : ;
2526 1500756 : else if (!TEST_HARD_REG_BIT (crtl->asm_clobbers,
2527 : HARD_FRAME_POINTER_REGNUM + i))
2528 : {
2529 1499465 : SET_HARD_REG_BIT (eliminable_regset,
2530 : HARD_FRAME_POINTER_REGNUM + i);
2531 1499465 : if (frame_pointer_needed)
2532 500323 : SET_HARD_REG_BIT (ira_no_alloc_regs,
2533 : HARD_FRAME_POINTER_REGNUM + i);
2534 : }
2535 1291 : else if (frame_pointer_needed)
2536 0 : error ("%s cannot be used in %<asm%> here",
2537 : reg_names[HARD_FRAME_POINTER_REGNUM + i]);
2538 : else
2539 1291 : df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true);
2540 : }
2541 1500777 : }
2542 :
2543 :
2544 :
2545 : /* Vector of substitutions of register numbers,
2546 : used to map pseudo regs into hardware regs.
2547 : This is set up as a result of register allocation.
2548 : Element N is the hard reg assigned to pseudo reg N,
2549 : or is -1 if no hard reg was assigned.
2550 : If N is a hard reg number, element N is N. */
2551 : short *reg_renumber;
2552 :
2553 : /* Set up REG_RENUMBER and CALLER_SAVE_NEEDED (used by reload) from
2554 : the allocation found by IRA. */
2555 : static void
2556 1471362 : setup_reg_renumber (void)
2557 : {
2558 1471362 : int regno, hard_regno;
2559 1471362 : ira_allocno_t a;
2560 1471362 : ira_allocno_iterator ai;
2561 :
2562 1471362 : caller_save_needed = 0;
2563 37935239 : FOR_EACH_ALLOCNO (a, ai)
2564 : {
2565 36463877 : if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL)
2566 3704601 : continue;
2567 : /* There are no caps at this point. */
2568 32759276 : ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL);
2569 32759276 : if (! ALLOCNO_ASSIGNED_P (a))
2570 : /* It can happen if A is not referenced but partially anticipated
2571 : somewhere in a region. */
2572 0 : ALLOCNO_ASSIGNED_P (a) = true;
2573 32759276 : ira_free_allocno_updated_costs (a);
2574 32759276 : hard_regno = ALLOCNO_HARD_REGNO (a);
2575 32759276 : regno = ALLOCNO_REGNO (a);
2576 32759276 : reg_renumber[regno] = (hard_regno < 0 ? -1 : hard_regno);
2577 32759276 : if (hard_regno >= 0)
2578 : {
2579 29445808 : int i, nwords;
2580 29445808 : enum reg_class pclass;
2581 29445808 : ira_object_t obj;
2582 :
2583 29445808 : pclass = ira_pressure_class_translate[REGNO_REG_CLASS (hard_regno)];
2584 29445808 : nwords = ALLOCNO_NUM_OBJECTS (a);
2585 59998496 : for (i = 0; i < nwords; i++)
2586 : {
2587 30552688 : obj = ALLOCNO_OBJECT (a, i);
2588 30552688 : OBJECT_TOTAL_CONFLICT_HARD_REGS (obj)
2589 61105376 : |= ~reg_class_contents[pclass];
2590 : }
2591 29445808 : if (ira_need_caller_save_p (a, hard_regno))
2592 : {
2593 435724 : ira_assert (!optimize || flag_caller_saves
2594 : || (ALLOCNO_CALLS_CROSSED_NUM (a)
2595 : == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a))
2596 : || regno >= ira_reg_equiv_len
2597 : || ira_equiv_no_lvalue_p (regno));
2598 435724 : caller_save_needed = 1;
2599 : }
2600 : }
2601 : }
2602 1471362 : }
2603 :
2604 : /* Set up allocno assignment flags for further allocation
2605 : improvements. */
2606 : static void
2607 0 : setup_allocno_assignment_flags (void)
2608 : {
2609 0 : int hard_regno;
2610 0 : ira_allocno_t a;
2611 0 : ira_allocno_iterator ai;
2612 :
2613 0 : FOR_EACH_ALLOCNO (a, ai)
2614 : {
2615 0 : if (! ALLOCNO_ASSIGNED_P (a))
2616 : /* It can happen if A is not referenced but partially anticipated
2617 : somewhere in a region. */
2618 0 : ira_free_allocno_updated_costs (a);
2619 0 : hard_regno = ALLOCNO_HARD_REGNO (a);
2620 : /* Don't assign hard registers to allocnos which are destination
2621 : of removed store at the end of loop. It has no sense to keep
2622 : the same value in different hard registers. It is also
2623 : impossible to assign hard registers correctly to such
2624 : allocnos because the cost info and info about intersected
2625 : calls are incorrect for them. */
2626 0 : ALLOCNO_ASSIGNED_P (a) = (hard_regno >= 0
2627 0 : || ALLOCNO_EMIT_DATA (a)->mem_optimized_dest_p
2628 0 : || (ALLOCNO_MEMORY_COST (a)
2629 0 : - ALLOCNO_CLASS_COST (a)) < 0);
2630 0 : ira_assert
2631 : (hard_regno < 0
2632 : || ira_hard_reg_in_set_p (hard_regno, ALLOCNO_MODE (a),
2633 : reg_class_contents[ALLOCNO_CLASS (a)]));
2634 : }
2635 0 : }
2636 :
2637 : /* Evaluate overall allocation cost and the costs for using hard
2638 : registers and memory for allocnos. */
2639 : static void
2640 1471362 : calculate_allocation_cost (void)
2641 : {
2642 1471362 : int hard_regno, cost;
2643 1471362 : ira_allocno_t a;
2644 1471362 : ira_allocno_iterator ai;
2645 :
2646 1471362 : ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
2647 37935239 : FOR_EACH_ALLOCNO (a, ai)
2648 : {
2649 36463877 : hard_regno = ALLOCNO_HARD_REGNO (a);
2650 36463877 : ira_assert (hard_regno < 0
2651 : || (ira_hard_reg_in_set_p
2652 : (hard_regno, ALLOCNO_MODE (a),
2653 : reg_class_contents[ALLOCNO_CLASS (a)])));
2654 36463877 : if (hard_regno < 0)
2655 : {
2656 3708600 : cost = ALLOCNO_MEMORY_COST (a);
2657 3708600 : ira_mem_cost += cost;
2658 : }
2659 32755277 : else if (ALLOCNO_HARD_REG_COSTS (a) != NULL)
2660 : {
2661 8594779 : cost = (ALLOCNO_HARD_REG_COSTS (a)
2662 : [ira_class_hard_reg_index
2663 8594779 : [ALLOCNO_CLASS (a)][hard_regno]]);
2664 8594779 : ira_reg_cost += cost;
2665 : }
2666 : else
2667 : {
2668 24160498 : cost = ALLOCNO_CLASS_COST (a);
2669 24160498 : ira_reg_cost += cost;
2670 : }
2671 36463877 : ira_overall_cost += cost;
2672 : }
2673 :
2674 1471362 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
2675 : {
2676 95 : fprintf (ira_dump_file,
2677 : "+++Costs: overall %" PRId64
2678 : ", reg %" PRId64
2679 : ", mem %" PRId64
2680 : ", ld %" PRId64
2681 : ", st %" PRId64
2682 : ", move %" PRId64,
2683 : ira_overall_cost, ira_reg_cost, ira_mem_cost,
2684 : ira_load_cost, ira_store_cost, ira_shuffle_cost);
2685 95 : fprintf (ira_dump_file, "\n+++ move loops %d, new jumps %d\n",
2686 : ira_move_loops_num, ira_additional_jumps_num);
2687 : }
2688 :
2689 1471362 : }
2690 :
2691 : #ifdef ENABLE_IRA_CHECKING
2692 : /* Check the correctness of the allocation. We do need this because
2693 : of complicated code to transform more one region internal
2694 : representation into one region representation. */
2695 : static void
2696 0 : check_allocation (void)
2697 : {
2698 0 : ira_allocno_t a;
2699 0 : int hard_regno, nregs, conflict_nregs;
2700 0 : ira_allocno_iterator ai;
2701 :
2702 0 : FOR_EACH_ALLOCNO (a, ai)
2703 : {
2704 0 : int n = ALLOCNO_NUM_OBJECTS (a);
2705 0 : int i;
2706 :
2707 0 : if (ALLOCNO_CAP_MEMBER (a) != NULL
2708 0 : || (hard_regno = ALLOCNO_HARD_REGNO (a)) < 0)
2709 0 : continue;
2710 0 : nregs = hard_regno_nregs (hard_regno, ALLOCNO_MODE (a));
2711 0 : if (nregs == 1)
2712 : /* We allocated a single hard register. */
2713 : n = 1;
2714 0 : else if (n > 1)
2715 : /* We allocated multiple hard registers, and we will test
2716 : conflicts in a granularity of single hard regs. */
2717 0 : nregs = 1;
2718 :
2719 0 : for (i = 0; i < n; i++)
2720 : {
2721 0 : ira_object_t obj = ALLOCNO_OBJECT (a, i);
2722 0 : ira_object_t conflict_obj;
2723 0 : ira_object_conflict_iterator oci;
2724 0 : int this_regno = hard_regno;
2725 0 : if (n > 1)
2726 : {
2727 0 : if (REG_WORDS_BIG_ENDIAN)
2728 : this_regno += n - i - 1;
2729 : else
2730 0 : this_regno += i;
2731 : }
2732 0 : FOR_EACH_OBJECT_CONFLICT (obj, conflict_obj, oci)
2733 : {
2734 0 : ira_allocno_t conflict_a = OBJECT_ALLOCNO (conflict_obj);
2735 0 : int conflict_hard_regno = ALLOCNO_HARD_REGNO (conflict_a);
2736 0 : if (conflict_hard_regno < 0)
2737 0 : continue;
2738 0 : if (ira_soft_conflict (a, conflict_a))
2739 0 : continue;
2740 :
2741 0 : conflict_nregs = hard_regno_nregs (conflict_hard_regno,
2742 0 : ALLOCNO_MODE (conflict_a));
2743 :
2744 0 : if (ALLOCNO_NUM_OBJECTS (conflict_a) > 1
2745 0 : && conflict_nregs == ALLOCNO_NUM_OBJECTS (conflict_a))
2746 : {
2747 0 : if (REG_WORDS_BIG_ENDIAN)
2748 : conflict_hard_regno += (ALLOCNO_NUM_OBJECTS (conflict_a)
2749 : - OBJECT_SUBWORD (conflict_obj) - 1);
2750 : else
2751 0 : conflict_hard_regno += OBJECT_SUBWORD (conflict_obj);
2752 0 : conflict_nregs = 1;
2753 : }
2754 :
2755 0 : if ((conflict_hard_regno <= this_regno
2756 0 : && this_regno < conflict_hard_regno + conflict_nregs)
2757 0 : || (this_regno <= conflict_hard_regno
2758 0 : && conflict_hard_regno < this_regno + nregs))
2759 : {
2760 0 : fprintf (stderr, "bad allocation for %d and %d\n",
2761 : ALLOCNO_REGNO (a), ALLOCNO_REGNO (conflict_a));
2762 0 : gcc_unreachable ();
2763 : }
2764 : }
2765 : }
2766 : }
2767 0 : }
2768 : #endif
2769 :
2770 : /* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should
2771 : be already calculated. */
2772 : static void
2773 1471362 : setup_reg_equiv_init (void)
2774 : {
2775 1471362 : int i;
2776 1471362 : int max_regno = max_reg_num ();
2777 :
2778 203792478 : for (i = 0; i < max_regno; i++)
2779 200849754 : reg_equiv_init (i) = ira_reg_equiv[i].init_insns;
2780 1471362 : }
2781 :
2782 : /* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS
2783 : are insns which were generated for such movement. It is assumed
2784 : that FROM_REGNO and TO_REGNO always have the same value at the
2785 : point of any move containing such registers. This function is used
2786 : to update equiv info for register shuffles on the region borders
2787 : and for caller save/restore insns. */
2788 : void
2789 2199278 : ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx_insn *insns)
2790 : {
2791 2199278 : rtx_insn *insn;
2792 2199278 : rtx x, note;
2793 :
2794 2199278 : if (! ira_reg_equiv[from_regno].defined_p
2795 2199278 : && (! ira_reg_equiv[to_regno].defined_p
2796 913 : || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX
2797 912 : && ! MEM_READONLY_P (x))))
2798 : return;
2799 37315 : insn = insns;
2800 37315 : if (NEXT_INSN (insn) != NULL_RTX)
2801 : {
2802 0 : if (! ira_reg_equiv[to_regno].defined_p)
2803 : {
2804 0 : ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX);
2805 : return;
2806 : }
2807 0 : ira_reg_equiv[to_regno].defined_p = false;
2808 0 : ira_reg_equiv[to_regno].caller_save_p = false;
2809 0 : ira_reg_equiv[to_regno].memory
2810 0 : = ira_reg_equiv[to_regno].constant
2811 0 : = ira_reg_equiv[to_regno].invariant
2812 0 : = ira_reg_equiv[to_regno].init_insns = NULL;
2813 0 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2814 0 : fprintf (ira_dump_file,
2815 : " Invalidating equiv info for reg %d\n", to_regno);
2816 0 : return;
2817 : }
2818 : /* It is possible that FROM_REGNO still has no equivalence because
2819 : in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd
2820 : insn was not processed yet. */
2821 37315 : if (ira_reg_equiv[from_regno].defined_p)
2822 : {
2823 37314 : ira_reg_equiv[to_regno].defined_p = true;
2824 37314 : if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX)
2825 : {
2826 37173 : ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX
2827 : && ira_reg_equiv[from_regno].constant == NULL_RTX);
2828 37173 : ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX
2829 : || rtx_equal_p (ira_reg_equiv[to_regno].memory, x));
2830 37173 : ira_reg_equiv[to_regno].memory = x;
2831 37173 : if (! MEM_READONLY_P (x))
2832 : /* We don't add the insn to insn init list because memory
2833 : equivalence is just to say what memory is better to use
2834 : when the pseudo is spilled. */
2835 : return;
2836 : }
2837 141 : else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX)
2838 : {
2839 41 : ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX);
2840 41 : ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX
2841 : || rtx_equal_p (ira_reg_equiv[to_regno].constant, x));
2842 41 : ira_reg_equiv[to_regno].constant = x;
2843 : }
2844 : else
2845 : {
2846 100 : x = ira_reg_equiv[from_regno].invariant;
2847 100 : ira_assert (x != NULL_RTX);
2848 100 : ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX
2849 : || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x));
2850 100 : ira_reg_equiv[to_regno].invariant = x;
2851 : }
2852 158 : if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX)
2853 : {
2854 158 : note = set_unique_reg_note (insn, REG_EQUIV, copy_rtx (x));
2855 158 : gcc_assert (note != NULL_RTX);
2856 158 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2857 : {
2858 0 : fprintf (ira_dump_file,
2859 : " Adding equiv note to insn %u for reg %d ",
2860 0 : INSN_UID (insn), to_regno);
2861 0 : dump_value_slim (ira_dump_file, x, 1);
2862 0 : fprintf (ira_dump_file, "\n");
2863 : }
2864 : }
2865 : }
2866 159 : ira_reg_equiv[to_regno].init_insns
2867 318 : = gen_rtx_INSN_LIST (VOIDmode, insn,
2868 159 : ira_reg_equiv[to_regno].init_insns);
2869 159 : if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
2870 0 : fprintf (ira_dump_file,
2871 : " Adding equiv init move insn %u to reg %d\n",
2872 0 : INSN_UID (insn), to_regno);
2873 : }
2874 :
2875 : /* Fix values of array REG_EQUIV_INIT after live range splitting done
2876 : by IRA. */
2877 : static void
2878 2087370 : fix_reg_equiv_init (void)
2879 : {
2880 2087370 : int max_regno = max_reg_num ();
2881 2087370 : int i, new_regno, max;
2882 2087370 : rtx set;
2883 2087370 : rtx_insn_list *x, *next, *prev;
2884 2087370 : rtx_insn *insn;
2885 :
2886 2087370 : if (max_regno_before_ira < max_regno)
2887 : {
2888 508473 : max = vec_safe_length (reg_equivs);
2889 508473 : grow_reg_equivs ();
2890 47824289 : for (i = FIRST_PSEUDO_REGISTER; i < max; i++)
2891 47315816 : for (prev = NULL, x = reg_equiv_init (i);
2892 51881132 : x != NULL_RTX;
2893 : x = next)
2894 : {
2895 4565316 : next = x->next ();
2896 4565316 : insn = x->insn ();
2897 4565316 : set = single_set (insn);
2898 4565316 : ira_assert (set != NULL_RTX
2899 : && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))));
2900 4565316 : if (REG_P (SET_DEST (set))
2901 4565316 : && ((int) REGNO (SET_DEST (set)) == i
2902 0 : || (int) ORIGINAL_REGNO (SET_DEST (set)) == i))
2903 : new_regno = REGNO (SET_DEST (set));
2904 500482 : else if (REG_P (SET_SRC (set))
2905 500482 : && ((int) REGNO (SET_SRC (set)) == i
2906 0 : || (int) ORIGINAL_REGNO (SET_SRC (set)) == i))
2907 : new_regno = REGNO (SET_SRC (set));
2908 : else
2909 0 : gcc_unreachable ();
2910 4565316 : if (new_regno == i)
2911 : prev = x;
2912 : else
2913 : {
2914 : /* Remove the wrong list element. */
2915 0 : if (prev == NULL_RTX)
2916 0 : reg_equiv_init (i) = next;
2917 : else
2918 0 : XEXP (prev, 1) = next;
2919 0 : XEXP (x, 1) = reg_equiv_init (new_regno);
2920 0 : reg_equiv_init (new_regno) = x;
2921 : }
2922 : }
2923 : }
2924 2087370 : }
2925 :
2926 : #ifdef ENABLE_IRA_CHECKING
2927 : /* Print redundant memory-memory copies. */
2928 : static void
2929 1043685 : print_redundant_copies (void)
2930 : {
2931 1043685 : int hard_regno;
2932 1043685 : ira_allocno_t a;
2933 1043685 : ira_copy_t cp, next_cp;
2934 1043685 : ira_allocno_iterator ai;
2935 :
2936 26147365 : FOR_EACH_ALLOCNO (a, ai)
2937 : {
2938 25103680 : if (ALLOCNO_CAP_MEMBER (a) != NULL)
2939 : /* It is a cap. */
2940 3704601 : continue;
2941 21399079 : hard_regno = ALLOCNO_HARD_REGNO (a);
2942 21399079 : if (hard_regno >= 0)
2943 18172544 : continue;
2944 4187939 : for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp)
2945 961404 : if (cp->first == a)
2946 372735 : next_cp = cp->next_first_allocno_copy;
2947 : else
2948 : {
2949 588669 : next_cp = cp->next_second_allocno_copy;
2950 588669 : if (internal_flag_ira_verbose > 4 && ira_dump_file != NULL
2951 1 : && cp->insn != NULL_RTX
2952 0 : && ALLOCNO_HARD_REGNO (cp->first) == hard_regno)
2953 0 : fprintf (ira_dump_file,
2954 : " Redundant move from %d(freq %d):%d\n",
2955 0 : INSN_UID (cp->insn), cp->freq, hard_regno);
2956 : }
2957 : }
2958 1043685 : }
2959 : #endif
2960 :
2961 : /* Setup preferred and alternative classes for new pseudo-registers
2962 : created by IRA starting with START. */
2963 : static void
2964 1079285 : setup_preferred_alternate_classes_for_new_pseudos (int start)
2965 : {
2966 1079285 : int i, old_regno;
2967 1079285 : int max_regno = max_reg_num ();
2968 :
2969 2263966 : for (i = start; i < max_regno; i++)
2970 : {
2971 1184681 : old_regno = ORIGINAL_REGNO (regno_reg_rtx[i]);
2972 1184681 : ira_assert (i != old_regno);
2973 1184681 : setup_reg_classes (i, reg_preferred_class (old_regno),
2974 : reg_alternate_class (old_regno),
2975 : reg_allocno_class (old_regno));
2976 1184681 : if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
2977 0 : fprintf (ira_dump_file,
2978 : " New r%d: setting preferred %s, alternative %s\n",
2979 0 : i, reg_class_names[reg_preferred_class (old_regno)],
2980 0 : reg_class_names[reg_alternate_class (old_regno)]);
2981 : }
2982 1079285 : }
2983 :
2984 :
2985 : /* The number of entries allocated in reg_info. */
2986 : static int allocated_reg_info_size;
2987 :
2988 : /* Regional allocation can create new pseudo-registers. This function
2989 : expands some arrays for pseudo-registers. */
2990 : static void
2991 1079285 : expand_reg_info (void)
2992 : {
2993 1079285 : int i;
2994 1079285 : int size = max_reg_num ();
2995 :
2996 1079285 : resize_reg_info ();
2997 2263966 : for (i = allocated_reg_info_size; i < size; i++)
2998 1184681 : setup_reg_classes (i, GENERAL_REGS, ALL_REGS, GENERAL_REGS);
2999 1079285 : setup_preferred_alternate_classes_for_new_pseudos (allocated_reg_info_size);
3000 1079285 : allocated_reg_info_size = size;
3001 1079285 : }
3002 :
3003 : /* Return TRUE if there is too high register pressure in the function.
3004 : It is used to decide when stack slot sharing is worth to do. */
3005 : static bool
3006 1471362 : too_high_register_pressure_p (void)
3007 : {
3008 1471362 : int i;
3009 1471362 : enum reg_class pclass;
3010 :
3011 7393831 : for (i = 0; i < ira_pressure_classes_num; i++)
3012 : {
3013 5922471 : pclass = ira_pressure_classes[i];
3014 5922471 : if (ira_loop_tree_root->reg_pressure[pclass] > 10000)
3015 : return true;
3016 : }
3017 : return false;
3018 : }
3019 :
3020 :
3021 :
3022 : /* Indicate that hard register number FROM was eliminated and replaced with
3023 : an offset from hard register number TO. The status of hard registers live
3024 : at the start of a basic block is updated by replacing a use of FROM with
3025 : a use of TO. */
3026 :
3027 : void
3028 0 : mark_elimination (int from, int to)
3029 : {
3030 0 : basic_block bb;
3031 0 : bitmap r;
3032 :
3033 0 : FOR_EACH_BB_FN (bb, cfun)
3034 : {
3035 0 : r = DF_LR_IN (bb);
3036 0 : if (bitmap_bit_p (r, from))
3037 : {
3038 0 : bitmap_clear_bit (r, from);
3039 0 : bitmap_set_bit (r, to);
3040 : }
3041 0 : if (! df_live)
3042 0 : continue;
3043 0 : r = DF_LIVE_IN (bb);
3044 0 : if (bitmap_bit_p (r, from))
3045 : {
3046 0 : bitmap_clear_bit (r, from);
3047 0 : bitmap_set_bit (r, to);
3048 : }
3049 : }
3050 0 : }
3051 :
3052 :
3053 :
3054 : /* The length of the following array. */
3055 : int ira_reg_equiv_len;
3056 :
3057 : /* Info about equiv. info for each register. */
3058 : struct ira_reg_equiv_s *ira_reg_equiv;
3059 :
3060 : /* Expand ira_reg_equiv if necessary. */
3061 : void
3062 14847546 : ira_expand_reg_equiv (void)
3063 : {
3064 14847546 : int old = ira_reg_equiv_len;
3065 :
3066 14847546 : if (ira_reg_equiv_len > max_reg_num ())
3067 : return;
3068 1474510 : ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1;
3069 1474510 : ira_reg_equiv
3070 2949020 : = (struct ira_reg_equiv_s *) xrealloc (ira_reg_equiv,
3071 1474510 : ira_reg_equiv_len
3072 : * sizeof (struct ira_reg_equiv_s));
3073 1474510 : gcc_assert (old < ira_reg_equiv_len);
3074 1474510 : memset (ira_reg_equiv + old, 0,
3075 1474510 : sizeof (struct ira_reg_equiv_s) * (ira_reg_equiv_len - old));
3076 : }
3077 :
3078 : static void
3079 1471362 : init_reg_equiv (void)
3080 : {
3081 1471362 : ira_reg_equiv_len = 0;
3082 1471362 : ira_reg_equiv = NULL;
3083 0 : ira_expand_reg_equiv ();
3084 0 : }
3085 :
3086 : static void
3087 1471362 : finish_reg_equiv (void)
3088 : {
3089 1471362 : free (ira_reg_equiv);
3090 0 : }
3091 :
3092 :
3093 :
3094 : struct equivalence
3095 : {
3096 : /* Set when a REG_EQUIV note is found or created. Use to
3097 : keep track of what memory accesses might be created later,
3098 : e.g. by reload. */
3099 : rtx replacement;
3100 : rtx *src_p;
3101 :
3102 : /* The list of each instruction which initializes this register.
3103 :
3104 : NULL indicates we know nothing about this register's equivalence
3105 : properties.
3106 :
3107 : An INSN_LIST with a NULL insn indicates this pseudo is already
3108 : known to not have a valid equivalence. */
3109 : rtx_insn_list *init_insns;
3110 :
3111 : /* Loop depth is used to recognize equivalences which appear
3112 : to be present within the same loop (or in an inner loop). */
3113 : short loop_depth;
3114 : /* Nonzero if this had a preexisting REG_EQUIV note. */
3115 : unsigned char is_arg_equivalence : 1;
3116 : /* Set when an attempt should be made to replace a register
3117 : with the associated src_p entry. */
3118 : unsigned char replace : 1;
3119 : /* Set if this register has no known equivalence. */
3120 : unsigned char no_equiv : 1;
3121 : /* Set if this register is mentioned in a paradoxical subreg. */
3122 : unsigned char pdx_subregs : 1;
3123 : };
3124 :
3125 : /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence
3126 : structure for that register. */
3127 : static struct equivalence *reg_equiv;
3128 :
3129 : /* Used for communication between the following two functions. */
3130 : struct equiv_mem_data
3131 : {
3132 : /* A MEM that we wish to ensure remains unchanged. */
3133 : rtx equiv_mem;
3134 :
3135 : /* Set true if EQUIV_MEM is modified. */
3136 : bool equiv_mem_modified;
3137 : };
3138 :
3139 : /* If EQUIV_MEM is modified by modifying DEST, indicate that it is modified.
3140 : Called via note_stores. */
3141 : static void
3142 14146214 : validate_equiv_mem_from_store (rtx dest, const_rtx set ATTRIBUTE_UNUSED,
3143 : void *data)
3144 : {
3145 14146214 : struct equiv_mem_data *info = (struct equiv_mem_data *) data;
3146 :
3147 14146214 : if ((REG_P (dest)
3148 10454845 : && reg_overlap_mentioned_p (dest, info->equiv_mem))
3149 24585764 : || (MEM_P (dest)
3150 3657995 : && anti_dependence (info->equiv_mem, dest)))
3151 330969 : info->equiv_mem_modified = true;
3152 14146214 : }
3153 :
3154 : static bool equiv_init_varies_p (rtx x);
3155 :
3156 : enum valid_equiv { valid_none, valid_combine, valid_reload };
3157 :
3158 : /* Verify that no store between START and the death of REG invalidates
3159 : MEMREF. MEMREF is invalidated by modifying a register used in MEMREF,
3160 : by storing into an overlapping memory location, or with a non-const
3161 : CALL_INSN.
3162 :
3163 : Return VALID_RELOAD if MEMREF remains valid for both reload and
3164 : combine_and_move insns, VALID_COMBINE if only valid for
3165 : combine_and_move_insns, and VALID_NONE otherwise. */
3166 : static enum valid_equiv
3167 4032716 : validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref)
3168 : {
3169 4032716 : rtx_insn *insn;
3170 4032716 : rtx note;
3171 4032716 : struct equiv_mem_data info = { memref, false };
3172 4032716 : enum valid_equiv ret = valid_reload;
3173 :
3174 : /* If the memory reference has side effects or is volatile, it isn't a
3175 : valid equivalence. */
3176 4032716 : if (side_effects_p (memref))
3177 : return valid_none;
3178 :
3179 21016347 : for (insn = start; insn; insn = NEXT_INSN (insn))
3180 : {
3181 21016126 : if (!INSN_P (insn))
3182 1407708 : continue;
3183 :
3184 19608418 : if (find_reg_note (insn, REG_DEAD, reg))
3185 : return ret;
3186 :
3187 16790685 : if (CALL_P (insn))
3188 : {
3189 : /* We can combine a reg def from one insn into a reg use in
3190 : another over a call if the memory is readonly or the call
3191 : const/pure. However, we can't set reg_equiv notes up for
3192 : reload over any call. The problem is the equivalent form
3193 : may reference a pseudo which gets assigned a call
3194 : clobbered hard reg. When we later replace REG with its
3195 : equivalent form, the value in the call-clobbered reg has
3196 : been changed and all hell breaks loose. */
3197 93482 : ret = valid_combine;
3198 93482 : if (!MEM_READONLY_P (memref)
3199 93482 : && (!RTL_CONST_OR_PURE_CALL_P (insn)
3200 8324 : || equiv_init_varies_p (XEXP (memref, 0))))
3201 87490 : return valid_none;
3202 : }
3203 :
3204 16703195 : note_stores (insn, validate_equiv_mem_from_store, &info);
3205 16703195 : if (info.equiv_mem_modified)
3206 : return valid_none;
3207 :
3208 : /* If a register mentioned in MEMREF is modified via an
3209 : auto-increment, we lose the equivalence. Do the same if one
3210 : dies; although we could extend the life, it doesn't seem worth
3211 : the trouble. */
3212 :
3213 22841694 : for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3214 7090007 : if ((REG_NOTE_KIND (note) == REG_INC
3215 7090007 : || REG_NOTE_KIND (note) == REG_DEAD)
3216 5340955 : && REG_P (XEXP (note, 0))
3217 12430962 : && reg_overlap_mentioned_p (XEXP (note, 0), memref))
3218 : return valid_none;
3219 : }
3220 :
3221 : return valid_none;
3222 : }
3223 :
3224 : /* Returns false if X is known to be invariant. */
3225 : static bool
3226 865028 : equiv_init_varies_p (rtx x)
3227 : {
3228 865028 : RTX_CODE code = GET_CODE (x);
3229 865028 : int i;
3230 865028 : const char *fmt;
3231 :
3232 865028 : switch (code)
3233 : {
3234 240566 : case MEM:
3235 240566 : return !MEM_READONLY_P (x) || equiv_init_varies_p (XEXP (x, 0));
3236 :
3237 : case CONST:
3238 : CASE_CONST_ANY:
3239 : case SYMBOL_REF:
3240 : case LABEL_REF:
3241 : return false;
3242 :
3243 189918 : case REG:
3244 189918 : return reg_equiv[REGNO (x)].replace == 0 && rtx_varies_p (x, 0);
3245 :
3246 0 : case ASM_OPERANDS:
3247 0 : if (MEM_VOLATILE_P (x))
3248 : return true;
3249 :
3250 : /* Fall through. */
3251 :
3252 133659 : default:
3253 133659 : break;
3254 : }
3255 :
3256 133659 : fmt = GET_RTX_FORMAT (code);
3257 327900 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3258 220729 : if (fmt[i] == 'e')
3259 : {
3260 218804 : if (equiv_init_varies_p (XEXP (x, i)))
3261 : return true;
3262 : }
3263 1925 : else if (fmt[i] == 'E')
3264 : {
3265 : int j;
3266 3813 : for (j = 0; j < XVECLEN (x, i); j++)
3267 3451 : if (equiv_init_varies_p (XVECEXP (x, i, j)))
3268 : return true;
3269 : }
3270 :
3271 : return false;
3272 : }
3273 :
3274 : /* Returns true if X (used to initialize register REGNO) is movable.
3275 : X is only movable if the registers it uses have equivalent initializations
3276 : which appear to be within the same loop (or in an inner loop) and movable
3277 : or if they are not candidates for local_alloc and don't vary. */
3278 : static bool
3279 10430886 : equiv_init_movable_p (rtx x, int regno)
3280 : {
3281 13380141 : int i, j;
3282 13380141 : const char *fmt;
3283 13380141 : enum rtx_code code = GET_CODE (x);
3284 :
3285 13380141 : switch (code)
3286 : {
3287 2949255 : case SET:
3288 2949255 : return equiv_init_movable_p (SET_SRC (x), regno);
3289 :
3290 : case CLOBBER:
3291 : return false;
3292 :
3293 : case PRE_INC:
3294 : case PRE_DEC:
3295 : case POST_INC:
3296 : case POST_DEC:
3297 : case PRE_MODIFY:
3298 : case POST_MODIFY:
3299 : return false;
3300 :
3301 1781111 : case REG:
3302 1781111 : return ((reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth
3303 1316149 : && reg_equiv[REGNO (x)].replace)
3304 3026479 : || (REG_BASIC_BLOCK (REGNO (x)) < NUM_FIXED_BLOCKS
3305 1559490 : && ! rtx_varies_p (x, 0)));
3306 :
3307 : case UNSPEC_VOLATILE:
3308 : return false;
3309 :
3310 0 : case ASM_OPERANDS:
3311 0 : if (MEM_VOLATILE_P (x))
3312 : return false;
3313 :
3314 : /* Fall through. */
3315 :
3316 8017194 : default:
3317 8017194 : break;
3318 : }
3319 :
3320 8017194 : fmt = GET_RTX_FORMAT (code);
3321 18984252 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3322 13004552 : switch (fmt[i])
3323 : {
3324 5994240 : case 'e':
3325 5994240 : if (! equiv_init_movable_p (XEXP (x, i), regno))
3326 : return false;
3327 : break;
3328 748544 : case 'E':
3329 970767 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3330 854810 : if (! equiv_init_movable_p (XVECEXP (x, i, j), regno))
3331 : return false;
3332 : break;
3333 : }
3334 :
3335 : return true;
3336 : }
3337 :
3338 : static bool memref_referenced_p (rtx memref, rtx x, bool read_p);
3339 :
3340 : /* Auxiliary function for memref_referenced_p. Process setting X for
3341 : MEMREF store. */
3342 : static bool
3343 837541 : process_set_for_memref_referenced_p (rtx memref, rtx x)
3344 : {
3345 : /* If we are setting a MEM, it doesn't count (its address does), but any
3346 : other SET_DEST that has a MEM in it is referencing the MEM. */
3347 837541 : if (MEM_P (x))
3348 : {
3349 678698 : if (memref_referenced_p (memref, XEXP (x, 0), true))
3350 : return true;
3351 : }
3352 158843 : else if (memref_referenced_p (memref, x, false))
3353 : return true;
3354 :
3355 : return false;
3356 : }
3357 :
3358 : /* TRUE if X references a memory location (as a read if READ_P) that
3359 : would be affected by a store to MEMREF. */
3360 : static bool
3361 3978865 : memref_referenced_p (rtx memref, rtx x, bool read_p)
3362 : {
3363 3978865 : int i, j;
3364 3978865 : const char *fmt;
3365 3978865 : enum rtx_code code = GET_CODE (x);
3366 :
3367 3978865 : switch (code)
3368 : {
3369 : case CONST:
3370 : case LABEL_REF:
3371 : case SYMBOL_REF:
3372 : CASE_CONST_ANY:
3373 : case PC:
3374 : case HIGH:
3375 : case LO_SUM:
3376 : return false;
3377 :
3378 1596954 : case REG:
3379 1596954 : return (reg_equiv[REGNO (x)].replacement
3380 1668878 : && memref_referenced_p (memref,
3381 71924 : reg_equiv[REGNO (x)].replacement, read_p));
3382 :
3383 132330 : case MEM:
3384 : /* Memory X might have another effective type than MEMREF. */
3385 132330 : if (read_p || true_dependence (memref, VOIDmode, x))
3386 120649 : return true;
3387 : break;
3388 :
3389 821834 : case SET:
3390 821834 : if (process_set_for_memref_referenced_p (memref, SET_DEST (x)))
3391 : return true;
3392 :
3393 806979 : return memref_referenced_p (memref, SET_SRC (x), true);
3394 :
3395 15707 : case CLOBBER:
3396 15707 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3397 : return true;
3398 :
3399 : return false;
3400 :
3401 0 : case PRE_DEC:
3402 0 : case POST_DEC:
3403 0 : case PRE_INC:
3404 0 : case POST_INC:
3405 0 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3406 : return true;
3407 :
3408 0 : return memref_referenced_p (memref, XEXP (x, 0), true);
3409 :
3410 0 : case POST_MODIFY:
3411 0 : case PRE_MODIFY:
3412 : /* op0 = op0 + op1 */
3413 0 : if (process_set_for_memref_referenced_p (memref, XEXP (x, 0)))
3414 : return true;
3415 :
3416 0 : if (memref_referenced_p (memref, XEXP (x, 0), true))
3417 : return true;
3418 :
3419 0 : return memref_referenced_p (memref, XEXP (x, 1), true);
3420 :
3421 : default:
3422 : break;
3423 : }
3424 :
3425 739601 : fmt = GET_RTX_FORMAT (code);
3426 2149807 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
3427 1441521 : switch (fmt[i])
3428 : {
3429 1402146 : case 'e':
3430 1402146 : if (memref_referenced_p (memref, XEXP (x, i), read_p))
3431 : return true;
3432 : break;
3433 19243 : case 'E':
3434 54825 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
3435 38472 : if (memref_referenced_p (memref, XVECEXP (x, i, j), read_p))
3436 : return true;
3437 : break;
3438 : }
3439 :
3440 : return false;
3441 : }
3442 :
3443 : /* TRUE if some insn in the range (START, END] references a memory location
3444 : that would be affected by a store to MEMREF.
3445 :
3446 : Callers should not call this routine if START is after END in the
3447 : RTL chain. */
3448 :
3449 : static bool
3450 630645 : memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end)
3451 : {
3452 630645 : rtx_insn *insn;
3453 :
3454 2082426 : for (insn = NEXT_INSN (start);
3455 4149077 : insn && insn != NEXT_INSN (end);
3456 1451781 : insn = NEXT_INSN (insn))
3457 : {
3458 1572430 : if (!NONDEBUG_INSN_P (insn))
3459 750627 : continue;
3460 :
3461 821803 : if (memref_referenced_p (memref, PATTERN (insn), false))
3462 : return true;
3463 :
3464 : /* Nonconst functions may access memory. */
3465 701154 : if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn)))
3466 : return true;
3467 : }
3468 :
3469 509996 : gcc_assert (insn == NEXT_INSN (end));
3470 : return false;
3471 : }
3472 :
3473 : /* Mark REG as having no known equivalence.
3474 : Some instructions might have been processed before and furnished
3475 : with REG_EQUIV notes for this register; these notes will have to be
3476 : removed.
3477 : STORE is the piece of RTL that does the non-constant / conflicting
3478 : assignment - a SET, CLOBBER or REG_INC note. It is currently not used,
3479 : but needs to be there because this function is called from note_stores. */
3480 : static void
3481 50973637 : no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED,
3482 : void *data ATTRIBUTE_UNUSED)
3483 : {
3484 50973637 : int regno;
3485 50973637 : rtx_insn_list *list;
3486 :
3487 50973637 : if (!REG_P (reg))
3488 : return;
3489 35195457 : regno = REGNO (reg);
3490 35195457 : reg_equiv[regno].no_equiv = 1;
3491 35195457 : list = reg_equiv[regno].init_insns;
3492 64083680 : if (list && list->insn () == NULL)
3493 : return;
3494 7090675 : reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, NULL_RTX, NULL);
3495 7090675 : reg_equiv[regno].replacement = NULL_RTX;
3496 : /* This doesn't matter for equivalences made for argument registers, we
3497 : should keep their initialization insns. */
3498 7090675 : if (reg_equiv[regno].is_arg_equivalence)
3499 : return;
3500 7085433 : ira_reg_equiv[regno].defined_p = false;
3501 7085433 : ira_reg_equiv[regno].caller_save_p = false;
3502 7085433 : ira_reg_equiv[regno].init_insns = NULL;
3503 7906920 : for (; list; list = list->next ())
3504 : {
3505 821487 : rtx_insn *insn = list->insn ();
3506 821487 : remove_note (insn, find_reg_note (insn, REG_EQUIV, NULL_RTX));
3507 : }
3508 : }
3509 :
3510 : /* Check whether the SUBREG is a paradoxical subreg and set the result
3511 : in PDX_SUBREGS. */
3512 :
3513 : static void
3514 83536766 : set_paradoxical_subreg (rtx_insn *insn)
3515 : {
3516 83536766 : subrtx_iterator::array_type array;
3517 529167296 : FOR_EACH_SUBRTX (iter, array, PATTERN (insn), NONCONST)
3518 : {
3519 445630530 : const_rtx subreg = *iter;
3520 445630530 : if (GET_CODE (subreg) == SUBREG)
3521 : {
3522 2873514 : const_rtx reg = SUBREG_REG (subreg);
3523 2873514 : if (REG_P (reg) && paradoxical_subreg_p (subreg))
3524 828038 : reg_equiv[REGNO (reg)].pdx_subregs = true;
3525 : }
3526 : }
3527 83536766 : }
3528 :
3529 : /* In DEBUG_INSN location adjust REGs from CLEARED_REGS bitmap to the
3530 : equivalent replacement. */
3531 :
3532 : static rtx
3533 42250821 : adjust_cleared_regs (rtx loc, const_rtx old_rtx ATTRIBUTE_UNUSED, void *data)
3534 : {
3535 42250821 : if (REG_P (loc))
3536 : {
3537 6477932 : bitmap cleared_regs = (bitmap) data;
3538 6477932 : if (bitmap_bit_p (cleared_regs, REGNO (loc)))
3539 17356 : return simplify_replace_fn_rtx (copy_rtx (*reg_equiv[REGNO (loc)].src_p),
3540 17356 : NULL_RTX, adjust_cleared_regs, data);
3541 : }
3542 : return NULL_RTX;
3543 : }
3544 :
3545 : /* Given register REGNO is set only once, return true if the defining
3546 : insn dominates all uses. */
3547 :
3548 : static bool
3549 49396 : def_dominates_uses (int regno)
3550 : {
3551 49396 : df_ref def = DF_REG_DEF_CHAIN (regno);
3552 :
3553 49396 : struct df_insn_info *def_info = DF_REF_INSN_INFO (def);
3554 : /* If this is an artificial def (eh handler regs, hard frame pointer
3555 : for non-local goto, regs defined on function entry) then def_info
3556 : is NULL and the reg is always live before any use. We might
3557 : reasonably return true in that case, but since the only call
3558 : of this function is currently here in ira.cc when we are looking
3559 : at a defining insn we can't have an artificial def as that would
3560 : bump DF_REG_DEF_COUNT. */
3561 49396 : gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && def_info != NULL);
3562 :
3563 49396 : rtx_insn *def_insn = DF_REF_INSN (def);
3564 49396 : basic_block def_bb = BLOCK_FOR_INSN (def_insn);
3565 :
3566 49396 : for (df_ref use = DF_REG_USE_CHAIN (regno);
3567 141694 : use;
3568 92298 : use = DF_REF_NEXT_REG (use))
3569 : {
3570 92298 : struct df_insn_info *use_info = DF_REF_INSN_INFO (use);
3571 : /* Only check real uses, not artificial ones. */
3572 92298 : if (use_info)
3573 : {
3574 92298 : rtx_insn *use_insn = DF_REF_INSN (use);
3575 92298 : if (!DEBUG_INSN_P (use_insn))
3576 : {
3577 92046 : basic_block use_bb = BLOCK_FOR_INSN (use_insn);
3578 92046 : if (use_bb != def_bb
3579 92046 : ? !dominated_by_p (CDI_DOMINATORS, use_bb, def_bb)
3580 54223 : : DF_INSN_INFO_LUID (use_info) < DF_INSN_INFO_LUID (def_info))
3581 : return false;
3582 : }
3583 : }
3584 : }
3585 : return true;
3586 : }
3587 :
3588 : /* Scan the instructions before update_equiv_regs. Record which registers
3589 : are referenced as paradoxical subregs. Also check for cases in which
3590 : the current function needs to save a register that one of its call
3591 : instructions clobbers.
3592 :
3593 : These things are logically unrelated, but it's more efficient to do
3594 : them together. */
3595 :
3596 : static void
3597 1471362 : update_equiv_regs_prescan (void)
3598 : {
3599 1471362 : basic_block bb;
3600 1471362 : rtx_insn *insn;
3601 1471362 : function_abi_aggregator callee_abis;
3602 :
3603 15875447 : FOR_EACH_BB_FN (bb, cfun)
3604 173423809 : FOR_BB_INSNS (bb, insn)
3605 159019724 : if (NONDEBUG_INSN_P (insn))
3606 : {
3607 83536766 : set_paradoxical_subreg (insn);
3608 83536766 : if (CALL_P (insn))
3609 5946193 : callee_abis.note_callee_abi (insn_callee_abi (insn));
3610 : }
3611 :
3612 1471362 : HARD_REG_SET extra_caller_saves = callee_abis.caller_save_regs (*crtl->abi);
3613 2942724 : if (!hard_reg_set_empty_p (extra_caller_saves))
3614 0 : for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno)
3615 0 : if (TEST_HARD_REG_BIT (extra_caller_saves, regno))
3616 0 : df_set_regs_ever_live (regno, true);
3617 1471362 : }
3618 :
3619 : /* Find registers that are equivalent to a single value throughout the
3620 : compilation (either because they can be referenced in memory or are
3621 : set once from a single constant). Lower their priority for a
3622 : register.
3623 :
3624 : If such a register is only referenced once, try substituting its
3625 : value into the using insn. If it succeeds, we can eliminate the
3626 : register completely.
3627 :
3628 : Initialize init_insns in ira_reg_equiv array. */
3629 : static void
3630 1471362 : update_equiv_regs (void)
3631 : {
3632 1471362 : rtx_insn *insn;
3633 1471362 : basic_block bb;
3634 :
3635 : /* Scan the insns and find which registers have equivalences. Do this
3636 : in a separate scan of the insns because (due to -fcse-follow-jumps)
3637 : a register can be set below its use. */
3638 1471362 : bitmap setjmp_crosses = regstat_get_setjmp_crosses ();
3639 15875447 : FOR_EACH_BB_FN (bb, cfun)
3640 : {
3641 14404085 : int loop_depth = bb_loop_depth (bb);
3642 :
3643 173423809 : for (insn = BB_HEAD (bb);
3644 173423809 : insn != NEXT_INSN (BB_END (bb));
3645 159019724 : insn = NEXT_INSN (insn))
3646 : {
3647 159019724 : rtx note;
3648 159019724 : rtx set;
3649 159019724 : rtx dest, src;
3650 159019724 : int regno;
3651 :
3652 159019724 : if (! INSN_P (insn))
3653 26599953 : continue;
3654 :
3655 217482222 : for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
3656 85062451 : if (REG_NOTE_KIND (note) == REG_INC)
3657 0 : no_equiv (XEXP (note, 0), note, NULL);
3658 :
3659 132419771 : set = single_set (insn);
3660 :
3661 : /* If this insn contains more (or less) than a single SET,
3662 : only mark all destinations as having no known equivalence. */
3663 189535284 : if (set == NULL_RTX
3664 132419771 : || side_effects_p (SET_SRC (set)))
3665 : {
3666 57115513 : note_pattern_stores (PATTERN (insn), no_equiv, NULL);
3667 57115513 : continue;
3668 : }
3669 75304258 : else if (GET_CODE (PATTERN (insn)) == PARALLEL)
3670 : {
3671 10517790 : int i;
3672 :
3673 31748094 : for (i = XVECLEN (PATTERN (insn), 0) - 1; i >= 0; i--)
3674 : {
3675 21230304 : rtx part = XVECEXP (PATTERN (insn), 0, i);
3676 21230304 : if (part != set)
3677 10712514 : note_pattern_stores (part, no_equiv, NULL);
3678 : }
3679 : }
3680 :
3681 75304258 : dest = SET_DEST (set);
3682 75304258 : src = SET_SRC (set);
3683 :
3684 : /* See if this is setting up the equivalence between an argument
3685 : register and its stack slot. */
3686 75304258 : note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3687 75304258 : if (note)
3688 : {
3689 230111 : gcc_assert (REG_P (dest));
3690 230111 : regno = REGNO (dest);
3691 :
3692 : /* Note that we don't want to clear init_insns in
3693 : ira_reg_equiv even if there are multiple sets of this
3694 : register. */
3695 230111 : reg_equiv[regno].is_arg_equivalence = 1;
3696 :
3697 : /* The insn result can have equivalence memory although
3698 : the equivalence is not set up by the insn. We add
3699 : this insn to init insns as it is a flag for now that
3700 : regno has an equivalence. We will remove the insn
3701 : from init insn list later. */
3702 230111 : if (rtx_equal_p (src, XEXP (note, 0)) || MEM_P (XEXP (note, 0)))
3703 230111 : ira_reg_equiv[regno].init_insns
3704 230111 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3705 230111 : ira_reg_equiv[regno].init_insns);
3706 :
3707 : /* Continue normally in case this is a candidate for
3708 : replacements. */
3709 : }
3710 :
3711 75304258 : if (!optimize)
3712 22689582 : continue;
3713 :
3714 : /* We only handle the case of a pseudo register being set
3715 : once, or always to the same value. */
3716 : /* ??? The mn10200 port breaks if we add equivalences for
3717 : values that need an ADDRESS_REGS register and set them equivalent
3718 : to a MEM of a pseudo. The actual problem is in the over-conservative
3719 : handling of INPADDR_ADDRESS / INPUT_ADDRESS / INPUT triples in
3720 : calculate_needs, but we traditionally work around this problem
3721 : here by rejecting equivalences when the destination is in a register
3722 : that's likely spilled. This is fragile, of course, since the
3723 : preferred class of a pseudo depends on all instructions that set
3724 : or use it. */
3725 :
3726 86924872 : if (!REG_P (dest)
3727 36502735 : || (regno = REGNO (dest)) < FIRST_PSEUDO_REGISTER
3728 20971756 : || (reg_equiv[regno].init_insns
3729 3508228 : && reg_equiv[regno].init_insns->insn () == NULL)
3730 70919219 : || (targetm.class_likely_spilled_p (reg_preferred_class (regno))
3731 350 : && MEM_P (src) && ! reg_equiv[regno].is_arg_equivalence))
3732 : {
3733 : /* This might be setting a SUBREG of a pseudo, a pseudo that is
3734 : also set somewhere else to a constant. */
3735 34310196 : note_pattern_stores (set, no_equiv, NULL);
3736 34310196 : continue;
3737 : }
3738 :
3739 : /* Don't set reg mentioned in a paradoxical subreg
3740 : equivalent to a mem. */
3741 18304480 : if (MEM_P (src) && reg_equiv[regno].pdx_subregs)
3742 : {
3743 17769 : note_pattern_stores (set, no_equiv, NULL);
3744 17769 : continue;
3745 : }
3746 :
3747 18286711 : note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
3748 :
3749 : /* cse sometimes generates function invariants, but doesn't put a
3750 : REG_EQUAL note on the insn. Since this note would be redundant,
3751 : there's no point creating it earlier than here. */
3752 18286711 : if (! note && ! rtx_varies_p (src, 0))
3753 2703733 : note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src));
3754 :
3755 : /* Don't bother considering a REG_EQUAL note containing an EXPR_LIST
3756 : since it represents a function call. */
3757 18286711 : if (note && GET_CODE (XEXP (note, 0)) == EXPR_LIST)
3758 14558536 : note = NULL_RTX;
3759 :
3760 18286711 : if (DF_REG_DEF_COUNT (regno) != 1)
3761 : {
3762 2911483 : bool equal_p = true;
3763 2911483 : rtx_insn_list *list;
3764 :
3765 : /* If we have already processed this pseudo and determined it
3766 : cannot have an equivalence, then honor that decision. */
3767 2911483 : if (reg_equiv[regno].no_equiv)
3768 0 : continue;
3769 :
3770 4730369 : if (! note
3771 1120084 : || rtx_varies_p (XEXP (note, 0), 0)
3772 4004080 : || (reg_equiv[regno].replacement
3773 0 : && ! rtx_equal_p (XEXP (note, 0),
3774 : reg_equiv[regno].replacement)))
3775 : {
3776 1818886 : no_equiv (dest, set, NULL);
3777 1818886 : continue;
3778 : }
3779 :
3780 1092597 : list = reg_equiv[regno].init_insns;
3781 2048958 : for (; list; list = list->next ())
3782 : {
3783 1045820 : rtx note_tmp;
3784 1045820 : rtx_insn *insn_tmp;
3785 :
3786 1045820 : insn_tmp = list->insn ();
3787 1045820 : note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX);
3788 1045820 : gcc_assert (note_tmp);
3789 1045820 : if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0)))
3790 : {
3791 : equal_p = false;
3792 : break;
3793 : }
3794 : }
3795 :
3796 1092597 : if (! equal_p)
3797 : {
3798 89459 : no_equiv (dest, set, NULL);
3799 89459 : continue;
3800 : }
3801 : }
3802 :
3803 : /* Record this insn as initializing this register. */
3804 16378366 : reg_equiv[regno].init_insns
3805 16378366 : = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv[regno].init_insns);
3806 :
3807 : /* If this register is known to be equal to a constant, record that
3808 : it is always equivalent to the constant.
3809 : Note that it is possible to have a register use before
3810 : the def in loops (see gcc.c-torture/execute/pr79286.c)
3811 : where the reg is undefined on first use. If the def insn
3812 : won't trap we can use it as an equivalence, effectively
3813 : choosing the "undefined" value for the reg to be the
3814 : same as the value set by the def. */
3815 16378366 : if (DF_REG_DEF_COUNT (regno) == 1
3816 15375228 : && note
3817 2608091 : && !rtx_varies_p (XEXP (note, 0), 0)
3818 18571099 : && (!may_trap_or_fault_p (XEXP (note, 0))
3819 49396 : || def_dominates_uses (regno)))
3820 : {
3821 2192733 : rtx note_value = XEXP (note, 0);
3822 2192733 : remove_note (insn, note);
3823 2192733 : set_unique_reg_note (insn, REG_EQUIV, note_value);
3824 : }
3825 :
3826 : /* If this insn introduces a "constant" register, decrease the priority
3827 : of that register. Record this insn if the register is only used once
3828 : more and the equivalence value is the same as our source.
3829 :
3830 : The latter condition is checked for two reasons: First, it is an
3831 : indication that it may be more efficient to actually emit the insn
3832 : as written (if no registers are available, reload will substitute
3833 : the equivalence). Secondly, it avoids problems with any registers
3834 : dying in this insn whose death notes would be missed.
3835 :
3836 : If we don't have a REG_EQUIV note, see if this insn is loading
3837 : a register used only in one basic block from a MEM. If so, and the
3838 : MEM remains unchanged for the life of the register, add a REG_EQUIV
3839 : note. */
3840 16378366 : note = find_reg_note (insn, REG_EQUIV, NULL_RTX);
3841 :
3842 16378366 : rtx replacement = NULL_RTX;
3843 16378366 : if (note)
3844 2415452 : replacement = XEXP (note, 0);
3845 13962914 : else if (REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3846 10017035 : && MEM_P (SET_SRC (set)))
3847 : {
3848 2962304 : enum valid_equiv validity;
3849 2962304 : validity = validate_equiv_mem (insn, dest, SET_SRC (set));
3850 2962304 : if (validity != valid_none)
3851 : {
3852 2186837 : replacement = copy_rtx (SET_SRC (set));
3853 2186837 : if (validity == valid_reload)
3854 : {
3855 2186041 : note = set_unique_reg_note (insn, REG_EQUIV, replacement);
3856 : }
3857 796 : else if (ira_use_lra_p)
3858 : {
3859 : /* We still can use this equivalence for caller save
3860 : optimization in LRA. Mark this. */
3861 796 : ira_reg_equiv[regno].caller_save_p = true;
3862 796 : ira_reg_equiv[regno].init_insns
3863 796 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3864 796 : ira_reg_equiv[regno].init_insns);
3865 : }
3866 : }
3867 : }
3868 :
3869 : /* If we haven't done so, record for reload that this is an
3870 : equivalencing insn. */
3871 16378366 : if (note && !reg_equiv[regno].is_arg_equivalence)
3872 4378774 : ira_reg_equiv[regno].init_insns
3873 4378774 : = gen_rtx_INSN_LIST (VOIDmode, insn,
3874 4378774 : ira_reg_equiv[regno].init_insns);
3875 :
3876 16378366 : if (replacement)
3877 : {
3878 4602289 : reg_equiv[regno].replacement = replacement;
3879 4602289 : reg_equiv[regno].src_p = &SET_SRC (set);
3880 4602289 : reg_equiv[regno].loop_depth = (short) loop_depth;
3881 :
3882 : /* Don't mess with things live during setjmp. */
3883 4602289 : if (optimize && !bitmap_bit_p (setjmp_crosses, regno))
3884 : {
3885 : /* If the register is referenced exactly twice, meaning it is
3886 : set once and used once, indicate that the reference may be
3887 : replaced by the equivalence we computed above. Do this
3888 : even if the register is only used in one block so that
3889 : dependencies can be handled where the last register is
3890 : used in a different block (i.e. HIGH / LO_SUM sequences)
3891 : and to reduce the number of registers alive across
3892 : calls. */
3893 :
3894 4602240 : if (REG_N_REFS (regno) == 2
3895 3646448 : && (rtx_equal_p (replacement, src)
3896 393887 : || ! equiv_init_varies_p (src))
3897 3581836 : && NONJUMP_INSN_P (insn)
3898 8184076 : && equiv_init_movable_p (PATTERN (insn), regno))
3899 2162051 : reg_equiv[regno].replace = 1;
3900 : }
3901 : }
3902 : }
3903 : }
3904 1471362 : }
3905 :
3906 : /* For insns that set a MEM to the contents of a REG that is only used
3907 : in a single basic block, see if the register is always equivalent
3908 : to that memory location and if moving the store from INSN to the
3909 : insn that sets REG is safe. If so, put a REG_EQUIV note on the
3910 : initializing insn. */
3911 : static void
3912 963943 : add_store_equivs (void)
3913 : {
3914 963943 : auto_sbitmap seen_insns (get_max_uid () + 1);
3915 963943 : bitmap_clear (seen_insns);
3916 :
3917 128339731 : for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
3918 : {
3919 127375788 : rtx set, src, dest;
3920 127375788 : unsigned regno;
3921 127375788 : rtx_insn *init_insn;
3922 :
3923 127375788 : bitmap_set_bit (seen_insns, INSN_UID (insn));
3924 :
3925 127375788 : if (! INSN_P (insn))
3926 23835626 : continue;
3927 :
3928 103540162 : set = single_set (insn);
3929 103540162 : if (! set)
3930 52317352 : continue;
3931 :
3932 51222810 : dest = SET_DEST (set);
3933 51222810 : src = SET_SRC (set);
3934 :
3935 : /* Don't add a REG_EQUIV note if the insn already has one. The existing
3936 : REG_EQUIV is likely more useful than the one we are adding. */
3937 7763997 : if (MEM_P (dest) && REG_P (src)
3938 5202652 : && (regno = REGNO (src)) >= FIRST_PSEUDO_REGISTER
3939 5136415 : && REG_BASIC_BLOCK (regno) >= NUM_FIXED_BLOCKS
3940 2940639 : && DF_REG_DEF_COUNT (regno) == 1
3941 2875184 : && ! reg_equiv[regno].pdx_subregs
3942 2731488 : && reg_equiv[regno].init_insns != NULL
3943 2731488 : && (init_insn = reg_equiv[regno].init_insns->insn ()) != 0
3944 2700548 : && bitmap_bit_p (seen_insns, INSN_UID (init_insn))
3945 2700548 : && ! find_reg_note (init_insn, REG_EQUIV, NULL_RTX)
3946 1070412 : && validate_equiv_mem (init_insn, src, dest) == valid_reload
3947 630645 : && ! memref_used_between_p (dest, init_insn, insn)
3948 : /* Attaching a REG_EQUIV note will fail if INIT_INSN has
3949 : multiple sets. */
3950 51732806 : && set_unique_reg_note (init_insn, REG_EQUIV, copy_rtx (dest)))
3951 : {
3952 : /* This insn makes the equivalence, not the one initializing
3953 : the register. */
3954 509510 : ira_reg_equiv[regno].init_insns
3955 509510 : = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
3956 509510 : df_notes_rescan (init_insn);
3957 509510 : if (dump_file)
3958 88 : fprintf (dump_file,
3959 : "Adding REG_EQUIV to insn %d for source of insn %d\n",
3960 88 : INSN_UID (init_insn),
3961 88 : INSN_UID (insn));
3962 : }
3963 : }
3964 963943 : }
3965 :
3966 : /* Scan all regs killed in an insn to see if any of them are registers
3967 : only used that once. If so, see if we can replace the reference
3968 : with the equivalent form. If we can, delete the initializing
3969 : reference and this register will go away. If we can't replace the
3970 : reference, and the initializing reference is within the same loop
3971 : (or in an inner loop), then move the register initialization just
3972 : before the use, so that they are in the same basic block. */
3973 : static void
3974 1043632 : combine_and_move_insns (void)
3975 : {
3976 1043632 : auto_bitmap cleared_regs;
3977 1043632 : int max = max_reg_num ();
3978 :
3979 51726471 : for (int regno = FIRST_PSEUDO_REGISTER; regno < max; regno++)
3980 : {
3981 50682839 : if (!reg_equiv[regno].replace)
3982 48520987 : continue;
3983 :
3984 2161852 : rtx_insn *use_insn = 0;
3985 2161852 : for (df_ref use = DF_REG_USE_CHAIN (regno);
3986 4344303 : use;
3987 2182451 : use = DF_REF_NEXT_REG (use))
3988 2182451 : if (DF_REF_INSN_INFO (use))
3989 : {
3990 2182451 : if (DEBUG_INSN_P (DF_REF_INSN (use)))
3991 20599 : continue;
3992 2161852 : gcc_assert (!use_insn);
3993 : use_insn = DF_REF_INSN (use);
3994 : }
3995 2161852 : gcc_assert (use_insn);
3996 :
3997 : /* Don't substitute into jumps. indirect_jump_optimize does
3998 : this for anything we are prepared to handle. */
3999 2161852 : if (JUMP_P (use_insn))
4000 400 : continue;
4001 :
4002 : /* Also don't substitute into a conditional trap insn -- it can become
4003 : an unconditional trap, and that is a flow control insn. */
4004 2161452 : if (GET_CODE (PATTERN (use_insn)) == TRAP_IF)
4005 0 : continue;
4006 :
4007 2161452 : df_ref def = DF_REG_DEF_CHAIN (regno);
4008 2161452 : gcc_assert (DF_REG_DEF_COUNT (regno) == 1 && DF_REF_INSN_INFO (def));
4009 2161452 : rtx_insn *def_insn = DF_REF_INSN (def);
4010 :
4011 : /* We may not move instructions that can throw, since that
4012 : changes basic block boundaries and we are not prepared to
4013 : adjust the CFG to match. */
4014 2161452 : if (can_throw_internal (def_insn))
4015 0 : continue;
4016 :
4017 : /* Instructions with multiple sets can only be moved if DF analysis is
4018 : performed for all of the registers set. See PR91052. */
4019 2161452 : if (multiple_sets (def_insn))
4020 0 : continue;
4021 :
4022 2161452 : basic_block use_bb = BLOCK_FOR_INSN (use_insn);
4023 2161452 : basic_block def_bb = BLOCK_FOR_INSN (def_insn);
4024 2161452 : if (bb_loop_depth (use_bb) > bb_loop_depth (def_bb))
4025 136791 : continue;
4026 :
4027 2024661 : if (asm_noperands (PATTERN (def_insn)) < 0
4028 4049322 : && validate_replace_rtx (regno_reg_rtx[regno],
4029 2024661 : *reg_equiv[regno].src_p, use_insn))
4030 : {
4031 375766 : rtx link;
4032 : /* Append the REG_DEAD notes from def_insn. */
4033 752423 : for (rtx *p = ®_NOTES (def_insn); (link = *p) != 0; )
4034 : {
4035 376657 : if (REG_NOTE_KIND (XEXP (link, 0)) == REG_DEAD)
4036 : {
4037 0 : *p = XEXP (link, 1);
4038 0 : XEXP (link, 1) = REG_NOTES (use_insn);
4039 0 : REG_NOTES (use_insn) = link;
4040 : }
4041 : else
4042 376657 : p = &XEXP (link, 1);
4043 : }
4044 :
4045 375766 : remove_death (regno, use_insn);
4046 375766 : SET_REG_N_REFS (regno, 0);
4047 375766 : REG_FREQ (regno) = 0;
4048 375766 : df_ref use;
4049 454020 : FOR_EACH_INSN_USE (use, def_insn)
4050 : {
4051 78254 : unsigned int use_regno = DF_REF_REGNO (use);
4052 78254 : if (!HARD_REGISTER_NUM_P (use_regno))
4053 1266 : reg_equiv[use_regno].replace = 0;
4054 : }
4055 :
4056 375766 : delete_insn (def_insn);
4057 :
4058 375766 : reg_equiv[regno].init_insns = NULL;
4059 375766 : ira_reg_equiv[regno].init_insns = NULL;
4060 375766 : bitmap_set_bit (cleared_regs, regno);
4061 : }
4062 :
4063 : /* Move the initialization of the register to just before
4064 : USE_INSN. Update the flow information. */
4065 1648895 : else if (prev_nondebug_insn (use_insn) != def_insn)
4066 : {
4067 318503 : rtx_insn *new_insn;
4068 :
4069 318503 : new_insn = emit_insn_before (PATTERN (def_insn), use_insn);
4070 318503 : REG_NOTES (new_insn) = REG_NOTES (def_insn);
4071 318503 : REG_NOTES (def_insn) = 0;
4072 : /* Rescan it to process the notes. */
4073 318503 : df_insn_rescan (new_insn);
4074 :
4075 : /* Make sure this insn is recognized before reload begins,
4076 : otherwise eliminate_regs_in_insn will die. */
4077 318503 : INSN_CODE (new_insn) = INSN_CODE (def_insn);
4078 :
4079 318503 : delete_insn (def_insn);
4080 :
4081 318503 : XEXP (reg_equiv[regno].init_insns, 0) = new_insn;
4082 :
4083 318503 : REG_BASIC_BLOCK (regno) = use_bb->index;
4084 318503 : REG_N_CALLS_CROSSED (regno) = 0;
4085 :
4086 318503 : if (use_insn == BB_HEAD (use_bb))
4087 0 : BB_HEAD (use_bb) = new_insn;
4088 :
4089 : /* We know regno dies in use_insn, but inside a loop
4090 : REG_DEAD notes might be missing when def_insn was in
4091 : another basic block. However, when we move def_insn into
4092 : this bb we'll definitely get a REG_DEAD note and reload
4093 : will see the death. It's possible that update_equiv_regs
4094 : set up an equivalence referencing regno for a reg set by
4095 : use_insn, when regno was seen as non-local. Now that
4096 : regno is local to this block, and dies, such an
4097 : equivalence is invalid. */
4098 318503 : if (find_reg_note (use_insn, REG_EQUIV, regno_reg_rtx[regno]))
4099 : {
4100 0 : rtx set = single_set (use_insn);
4101 0 : if (set && REG_P (SET_DEST (set)))
4102 0 : no_equiv (SET_DEST (set), set, NULL);
4103 : }
4104 :
4105 318503 : ira_reg_equiv[regno].init_insns
4106 318503 : = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX);
4107 318503 : bitmap_set_bit (cleared_regs, regno);
4108 : }
4109 : }
4110 :
4111 1043632 : if (!bitmap_empty_p (cleared_regs))
4112 : {
4113 222134 : basic_block bb;
4114 :
4115 5805452 : FOR_EACH_BB_FN (bb, cfun)
4116 : {
4117 11166636 : bitmap_and_compl_into (DF_LR_IN (bb), cleared_regs);
4118 11166636 : bitmap_and_compl_into (DF_LR_OUT (bb), cleared_regs);
4119 5583318 : if (!df_live)
4120 5583318 : continue;
4121 0 : bitmap_and_compl_into (DF_LIVE_IN (bb), cleared_regs);
4122 0 : bitmap_and_compl_into (DF_LIVE_OUT (bb), cleared_regs);
4123 : }
4124 :
4125 : /* Last pass - adjust debug insns referencing cleared regs. */
4126 222134 : if (MAY_HAVE_DEBUG_BIND_INSNS)
4127 61965040 : for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
4128 61842737 : if (DEBUG_BIND_INSN_P (insn))
4129 : {
4130 23133378 : rtx old_loc = INSN_VAR_LOCATION_LOC (insn);
4131 23133378 : INSN_VAR_LOCATION_LOC (insn)
4132 46266756 : = simplify_replace_fn_rtx (old_loc, NULL_RTX,
4133 : adjust_cleared_regs,
4134 23133378 : (void *) cleared_regs);
4135 23133378 : if (old_loc != INSN_VAR_LOCATION_LOC (insn))
4136 17017 : df_insn_rescan (insn);
4137 : }
4138 : }
4139 1043632 : }
4140 :
4141 : /* A pass over indirect jumps, converting simple cases to direct jumps.
4142 : Combine does this optimization too, but only within a basic block. */
4143 : static void
4144 1471362 : indirect_jump_optimize (void)
4145 : {
4146 1471362 : basic_block bb;
4147 1471362 : bool rebuild_p = false;
4148 :
4149 15875449 : FOR_EACH_BB_REVERSE_FN (bb, cfun)
4150 : {
4151 14404087 : rtx_insn *insn = BB_END (bb);
4152 20031047 : if (!JUMP_P (insn)
4153 14404087 : || find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX))
4154 5626960 : continue;
4155 :
4156 8777127 : rtx x = pc_set (insn);
4157 8777127 : if (!x || !REG_P (SET_SRC (x)))
4158 8775912 : continue;
4159 :
4160 1215 : int regno = REGNO (SET_SRC (x));
4161 1215 : if (DF_REG_DEF_COUNT (regno) == 1)
4162 : {
4163 1104 : df_ref def = DF_REG_DEF_CHAIN (regno);
4164 1104 : if (!DF_REF_IS_ARTIFICIAL (def))
4165 : {
4166 1104 : rtx_insn *def_insn = DF_REF_INSN (def);
4167 1104 : rtx lab = NULL_RTX;
4168 1104 : rtx set = single_set (def_insn);
4169 1104 : if (set && GET_CODE (SET_SRC (set)) == LABEL_REF)
4170 : lab = SET_SRC (set);
4171 : else
4172 : {
4173 1103 : rtx eqnote = find_reg_note (def_insn, REG_EQUAL, NULL_RTX);
4174 1103 : if (eqnote && GET_CODE (XEXP (eqnote, 0)) == LABEL_REF)
4175 : lab = XEXP (eqnote, 0);
4176 : }
4177 1 : if (lab && validate_replace_rtx (SET_SRC (x), lab, insn))
4178 : rebuild_p = true;
4179 : }
4180 : }
4181 : }
4182 :
4183 1471362 : if (rebuild_p)
4184 : {
4185 1 : timevar_push (TV_JUMP);
4186 1 : rebuild_jump_labels (get_insns ());
4187 1 : if (purge_all_dead_edges ())
4188 1 : delete_unreachable_blocks ();
4189 1 : timevar_pop (TV_JUMP);
4190 : }
4191 1471362 : }
4192 :
4193 : /* Set up fields memory, constant, and invariant from init_insns in
4194 : the structures of array ira_reg_equiv. */
4195 : static void
4196 1471362 : setup_reg_equiv (void)
4197 : {
4198 1471362 : int i;
4199 1471362 : rtx_insn_list *elem, *prev_elem, *next_elem;
4200 1471362 : rtx_insn *insn;
4201 1471362 : rtx set, x;
4202 :
4203 168285406 : for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++)
4204 166814044 : for (prev_elem = NULL, elem = ira_reg_equiv[i].init_insns;
4205 171432448 : elem;
4206 : prev_elem = elem, elem = next_elem)
4207 : {
4208 4741159 : next_elem = elem->next ();
4209 4741159 : insn = elem->insn ();
4210 4741159 : set = single_set (insn);
4211 :
4212 : /* Init insns can set up equivalence when the reg is a destination or
4213 : a source (in this case the destination is memory). */
4214 4741159 : if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set))))
4215 : {
4216 4741159 : if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL)
4217 : {
4218 4230930 : x = XEXP (x, 0);
4219 4230930 : if (REG_P (SET_DEST (set))
4220 4230930 : && REGNO (SET_DEST (set)) == (unsigned int) i
4221 8461860 : && ! rtx_equal_p (SET_SRC (set), x) && MEM_P (x))
4222 : {
4223 : /* This insn reporting the equivalence but
4224 : actually not setting it. Remove it from the
4225 : list. */
4226 30376 : if (prev_elem == NULL)
4227 30376 : ira_reg_equiv[i].init_insns = next_elem;
4228 : else
4229 0 : XEXP (prev_elem, 1) = next_elem;
4230 : elem = prev_elem;
4231 : }
4232 : }
4233 510229 : else if (REG_P (SET_DEST (set))
4234 510229 : && REGNO (SET_DEST (set)) == (unsigned int) i)
4235 719 : x = SET_SRC (set);
4236 : else
4237 : {
4238 509510 : gcc_assert (REG_P (SET_SRC (set))
4239 : && REGNO (SET_SRC (set)) == (unsigned int) i);
4240 : x = SET_DEST (set);
4241 : }
4242 : /* If PIC is enabled and the equiv is not a LEGITIMATE_PIC_OPERAND,
4243 : we can't use it. */
4244 4741159 : if (! CONSTANT_P (x)
4245 1008826 : || ! flag_pic
4246 : /* A function invariant is often CONSTANT_P but may
4247 : include a register. We promise to only pass
4248 : CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */
4249 4863516 : || LEGITIMATE_PIC_OPERAND_P (x))
4250 : {
4251 : /* It can happen that a REG_EQUIV note contains a MEM
4252 : that is not a legitimate memory operand. As later
4253 : stages of reload assume that all addresses found in
4254 : the lra_regno_equiv_* arrays were originally
4255 : legitimate, we ignore such REG_EQUIV notes. */
4256 4697017 : if (memory_operand (x, VOIDmode))
4257 : {
4258 2785843 : ira_reg_equiv[i].defined_p = !ira_reg_equiv[i].caller_save_p;
4259 2785843 : ira_reg_equiv[i].memory = x;
4260 2785843 : continue;
4261 : }
4262 1911174 : else if (function_invariant_p (x))
4263 : {
4264 1833062 : machine_mode mode;
4265 :
4266 1833062 : mode = GET_MODE (SET_DEST (set));
4267 1833062 : if (GET_CODE (x) == PLUS
4268 965812 : || x == frame_pointer_rtx || x == arg_pointer_rtx)
4269 : /* This is PLUS of frame pointer and a constant,
4270 : or fp, or argp. */
4271 868378 : ira_reg_equiv[i].invariant = x;
4272 964684 : else if (targetm.legitimate_constant_p (mode, x))
4273 705060 : ira_reg_equiv[i].constant = x;
4274 : else
4275 : {
4276 259624 : ira_reg_equiv[i].memory = force_const_mem (mode, x);
4277 259624 : if (ira_reg_equiv[i].memory == NULL_RTX)
4278 : {
4279 501 : ira_reg_equiv[i].defined_p = false;
4280 501 : ira_reg_equiv[i].caller_save_p = false;
4281 501 : ira_reg_equiv[i].init_insns = NULL;
4282 501 : break;
4283 : }
4284 : }
4285 1832561 : ira_reg_equiv[i].defined_p = true;
4286 1832561 : continue;
4287 1832561 : }
4288 : }
4289 : }
4290 122254 : ira_reg_equiv[i].defined_p = false;
4291 122254 : ira_reg_equiv[i].caller_save_p = false;
4292 122254 : ira_reg_equiv[i].init_insns = NULL;
4293 122254 : break;
4294 : }
4295 1471362 : }
4296 :
4297 :
4298 :
4299 : /* Print chain C to FILE. */
4300 : static void
4301 0 : print_insn_chain (FILE *file, class insn_chain *c)
4302 : {
4303 0 : fprintf (file, "insn=%d, ", INSN_UID (c->insn));
4304 0 : bitmap_print (file, &c->live_throughout, "live_throughout: ", ", ");
4305 0 : bitmap_print (file, &c->dead_or_set, "dead_or_set: ", "\n");
4306 0 : }
4307 :
4308 :
4309 : /* Print all reload_insn_chains to FILE. */
4310 : static void
4311 0 : print_insn_chains (FILE *file)
4312 : {
4313 0 : class insn_chain *c;
4314 0 : for (c = reload_insn_chain; c ; c = c->next)
4315 0 : print_insn_chain (file, c);
4316 0 : }
4317 :
4318 : /* Return true if pseudo REGNO should be added to set live_throughout
4319 : or dead_or_set of the insn chains for reload consideration. */
4320 : static bool
4321 0 : pseudo_for_reload_consideration_p (int regno)
4322 : {
4323 : /* Consider spilled pseudos too for IRA because they still have a
4324 : chance to get hard-registers in the reload when IRA is used. */
4325 0 : return (reg_renumber[regno] >= 0 || ira_conflicts_p);
4326 : }
4327 :
4328 : /* Return true if we can track the individual bytes of subreg X.
4329 : When returning true, set *OUTER_SIZE to the number of bytes in
4330 : X itself, *INNER_SIZE to the number of bytes in the inner register
4331 : and *START to the offset of the first byte. */
4332 : static bool
4333 0 : get_subreg_tracking_sizes (rtx x, HOST_WIDE_INT *outer_size,
4334 : HOST_WIDE_INT *inner_size, HOST_WIDE_INT *start)
4335 : {
4336 0 : rtx reg = regno_reg_rtx[REGNO (SUBREG_REG (x))];
4337 0 : return (GET_MODE_SIZE (GET_MODE (x)).is_constant (outer_size)
4338 0 : && GET_MODE_SIZE (GET_MODE (reg)).is_constant (inner_size)
4339 0 : && SUBREG_BYTE (x).is_constant (start));
4340 : }
4341 :
4342 : /* Init LIVE_SUBREGS[ALLOCNUM] and LIVE_SUBREGS_USED[ALLOCNUM] for
4343 : a register with SIZE bytes, making the register live if INIT_VALUE. */
4344 : static void
4345 0 : init_live_subregs (bool init_value, sbitmap *live_subregs,
4346 : bitmap live_subregs_used, int allocnum, int size)
4347 : {
4348 0 : gcc_assert (size > 0);
4349 :
4350 : /* Been there, done that. */
4351 0 : if (bitmap_bit_p (live_subregs_used, allocnum))
4352 : return;
4353 :
4354 : /* Create a new one. */
4355 0 : if (live_subregs[allocnum] == NULL)
4356 0 : live_subregs[allocnum] = sbitmap_alloc (size);
4357 :
4358 : /* If the entire reg was live before blasting into subregs, we need
4359 : to init all of the subregs to ones else init to 0. */
4360 0 : if (init_value)
4361 0 : bitmap_ones (live_subregs[allocnum]);
4362 : else
4363 0 : bitmap_clear (live_subregs[allocnum]);
4364 :
4365 0 : bitmap_set_bit (live_subregs_used, allocnum);
4366 : }
4367 :
4368 : /* Walk the insns of the current function and build reload_insn_chain,
4369 : and record register life information. */
4370 : static void
4371 0 : build_insn_chain (void)
4372 : {
4373 0 : unsigned int i;
4374 0 : class insn_chain **p = &reload_insn_chain;
4375 0 : basic_block bb;
4376 0 : class insn_chain *c = NULL;
4377 0 : class insn_chain *next = NULL;
4378 0 : auto_bitmap live_relevant_regs;
4379 0 : auto_bitmap elim_regset;
4380 : /* live_subregs is a vector used to keep accurate information about
4381 : which hardregs are live in multiword pseudos. live_subregs and
4382 : live_subregs_used are indexed by pseudo number. The live_subreg
4383 : entry for a particular pseudo is only used if the corresponding
4384 : element is non zero in live_subregs_used. The sbitmap size of
4385 : live_subreg[allocno] is number of bytes that the pseudo can
4386 : occupy. */
4387 0 : sbitmap *live_subregs = XCNEWVEC (sbitmap, max_regno);
4388 0 : auto_bitmap live_subregs_used;
4389 :
4390 0 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
4391 0 : if (TEST_HARD_REG_BIT (eliminable_regset, i))
4392 0 : bitmap_set_bit (elim_regset, i);
4393 0 : FOR_EACH_BB_REVERSE_FN (bb, cfun)
4394 : {
4395 0 : bitmap_iterator bi;
4396 0 : rtx_insn *insn;
4397 :
4398 0 : CLEAR_REG_SET (live_relevant_regs);
4399 0 : bitmap_clear (live_subregs_used);
4400 :
4401 0 : EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb), 0, i, bi)
4402 : {
4403 0 : if (i >= FIRST_PSEUDO_REGISTER)
4404 : break;
4405 0 : bitmap_set_bit (live_relevant_regs, i);
4406 : }
4407 :
4408 0 : EXECUTE_IF_SET_IN_BITMAP (df_get_live_out (bb),
4409 : FIRST_PSEUDO_REGISTER, i, bi)
4410 : {
4411 0 : if (pseudo_for_reload_consideration_p (i))
4412 0 : bitmap_set_bit (live_relevant_regs, i);
4413 : }
4414 :
4415 0 : FOR_BB_INSNS_REVERSE (bb, insn)
4416 : {
4417 0 : if (!NOTE_P (insn) && !BARRIER_P (insn))
4418 : {
4419 0 : struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4420 0 : df_ref def, use;
4421 :
4422 0 : c = new_insn_chain ();
4423 0 : c->next = next;
4424 0 : next = c;
4425 0 : *p = c;
4426 0 : p = &c->prev;
4427 :
4428 0 : c->insn = insn;
4429 0 : c->block = bb->index;
4430 :
4431 0 : if (NONDEBUG_INSN_P (insn))
4432 0 : FOR_EACH_INSN_INFO_DEF (def, insn_info)
4433 : {
4434 0 : unsigned int regno = DF_REF_REGNO (def);
4435 :
4436 : /* Ignore may clobbers because these are generated
4437 : from calls. However, every other kind of def is
4438 : added to dead_or_set. */
4439 0 : if (!DF_REF_FLAGS_IS_SET (def, DF_REF_MAY_CLOBBER))
4440 : {
4441 0 : if (regno < FIRST_PSEUDO_REGISTER)
4442 : {
4443 0 : if (!fixed_regs[regno])
4444 0 : bitmap_set_bit (&c->dead_or_set, regno);
4445 : }
4446 0 : else if (pseudo_for_reload_consideration_p (regno))
4447 0 : bitmap_set_bit (&c->dead_or_set, regno);
4448 : }
4449 :
4450 0 : if ((regno < FIRST_PSEUDO_REGISTER
4451 0 : || reg_renumber[regno] >= 0
4452 0 : || ira_conflicts_p)
4453 0 : && (!DF_REF_FLAGS_IS_SET (def, DF_REF_CONDITIONAL)))
4454 : {
4455 0 : rtx reg = DF_REF_REG (def);
4456 0 : HOST_WIDE_INT outer_size, inner_size, start;
4457 :
4458 : /* We can usually track the liveness of individual
4459 : bytes within a subreg. The only exceptions are
4460 : subregs wrapped in ZERO_EXTRACTs and subregs whose
4461 : size is not known; in those cases we need to be
4462 : conservative and treat the definition as a partial
4463 : definition of the full register rather than a full
4464 : definition of a specific part of the register. */
4465 0 : if (GET_CODE (reg) == SUBREG
4466 0 : && !DF_REF_FLAGS_IS_SET (def, DF_REF_ZERO_EXTRACT)
4467 0 : && get_subreg_tracking_sizes (reg, &outer_size,
4468 : &inner_size, &start))
4469 : {
4470 0 : HOST_WIDE_INT last = start + outer_size;
4471 :
4472 0 : init_live_subregs
4473 0 : (bitmap_bit_p (live_relevant_regs, regno),
4474 : live_subregs, live_subregs_used, regno,
4475 : inner_size);
4476 :
4477 0 : if (!DF_REF_FLAGS_IS_SET
4478 : (def, DF_REF_STRICT_LOW_PART))
4479 : {
4480 : /* Expand the range to cover entire words.
4481 : Bytes added here are "don't care". */
4482 0 : start
4483 0 : = start / UNITS_PER_WORD * UNITS_PER_WORD;
4484 0 : last = ((last + UNITS_PER_WORD - 1)
4485 0 : / UNITS_PER_WORD * UNITS_PER_WORD);
4486 : }
4487 :
4488 : /* Ignore the paradoxical bits. */
4489 0 : if (last > SBITMAP_SIZE (live_subregs[regno]))
4490 : last = SBITMAP_SIZE (live_subregs[regno]);
4491 :
4492 0 : while (start < last)
4493 : {
4494 0 : bitmap_clear_bit (live_subregs[regno], start);
4495 0 : start++;
4496 : }
4497 :
4498 0 : if (bitmap_empty_p (live_subregs[regno]))
4499 : {
4500 0 : bitmap_clear_bit (live_subregs_used, regno);
4501 0 : bitmap_clear_bit (live_relevant_regs, regno);
4502 : }
4503 : else
4504 : /* Set live_relevant_regs here because
4505 : that bit has to be true to get us to
4506 : look at the live_subregs fields. */
4507 0 : bitmap_set_bit (live_relevant_regs, regno);
4508 : }
4509 : else
4510 : {
4511 : /* DF_REF_PARTIAL is generated for
4512 : subregs, STRICT_LOW_PART, and
4513 : ZERO_EXTRACT. We handle the subreg
4514 : case above so here we have to keep from
4515 : modeling the def as a killing def. */
4516 0 : if (!DF_REF_FLAGS_IS_SET (def, DF_REF_PARTIAL))
4517 : {
4518 0 : bitmap_clear_bit (live_subregs_used, regno);
4519 0 : bitmap_clear_bit (live_relevant_regs, regno);
4520 : }
4521 : }
4522 : }
4523 : }
4524 :
4525 0 : bitmap_and_compl_into (live_relevant_regs, elim_regset);
4526 0 : bitmap_copy (&c->live_throughout, live_relevant_regs);
4527 :
4528 0 : if (NONDEBUG_INSN_P (insn))
4529 0 : FOR_EACH_INSN_INFO_USE (use, insn_info)
4530 : {
4531 0 : unsigned int regno = DF_REF_REGNO (use);
4532 0 : rtx reg = DF_REF_REG (use);
4533 :
4534 : /* DF_REF_READ_WRITE on a use means that this use
4535 : is fabricated from a def that is a partial set
4536 : to a multiword reg. Here, we only model the
4537 : subreg case that is not wrapped in ZERO_EXTRACT
4538 : precisely so we do not need to look at the
4539 : fabricated use. */
4540 0 : if (DF_REF_FLAGS_IS_SET (use, DF_REF_READ_WRITE)
4541 0 : && !DF_REF_FLAGS_IS_SET (use, DF_REF_ZERO_EXTRACT)
4542 0 : && DF_REF_FLAGS_IS_SET (use, DF_REF_SUBREG))
4543 0 : continue;
4544 :
4545 : /* Add the last use of each var to dead_or_set. */
4546 0 : if (!bitmap_bit_p (live_relevant_regs, regno))
4547 : {
4548 0 : if (regno < FIRST_PSEUDO_REGISTER)
4549 : {
4550 0 : if (!fixed_regs[regno])
4551 0 : bitmap_set_bit (&c->dead_or_set, regno);
4552 : }
4553 0 : else if (pseudo_for_reload_consideration_p (regno))
4554 0 : bitmap_set_bit (&c->dead_or_set, regno);
4555 : }
4556 :
4557 0 : if (regno < FIRST_PSEUDO_REGISTER
4558 0 : || pseudo_for_reload_consideration_p (regno))
4559 : {
4560 0 : HOST_WIDE_INT outer_size, inner_size, start;
4561 0 : if (GET_CODE (reg) == SUBREG
4562 0 : && !DF_REF_FLAGS_IS_SET (use,
4563 : DF_REF_SIGN_EXTRACT
4564 : | DF_REF_ZERO_EXTRACT)
4565 0 : && get_subreg_tracking_sizes (reg, &outer_size,
4566 : &inner_size, &start))
4567 : {
4568 0 : HOST_WIDE_INT last = start + outer_size;
4569 :
4570 0 : init_live_subregs
4571 0 : (bitmap_bit_p (live_relevant_regs, regno),
4572 : live_subregs, live_subregs_used, regno,
4573 : inner_size);
4574 :
4575 : /* Ignore the paradoxical bits. */
4576 0 : if (last > SBITMAP_SIZE (live_subregs[regno]))
4577 : last = SBITMAP_SIZE (live_subregs[regno]);
4578 :
4579 0 : while (start < last)
4580 : {
4581 0 : bitmap_set_bit (live_subregs[regno], start);
4582 0 : start++;
4583 : }
4584 : }
4585 : else
4586 : /* Resetting the live_subregs_used is
4587 : effectively saying do not use the subregs
4588 : because we are reading the whole
4589 : pseudo. */
4590 0 : bitmap_clear_bit (live_subregs_used, regno);
4591 0 : bitmap_set_bit (live_relevant_regs, regno);
4592 : }
4593 : }
4594 : }
4595 : }
4596 :
4597 : /* FIXME!! The following code is a disaster. Reload needs to see the
4598 : labels and jump tables that are just hanging out in between
4599 : the basic blocks. See pr33676. */
4600 0 : insn = BB_HEAD (bb);
4601 :
4602 : /* Skip over the barriers and cruft. */
4603 0 : while (insn && (BARRIER_P (insn) || NOTE_P (insn)
4604 0 : || BLOCK_FOR_INSN (insn) == bb))
4605 0 : insn = PREV_INSN (insn);
4606 :
4607 : /* While we add anything except barriers and notes, the focus is
4608 : to get the labels and jump tables into the
4609 : reload_insn_chain. */
4610 0 : while (insn)
4611 : {
4612 0 : if (!NOTE_P (insn) && !BARRIER_P (insn))
4613 : {
4614 0 : if (BLOCK_FOR_INSN (insn))
4615 : break;
4616 :
4617 0 : c = new_insn_chain ();
4618 0 : c->next = next;
4619 0 : next = c;
4620 0 : *p = c;
4621 0 : p = &c->prev;
4622 :
4623 : /* The block makes no sense here, but it is what the old
4624 : code did. */
4625 0 : c->block = bb->index;
4626 0 : c->insn = insn;
4627 0 : bitmap_copy (&c->live_throughout, live_relevant_regs);
4628 : }
4629 0 : insn = PREV_INSN (insn);
4630 : }
4631 : }
4632 :
4633 0 : reload_insn_chain = c;
4634 0 : *p = NULL;
4635 :
4636 0 : for (i = 0; i < (unsigned int) max_regno; i++)
4637 0 : if (live_subregs[i] != NULL)
4638 0 : sbitmap_free (live_subregs[i]);
4639 0 : free (live_subregs);
4640 :
4641 0 : if (dump_file)
4642 0 : print_insn_chains (dump_file);
4643 0 : }
4644 :
4645 : /* Examine the rtx found in *LOC, which is read or written to as determined
4646 : by TYPE. Return false if we find a reason why an insn containing this
4647 : rtx should not be moved (such as accesses to non-constant memory), true
4648 : otherwise. */
4649 : static bool
4650 6713229 : rtx_moveable_p (rtx *loc, enum op_type type)
4651 : {
4652 6722433 : const char *fmt;
4653 6722433 : rtx x = *loc;
4654 6722433 : int i, j;
4655 :
4656 6722433 : enum rtx_code code = GET_CODE (x);
4657 6722433 : switch (code)
4658 : {
4659 : case CONST:
4660 : CASE_CONST_ANY:
4661 : case SYMBOL_REF:
4662 : case LABEL_REF:
4663 : return true;
4664 :
4665 0 : case PC:
4666 0 : return type == OP_IN;
4667 :
4668 2365155 : case REG:
4669 2365155 : if (x == frame_pointer_rtx)
4670 : return true;
4671 2363759 : if (HARD_REGISTER_P (x))
4672 : return false;
4673 :
4674 : return true;
4675 :
4676 630067 : case MEM:
4677 630067 : if (type == OP_IN && MEM_READONLY_P (x))
4678 9134 : return rtx_moveable_p (&XEXP (x, 0), OP_IN);
4679 : return false;
4680 :
4681 2068575 : case SET:
4682 2068575 : return (rtx_moveable_p (&SET_SRC (x), OP_IN)
4683 2068575 : && rtx_moveable_p (&SET_DEST (x), OP_OUT));
4684 :
4685 5 : case STRICT_LOW_PART:
4686 5 : return rtx_moveable_p (&XEXP (x, 0), OP_OUT);
4687 :
4688 508 : case ZERO_EXTRACT:
4689 508 : case SIGN_EXTRACT:
4690 508 : return (rtx_moveable_p (&XEXP (x, 0), type)
4691 508 : && rtx_moveable_p (&XEXP (x, 1), OP_IN)
4692 1016 : && rtx_moveable_p (&XEXP (x, 2), OP_IN));
4693 :
4694 65 : case CLOBBER:
4695 65 : return rtx_moveable_p (&SET_DEST (x), OP_OUT);
4696 :
4697 : case UNSPEC_VOLATILE:
4698 : /* It is a bad idea to consider insns with such rtl
4699 : as moveable ones. The insn scheduler also considers them as barrier
4700 : for a reason. */
4701 : return false;
4702 :
4703 0 : case ASM_OPERANDS:
4704 : /* The same is true for volatile asm: it has unknown side effects, it
4705 : cannot be moved at will. */
4706 0 : if (MEM_VOLATILE_P (x))
4707 : return false;
4708 :
4709 1096466 : default:
4710 1096466 : break;
4711 : }
4712 :
4713 1096466 : fmt = GET_RTX_FORMAT (code);
4714 2823443 : for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
4715 : {
4716 1890991 : if (fmt[i] == 'e')
4717 : {
4718 1485644 : if (!rtx_moveable_p (&XEXP (x, i), type))
4719 : return false;
4720 : }
4721 405347 : else if (fmt[i] == 'E')
4722 556695 : for (j = XVECLEN (x, i) - 1; j >= 0; j--)
4723 : {
4724 429693 : if (!rtx_moveable_p (&XVECEXP (x, i, j), type))
4725 : return false;
4726 : }
4727 : }
4728 : return true;
4729 : }
4730 :
4731 : /* A wrapper around dominated_by_p, which uses the information in UID_LUID
4732 : to give dominance relationships between two insns I1 and I2. */
4733 : static bool
4734 21061392 : insn_dominated_by_p (rtx i1, rtx i2, int *uid_luid)
4735 : {
4736 21061392 : basic_block bb1 = BLOCK_FOR_INSN (i1);
4737 21061392 : basic_block bb2 = BLOCK_FOR_INSN (i2);
4738 :
4739 21061392 : if (bb1 == bb2)
4740 11094770 : return uid_luid[INSN_UID (i2)] < uid_luid[INSN_UID (i1)];
4741 9966622 : return dominated_by_p (CDI_DOMINATORS, bb1, bb2);
4742 : }
4743 :
4744 : /* Record the range of register numbers added by find_moveable_pseudos. */
4745 : int first_moveable_pseudo, last_moveable_pseudo;
4746 :
4747 : /* These two vectors hold data for every register added by
4748 : find_movable_pseudos, with index 0 holding data for the
4749 : first_moveable_pseudo. */
4750 : /* The original home register. */
4751 : static vec<rtx> pseudo_replaced_reg;
4752 :
4753 : /* Look for instances where we have an instruction that is known to increase
4754 : register pressure, and whose result is not used immediately. If it is
4755 : possible to move the instruction downwards to just before its first use,
4756 : split its lifetime into two ranges. We create a new pseudo to compute the
4757 : value, and emit a move instruction just before the first use. If, after
4758 : register allocation, the new pseudo remains unallocated, the function
4759 : move_unallocated_pseudos then deletes the move instruction and places
4760 : the computation just before the first use.
4761 :
4762 : Such a move is safe and profitable if all the input registers remain live
4763 : and unchanged between the original computation and its first use. In such
4764 : a situation, the computation is known to increase register pressure, and
4765 : moving it is known to at least not worsen it.
4766 :
4767 : We restrict moves to only those cases where a register remains unallocated,
4768 : in order to avoid interfering too much with the instruction schedule. As
4769 : an exception, we may move insns which only modify their input register
4770 : (typically induction variables), as this increases the freedom for our
4771 : intended transformation, and does not limit the second instruction
4772 : scheduler pass. */
4773 :
4774 : static void
4775 1043685 : find_moveable_pseudos (void)
4776 : {
4777 1043685 : unsigned i;
4778 1043685 : int max_regs = max_reg_num ();
4779 1043685 : int max_uid = get_max_uid ();
4780 1043685 : basic_block bb;
4781 1043685 : int *uid_luid = XNEWVEC (int, max_uid);
4782 1043685 : rtx_insn **closest_uses = XNEWVEC (rtx_insn *, max_regs);
4783 : /* A set of registers which are live but not modified throughout a block. */
4784 1043685 : bitmap_head *bb_transp_live = XNEWVEC (bitmap_head,
4785 : last_basic_block_for_fn (cfun));
4786 : /* A set of registers which only exist in a given basic block. */
4787 1043685 : bitmap_head *bb_local = XNEWVEC (bitmap_head,
4788 : last_basic_block_for_fn (cfun));
4789 : /* A set of registers which are set once, in an instruction that can be
4790 : moved freely downwards, but are otherwise transparent to a block. */
4791 1043685 : bitmap_head *bb_moveable_reg_sets = XNEWVEC (bitmap_head,
4792 : last_basic_block_for_fn (cfun));
4793 1043685 : auto_bitmap live, used, set, interesting, unusable_as_input;
4794 1043685 : bitmap_iterator bi;
4795 :
4796 1043685 : first_moveable_pseudo = max_regs;
4797 1043685 : pseudo_replaced_reg.release ();
4798 1043685 : pseudo_replaced_reg.safe_grow_cleared (max_regs, true);
4799 :
4800 1043685 : df_analyze ();
4801 1043685 : calculate_dominance_info (CDI_DOMINATORS);
4802 :
4803 1043685 : i = 0;
4804 12051971 : FOR_EACH_BB_FN (bb, cfun)
4805 : {
4806 11008286 : rtx_insn *insn;
4807 11008286 : bitmap transp = bb_transp_live + bb->index;
4808 11008286 : bitmap moveable = bb_moveable_reg_sets + bb->index;
4809 11008286 : bitmap local = bb_local + bb->index;
4810 :
4811 11008286 : bitmap_initialize (local, 0);
4812 11008286 : bitmap_initialize (transp, 0);
4813 11008286 : bitmap_initialize (moveable, 0);
4814 11008286 : bitmap_copy (live, df_get_live_out (bb));
4815 11008286 : bitmap_and_into (live, df_get_live_in (bb));
4816 11008286 : bitmap_copy (transp, live);
4817 11008286 : bitmap_clear (moveable);
4818 11008286 : bitmap_clear (live);
4819 11008286 : bitmap_clear (used);
4820 11008286 : bitmap_clear (set);
4821 139692749 : FOR_BB_INSNS (bb, insn)
4822 128684463 : if (NONDEBUG_INSN_P (insn))
4823 : {
4824 58520087 : df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4825 58520087 : df_ref def, use;
4826 :
4827 58520087 : uid_luid[INSN_UID (insn)] = i++;
4828 :
4829 58520087 : def = df_single_def (insn_info);
4830 58520087 : use = df_single_use (insn_info);
4831 58520087 : if (use
4832 58520087 : && def
4833 19310268 : && DF_REF_REGNO (use) == DF_REF_REGNO (def)
4834 618048 : && !bitmap_bit_p (set, DF_REF_REGNO (use))
4835 58600222 : && rtx_moveable_p (&PATTERN (insn), OP_IN))
4836 : {
4837 32436 : unsigned regno = DF_REF_REGNO (use);
4838 32436 : bitmap_set_bit (moveable, regno);
4839 32436 : bitmap_set_bit (set, regno);
4840 32436 : bitmap_set_bit (used, regno);
4841 32436 : bitmap_clear_bit (transp, regno);
4842 32436 : continue;
4843 32436 : }
4844 131006400 : FOR_EACH_INSN_INFO_USE (use, insn_info)
4845 : {
4846 72518749 : unsigned regno = DF_REF_REGNO (use);
4847 72518749 : bitmap_set_bit (used, regno);
4848 72518749 : if (bitmap_clear_bit (moveable, regno))
4849 15693 : bitmap_clear_bit (transp, regno);
4850 : }
4851 :
4852 483076207 : FOR_EACH_INSN_INFO_DEF (def, insn_info)
4853 : {
4854 424588556 : unsigned regno = DF_REF_REGNO (def);
4855 424588556 : bitmap_set_bit (set, regno);
4856 424588556 : bitmap_clear_bit (transp, regno);
4857 424588556 : bitmap_clear_bit (moveable, regno);
4858 : }
4859 : }
4860 : }
4861 :
4862 12051971 : FOR_EACH_BB_FN (bb, cfun)
4863 : {
4864 11008286 : bitmap local = bb_local + bb->index;
4865 11008286 : rtx_insn *insn;
4866 :
4867 139692749 : FOR_BB_INSNS (bb, insn)
4868 128684463 : if (NONDEBUG_INSN_P (insn))
4869 : {
4870 58520087 : df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
4871 58520087 : rtx_insn *def_insn;
4872 58520087 : rtx closest_use, note;
4873 58520087 : df_ref def, use;
4874 58520087 : unsigned regno;
4875 58520087 : bool all_dominated, all_local;
4876 58520087 : machine_mode mode;
4877 :
4878 58520087 : def = df_single_def (insn_info);
4879 : /* There must be exactly one def in this insn. */
4880 31946438 : if (!def || !single_set (insn))
4881 26657115 : continue;
4882 : /* This must be the only definition of the reg. We also limit
4883 : which modes we deal with so that we can assume we can generate
4884 : move instructions. */
4885 31862972 : regno = DF_REF_REGNO (def);
4886 31862972 : mode = GET_MODE (DF_REF_REG (def));
4887 31862972 : if (DF_REG_DEF_COUNT (regno) != 1
4888 12131169 : || !DF_REF_INSN_INFO (def)
4889 12131169 : || HARD_REGISTER_NUM_P (regno)
4890 12098743 : || DF_REG_EQ_USE_COUNT (regno) > 0
4891 11554046 : || (!INTEGRAL_MODE_P (mode)
4892 : && !FLOAT_MODE_P (mode)
4893 : && !OPAQUE_MODE_P (mode)))
4894 20308926 : continue;
4895 11554046 : def_insn = DF_REF_INSN (def);
4896 :
4897 19492703 : for (note = REG_NOTES (def_insn); note; note = XEXP (note, 1))
4898 10547140 : if (REG_NOTE_KIND (note) == REG_EQUIV && MEM_P (XEXP (note, 0)))
4899 : break;
4900 :
4901 11554046 : if (note)
4902 : {
4903 2608483 : if (dump_file)
4904 68 : fprintf (dump_file, "Ignoring reg %d, has equiv memory\n",
4905 : regno);
4906 2608483 : bitmap_set_bit (unusable_as_input, regno);
4907 2608483 : continue;
4908 : }
4909 :
4910 8945563 : use = DF_REG_USE_CHAIN (regno);
4911 8945563 : all_dominated = true;
4912 8945563 : all_local = true;
4913 8945563 : closest_use = NULL_RTX;
4914 26416198 : for (; use; use = DF_REF_NEXT_REG (use))
4915 : {
4916 17470635 : rtx_insn *insn;
4917 17470635 : if (!DF_REF_INSN_INFO (use))
4918 : {
4919 : all_dominated = false;
4920 : all_local = false;
4921 : break;
4922 : }
4923 17470635 : insn = DF_REF_INSN (use);
4924 17470635 : if (DEBUG_INSN_P (insn))
4925 2234471 : continue;
4926 15236164 : if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (def_insn))
4927 6077043 : all_local = false;
4928 15236164 : if (!insn_dominated_by_p (insn, def_insn, uid_luid))
4929 9739 : all_dominated = false;
4930 15236164 : if (closest_use != insn && closest_use != const0_rtx)
4931 : {
4932 13388499 : if (closest_use == NULL_RTX)
4933 : closest_use = insn;
4934 4506072 : else if (insn_dominated_by_p (closest_use, insn, uid_luid))
4935 : closest_use = insn;
4936 1319156 : else if (!insn_dominated_by_p (insn, closest_use, uid_luid))
4937 626414 : closest_use = const0_rtx;
4938 : }
4939 : }
4940 8945563 : if (!all_dominated)
4941 : {
4942 4889 : if (dump_file)
4943 0 : fprintf (dump_file, "Reg %d not all uses dominated by set\n",
4944 : regno);
4945 4889 : continue;
4946 : }
4947 8940674 : if (all_local)
4948 6368542 : bitmap_set_bit (local, regno);
4949 8315940 : if (closest_use == const0_rtx || closest_use == NULL
4950 17193478 : || next_nonnote_nondebug_insn (def_insn) == closest_use)
4951 : {
4952 5631075 : if (dump_file)
4953 99 : fprintf (dump_file, "Reg %d uninteresting%s\n", regno,
4954 99 : closest_use == const0_rtx || closest_use == NULL
4955 : ? " (no unique first use)" : "");
4956 5631075 : continue;
4957 : }
4958 :
4959 3309599 : bitmap_set_bit (interesting, regno);
4960 : /* If we get here, we know closest_use is a non-NULL insn
4961 : (as opposed to const_0_rtx). */
4962 3309599 : closest_uses[regno] = as_a <rtx_insn *> (closest_use);
4963 :
4964 3309599 : if (dump_file && (all_local || all_dominated))
4965 : {
4966 78 : fprintf (dump_file, "Reg %u:", regno);
4967 78 : if (all_local)
4968 14 : fprintf (dump_file, " local to bb %d", bb->index);
4969 78 : if (all_dominated)
4970 78 : fprintf (dump_file, " def dominates all uses");
4971 78 : if (closest_use != const0_rtx)
4972 78 : fprintf (dump_file, " has unique first use");
4973 78 : fputs ("\n", dump_file);
4974 : }
4975 : }
4976 : }
4977 :
4978 4353284 : EXECUTE_IF_SET_IN_BITMAP (interesting, 0, i, bi)
4979 : {
4980 3309599 : df_ref def = DF_REG_DEF_CHAIN (i);
4981 3309599 : rtx_insn *def_insn = DF_REF_INSN (def);
4982 3309599 : basic_block def_block = BLOCK_FOR_INSN (def_insn);
4983 3309599 : bitmap def_bb_local = bb_local + def_block->index;
4984 3309599 : bitmap def_bb_moveable = bb_moveable_reg_sets + def_block->index;
4985 3309599 : bitmap def_bb_transp = bb_transp_live + def_block->index;
4986 3309599 : bool local_to_bb_p = bitmap_bit_p (def_bb_local, i);
4987 3309599 : rtx_insn *use_insn = closest_uses[i];
4988 3309599 : df_ref use;
4989 3309599 : bool all_ok = true;
4990 3309599 : bool all_transp = true;
4991 :
4992 3309599 : if (!REG_P (DF_REF_REG (def)))
4993 40546 : continue;
4994 :
4995 3269053 : if (!local_to_bb_p)
4996 : {
4997 1233745 : if (dump_file)
4998 64 : fprintf (dump_file, "Reg %u not local to one basic block\n",
4999 : i);
5000 1233745 : continue;
5001 : }
5002 2035308 : if (reg_equiv_init (i) != NULL_RTX)
5003 : {
5004 46836 : if (dump_file)
5005 0 : fprintf (dump_file, "Ignoring reg %u with equiv init insn\n",
5006 : i);
5007 46836 : continue;
5008 : }
5009 1988472 : if (!rtx_moveable_p (&PATTERN (def_insn), OP_IN))
5010 : {
5011 1376149 : if (dump_file)
5012 14 : fprintf (dump_file, "Found def insn %d for %d to be not moveable\n",
5013 14 : INSN_UID (def_insn), i);
5014 1376149 : continue;
5015 : }
5016 612323 : if (dump_file)
5017 0 : fprintf (dump_file, "Examining insn %d, def for %d\n",
5018 0 : INSN_UID (def_insn), i);
5019 1375700 : FOR_EACH_INSN_USE (use, def_insn)
5020 : {
5021 822850 : unsigned regno = DF_REF_REGNO (use);
5022 822850 : if (bitmap_bit_p (unusable_as_input, regno))
5023 : {
5024 59473 : all_ok = false;
5025 59473 : if (dump_file)
5026 0 : fprintf (dump_file, " found unusable input reg %u.\n", regno);
5027 : break;
5028 : }
5029 763377 : if (!bitmap_bit_p (def_bb_transp, regno))
5030 : {
5031 711600 : if (bitmap_bit_p (def_bb_moveable, regno)
5032 711600 : && !control_flow_insn_p (use_insn))
5033 : {
5034 35 : if (modified_between_p (DF_REF_REG (use), def_insn, use_insn))
5035 : {
5036 0 : rtx_insn *x = NEXT_INSN (def_insn);
5037 0 : while (!modified_in_p (DF_REF_REG (use), x))
5038 : {
5039 0 : gcc_assert (x != use_insn);
5040 0 : x = NEXT_INSN (x);
5041 : }
5042 0 : if (dump_file)
5043 0 : fprintf (dump_file, " input reg %u modified but insn %d moveable\n",
5044 0 : regno, INSN_UID (x));
5045 0 : emit_insn_after (PATTERN (x), use_insn);
5046 0 : set_insn_deleted (x);
5047 : }
5048 : else
5049 : {
5050 35 : if (dump_file)
5051 0 : fprintf (dump_file, " input reg %u modified between def and use\n",
5052 : regno);
5053 : all_transp = false;
5054 : }
5055 : }
5056 : else
5057 : all_transp = false;
5058 : }
5059 : }
5060 0 : if (!all_ok)
5061 59473 : continue;
5062 552850 : if (!dbg_cnt (ira_move))
5063 : break;
5064 552850 : if (dump_file)
5065 0 : fprintf (dump_file, " all ok%s\n", all_transp ? " and transp" : "");
5066 :
5067 552850 : if (all_transp)
5068 : {
5069 14472 : rtx def_reg = DF_REF_REG (def);
5070 14472 : rtx newreg = ira_create_new_reg (def_reg);
5071 14472 : if (validate_change (def_insn, DF_REF_REAL_LOC (def), newreg, 0))
5072 : {
5073 14472 : unsigned nregno = REGNO (newreg);
5074 14472 : emit_insn_before (gen_move_insn (def_reg, newreg), use_insn);
5075 14472 : nregno -= max_regs;
5076 14472 : pseudo_replaced_reg[nregno] = def_reg;
5077 : }
5078 : }
5079 : }
5080 :
5081 12051971 : FOR_EACH_BB_FN (bb, cfun)
5082 : {
5083 11008286 : bitmap_clear (bb_local + bb->index);
5084 11008286 : bitmap_clear (bb_transp_live + bb->index);
5085 11008286 : bitmap_clear (bb_moveable_reg_sets + bb->index);
5086 : }
5087 1043685 : free (uid_luid);
5088 1043685 : free (closest_uses);
5089 1043685 : free (bb_local);
5090 1043685 : free (bb_transp_live);
5091 1043685 : free (bb_moveable_reg_sets);
5092 :
5093 1043685 : last_moveable_pseudo = max_reg_num ();
5094 :
5095 1043685 : fix_reg_equiv_init ();
5096 1043685 : expand_reg_info ();
5097 1043685 : regstat_free_n_sets_and_refs ();
5098 1043685 : regstat_free_ri ();
5099 1043685 : regstat_init_n_sets_and_refs ();
5100 1043685 : regstat_compute_ri ();
5101 1043685 : free_dominance_info (CDI_DOMINATORS);
5102 1043685 : }
5103 :
5104 : /* If SET pattern SET is an assignment from a hard register to a pseudo which
5105 : is live at CALL_DOM (if non-NULL, otherwise this check is omitted), return
5106 : the destination. Otherwise return NULL. */
5107 :
5108 : static rtx
5109 2085472 : interesting_dest_for_shprep_1 (rtx set, basic_block call_dom)
5110 : {
5111 2085472 : rtx src = SET_SRC (set);
5112 2085472 : rtx dest = SET_DEST (set);
5113 702799 : if (!REG_P (src) || !HARD_REGISTER_P (src)
5114 557640 : || !REG_P (dest) || HARD_REGISTER_P (dest)
5115 2622679 : || (call_dom && !bitmap_bit_p (df_get_live_in (call_dom), REGNO (dest))))
5116 1637131 : return NULL;
5117 : return dest;
5118 : }
5119 :
5120 : /* If insn is interesting for parameter range-splitting shrink-wrapping
5121 : preparation, i.e. it is a single set from a hard register to a pseudo, which
5122 : is live at CALL_DOM (if non-NULL, otherwise this check is omitted), or a
5123 : parallel statement with only one such statement, return the destination.
5124 : Otherwise return NULL. */
5125 :
5126 : static rtx
5127 4317777 : interesting_dest_for_shprep (rtx_insn *insn, basic_block call_dom)
5128 : {
5129 4317777 : if (!INSN_P (insn))
5130 : return NULL;
5131 3446182 : rtx pat = PATTERN (insn);
5132 3446182 : if (GET_CODE (pat) == SET)
5133 1851901 : return interesting_dest_for_shprep_1 (pat, call_dom);
5134 :
5135 1594281 : if (GET_CODE (pat) != PARALLEL)
5136 : return NULL;
5137 : rtx ret = NULL;
5138 611269 : for (int i = 0; i < XVECLEN (pat, 0); i++)
5139 : {
5140 412437 : rtx sub = XVECEXP (pat, 0, i);
5141 412437 : if (GET_CODE (sub) == USE || GET_CODE (sub) == CLOBBER)
5142 171955 : continue;
5143 240482 : if (GET_CODE (sub) != SET
5144 240482 : || side_effects_p (sub))
5145 6911 : return NULL;
5146 233571 : rtx dest = interesting_dest_for_shprep_1 (sub, call_dom);
5147 233571 : if (dest && ret)
5148 : return NULL;
5149 233571 : if (dest)
5150 405526 : ret = dest;
5151 : }
5152 : return ret;
5153 : }
5154 :
5155 : /* Split live ranges of pseudos that are loaded from hard registers in the
5156 : first BB in a BB that dominates all non-sibling call if such a BB can be
5157 : found and is not in a loop. Return true if the function has made any
5158 : changes. */
5159 :
5160 : static bool
5161 1043685 : split_live_ranges_for_shrink_wrap (void)
5162 : {
5163 1043685 : basic_block bb, call_dom = NULL;
5164 1043685 : basic_block first = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
5165 1043685 : rtx_insn *insn, *last_interesting_insn = NULL;
5166 1043685 : auto_bitmap need_new, reachable;
5167 1043685 : vec<basic_block> queue;
5168 :
5169 1043685 : if (!SHRINK_WRAPPING_ENABLED)
5170 246 : return false;
5171 :
5172 1043439 : queue.create (n_basic_blocks_for_fn (cfun));
5173 :
5174 7533537 : FOR_EACH_BB_FN (bb, cfun)
5175 75360327 : FOR_BB_INSNS (bb, insn)
5176 70226271 : if (CALL_P (insn) && !SIBLING_CALL_P (insn))
5177 : {
5178 1764023 : if (bb == first)
5179 : {
5180 407981 : queue.release ();
5181 407981 : return false;
5182 : }
5183 :
5184 1356042 : bitmap_set_bit (need_new, bb->index);
5185 1356042 : bitmap_set_bit (reachable, bb->index);
5186 1356042 : queue.quick_push (bb);
5187 1356042 : break;
5188 : }
5189 :
5190 635458 : if (queue.is_empty ())
5191 : {
5192 380773 : queue.release ();
5193 380773 : return false;
5194 : }
5195 :
5196 4188655 : while (!queue.is_empty ())
5197 : {
5198 3933970 : edge e;
5199 3933970 : edge_iterator ei;
5200 :
5201 3933970 : bb = queue.pop ();
5202 9451436 : FOR_EACH_EDGE (e, ei, bb->succs)
5203 5517466 : if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5204 5517466 : && bitmap_set_bit (reachable, e->dest->index))
5205 2577928 : queue.quick_push (e->dest);
5206 : }
5207 254685 : queue.release ();
5208 :
5209 3876572 : FOR_BB_INSNS (first, insn)
5210 : {
5211 3622163 : rtx dest = interesting_dest_for_shprep (insn, NULL);
5212 3622163 : if (!dest)
5213 3229204 : continue;
5214 :
5215 392959 : if (DF_REG_DEF_COUNT (REGNO (dest)) > 1)
5216 : return false;
5217 :
5218 392683 : for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
5219 2875038 : use;
5220 2482355 : use = DF_REF_NEXT_REG (use))
5221 : {
5222 2482355 : int ubbi = DF_REF_BB (use)->index;
5223 :
5224 : /* Only non debug insns should be taken into account. */
5225 2482355 : if (NONDEBUG_INSN_P (DF_REF_INSN (use))
5226 2482355 : && bitmap_bit_p (reachable, ubbi))
5227 1126158 : bitmap_set_bit (need_new, ubbi);
5228 : }
5229 : last_interesting_insn = insn;
5230 : }
5231 :
5232 254409 : if (!last_interesting_insn)
5233 : return false;
5234 :
5235 182522 : call_dom = nearest_common_dominator_for_set (CDI_DOMINATORS, need_new);
5236 182522 : if (call_dom == first)
5237 : return false;
5238 :
5239 95414 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5240 213300 : while (bb_loop_depth (call_dom) > 0)
5241 22472 : call_dom = get_immediate_dominator (CDI_DOMINATORS, call_dom);
5242 95414 : loop_optimizer_finalize ();
5243 :
5244 95414 : if (call_dom == first)
5245 : return false;
5246 :
5247 83708 : calculate_dominance_info (CDI_POST_DOMINATORS);
5248 83708 : if (dominated_by_p (CDI_POST_DOMINATORS, first, call_dom))
5249 : {
5250 7574 : free_dominance_info (CDI_POST_DOMINATORS);
5251 7574 : return false;
5252 : }
5253 76134 : free_dominance_info (CDI_POST_DOMINATORS);
5254 :
5255 76134 : if (dump_file)
5256 2 : fprintf (dump_file, "Will split live ranges of parameters at BB %i\n",
5257 : call_dom->index);
5258 :
5259 76134 : bool ret = false;
5260 747747 : FOR_BB_INSNS (first, insn)
5261 : {
5262 695614 : rtx dest = interesting_dest_for_shprep (insn, call_dom);
5263 695614 : if (!dest || dest == pic_offset_table_rtx)
5264 640232 : continue;
5265 :
5266 55382 : bool need_newreg = false;
5267 55382 : df_ref use, next;
5268 70169 : for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5269 : {
5270 70099 : rtx_insn *uin = DF_REF_INSN (use);
5271 70099 : next = DF_REF_NEXT_REG (use);
5272 :
5273 70099 : if (DEBUG_INSN_P (uin))
5274 411 : continue;
5275 :
5276 69688 : basic_block ubb = BLOCK_FOR_INSN (uin);
5277 69688 : if (ubb == call_dom
5278 69688 : || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5279 : {
5280 : need_newreg = true;
5281 : break;
5282 : }
5283 : }
5284 :
5285 55382 : if (need_newreg)
5286 : {
5287 55312 : rtx newreg = ira_create_new_reg (dest);
5288 :
5289 434953 : for (use = DF_REG_USE_CHAIN (REGNO (dest)); use; use = next)
5290 : {
5291 379641 : rtx_insn *uin = DF_REF_INSN (use);
5292 379641 : next = DF_REF_NEXT_REG (use);
5293 :
5294 379641 : basic_block ubb = BLOCK_FOR_INSN (uin);
5295 379641 : if (ubb == call_dom
5296 379641 : || dominated_by_p (CDI_DOMINATORS, ubb, call_dom))
5297 275134 : validate_change (uin, DF_REF_REAL_LOC (use), newreg, true);
5298 : }
5299 :
5300 55312 : rtx_insn *new_move = gen_move_insn (newreg, dest);
5301 55312 : emit_insn_after (new_move, bb_note (call_dom));
5302 55312 : if (dump_file)
5303 : {
5304 2 : fprintf (dump_file, "Split live-range of register ");
5305 2 : print_rtl_single (dump_file, dest);
5306 : }
5307 : ret = true;
5308 : }
5309 :
5310 55382 : if (insn == last_interesting_insn)
5311 : break;
5312 : }
5313 76134 : apply_change_group ();
5314 76134 : return ret;
5315 1043685 : }
5316 :
5317 : /* Perform the second half of the transformation started in
5318 : find_moveable_pseudos. We look for instances where the newly introduced
5319 : pseudo remains unallocated, and remove it by moving the definition to
5320 : just before its use, replacing the move instruction generated by
5321 : find_moveable_pseudos. */
5322 : static void
5323 1043685 : move_unallocated_pseudos (void)
5324 : {
5325 1043685 : int i;
5326 1058157 : for (i = first_moveable_pseudo; i < last_moveable_pseudo; i++)
5327 14472 : if (reg_renumber[i] < 0)
5328 : {
5329 3431 : int idx = i - first_moveable_pseudo;
5330 3431 : rtx other_reg = pseudo_replaced_reg[idx];
5331 : /* The iterating range [first_moveable_pseudo, last_moveable_pseudo)
5332 : covers every new pseudo created in find_moveable_pseudos,
5333 : regardless of the validation with it is successful or not.
5334 : So we need to skip the pseudos which were used in those failed
5335 : validations to avoid unexpected DF info and consequent ICE.
5336 : We only set pseudo_replaced_reg[] when the validation is successful
5337 : in find_moveable_pseudos, it's enough to check it here. */
5338 3431 : if (!other_reg)
5339 0 : continue;
5340 3431 : rtx_insn *def_insn = DF_REF_INSN (DF_REG_DEF_CHAIN (i));
5341 : /* The use must follow all definitions of OTHER_REG, so we can
5342 : insert the new definition immediately after any of them. */
5343 3431 : df_ref other_def = DF_REG_DEF_CHAIN (REGNO (other_reg));
5344 3431 : rtx_insn *move_insn = DF_REF_INSN (other_def);
5345 3431 : rtx_insn *newinsn = emit_insn_after (PATTERN (def_insn), move_insn);
5346 3431 : rtx set;
5347 3431 : int success;
5348 :
5349 3431 : if (dump_file)
5350 0 : fprintf (dump_file, "moving def of %d (insn %d now) ",
5351 0 : REGNO (other_reg), INSN_UID (def_insn));
5352 :
5353 3431 : delete_insn (move_insn);
5354 6862 : while ((other_def = DF_REG_DEF_CHAIN (REGNO (other_reg))))
5355 0 : delete_insn (DF_REF_INSN (other_def));
5356 3431 : delete_insn (def_insn);
5357 :
5358 3431 : set = single_set (newinsn);
5359 3431 : success = validate_change (newinsn, &SET_DEST (set), other_reg, 0);
5360 3431 : gcc_assert (success);
5361 3431 : if (dump_file)
5362 0 : fprintf (dump_file, " %d) rather than keep unallocated replacement %d\n",
5363 0 : INSN_UID (newinsn), i);
5364 3431 : SET_REG_N_REFS (i, 0);
5365 : }
5366 :
5367 1043685 : first_moveable_pseudo = last_moveable_pseudo = 0;
5368 1043685 : }
5369 :
5370 :
5371 :
5372 : /* Code dealing with scratches (changing them onto
5373 : pseudos and restoring them from the pseudos).
5374 :
5375 : We change scratches into pseudos at the beginning of IRA to
5376 : simplify dealing with them (conflicts, hard register assignments).
5377 :
5378 : If the pseudo denoting scratch was spilled it means that we do not
5379 : need a hard register for it. Such pseudos are transformed back to
5380 : scratches at the end of LRA. */
5381 :
5382 : /* Description of location of a former scratch operand. */
5383 : struct sloc
5384 : {
5385 : rtx_insn *insn; /* Insn where the scratch was. */
5386 : int nop; /* Number of the operand which was a scratch. */
5387 : unsigned regno; /* regno gnerated instead of scratch */
5388 : int icode; /* Original icode from which scratch was removed. */
5389 : };
5390 :
5391 : typedef struct sloc *sloc_t;
5392 :
5393 : /* Locations of the former scratches. */
5394 : static vec<sloc_t> scratches;
5395 :
5396 : /* Bitmap of scratch regnos. */
5397 : static bitmap_head scratch_bitmap;
5398 :
5399 : /* Bitmap of scratch operands. */
5400 : static bitmap_head scratch_operand_bitmap;
5401 :
5402 : /* Return true if pseudo REGNO is made of SCRATCH. */
5403 : bool
5404 370314746 : ira_former_scratch_p (int regno)
5405 : {
5406 370314746 : return bitmap_bit_p (&scratch_bitmap, regno);
5407 : }
5408 :
5409 : /* Return true if the operand NOP of INSN is a former scratch. */
5410 : bool
5411 0 : ira_former_scratch_operand_p (rtx_insn *insn, int nop)
5412 : {
5413 0 : return bitmap_bit_p (&scratch_operand_bitmap,
5414 0 : INSN_UID (insn) * MAX_RECOG_OPERANDS + nop) != 0;
5415 : }
5416 :
5417 : /* Register operand NOP in INSN as a former scratch. It will be
5418 : changed to scratch back, if it is necessary, at the LRA end. */
5419 : void
5420 91924 : ira_register_new_scratch_op (rtx_insn *insn, int nop, int icode)
5421 : {
5422 91924 : rtx op = *recog_data.operand_loc[nop];
5423 91924 : sloc_t loc = XNEW (struct sloc);
5424 91924 : ira_assert (REG_P (op));
5425 91924 : loc->insn = insn;
5426 91924 : loc->nop = nop;
5427 91924 : loc->regno = REGNO (op);
5428 91924 : loc->icode = icode;
5429 91924 : scratches.safe_push (loc);
5430 91924 : bitmap_set_bit (&scratch_bitmap, REGNO (op));
5431 183848 : bitmap_set_bit (&scratch_operand_bitmap,
5432 91924 : INSN_UID (insn) * MAX_RECOG_OPERANDS + nop);
5433 91924 : add_reg_note (insn, REG_UNUSED, op);
5434 91924 : }
5435 :
5436 : /* Return true if string STR contains constraint 'X'. */
5437 : static bool
5438 91924 : contains_X_constraint_p (const char *str)
5439 : {
5440 91924 : int c;
5441 :
5442 337592 : while ((c = *str))
5443 : {
5444 254626 : str += CONSTRAINT_LEN (c, str);
5445 254626 : if (c == 'X') return true;
5446 : }
5447 : return false;
5448 : }
5449 :
5450 : /* Change INSN's scratches into pseudos and save their location.
5451 : Return true if we changed any scratch. */
5452 : bool
5453 274001632 : ira_remove_insn_scratches (rtx_insn *insn, bool all_p, FILE *dump_file,
5454 : rtx (*get_reg) (rtx original))
5455 : {
5456 274001632 : int i;
5457 274001632 : bool insn_changed_p;
5458 274001632 : rtx reg, *loc;
5459 :
5460 274001632 : extract_insn (insn);
5461 274001632 : insn_changed_p = false;
5462 923891062 : for (i = 0; i < recog_data.n_operands; i++)
5463 : {
5464 375887798 : loc = recog_data.operand_loc[i];
5465 375887798 : if (GET_CODE (*loc) == SCRATCH && GET_MODE (*loc) != VOIDmode)
5466 : {
5467 100882 : if (! all_p && contains_X_constraint_p (recog_data.constraints[i]))
5468 8958 : continue;
5469 91924 : insn_changed_p = true;
5470 91924 : *loc = reg = get_reg (*loc);
5471 91924 : ira_register_new_scratch_op (insn, i, INSN_CODE (insn));
5472 91924 : if (ira_dump_file != NULL)
5473 0 : fprintf (dump_file,
5474 : "Removing SCRATCH to p%u in insn #%u (nop %d)\n",
5475 0 : REGNO (reg), INSN_UID (insn), i);
5476 : }
5477 : }
5478 274001632 : return insn_changed_p;
5479 : }
5480 :
5481 : /* Return new register of the same mode as ORIGINAL. Used in
5482 : remove_scratches. */
5483 : static rtx
5484 82966 : get_scratch_reg (rtx original)
5485 : {
5486 82966 : return gen_reg_rtx (GET_MODE (original));
5487 : }
5488 :
5489 : /* Change scratches into pseudos and save their location. Return true
5490 : if we changed any scratch. */
5491 : static bool
5492 1471362 : remove_scratches (void)
5493 : {
5494 1471362 : bool change_p = false;
5495 1471362 : basic_block bb;
5496 1471362 : rtx_insn *insn;
5497 :
5498 1471362 : scratches.create (get_max_uid ());
5499 1471362 : bitmap_initialize (&scratch_bitmap, ®_obstack);
5500 1471362 : bitmap_initialize (&scratch_operand_bitmap, ®_obstack);
5501 15875447 : FOR_EACH_BB_FN (bb, cfun)
5502 173048043 : FOR_BB_INSNS (bb, insn)
5503 158643958 : if (INSN_P (insn)
5504 158643958 : && ira_remove_insn_scratches (insn, false, ira_dump_file, get_scratch_reg))
5505 : {
5506 : /* Because we might use DF, we need to keep DF info up to date. */
5507 81799 : df_insn_rescan (insn);
5508 81799 : change_p = true;
5509 : }
5510 1471362 : return change_p;
5511 : }
5512 :
5513 : /* Changes pseudos created by function remove_scratches onto scratches. */
5514 : void
5515 1471362 : ira_restore_scratches (FILE *dump_file)
5516 : {
5517 1471362 : int regno, n;
5518 1471362 : unsigned i;
5519 1471362 : rtx *op_loc;
5520 1471362 : sloc_t loc;
5521 :
5522 1563286 : for (i = 0; scratches.iterate (i, &loc); i++)
5523 : {
5524 : /* Ignore already deleted insns. */
5525 91924 : if (NOTE_P (loc->insn)
5526 0 : && NOTE_KIND (loc->insn) == NOTE_INSN_DELETED)
5527 0 : continue;
5528 91924 : extract_insn (loc->insn);
5529 91924 : if (loc->icode != INSN_CODE (loc->insn))
5530 : {
5531 : /* The icode doesn't match, which means the insn has been modified
5532 : (e.g. register elimination). The scratch cannot be restored. */
5533 0 : continue;
5534 : }
5535 91924 : op_loc = recog_data.operand_loc[loc->nop];
5536 91924 : if (REG_P (*op_loc)
5537 91924 : && ((regno = REGNO (*op_loc)) >= FIRST_PSEUDO_REGISTER)
5538 183848 : && reg_renumber[regno] < 0)
5539 : {
5540 : /* It should be only case when scratch register with chosen
5541 : constraint 'X' did not get memory or hard register. */
5542 5283 : ira_assert (ira_former_scratch_p (regno));
5543 5283 : *op_loc = gen_rtx_SCRATCH (GET_MODE (*op_loc));
5544 5283 : for (n = 0; n < recog_data.n_dups; n++)
5545 0 : *recog_data.dup_loc[n]
5546 0 : = *recog_data.operand_loc[(int) recog_data.dup_num[n]];
5547 5283 : if (dump_file != NULL)
5548 0 : fprintf (dump_file, "Restoring SCRATCH in insn #%u(nop %d)\n",
5549 0 : INSN_UID (loc->insn), loc->nop);
5550 : }
5551 : }
5552 1563286 : for (i = 0; scratches.iterate (i, &loc); i++)
5553 91924 : free (loc);
5554 1471362 : scratches.release ();
5555 1471362 : bitmap_clear (&scratch_bitmap);
5556 1471362 : bitmap_clear (&scratch_operand_bitmap);
5557 1471362 : }
5558 :
5559 :
5560 :
5561 : /* If the backend knows where to allocate pseudos for hard
5562 : register initial values, register these allocations now. */
5563 : static void
5564 1471362 : allocate_initial_values (void)
5565 : {
5566 1471362 : if (targetm.allocate_initial_value)
5567 : {
5568 : rtx hreg, preg, x;
5569 : int i, regno;
5570 :
5571 0 : for (i = 0; HARD_REGISTER_NUM_P (i); i++)
5572 : {
5573 0 : if (! initial_value_entry (i, &hreg, &preg))
5574 : break;
5575 :
5576 0 : x = targetm.allocate_initial_value (hreg);
5577 0 : regno = REGNO (preg);
5578 0 : if (x && REG_N_SETS (regno) <= 1)
5579 : {
5580 0 : if (MEM_P (x))
5581 0 : reg_equiv_memory_loc (regno) = x;
5582 : else
5583 : {
5584 0 : basic_block bb;
5585 0 : int new_regno;
5586 :
5587 0 : gcc_assert (REG_P (x));
5588 0 : new_regno = REGNO (x);
5589 0 : reg_renumber[regno] = new_regno;
5590 : /* Poke the regno right into regno_reg_rtx so that even
5591 : fixed regs are accepted. */
5592 0 : SET_REGNO (preg, new_regno);
5593 : /* Update global register liveness information. */
5594 0 : FOR_EACH_BB_FN (bb, cfun)
5595 : {
5596 0 : if (REGNO_REG_SET_P (df_get_live_in (bb), regno))
5597 0 : SET_REGNO_REG_SET (df_get_live_in (bb), new_regno);
5598 0 : if (REGNO_REG_SET_P (df_get_live_out (bb), regno))
5599 0 : SET_REGNO_REG_SET (df_get_live_out (bb), new_regno);
5600 : }
5601 : }
5602 : }
5603 : }
5604 :
5605 0 : gcc_checking_assert (! initial_value_entry (FIRST_PSEUDO_REGISTER,
5606 : &hreg, &preg));
5607 : }
5608 1471362 : }
5609 :
5610 :
5611 :
5612 :
5613 : /* True when we use LRA instead of reload pass for the current
5614 : function. */
5615 : bool ira_use_lra_p;
5616 :
5617 : /* True if we have allocno conflicts. It is false for non-optimized
5618 : mode or when the conflict table is too big. */
5619 : bool ira_conflicts_p;
5620 :
5621 : /* Saved between IRA and reload. */
5622 : static int saved_flag_ira_share_spill_slots;
5623 :
5624 : /* Set to true while in IRA. */
5625 : bool ira_in_progress = false;
5626 :
5627 : /* Set up array ira_hard_regno_nrefs. */
5628 : static void
5629 1471362 : setup_hard_regno_nrefs (void)
5630 : {
5631 1471362 : int i;
5632 :
5633 136836666 : for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
5634 : {
5635 135365304 : ira_hard_regno_nrefs[i] = 0;
5636 135365304 : for (df_ref use = DF_REG_USE_CHAIN (i);
5637 251199416 : use != NULL;
5638 115834112 : use = DF_REF_NEXT_REG (use))
5639 115834112 : if (DF_REF_CLASS (use) != DF_REF_ARTIFICIAL
5640 52987067 : && !(DF_REF_INSN_INFO (use) && DEBUG_INSN_P (DF_REF_INSN (use))))
5641 49814116 : ira_hard_regno_nrefs[i]++;
5642 135365304 : for (df_ref def = DF_REG_DEF_CHAIN (i);
5643 683091933 : def != NULL;
5644 547726629 : def = DF_REF_NEXT_REG (def))
5645 547726629 : if (DF_REF_CLASS (def) != DF_REF_ARTIFICIAL
5646 520404889 : && !(DF_REF_INSN_INFO (def) && DEBUG_INSN_P (DF_REF_INSN (def))))
5647 520404889 : ira_hard_regno_nrefs[i]++;
5648 : }
5649 1471362 : }
5650 :
5651 : /* This is the main entry of IRA. */
5652 : static void
5653 1471362 : ira (FILE *f)
5654 : {
5655 1471362 : bool loops_p;
5656 1471362 : int ira_max_point_before_emit;
5657 1471362 : bool saved_flag_caller_saves = flag_caller_saves;
5658 1471362 : enum ira_region saved_flag_ira_region = flag_ira_region;
5659 1471362 : basic_block bb;
5660 1471362 : edge_iterator ei;
5661 1471362 : edge e;
5662 1471362 : bool output_jump_reload_p = false;
5663 :
5664 1471362 : setup_hard_regno_nrefs ();
5665 1471362 : if (ira_use_lra_p)
5666 : {
5667 : /* First put potential jump output reloads on the output edges
5668 : as USE which will be removed at the end of LRA. The major
5669 : goal is actually to create BBs for critical edges for LRA and
5670 : populate them later by live info. In LRA it will be
5671 : difficult to do this. */
5672 15875438 : FOR_EACH_BB_FN (bb, cfun)
5673 : {
5674 14404076 : rtx_insn *end = BB_END (bb);
5675 14404076 : if (!JUMP_P (end))
5676 5626071 : continue;
5677 8778005 : extract_insn (end);
5678 23796685 : for (int i = 0; i < recog_data.n_operands; i++)
5679 15018825 : if (recog_data.operand_type[i] != OP_IN)
5680 : {
5681 145 : bool skip_p = false;
5682 497 : FOR_EACH_EDGE (e, ei, bb->succs)
5683 715 : if (EDGE_CRITICAL_P (e)
5684 11 : && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
5685 363 : && (e->flags & EDGE_ABNORMAL))
5686 : {
5687 : skip_p = true;
5688 : break;
5689 : }
5690 145 : if (skip_p)
5691 : break;
5692 145 : output_jump_reload_p = true;
5693 497 : FOR_EACH_EDGE (e, ei, bb->succs)
5694 715 : if (EDGE_CRITICAL_P (e)
5695 363 : && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
5696 : {
5697 11 : start_sequence ();
5698 : /* We need to put some no-op insn here. We can
5699 : not put a note as commit_edges insertion will
5700 : fail. */
5701 11 : emit_insn (gen_rtx_USE (VOIDmode, const1_rtx));
5702 11 : rtx_insn *insns = end_sequence ();
5703 11 : insert_insn_on_edge (insns, e);
5704 : }
5705 : break;
5706 : }
5707 : }
5708 1471362 : if (output_jump_reload_p)
5709 139 : commit_edge_insertions ();
5710 : }
5711 :
5712 1471362 : if (flag_ira_verbose < 10)
5713 : {
5714 1471362 : internal_flag_ira_verbose = flag_ira_verbose;
5715 1471362 : ira_dump_file = f;
5716 : }
5717 : else
5718 : {
5719 0 : internal_flag_ira_verbose = flag_ira_verbose - 10;
5720 0 : ira_dump_file = stderr;
5721 : }
5722 :
5723 1471362 : clear_bb_flags ();
5724 :
5725 : /* Determine if the current function is a leaf before running IRA
5726 : since this can impact optimizations done by the prologue and
5727 : epilogue thus changing register elimination offsets.
5728 : Other target callbacks may use crtl->is_leaf too, including
5729 : SHRINK_WRAPPING_ENABLED, so initialize as early as possible. */
5730 1471362 : crtl->is_leaf = leaf_function_p ();
5731 :
5732 : /* Perform target specific PIC register initialization. */
5733 1471362 : targetm.init_pic_reg ();
5734 :
5735 1471362 : ira_conflicts_p = optimize > 0;
5736 :
5737 : /* Determine the number of pseudos actually requiring coloring. */
5738 1471362 : unsigned int num_used_regs = 0;
5739 66815839 : for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
5740 65344477 : if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
5741 29840987 : num_used_regs++;
5742 :
5743 : /* If there are too many pseudos and/or basic blocks (e.g. 10K pseudos and
5744 : 10K blocks or 100K pseudos and 1K blocks) or we have too many function
5745 : insns, we will use simplified and faster algorithms in LRA. */
5746 1471362 : lra_simple_p
5747 1471362 : = (ira_use_lra_p
5748 1471362 : && (num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun)
5749 : /* max uid is a good evaluation of the number of insns as most
5750 : optimizations are done on tree-SSA level. */
5751 1471357 : || ((uint64_t) get_max_uid ()
5752 1471357 : > (uint64_t) param_ira_simple_lra_insn_threshold * 1000)));
5753 :
5754 1471362 : if (lra_simple_p)
5755 : {
5756 : /* It permits to skip live range splitting in LRA. */
5757 5 : flag_caller_saves = false;
5758 : /* There is no sense to do regional allocation when we use
5759 : simplified LRA. */
5760 5 : flag_ira_region = IRA_REGION_ONE;
5761 5 : ira_conflicts_p = false;
5762 : }
5763 :
5764 : #ifndef IRA_NO_OBSTACK
5765 : gcc_obstack_init (&ira_obstack);
5766 : #endif
5767 1471362 : bitmap_obstack_initialize (&ira_bitmap_obstack);
5768 :
5769 : /* LRA uses its own infrastructure to handle caller save registers. */
5770 1471362 : if (flag_caller_saves && !ira_use_lra_p)
5771 0 : init_caller_save ();
5772 :
5773 1471362 : setup_prohibited_mode_move_regs ();
5774 1471362 : decrease_live_ranges_number ();
5775 1471362 : df_note_add_problem ();
5776 :
5777 : /* DF_LIVE can't be used in the register allocator, too many other
5778 : parts of the compiler depend on using the "classic" liveness
5779 : interpretation of the DF_LR problem. See PR38711.
5780 : Remove the problem, so that we don't spend time updating it in
5781 : any of the df_analyze() calls during IRA/LRA. */
5782 1471362 : if (optimize > 1)
5783 963981 : df_remove_problem (df_live);
5784 1471362 : gcc_checking_assert (df_live == NULL);
5785 :
5786 1471362 : if (flag_checking)
5787 1471342 : df->changeable_flags |= DF_VERIFY_SCHEDULED;
5788 :
5789 1471362 : df_analyze ();
5790 :
5791 1471362 : init_reg_equiv ();
5792 1471362 : if (ira_conflicts_p)
5793 : {
5794 1043685 : calculate_dominance_info (CDI_DOMINATORS);
5795 :
5796 1043685 : if (split_live_ranges_for_shrink_wrap ())
5797 27289 : df_analyze ();
5798 :
5799 1043685 : free_dominance_info (CDI_DOMINATORS);
5800 : }
5801 :
5802 1471362 : df_clear_flags (DF_NO_INSN_RESCAN);
5803 :
5804 1471362 : indirect_jump_optimize ();
5805 1471362 : if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
5806 5398 : df_analyze ();
5807 :
5808 1471362 : regstat_init_n_sets_and_refs ();
5809 1471362 : regstat_compute_ri ();
5810 :
5811 : /* If we are not optimizing, then this is the only place before
5812 : register allocation where dataflow is done. And that is needed
5813 : to generate these warnings. */
5814 1471362 : if (warn_clobbered)
5815 134888 : generate_setjmp_warnings ();
5816 :
5817 : /* update_equiv_regs can use reg classes of pseudos and they are set up in
5818 : register pressure sensitive scheduling and loop invariant motion and in
5819 : live range shrinking. This info can become obsolete if we add new pseudos
5820 : since the last set up. Recalculate it again if the new pseudos were
5821 : added. */
5822 1471362 : if (resize_reg_info () && (flag_sched_pressure || flag_live_range_shrinkage
5823 1471267 : || flag_ira_loop_pressure))
5824 43 : ira_set_pseudo_classes (true, ira_dump_file);
5825 :
5826 1471362 : init_alias_analysis ();
5827 1471362 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
5828 1471362 : reg_equiv = XCNEWVEC (struct equivalence, max_reg_num ());
5829 1471362 : update_equiv_regs_prescan ();
5830 1471362 : update_equiv_regs ();
5831 :
5832 : /* Don't move insns if live range shrinkage or register
5833 : pressure-sensitive scheduling were done because it will not
5834 : improve allocation but likely worsen insn scheduling. */
5835 1471362 : if (optimize
5836 1043685 : && !flag_live_range_shrinkage
5837 1043654 : && !(flag_sched_pressure && flag_schedule_insns))
5838 1043632 : combine_and_move_insns ();
5839 :
5840 : /* Gather additional equivalences with memory. */
5841 1471362 : if (optimize && flag_expensive_optimizations)
5842 963943 : add_store_equivs ();
5843 :
5844 1471362 : loop_optimizer_finalize ();
5845 1471362 : free_dominance_info (CDI_DOMINATORS);
5846 1471362 : end_alias_analysis ();
5847 1471362 : free (reg_equiv);
5848 :
5849 : /* Once max_regno changes, we need to free and re-init/re-compute
5850 : some data structures like regstat_n_sets_and_refs and reg_info_p. */
5851 1544110 : auto regstat_recompute_for_max_regno = []() {
5852 72748 : regstat_free_n_sets_and_refs ();
5853 72748 : regstat_free_ri ();
5854 72748 : regstat_init_n_sets_and_refs ();
5855 72748 : regstat_compute_ri ();
5856 72748 : resize_reg_info ();
5857 72748 : };
5858 :
5859 1471362 : int max_regno_before_rm = max_reg_num ();
5860 1471362 : if (ira_use_lra_p && remove_scratches ())
5861 : {
5862 37557 : ira_expand_reg_equiv ();
5863 : /* For now remove_scatches is supposed to create pseudos when it
5864 : succeeds, assert this happens all the time. Once it doesn't
5865 : hold, we should guard the regstat recompute for the case
5866 : max_regno changes. */
5867 37557 : gcc_assert (max_regno_before_rm != max_reg_num ());
5868 37557 : regstat_recompute_for_max_regno ();
5869 : }
5870 :
5871 1471362 : setup_reg_equiv ();
5872 1471362 : grow_reg_equivs ();
5873 1471362 : setup_reg_equiv_init ();
5874 :
5875 1471362 : allocated_reg_info_size = max_reg_num ();
5876 :
5877 : /* It is not worth to do such improvement when we use a simple
5878 : allocation because of -O0 usage or because the function is too
5879 : big. */
5880 1471362 : if (ira_conflicts_p)
5881 1043685 : find_moveable_pseudos ();
5882 :
5883 1471362 : max_regno_before_ira = max_reg_num ();
5884 1471362 : ira_setup_eliminable_regset ();
5885 :
5886 1471362 : ira_overall_cost = ira_reg_cost = ira_mem_cost = 0;
5887 1471362 : ira_load_cost = ira_store_cost = ira_shuffle_cost = 0;
5888 1471362 : ira_move_loops_num = ira_additional_jumps_num = 0;
5889 :
5890 1471362 : ira_assert (current_loops == NULL);
5891 1471362 : if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED)
5892 998009 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS | LOOPS_HAVE_RECORDED_EXITS);
5893 :
5894 1471362 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5895 95 : fprintf (ira_dump_file, "Building IRA IR\n");
5896 1471362 : loops_p = ira_build ();
5897 :
5898 1471362 : ira_assert (ira_conflicts_p || !loops_p);
5899 :
5900 1471362 : saved_flag_ira_share_spill_slots = flag_ira_share_spill_slots;
5901 1471362 : if (too_high_register_pressure_p () || cfun->calls_setjmp)
5902 : /* It is just wasting compiler's time to pack spilled pseudos into
5903 : stack slots in this case -- prohibit it. We also do this if
5904 : there is setjmp call because a variable not modified between
5905 : setjmp and longjmp the compiler is required to preserve its
5906 : value and sharing slots does not guarantee it. */
5907 1337 : flag_ira_share_spill_slots = false;
5908 :
5909 1471362 : ira_color ();
5910 :
5911 1471362 : ira_max_point_before_emit = ira_max_point;
5912 :
5913 1471362 : ira_initiate_emit_data ();
5914 :
5915 1471362 : ira_emit (loops_p);
5916 :
5917 1471362 : max_regno = max_reg_num ();
5918 1471362 : if (ira_conflicts_p)
5919 : {
5920 1043685 : if (! loops_p)
5921 : {
5922 1008085 : if (! ira_use_lra_p)
5923 0 : ira_initiate_assign ();
5924 : }
5925 : else
5926 : {
5927 35600 : expand_reg_info ();
5928 :
5929 35600 : if (ira_use_lra_p)
5930 : {
5931 35600 : ira_allocno_t a;
5932 35600 : ira_allocno_iterator ai;
5933 :
5934 11758159 : FOR_EACH_ALLOCNO (a, ai)
5935 : {
5936 11686959 : int old_regno = ALLOCNO_REGNO (a);
5937 11686959 : int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)->reg);
5938 :
5939 11686959 : ALLOCNO_REGNO (a) = new_regno;
5940 :
5941 11686959 : if (old_regno != new_regno)
5942 1284374 : setup_reg_classes (new_regno, reg_preferred_class (old_regno),
5943 : reg_alternate_class (old_regno),
5944 : reg_allocno_class (old_regno));
5945 : }
5946 : }
5947 : else
5948 : {
5949 0 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
5950 0 : fprintf (ira_dump_file, "Flattening IR\n");
5951 0 : ira_flattening (max_regno_before_ira, ira_max_point_before_emit);
5952 : }
5953 : /* New insns were generated: add notes and recalculate live
5954 : info. */
5955 35600 : df_analyze ();
5956 :
5957 : /* ??? Rebuild the loop tree, but why? Does the loop tree
5958 : change if new insns were generated? Can that be handled
5959 : by updating the loop tree incrementally? */
5960 35600 : loop_optimizer_finalize ();
5961 35600 : free_dominance_info (CDI_DOMINATORS);
5962 35600 : loop_optimizer_init (AVOID_CFG_MODIFICATIONS
5963 : | LOOPS_HAVE_RECORDED_EXITS);
5964 :
5965 35600 : if (! ira_use_lra_p)
5966 : {
5967 0 : setup_allocno_assignment_flags ();
5968 0 : ira_initiate_assign ();
5969 0 : ira_reassign_conflict_allocnos (max_regno);
5970 : }
5971 : }
5972 : }
5973 :
5974 1471362 : ira_finish_emit_data ();
5975 :
5976 1471362 : setup_reg_renumber ();
5977 :
5978 1471362 : calculate_allocation_cost ();
5979 :
5980 : #ifdef ENABLE_IRA_CHECKING
5981 1471362 : if (ira_conflicts_p && ! ira_use_lra_p)
5982 : /* Opposite to reload pass, LRA does not use any conflict info
5983 : from IRA. We don't rebuild conflict info for LRA (through
5984 : ira_flattening call) and cannot use the check here. We could
5985 : rebuild this info for LRA in the check mode but there is a risk
5986 : that code generated with the check and without it will be a bit
5987 : different. Calling ira_flattening in any mode would be a
5988 : wasting CPU time. So do not check the allocation for LRA. */
5989 0 : check_allocation ();
5990 : #endif
5991 :
5992 1471362 : if (max_regno != max_regno_before_ira)
5993 35191 : regstat_recompute_for_max_regno ();
5994 :
5995 1471362 : overall_cost_before = ira_overall_cost;
5996 1471362 : if (! ira_conflicts_p)
5997 427677 : grow_reg_equivs ();
5998 : else
5999 : {
6000 1043685 : fix_reg_equiv_init ();
6001 :
6002 : #ifdef ENABLE_IRA_CHECKING
6003 1043685 : print_redundant_copies ();
6004 : #endif
6005 1043685 : if (! ira_use_lra_p)
6006 : {
6007 0 : ira_spilled_reg_stack_slots_num = 0;
6008 0 : ira_spilled_reg_stack_slots
6009 0 : = ((class ira_spilled_reg_stack_slot *)
6010 0 : ira_allocate (max_regno
6011 : * sizeof (class ira_spilled_reg_stack_slot)));
6012 0 : memset ((void *)ira_spilled_reg_stack_slots, 0,
6013 0 : max_regno * sizeof (class ira_spilled_reg_stack_slot));
6014 : }
6015 : }
6016 1471362 : allocate_initial_values ();
6017 :
6018 : /* See comment for find_moveable_pseudos call. */
6019 1471362 : if (ira_conflicts_p)
6020 1043685 : move_unallocated_pseudos ();
6021 :
6022 : /* Restore original values. */
6023 1471362 : if (lra_simple_p)
6024 : {
6025 5 : flag_caller_saves = saved_flag_caller_saves;
6026 5 : flag_ira_region = saved_flag_ira_region;
6027 : }
6028 1471362 : }
6029 :
6030 : /* Modify asm goto to avoid further trouble with this insn. We can
6031 : not replace the insn by USE as in other asm insns as we still
6032 : need to keep CFG consistency. */
6033 : void
6034 6 : ira_nullify_asm_goto (rtx_insn *insn)
6035 : {
6036 6 : ira_assert (JUMP_P (insn) && INSN_CODE (insn) < 0);
6037 6 : rtx tmp = extract_asm_operands (PATTERN (insn));
6038 6 : PATTERN (insn) = gen_rtx_ASM_OPERANDS (VOIDmode, ggc_strdup (""), "", 0,
6039 : rtvec_alloc (0),
6040 : rtvec_alloc (0),
6041 : ASM_OPERANDS_LABEL_VEC (tmp),
6042 : ASM_OPERANDS_SOURCE_LOCATION(tmp));
6043 6 : }
6044 :
6045 : static void
6046 1471362 : do_reload (void)
6047 : {
6048 1471362 : basic_block bb;
6049 1471362 : bool need_dce;
6050 1471362 : unsigned pic_offset_table_regno = INVALID_REGNUM;
6051 :
6052 1471362 : if (flag_ira_verbose < 10)
6053 1471362 : ira_dump_file = dump_file;
6054 :
6055 : /* If pic_offset_table_rtx is a pseudo register, then keep it so
6056 : after reload to avoid possible wrong usages of hard reg assigned
6057 : to it. */
6058 1471362 : if (pic_offset_table_rtx
6059 1471362 : && REGNO (pic_offset_table_rtx) >= FIRST_PSEUDO_REGISTER)
6060 : pic_offset_table_regno = REGNO (pic_offset_table_rtx);
6061 :
6062 1471362 : timevar_push (TV_RELOAD);
6063 1471362 : if (ira_use_lra_p)
6064 : {
6065 1471362 : if (current_loops != NULL)
6066 : {
6067 998009 : loop_optimizer_finalize ();
6068 998009 : free_dominance_info (CDI_DOMINATORS);
6069 : }
6070 18926800 : FOR_ALL_BB_FN (bb, cfun)
6071 17455438 : bb->loop_father = NULL;
6072 1471362 : current_loops = NULL;
6073 :
6074 1471362 : ira_destroy ();
6075 :
6076 1471362 : lra (ira_dump_file, internal_flag_ira_verbose);
6077 : /* ???!!! Move it before lra () when we use ira_reg_equiv in
6078 : LRA. */
6079 1471362 : vec_free (reg_equivs);
6080 1471362 : reg_equivs = NULL;
6081 1471362 : need_dce = false;
6082 : }
6083 : else
6084 : {
6085 0 : df_set_flags (DF_NO_INSN_RESCAN);
6086 0 : build_insn_chain ();
6087 :
6088 0 : need_dce = reload (get_insns (), ira_conflicts_p);
6089 : }
6090 :
6091 1471362 : timevar_pop (TV_RELOAD);
6092 :
6093 1471362 : timevar_push (TV_IRA);
6094 :
6095 1471362 : if (ira_conflicts_p && ! ira_use_lra_p)
6096 : {
6097 0 : ira_free (ira_spilled_reg_stack_slots);
6098 0 : ira_finish_assign ();
6099 : }
6100 :
6101 1471362 : if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL
6102 96 : && overall_cost_before != ira_overall_cost)
6103 0 : fprintf (ira_dump_file, "+++Overall after reload %" PRId64 "\n",
6104 : ira_overall_cost);
6105 :
6106 1471362 : flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots;
6107 :
6108 1471362 : if (! ira_use_lra_p)
6109 : {
6110 0 : ira_destroy ();
6111 0 : if (current_loops != NULL)
6112 : {
6113 0 : loop_optimizer_finalize ();
6114 0 : free_dominance_info (CDI_DOMINATORS);
6115 : }
6116 0 : FOR_ALL_BB_FN (bb, cfun)
6117 0 : bb->loop_father = NULL;
6118 0 : current_loops = NULL;
6119 :
6120 0 : regstat_free_ri ();
6121 0 : regstat_free_n_sets_and_refs ();
6122 : }
6123 :
6124 1471362 : if (optimize)
6125 1043685 : cleanup_cfg (CLEANUP_EXPENSIVE);
6126 :
6127 1471362 : finish_reg_equiv ();
6128 :
6129 1471362 : bitmap_obstack_release (&ira_bitmap_obstack);
6130 : #ifndef IRA_NO_OBSTACK
6131 : obstack_free (&ira_obstack, NULL);
6132 : #endif
6133 :
6134 : /* The code after the reload has changed so much that at this point
6135 : we might as well just rescan everything. Note that
6136 : df_rescan_all_insns is not going to help here because it does not
6137 : touch the artificial uses and defs. */
6138 1471362 : df_finish_pass (true);
6139 1471362 : df_scan_alloc (NULL);
6140 1471362 : df_scan_blocks ();
6141 :
6142 1471362 : if (optimize > 1)
6143 : {
6144 963981 : df_live_add_problem ();
6145 963981 : df_live_set_all_dirty ();
6146 : }
6147 :
6148 1471362 : if (optimize)
6149 1043685 : df_analyze ();
6150 :
6151 1471362 : if (need_dce && optimize)
6152 0 : run_fast_dce ();
6153 :
6154 : /* Diagnose uses of the hard frame pointer when it is used as a global
6155 : register. Often we can get away with letting the user appropriate
6156 : the frame pointer, but we should let them know when code generation
6157 : makes that impossible. */
6158 1471362 : if (global_regs[HARD_FRAME_POINTER_REGNUM] && frame_pointer_needed)
6159 : {
6160 2 : tree decl = global_regs_decl[HARD_FRAME_POINTER_REGNUM];
6161 2 : error_at (DECL_SOURCE_LOCATION (current_function_decl),
6162 : "frame pointer required, but reserved");
6163 2 : inform (DECL_SOURCE_LOCATION (decl), "for %qD", decl);
6164 : }
6165 :
6166 : /* If we are doing generic stack checking, give a warning if this
6167 : function's frame size is larger than we expect. */
6168 1471362 : if (flag_stack_check == GENERIC_STACK_CHECK)
6169 : {
6170 49 : poly_int64 size = get_frame_size () + STACK_CHECK_FIXED_FRAME_SIZE;
6171 :
6172 4557 : for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
6173 4508 : if (df_regs_ever_live_p (i)
6174 235 : && !fixed_regs[i]
6175 4679 : && !crtl->abi->clobbers_full_reg_p (i))
6176 84 : size += UNITS_PER_WORD;
6177 :
6178 49 : if (constant_lower_bound (size) > STACK_CHECK_MAX_FRAME_SIZE)
6179 1 : warning (0, "frame size too large for reliable stack checking");
6180 : }
6181 :
6182 1471362 : if (pic_offset_table_regno != INVALID_REGNUM)
6183 80395 : pic_offset_table_rtx = gen_rtx_REG (Pmode, pic_offset_table_regno);
6184 :
6185 1471362 : timevar_pop (TV_IRA);
6186 1471362 : }
6187 :
6188 : /* Run the integrated register allocator. */
6189 :
6190 : namespace {
6191 :
6192 : const pass_data pass_data_ira =
6193 : {
6194 : RTL_PASS, /* type */
6195 : "ira", /* name */
6196 : OPTGROUP_NONE, /* optinfo_flags */
6197 : TV_IRA, /* tv_id */
6198 : 0, /* properties_required */
6199 : 0, /* properties_provided */
6200 : 0, /* properties_destroyed */
6201 : 0, /* todo_flags_start */
6202 : TODO_do_not_ggc_collect, /* todo_flags_finish */
6203 : };
6204 :
6205 : class pass_ira : public rtl_opt_pass
6206 : {
6207 : public:
6208 285722 : pass_ira (gcc::context *ctxt)
6209 571444 : : rtl_opt_pass (pass_data_ira, ctxt)
6210 : {}
6211 :
6212 : /* opt_pass methods: */
6213 1471370 : bool gate (function *) final override
6214 : {
6215 1471370 : return !targetm.no_register_allocation;
6216 : }
6217 1471362 : unsigned int execute (function *) final override
6218 : {
6219 1471362 : ira_in_progress = true;
6220 1471362 : ira (dump_file);
6221 1471362 : ira_in_progress = false;
6222 1471362 : return 0;
6223 : }
6224 :
6225 : }; // class pass_ira
6226 :
6227 : } // anon namespace
6228 :
6229 : rtl_opt_pass *
6230 285722 : make_pass_ira (gcc::context *ctxt)
6231 : {
6232 285722 : return new pass_ira (ctxt);
6233 : }
6234 :
6235 : namespace {
6236 :
6237 : const pass_data pass_data_reload =
6238 : {
6239 : RTL_PASS, /* type */
6240 : "reload", /* name */
6241 : OPTGROUP_NONE, /* optinfo_flags */
6242 : TV_RELOAD, /* tv_id */
6243 : 0, /* properties_required */
6244 : 0, /* properties_provided */
6245 : 0, /* properties_destroyed */
6246 : 0, /* todo_flags_start */
6247 : 0, /* todo_flags_finish */
6248 : };
6249 :
6250 : class pass_reload : public rtl_opt_pass
6251 : {
6252 : public:
6253 285722 : pass_reload (gcc::context *ctxt)
6254 571444 : : rtl_opt_pass (pass_data_reload, ctxt)
6255 : {}
6256 :
6257 : /* opt_pass methods: */
6258 1471370 : bool gate (function *) final override
6259 : {
6260 1471370 : return !targetm.no_register_allocation;
6261 : }
6262 1471362 : unsigned int execute (function *) final override
6263 : {
6264 1471362 : do_reload ();
6265 1471362 : return 0;
6266 : }
6267 :
6268 : }; // class pass_reload
6269 :
6270 : } // anon namespace
6271 :
6272 : rtl_opt_pass *
6273 285722 : make_pass_reload (gcc::context *ctxt)
6274 : {
6275 285722 : return new pass_reload (ctxt);
6276 : }
|