Previous: , Up: Optimization options   [Contents][Index]


9.16.5 Optimizations specific to low level code


--try-switch-size N

The number of alternatives in a try/retry chain switch must be at least this number (default: 3).


--binary-switch-size N

The number of alternatives in a binary search switch must be at least this number (default: 4).


--middle-rec

Enable the middle recursion optimization.


Optimization levels 1 to 6 automatically set –middle-rec.


--simple-neg

Generate simplified code for simple negations.


Optimization levels 2 to 6 automatically set –simple-neg.


--llds-optimize
--llds-optimise

Enable the LLDs->LLDS optimization passes.


Optimization levels 0 to 6 automatically set –llds-optimize.


--optimize-repeat N
--optimise-repeat N

Iterate most LLDS->LLDS optimizations at most N times (default: 3).


Optimization levels 0 to 1 automatically set –optimize-repeat=1.


Optimization level 2 automatically sets –optimize-repeat=3.


Optimization levels 3 to 4 automatically set –optimize-repeat=4.


Optimization levels 5 to 6 automatically set –optimize-repeat=5.


--optimize-peep
--optimise-peep

Enable local peephole optimizations.


Optimization levels 0 to 6 automatically set –optimize-peep.


--optimize-labels
--optimise-labels

Delete dead labels, and the unreachable code following them.


Optimization levels 0 to 6 automatically set –optimize-labels.


--optimize-jumps
--optimise-jumps

Enable the short-circuiting of jumps to jumps.


Optimization levels 0 to 6 automatically set –optimize-jumps.


--optimize-fulljumps
--optimise-fulljumps

Enable the elimination of jumps to ordinary code.


Optimization levels 2 to 6 automatically set –optimize-fulljumps.


--checked-nondet-tailcalls

Convert nondet calls into tail calls whenever possible, even when this requires a runtime check. This option tries to minimize stack consumption, possibly at the expense of speed.


--pessimize-tailcalls

Disable the optimization of tailcalls. This option tries to minimize code size at the expense of speed.


--optimize-delay-slot
--optimise-delay-slot

Disable branch delay slot optimizations, This option is meaningful only if the target architecture has delay slots.


Optimization levels 1 to 6 automatically set –optimize-delay-slot.


--optimize-frames
--optimise-frames

Optimize the operations that maintain stack frames.


Optimization levels 1 to 6 automatically set –optimize-frames.


--optimize-reassign
--optimise-reassign

Optimize away assignments to memory locations that already hold the to-be-assigned value.


Optimization levels 3 to 6 automatically set –optimize-reassign.


--use-local-vars

Use local variables in C code blocks wherever possible.


Optimization levels 1 to 6 automatically set –use-local-vars.


--optimize-dups
--optimise-dups

Enable elimination of duplicate code within procedures.


Optimization levels 2 to 6 automatically set –optimize-dups.


--optimize-proc-dups
--optimise-proc-dups

Enable elimination of duplicate procedures.


--common-data

Enable optimization of common data structures.


Optimization levels 0 to 6 automatically set –common-data.


--no-common-layout-data

Disable optimization of common subsequences in layout structures.


--layout-compression-limit N

Attempt to compress the layout structures used by the debugger only as long as the arrays involved have at most N elements (default: 4000).


--emit-c-loops

Use C loop contstructs to implement loops. With ‘--no-emit-c-loops’, use only gotos.


Optimization levels 1 to 6 automatically set –emit-c-loops.


--procs-per-c-function N
--procs-per-C-function N

Put the code for up to N Mercury procedures in a single C function. The default value of N is one. Increasing N can produce slightly more efficient code, but makes compilation slower.


--no-local-thread-engine-base

Do not copy the thread-local Mercury engine base address into a local variable, even when this would be appropriate. This option is effective only in low-level parallel grades that do not use the GNU C global register variables extension.


--inline-alloc

Inline calls to GC_malloc(). This can improve performance a fair bit, but may significantly increase code size. This option is meaningful only if the selected garbage collector is boehm, and if the C compiler is gcc.


Optimization level 6 automatically sets –inline-alloc.


--use-macro-for-redo-fail

Emit the fail or redo macro instead of a branch to the fail or redo code in the runtime system. This produces slightly bigger but slightly faster code.


Optimization level 6 automatically sets –use-macro-for-redo-fail.


Previous: , Up: Optimization options   [Contents][Index]