Optimization

On this page we will dive into common optimization techniques used most commonly by compilers.

Inter-procedural constant propagation

Inter-procedural constant propagation is at the core just simulating operations at compile-time to prevent them needing to be performed at runtime.

assuming we have an operation such as:

x = 5 + 5

this optimization technique would look for operations like this which would in bytecode look roughly like:

iconst_5
iconst_5
iadd
istore x

first step is to simulate the stack, this means when we see iconst_5 we perform push(5) on our local stack, then when we meet an arithmetic instruction such as iadd we pop two stack elements from our local stack and add them together as dictated by the arithmetic instruction resulting in a single constant which we can use to replace our non-optimized bytecode.

Meaning we replace the operation with an instruction pushing our resulting constant from our simulation.

So our 4 instructions turn into two:

bipush 10
istore x

we do this sort of optimization for every arithmetic and bit-wise operation like *SHL, *SHR, *ADD, *SUB, *DIV, *MUL, *XOR, *AND, *OR and etc.

Now this is just called constant propagation or constant folding, the inter-procedural version is able to cross into locals, methods, fields, etc.

Meaning it is able to propagate stuff like:

y = 5
x = y + 5

field = 5

// method scope
{
   x = field + 5
}

methodA() {
   x = methodB() + 5
}

methodB(): int {
   return 5
}

which makes it useful for stuff like compilers, deobfuscators, etc.

It is useful for a compiler because it can prevent the need for computing the operations at runtime which slows down the program.

For deobfuscators it can be useful for deobfuscating number mutations using operations like *XOR and the outlining of constants (into methods and fields).

Dead block & code elimination

This technique simply simulates jumping into blocks marking ones that don't lead to the correct target, it works alongside constant propagation.

Meaning stuff like conditional jumps can be resolved by first propagating the operations used for the conditional and simulating the jump.

This can be useful for eliminating dead paths of switch statements that essentially don't lead anywhere or in general conditional jumps like if* by checking if the jump will ever be taken, if not it can be completely pruned.

Let's take this example:

// code
x = 5
if (x != 5) {
   // code
}
// code

the conditional jump after dead block elimination would hence the name be eliminated because the comparison operation would be propagated resulting in the conditional being marked as never branching.

But this is only step one of this technique as it also optimizes things like unused variable assignments, in-lining calls to pure methods (methods doing nothing but returning a constant value)

Loop unrolling

Loop unrolling works by essentially eliminating the need for a loop by doing the operations a loop usually performs manually.

Let's take this code for example:

cycles = 5
i = 0

while (i < cycles) {
   print(i)
   i++
}

Loop unrolling would simply do the print statements manually preventing the need for any jumps.

So after the optimization the code would be roughly like this:

i = 0
print(i++)
print(i++)
print(i++)
print(i++)
print(i++)

Now even though this does not look very pretty it does perform better in terms of speed because in this case the need for jumps to produce the loop is not needed anymore.

Peephole optimization

This is a very basic optimization technique that usually consists of multiple transformations to the bytecode.

The java compiler is notorious for generating some slow bytecode, for example:

iconst_0
istore x
iload x
ireturn

this could be a very real case that happens, if you don't see the issue well it's the storing of a variable and then loading it immediately after.

This is dumb because it wastes performance when in reality we could just remove the stores all together. making it return a pure constant:

iconst_0
ireturn

this is one of the examples of peephole optimization, another one is optimizing arithmetics into bit-wise operations for example instead of multiplying a constant by two using:

iconst_2
iconst_5
imul

we could instead perform this bit-wise operation which is slightly faster by essentially shifting our constant (5) to the left by 1:

iconst_1
iconst_5
ishl

another optimization i will give you is also a very simple one but that is kind of the intent of peephole optimizations, they're very easy to implement and save a little bit of performance.

Instead of using instructions like LDC, BIPUSH, SIPUSH for constants lower than or equal to 5 we could use immediate instructions like *const_*.

Now this is I believe an optimization already done by the java compiler but it doesn't hurt to implement it quickly anyway in the case of maybe an obfuscator doing this.

so an operation like bipush 5 or sipush 5 or ldc 5 would be replaced with the immediate version: iconst_5.

This is especially useful when the code is using LDC because LDC requires a constant pool entry in the class file which needs to be made by the compiler taking up some space as well as the JVM has to look up the constant from the pool which wastes performance.

And the last peephole optimization I will provide you with is another simple one. 2 variable load instructions can be replaced with 1 variable load instruction and a dup to duplicate the element which is faster for the JVM than looking up the variable.

So code like this (x squared):

iload x
iload x
imul

may be replaced with:

iload x
dup
imul

PreviousThe class file

Last updated 1 year ago

hashtagOptimization

hashtagInter-procedural constant propagation

hashtagDead block & code elimination

hashtagLoop unrolling

hashtagPeephole optimization

Optimization

Inter-procedural constant propagation

Dead block & code elimination

Loop unrolling

Peephole optimization