Our competences at your disposal on our blog
We love our job and we are glad to share it with you! Keep yourself fully informed about news and technologies we use to develop and which you may need! Follow our blog.
×

Error message

The spam filter installed on this site is currently unavailable. Per site policy, we are unable to accept new submissions until that problem is resolved. Please try resubmitting the form in a couple of minutes.
andrea.piolanti's picture
09/09/2013 - 14:42

If you are developing a java application, it is important to know that the java class files can be easily reverse engineered and decompiled into the original source code using java decompilers such as JD GUI or JAD. The best solution to prevent reverse engineering is to obfuscate the class file. According to the dictionary "obfuscation" in general, describes a practice that is used to intentionally make something more difficult to understand. In a programming context, it means to make code harder to understand or read, generally for privacy or security purposes. A software tool called obfuscator allows to convert a straight-forward program into one that works in the same way but is much harder to understand. Generally, an obfuscator tool performs three different step:

  • Shrinking
  • Optimization
  • Obfuscation

The shrinking step recursively determines which classes and class members are used. All classes and class members not used are discarded .If your application is distributed with a third party library then the number of classes, fields and methods that are not actually used could be quite significant.
The optimization step, optimizes the code and in particular: merges classes, removes attributes irrelevant to execution, sorts local variables, collapses method body inline, removes unused constant, optimizes code statement like if, for, switch and in general makes the code more compact and less user friendly.
The obfuscation step renames with meaningless name classes and class members that are not entry points. In this process keeping the entry points ensures that they can still be accessed by their original names. Every java project has at least one entry point: for example the class that contains the main for a java application or every class that contains public API for a java library. In general an entry point is a class that, if obfuscated, will prevent the library or the application from working properly.

On the web, is possible to find various obfuscators, but the most valid and open source is Proguard (http://proguard.sourceforge.net/#), developed and maintained by Eric Lafortune. Proguard also has a complete documentation and a useful and active forum.

The Proguard take in two different input:

  • input jars specified with the -injars option, those are the jars processed and obfuscated by Proguard that finally writes the processed results into one or more output jars (or wars).
  • library jars (or wars) specified with the -libaryjars option, those are essentially the dependency libraries of the input jar. ProGuard uses them to reconstruct the class dependencies that are necessary for proper processing. The library jars themselves always remain unchanged.

In addition to this inputs is possible to specify a list of options that allow to configure the proguard tool and so the output jar. All of this options can be specified in a configuration file that the proguard will take as input.
Following there is a brief description of the major options that it is worth to take into account when we use every obfuscator.
proguard-1.png

As described previously every java project has one or more entry points that must be preserved from the obfuscation. Proguard allows to specify the entry points that we want to exclude from the obfuscation, with the option -keep.
For example the option:

-keep public class *{
    public static void main(java.lang.String[]);
}

It also allows to preserve all the classes that contain a main method.

There are even other classes or pieces of software that must be preserved:

  • native code entry point if you use JNI
    -keepclasseswithmembernames class * {
    native ;
    }
  • annotations if the software references them. This is required if the application for instance uses JSONSerialization or JAXB features
    -keepattributes *Annotations*
  • to produce useful stacktrace or log is important to keep the line number (due to optimization step the final jar contains classes with different structure than the original one)
    -keepattributes LineNumberTable
  • if the application uses reflection you must preserve the entities that are invoked with reflection
    -keep public class className or package
  • in general every class that is created or invoked dynamically (that is, by name) with construct Class.forName(), SomeClass.class, .... In fact, Class.forName() constructs may refer to any class at run-time.

Finally an important remark about the obfuscation of a java library. If you have to distribute the library standalone you must preserve all of the library entry points (public API) otherwise the library will be useless. However, if the library is just needed from another application and you have to distribute your application together with the library, may be you would like to obfuscate the whole library since nobody except from your application will use it. In this situation you can exploit incremental obfuscation and so obfuscate the library and the application in the same compilation process. For example if the java application A (a.jar) uses the library B (B.jar) you could configure proguard as follow:

  • injars ‘B.jar’
  • outjars B_obfuscated.jar
  • injars ‘A.jar’
  • outjars A_obfuscated.jar

In this way Proguard computes the dependencies beetwen the input jars and uses the output of the first JAR as input of the obfuscation of the second JARs. In this way the library B is fully obfuscated.

Summarizing, Proguard is a useful tool to shrink, optimize and obfuscate java code. Despite of a not trivial configuration it allows to tune the output jar to match every requirement. As every obfuscator, the output can still be decompiled but in order to understand the code you have to waste a lot of time, patience and probably it will drive you crazy!

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.