Recently I’ve interested in byte code structure of Java and Dalvik. I’ve found some useful tools for playing with them.
Destination Byte Code
Java byte codes are simple to reverse engineering because they compile in run time. i.e. JVM will execute the byte codes in run time, thus Java code is cross platform but executes with more delay than direct compiled machine codes (for example using C++ and gcc).
Compiling from Java Source Code
Reversing Java byte codes are simpler than reversing machine codes. Oracle has a documentation about byte code in Java. In Java each source file (.java file) will be compile to a class file (.class) using the following command:
javac HelloWorld.java
This will create the HelloWorld.class file in the same folder where Java code exists. You may use Java’s default reversing tool (javap) for paying with class files.
Result of execution of javap HelloWorld.class
Compiled from "HelloWorld.java"
public class HelloWorld {
public HelloWorld();
public static void main(java.lang.String[]);
}
If you prefer more verbose results (javap -v HelloWorld.class
):
Classfile HelloWorld.class
Last modified Aug 28, 2016; size 426 bytes
MD5 checksum 90469fd7405d19947ba6e767de8a8b5f
Compiled from "HelloWorld.java"
public class HelloWorld
SourceFile: "HelloWorld.java"
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #6.#15 // java/lang/Object."<init>":()V
#2 = Fieldref #16.#17 // java/lang/System.out:Ljava/io/Prin
tStream;
#3 = String #18 // Hello World!
#4 = Methodref #19.#20 // java/io/PrintStream.println:(Ljava
/lang/String;)V
#5 = Class #21 // HelloWorld
#6 = Class #22 // java/lang/Object
#7 = Utf8 <init>
#8 = Utf8 ()V
#9 = Utf8 Code
#10 = Utf8 LineNumberTable
#11 = Utf8 main
#12 = Utf8 ([Ljava/lang/String;)V
#13 = Utf8 SourceFile
#14 = Utf8 HelloWorld.java
#15 = NameAndType #7:#8 // "<init>":()V
#16 = Class #23 // java/lang/System
#17 = NameAndType #24:#25 // out:Ljava/io/PrintStream;
#18 = Utf8 Hello World!
#19 = Class #26 // java/io/PrintStream
#20 = NameAndType #27:#28 // println:(Ljava/lang/String;)V
#21 = Utf8 HelloWorld
#22 = Utf8 java/lang/Object
#23 = Utf8 java/lang/System
#24 = Utf8 out
#25 = Utf8 Ljava/io/PrintStream;
#26 = Utf8 java/io/PrintStream
#27 = Utf8 println
#28 = Utf8 (Ljava/lang/String;)V
{
public HelloWorld();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>
":()V
4: return
LineNumberTable:
line 1: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: getstatic #2 // Field java/lang/System.out:Ljav
a/io/PrintStream;
3: ldc #3 // String Hello World!
5: invokevirtual #4 // Method java/io/PrintStream.prin
tln:(Ljava/lang/String;)V
8: return
LineNumberTable:
line 3: 0
line 4: 8
}
Reversing Byte Code
When you open the class file using hex editor, you will see some bytes.
But how javap can get all the information from these bytes? This is where reversing begins …
Each class file has the following structure:
ClassFile {
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
We will go through the byte codes step by step. Today we analyze some basic parts:
Magic
As you see, the first bytes are ca fe ba be
where java uses CAFE BABE as its magic for declaring class file.
Minor and Major
The next four bytes 00 00 (0 in decimal) and 00 34 (52 in decimal) shows the minor and major versions. I’ve compiled the code with Java SE 8, so you can’t run the code with Java SE 7.
So the minor version in 0 and major version in 52.
Constant Pool
Constant pool is where all the constants used in the class file are stored in it. the next two bytes shows the constant pool size (00 1d = 29 in decimal). The 0th item is for JMV so the items are in constant_pool[1] to constant_pool[28]
Continue Reading from secound part: