Apache + Yarn + Spark: Play with Twitter data!

Apache + Yarn + Spark: Play with Twitter data!

In this tutorial I want to write about using Apache Spark on Ubuntu machines where you can develop big data analysis apps with it.

First of all, I want to write a small and quick introduction to Hadoop + Spark environment. Hadoop makes it possible to work with lots of computers in a cluster. Work can be: storing files in cluster (HDFS – Hadoop Distributed File System), storing database in cluster (Apache HBase), or run software in cluster (MapReduce, Spark).

2

Reversing Java: Part I

Recently I’ve interested in byte code structure of Java and Dalvik. I’ve found some useful tools for playing with them.

Destination Byte Code

Java byte codes are simple to reverse engineering because they compile in run time. i.e. JVM will execute the byte codes in run time, thus Java code is cross platform but executes with more delay than direct compiled machine codes (for example using C++ and gcc).