Project - Stage 1

In this post I'll be planning out the steps I'll take to optimize an open source software package.

Software Package
John the Ripper

**update November 7, 2018**
bzip2

Description of the software
John the Ripper is a fast password cracker. Its primary purpose is to detect weak Unix passwords.
- Openwall

**update November 7, 2018**
bzip2 is a file compression program.

Available platforms
Unix, Windows, DOS, and OpenVMS

**update November 7, 2018**
aarch64, armv7hl, i686, ppc64, ppc64le, s390x, x86_64

What type of function is used
The software uses hash function

**update November 7, 2018**
The software uses compression based on Burrows-Wheeler algorithm
-bzip.org

Strategy
Software Source Code Make UP
Assembly Files .[sS] = 5
Files with Inline Assembly Code = 2
C Files = 54

Makefile Option
CC = gcc
CFLAGS = -c -Wall -Wdeclaration-after-statement -O2 -fomit-frame-pointer

**update November 7, 2018**
The software does not contain any assembler codes. The software is built with C codes and compiled with -O2 optimization level.

Using the above base information of the software, following are my plans for the Stage 2 and Stage 3 to optimize the  software.

**Data to be used**
After reading through README and FAQ files, it seems that the software uses hash file of the password file. The only password file I can use is from my RPi3 but it only has one password in it. For bench and perf purposes I plan to use use a larger file. The software contains a file 'password.lst' that contains list of passwords. My plan is to use password.lst file, create a hash file of passwords that mimics the /etc/shadow file. I'll most likely need to write a program to create this file.

**update November 7, 2018**
For the purpose of benching the performance, I'll be using three text files of different size, three images of different size, and three PDFs of different size.

-Wishlist-
If within the capability of the bzip2, I will test with mp3 and mp4 files.

Stage 2 Plans - Profiling
In this stage the goal is to bench the software with sample data.
I will use perf to get the performance report, dot to visualize the call graph.
It will be benched numerous times to ensure I have consistent performance report.
The test will be done on two different aarch64 platform (school server aarchie, my RPi3) and x86_64 (school server xerxes or my PC).
The data I'll be using will be the files I mentioned above.
Currently my plan is to optimize for time, but this may change after profiling.
My plan is to optimize for time it takes to compress/decompress, and maximize the compression to compress the file smaller.
This plan is expected to change based on the outcome of the performance profiling.

Stage 3 Plans - Optimization
In this stage the goal is to implement the changes in the code, test the changes for the output, profile the changes, and compare the results.
My current plan is to optimize the code in C file, hopefully a function or two in it. This plan may change after profiling.
Plan for optimization is to work with the C codes and algorithms to maximize the performance as much as possible. If this provides limitations I'll look into making certain part(s) of the function platform specific using inline assembler code.
After changes have been made, I will run multiple tests using the same data used in Stage 2 and compare the results to ensure that the changes I made are not affecting the output.
My proof of success will be shortened time to crack the passwords.
My proof of success will be result of faster compression/decompression and smaller compressed file size.

At this stage I have yet to profile the software. As mentioned earlier I need to create the data file I need to run the tests. Once I have the data files and ready to profile I'll update this stage with more details of my plans.

I plan to do finish performance profiling by November 12, 2018.


**Reason for change**
November 7, 2018
I was having problem getting the John the Ripper to work on my local machine and on the school server. After spending about one week trying to figure out why I wasn't able to crack any passwords, I came to a conclusion that it might be a platform issue. John the Ripper's INSTALL file states that after running 'make' command another 'make clean SYSTEM' command needs to be ran, where 'SYSTEM' is a target platform. The list of SYSTEMs in Makefile did not have aarch64 specified; aarch64 platform is handled by the 'generic' command. According to the JtR's documentation 'generic' command should build the program to work. After digging around the Google and Youtube, I came to realize that the program wasn't working normally on my machine. Successful execution of JtR were usually on Ubuntu or Kali.
Based on these findings I decided not to waste any more time and decided to change the software.

Comments