Lab 6 - Inline Assembler

In this post I'll be exploring inline assembly.

Part A
In this part I'll be using the code provided. The code used for this is the same code used in Algorithm Selection lab (volume scaling program).
A1
Let's take a look at the performance by comparing the runtime of program built in Algorithm Selection lab and the provided program with inline assembler.
Algorithem Selection Program Runtime
Data Size 100million
real  5.155s
user 4.882s
sys   0.270s

Inline Assembler Program Runtime
Data Size 100million
real  5.061s
user 4.848s
sys   0.210s

We can see that inline assembler program is slightly faster but not significantly.

A2
Q1: What is an alternate approach?

As we have seen until now, instead of manually declaring and using the register for each variables in the code let the gcc compiler decide which register to use for each variables.


Q2: Should we use 32767 or 32768 in next line? why?
vol_int = (int16_t) (0.75*32767.0);

Here we should use 32767.0. int16_t has size of -32768 to 32767.


Q3: What does it mean to "duplicate" values in the next line?

It means to duplicate contents of the general-purpose register into each element in a vector.
The v1.8h elements are a value held in the input register w0.
Source: keil.com


Q4: What happens if we remove the following?
: [in]"+r"(in_cursor)
:"0"(in_cursor),[out]"r"(out_cursor)

The code does not compile and generates error message
:undefined named operand 'in'
:undefined named operand 'out'


Q5: Are the results usable? Are they correct?

The result is usable and correct.


Part B
OpenBLAS
Q1: How much assembly-language code is present?
There are 210 files with inline-assembler code (asm, __asm__), and 1001 .s/.S files.
Total there are 1211 files with assembly code.

Q2: Is the assembly code in its own file or inline?
The assembly codes are in both .s/.S file and as inline within .c file.

Q3: Which platform(s) the assembler is used on
The platforms with assembler codes in .s/.S file are for:
- alpha
- arm
- arm64
- ia64
- mips64
- power
- sparc
- x86
- x86_64
- zarch

Q4: What happens on other platforms
It seems that directory cmake is responsible for running the codes that checks for the system platform. I  think the file system_check.cmake inside the cmake is one of the files used to select Makefile accordingly to the system platform. In the root folder there are different Makefiles for each platform, and based on the result from cmake directory, specific Make file is ran to build the software.

Q5: Why is it there (what it does)
The .s/.S files inside the kernel directory are system specific. These are the ones ensuring any components of the software that are system specific can be ran on different platforms. Also, the software can be optimized to its fullest allowed by the platform.

Q6: Your opinion of the value of the assembler code, especially when contrasted with the loss of portability and increase in complexity of the code.
My opinion of the value of the assembler code in software depends on the usage of that software. Software, like Photoshop, that requires lot of CPU usage would run better with assembler code handling the core components. If the software isn't meant to be very taxing on CPU I think having generalized code is sufficient.

Comments