The main aspects that should be considered for optimizing a cryptographic system are:

  • Exploring parallelism in the algorithm
  • Multi sample processing and split summation
  • Speed optimization at the expense of increasing the code size
  • Function call; argument passing increases overhead
  • Compiler may use inline functions
  • Task priority
  • Interrupt service management
  • Time-sliced multi-tasking
  • I/O queues management
  • Interrupt disable while generating the key

Structural Partitioning

Input and output buffers were kept aside so that the core could be processed without any interrupts. The external tasks must not be allowed to enter the critical path.

Critical Paths

In many cases, programs have a high-cost critical path that needs to be optimized. It makes sense to optimize the critical paths to a higher extent than the less critical paths.

Computational Complexity

Many programs need to perform highly complex sets of arithmetic functions. Such complex functions can be made simpler by exploring other alternatives such as look-up tables and bit-manipulation.

Reusability and Functionality

Programming should be performed in such a way that the program modules are flexible so they can be used again in the application.

Parallel Tasks

Hardware devices typically have a high-level of parallelism when compared to software devices. Design of an embedded device should include consideration of such parallelism found in the hardware.

Instruction-level Parallelism

A single instruction is broken into different instructions capable of being executed in parallel. Different register sets should be used to perform individual instructions, which result in instruction-level parallelism that makes the code efficient for multi-processors.

Recursive Tasks

Some tasks in a program need to be executed a finite number of times. Such tasks are called as recursive tasks. Recursive tasks have an overhead that needs to be checked when the instruction sequence should jump out of the loop.

  • Loop Unrolling: For a small number of repetitions, the overhead could be removed altogether by replacing the loop with the code components for that fixed number of times. This technique is called loop unrolling.
  • Loop Merging: When two loops are being executed with similar tasks that can be sequentially adjusted, it is better to combine the two loops into a single loop. This technique is called loop merging. This reduces the total overhead time of executing multiple loops to the overhead of a single loop.

Pipelining Tasks

Two pipelining tasks were considered in order to reduce code execution time.

  • Multi-sample Processing: Sometimes, different samples can be executed simultaneously when there is no inter-dependency among them. This helps in conserving valuable clock cycles.
  • Split Summation: A complex equation can be made simple by dividing it into smaller components so that they can be executed in parallel. Another advantage is that by dividing into smaller components, different registers can be used. This minimizes the number of memory transfers, which consume more cycles than simple register calls.

Conditional Tasks:

Some tasks use conditional statements like if-then-else, which consume a lot of cycles. A better way is to remove the conditional statements as much as possible.

Critical Issues

Interrupt Service Management

The cryptographic related modules should be given the highest priority. If the case arises to perform some other critical task, then an interrupt routine should be programmed to check whether any cryptographic module is running at that time. If so, then all cryptic data should be deleted until completion of the interrupt routine. Then the cryptographic module should be executed again. Under no circumstances should the cryptographic data be sent to the stacks in order to perform interrupt routines.

Time-sliced Multi-tasking

Time-sliced multitasking of a cryptographic module with other applications also presents vulnerability to attacks. Time slicing could help the attacker to read the data of the registers in order to obtain crucial information, which could lead to knowledge of the key.

I/O Queues Management

In order to run the cryptographic modules efficiently, the input and output modules should be structurally separated. When the embedded device has multi-processor capability, separate processing should be catered for I/O data management.

Optimization metrics

There are many optimization metrics concerned with embedded systems such as:

  • Production cost
  • Execution speed
  • Memory size
  • Data throughput
  • Power consumption
  • Robustness

--NEXT--> Cryptanalysis