Friday, August 24, 2012

ARM quick reference

ARM has three branches, application processor(Ax), RTOS processor (Rx), micro-controller series(Mx). From here on we will be discussing about application processor.

ARM arch & Mode

Cortex A5
Cortex A7
Cortex A8
Cortex A9
Cortex A15
bit.LITTLE processing - application switches between Cortex A7(energy efficient) and Cortex A15(high performance) to give high performance as well as good battery life(battery life increases by up to 70%)

SIMD - came 1st in ARMv6
Thumb - understanding
Thumb2
Trust Zone
Jazelle
AMBA
Floating point
DSP & SIMD
Virtualization
ARM Physical IP
ARM Classic Processors - ARM11, ARM9, ARM7
ARM SecureCore

CPU Mode:



-User mode
The only non-privileged mode.
-System mode
The only privileged mode that is not entered by an exception. It can only be entered by executing an instruction that explicitly writes to the mode bits of the CPSR.
-Supervisor (svc) mode
A privileged mode entered whenever the CPU is reset or when a SWI instruction is executed.
-Abort mode
A privileged mode that is entered whenever a prefetch abort or data abort exception occurs.
-Undefined mode
A privileged mode that is entered whenever an undefined instruction exception occurs.
-Interrupt mode
A privileged mode that is entered whenever the processor accepts an IRQ interrupt.
-Fast Interrupt mode
A privileged mode that is entered whenever the processor accepts an FIQ interrupt.
-Hyp mode
A hypervisor mode introduced in armv-7a for cortex-A15 processor for providing hardware virtualization support.


Switching Mode:

In the case of system calls on ARM, normally the system call causes a SWI instruction to be executed. Anytime the processor executes a SWI (software interrupt) instruction, it goes into SVC mode, which is privileged, and jumps to the SWI exception handler. The SWI handler then looks at the cause of the interrupt (embedded in the instruction) and then does whatever the OS programmer decided it should do. 
(save PC-4 to LR_svc, CPSR to SPSR, clear I flag, clear T flag, get vector from IVT and execute ISR/top half, enable I flag, resotor T flag)
The other exceptions - reset, undefined instruction, prefetch abort, data abort, interrupt, and fast interrupt - all also cause the processor to enter privileged modes.
How file handling works is entirely up to whoever wrote your operating system - there's nothing ARM specific about that at all.

Switching to Thumb:

Take next instruction in PC and add 1 i.e. PC+1, and then do break operation BR. Once BR executes, if LSB in PC is set, system executes immediate next instruction from BR in Thumb mode.

Thumb Mode Execution:

CPU doesn’t differentiate between thumb mode and ARM mode. It is a separate hardware within processor core which transforms 16 bit Thumb instruction to 32 bit ARM instruction without missing any single clock pulse. This operation happens during execution fetch(fetch -> decode -> execute).



DSP:

The ARM DSP instruction set extensions increase the DSP processing capability of ARM solutions in high-performance applications, while offering the low power consumption required by portable, battery-powered devices.

Features:

  • Single-cycle 16x16 and 32x16 MAC implementations
  • 2-3 x DSP performance improvement over ARM7™ processor-based CPU products
  • Zero overhead saturation extension support
  • New instructions to load and store pairs of registers, with enhanced addressing modes
  • New CLZ instruction improves normalization in arithmetic operations and improves divide performance
  • Full support in the ARMv5TE, ARMv6 and ARMv7 architectures

Applications:

  • Audio encode/decode (MP3: AAC, WMA)
  • Servo motor control (HDD/DVD)
  • MPEG4 decode
  • Voice and handwriting recognition
  • Embedded control
  • Bit exact algorithms (GSM-AMR)

The ARM DSP extensions enable increased DSP performance without the need for very high clock frequencies. This performance comes with almost no increase in power consumption on a typical implementation.The DSP extensions can often eliminate the need for additional hardware accelerators.

SIMD Extensions for Multimedia


SIMD extensions in the ARMv6 and ARMv7 architectures
Optimized for a broad range of software applications including video and audio codecs, where the extensions increase performance by up to 75% or more.  

ARMv6 SIMD Features:

  • 75% performance increase for audio and video processing
  • Simultaneous computation of 2x16-bit or 4x8-bit operands
  • Fractional arithmetic
  • User definable saturation modes (arbitrary word-width)
  • Dual 16x16 multiply-add/subtract 32x32 fractional MAC
  • Simultaneous 8/16-bit select operations
  • Performance up to 3.2 GOPS at 800MHz
  • Performance is achieved with a "near zero" increase in power consumption on a typical implementation

Applications:

  • Media streaming
  • Internet appliance
  • MPEG4 and H264 encode/decode
  • Voice and handwriting recognition
  • FFT processing
  • Complex arithmetic
  • Viterbi processing


NEON:

NEON™ technology builds on the concept of SIMD with a dedicated module to provide 128-bit wide vector operations, compared to the 32bit wide SIMD in the ARMv6 architecture. NEON technology introduced in the ARMv7 architecture is only available with ARM Cortex-A class processors. Advanced SIMD, supported by both Thumb instruction and ARM instructions.
Came with ARMv7, and Cortex A8 was the 1st processor to support it.

Vector Floating point(VFP):

ARM Floating Point architecture (VFP) provides hardware support for floating point operations in half-, single- and double-precision floating point arithmetic.
Using hardware floating point combined with the NEON™ multimedia processing capability, performance of imaging applications such as scaling, 2D and 3D transforms, font generation, and digital filters can be increased.


Thumb :

Good for low dencity code, thumb take 16 bit instruction size where as ARM uses 32 bit. Only good to work on thumb when memory is concerned. Becuase thumb dosen’t offer all reatures of ARM. Like talking to co-processor register. Execution wise also thumb and arm works same at CPU. Only while fetching thumb instructon, arm uses seperate hardware within core to convert 16 bit instruction to the replica of 32 bit instruction, without loss of cpu clock bit. Disadvantage with Thumb instruction comes when we want to implement some instruction with thumb doesn’t support. So there we need multiple instruction to do a job which could be done in one instruction by ARM mode code. understanding

How to jump to thumb mode? Whell don’t jomp by setting T bit of CPSR, it is dangerous. One way is by branch instruciton. Give address to branch instruction by setting LSB as 1. When cpu checks lsd as 1, it understands to run jumped location in THUB mode. There are other ways too.

Thumb-2

It is a blended instruction set combining both 16-bit and 32-bit instructions designed to deliver the best balance of density and performance enabling.
Thumb-2 instructions overcomes limitations of Thumb instruction sets as,

The main enhancements are:
  • 32-bit instructions added to the Thumb instruction set to:
    • provide support for exception handling in Thumb state
    • provide access to coprocessors
    • include Digital Signal Processing (DSP) and media instructions
    • improve performance in cases where a single 16-bit instruction restricts functions available to the compiler.

Amba:

Somewhat like PCI arch in Intel. This is open std. by arm to make design reliability.
The AMBA protocol is an open standard, on-chip interconnect specification for the connection and management of functional blocks in a System-on-Chip (SoC). It facilitates right-first-time development of multi-processor designs with large numbers of controllers and peripherals.
AMBA was introduced by ARM Ltd in 1996. The first AMBA buses were Advanced System Bus (ASB) and Advanced Peripheral Bus (APB).
In its 2nd version, AMBA 2, ARMadded AMBA High-performance Bus (AHB) that is a single clock-edge protocol.
In 2003, ARM introduced the 3rd generation, AMBA 3, including AXI to reach even higher performance interconnect and the Advanced Trace Bus (ATB) as part of the CoreSight on-chip debug and trace solution. These protocols are today the de-facto standard for 32-bit embedded processors because they are well documented and can be used without royalties.


TrustZone:

TrustZone technology, tightly integrated tightly into Cortex™-A processors, extends throughout the system via the AMBA® AXI™ bus and specific TrustZone System IP blocks. This system approach means that it is possible to secure peripherals such as secure memory, crypto blocks, keyboard and screen to ensure they can be protected from software attack.

Jazelle:

ARM Jazelle DBX (Direct Bytecode eXecution) technology for direct bytecode execution of JavaTM delivers an unparalleled combination of Java performance and the world's leading 32-bit embedded RISC architecture - giving platform developers the freedom to run Java applications alongside established OS, middleware and application code on a single processor.

ARM Jazelle RCT (Runtime Compilation Target) technology supports efficient ahead-of-time (AOT) and just-in-time (JIT) compilation with Java and other execution environments.




--------------------

References:

ARM reference for below all technology
http://www.arm.com/products/processors/technologies/instruction-set-architectures.php
http://www.arm.com/products/processors/cortex-a/index.php
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0066d/Bcegdedj.html
IQ: http://embedded-telecom-interview.blogspot.in/2011/06/arm-processor-interview-questions.html

No comments:

Post a Comment