Friday, August 24, 2012

ARM quick reference

ARM has three branches, application processor(Ax), RTOS processor (Rx), micro-controller series(Mx). From here on we will be discussing about application processor.

ARM arch & Mode

Cortex A5
Cortex A7
Cortex A8
Cortex A9
Cortex A15
bit.LITTLE processing - application switches between Cortex A7(energy efficient) and Cortex A15(high performance) to give high performance as well as good battery life(battery life increases by up to 70%)

SIMD - came 1st in ARMv6
Thumb - understanding
Thumb2
Trust Zone
Jazelle
AMBA
Floating point
DSP & SIMD
Virtualization
ARM Physical IP
ARM Classic Processors - ARM11, ARM9, ARM7
ARM SecureCore

CPU Mode:



-User mode
The only non-privileged mode.
-System mode
The only privileged mode that is not entered by an exception. It can only be entered by executing an instruction that explicitly writes to the mode bits of the CPSR.
-Supervisor (svc) mode
A privileged mode entered whenever the CPU is reset or when a SWI instruction is executed.
-Abort mode
A privileged mode that is entered whenever a prefetch abort or data abort exception occurs.
-Undefined mode
A privileged mode that is entered whenever an undefined instruction exception occurs.
-Interrupt mode
A privileged mode that is entered whenever the processor accepts an IRQ interrupt.
-Fast Interrupt mode
A privileged mode that is entered whenever the processor accepts an FIQ interrupt.
-Hyp mode
A hypervisor mode introduced in armv-7a for cortex-A15 processor for providing hardware virtualization support.


Switching Mode:

In the case of system calls on ARM, normally the system call causes a SWI instruction to be executed. Anytime the processor executes a SWI (software interrupt) instruction, it goes into SVC mode, which is privileged, and jumps to the SWI exception handler. The SWI handler then looks at the cause of the interrupt (embedded in the instruction) and then does whatever the OS programmer decided it should do. 
(save PC-4 to LR_svc, CPSR to SPSR, clear I flag, clear T flag, get vector from IVT and execute ISR/top half, enable I flag, resotor T flag)
The other exceptions - reset, undefined instruction, prefetch abort, data abort, interrupt, and fast interrupt - all also cause the processor to enter privileged modes.
How file handling works is entirely up to whoever wrote your operating system - there's nothing ARM specific about that at all.

Switching to Thumb:

Take next instruction in PC and add 1 i.e. PC+1, and then do break operation BR. Once BR executes, if LSB in PC is set, system executes immediate next instruction from BR in Thumb mode.

Thumb Mode Execution:

CPU doesn’t differentiate between thumb mode and ARM mode. It is a separate hardware within processor core which transforms 16 bit Thumb instruction to 32 bit ARM instruction without missing any single clock pulse. This operation happens during execution fetch(fetch -> decode -> execute).



DSP:

The ARM DSP instruction set extensions increase the DSP processing capability of ARM solutions in high-performance applications, while offering the low power consumption required by portable, battery-powered devices.

Features:

  • Single-cycle 16x16 and 32x16 MAC implementations
  • 2-3 x DSP performance improvement over ARM7™ processor-based CPU products
  • Zero overhead saturation extension support
  • New instructions to load and store pairs of registers, with enhanced addressing modes
  • New CLZ instruction improves normalization in arithmetic operations and improves divide performance
  • Full support in the ARMv5TE, ARMv6 and ARMv7 architectures

Applications:

  • Audio encode/decode (MP3: AAC, WMA)
  • Servo motor control (HDD/DVD)
  • MPEG4 decode
  • Voice and handwriting recognition
  • Embedded control
  • Bit exact algorithms (GSM-AMR)

The ARM DSP extensions enable increased DSP performance without the need for very high clock frequencies. This performance comes with almost no increase in power consumption on a typical implementation.The DSP extensions can often eliminate the need for additional hardware accelerators.

SIMD Extensions for Multimedia


SIMD extensions in the ARMv6 and ARMv7 architectures
Optimized for a broad range of software applications including video and audio codecs, where the extensions increase performance by up to 75% or more.  

ARMv6 SIMD Features:

  • 75% performance increase for audio and video processing
  • Simultaneous computation of 2x16-bit or 4x8-bit operands
  • Fractional arithmetic
  • User definable saturation modes (arbitrary word-width)
  • Dual 16x16 multiply-add/subtract 32x32 fractional MAC
  • Simultaneous 8/16-bit select operations
  • Performance up to 3.2 GOPS at 800MHz
  • Performance is achieved with a "near zero" increase in power consumption on a typical implementation

Applications:

  • Media streaming
  • Internet appliance
  • MPEG4 and H264 encode/decode
  • Voice and handwriting recognition
  • FFT processing
  • Complex arithmetic
  • Viterbi processing


NEON:

NEON™ technology builds on the concept of SIMD with a dedicated module to provide 128-bit wide vector operations, compared to the 32bit wide SIMD in the ARMv6 architecture. NEON technology introduced in the ARMv7 architecture is only available with ARM Cortex-A class processors. Advanced SIMD, supported by both Thumb instruction and ARM instructions.
Came with ARMv7, and Cortex A8 was the 1st processor to support it.

Vector Floating point(VFP):

ARM Floating Point architecture (VFP) provides hardware support for floating point operations in half-, single- and double-precision floating point arithmetic.
Using hardware floating point combined with the NEON™ multimedia processing capability, performance of imaging applications such as scaling, 2D and 3D transforms, font generation, and digital filters can be increased.


Thumb :

Good for low dencity code, thumb take 16 bit instruction size where as ARM uses 32 bit. Only good to work on thumb when memory is concerned. Becuase thumb dosen’t offer all reatures of ARM. Like talking to co-processor register. Execution wise also thumb and arm works same at CPU. Only while fetching thumb instructon, arm uses seperate hardware within core to convert 16 bit instruction to the replica of 32 bit instruction, without loss of cpu clock bit. Disadvantage with Thumb instruction comes when we want to implement some instruction with thumb doesn’t support. So there we need multiple instruction to do a job which could be done in one instruction by ARM mode code. understanding

How to jump to thumb mode? Whell don’t jomp by setting T bit of CPSR, it is dangerous. One way is by branch instruciton. Give address to branch instruction by setting LSB as 1. When cpu checks lsd as 1, it understands to run jumped location in THUB mode. There are other ways too.

Thumb-2

It is a blended instruction set combining both 16-bit and 32-bit instructions designed to deliver the best balance of density and performance enabling.
Thumb-2 instructions overcomes limitations of Thumb instruction sets as,

The main enhancements are:
  • 32-bit instructions added to the Thumb instruction set to:
    • provide support for exception handling in Thumb state
    • provide access to coprocessors
    • include Digital Signal Processing (DSP) and media instructions
    • improve performance in cases where a single 16-bit instruction restricts functions available to the compiler.

Amba:

Somewhat like PCI arch in Intel. This is open std. by arm to make design reliability.
The AMBA protocol is an open standard, on-chip interconnect specification for the connection and management of functional blocks in a System-on-Chip (SoC). It facilitates right-first-time development of multi-processor designs with large numbers of controllers and peripherals.
AMBA was introduced by ARM Ltd in 1996. The first AMBA buses were Advanced System Bus (ASB) and Advanced Peripheral Bus (APB).
In its 2nd version, AMBA 2, ARMadded AMBA High-performance Bus (AHB) that is a single clock-edge protocol.
In 2003, ARM introduced the 3rd generation, AMBA 3, including AXI to reach even higher performance interconnect and the Advanced Trace Bus (ATB) as part of the CoreSight on-chip debug and trace solution. These protocols are today the de-facto standard for 32-bit embedded processors because they are well documented and can be used without royalties.


TrustZone:

TrustZone technology, tightly integrated tightly into Cortex™-A processors, extends throughout the system via the AMBA® AXI™ bus and specific TrustZone System IP blocks. This system approach means that it is possible to secure peripherals such as secure memory, crypto blocks, keyboard and screen to ensure they can be protected from software attack.

Jazelle:

ARM Jazelle DBX (Direct Bytecode eXecution) technology for direct bytecode execution of JavaTM delivers an unparalleled combination of Java performance and the world's leading 32-bit embedded RISC architecture - giving platform developers the freedom to run Java applications alongside established OS, middleware and application code on a single processor.

ARM Jazelle RCT (Runtime Compilation Target) technology supports efficient ahead-of-time (AOT) and just-in-time (JIT) compilation with Java and other execution environments.




--------------------

References:

ARM reference for below all technology
http://www.arm.com/products/processors/technologies/instruction-set-architectures.php
http://www.arm.com/products/processors/cortex-a/index.php
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0066d/Bcegdedj.html
IQ: http://embedded-telecom-interview.blogspot.in/2011/06/arm-processor-interview-questions.html

Thursday, August 16, 2012

Lets go to User Space


Compile busybox statically linked:

$make ARCH=arm CROSS_COMPILE=/home/abhishek/arm-2009q3/bin/arm-none-linux-gnueabi- menuconfig

$LDFLAGS="--static" make ARCH=arm CROSS_COMPILE=/home/abhishek/arm-2009q3/bin/arm-none-linux-gnueabi- 

$make ARCH=arm CROSS_COMPILE=/home/abhishek/arm-2009q3/bin/arm-none-linux-gnueabi- CONFIG_PREFIX=/home/abhishek/rpi/busybox/INSTALL-DIR install

Running Script from your application:

http://www.tuxation.com/setuid-on-shell-scripts.html



http://www.tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html



---------------------------------------------------------------------------------------------------------



Ethernet setup:

Check output of ifconfig in target system, if could see ethernet link then your ethernet link is up.
If you couldn't find ethernet interface then you need to bring interface up.
$ ip link show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
    link/ether 00:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
3: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
6: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 3
    link/ppp 


$ ip link show eth0

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
    link/ether 00:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff


$ ip link set eth0 up

$ifconfig eth0

Now you can see ethernet device up. But wont see ip address. So, here you have to assign device ip address,


$ ifconfig eth0 10.0.0.1 netmask 255.255.255.0 up


Starting GUI from console:

Your system is booted in shell prompt and you want to launch GUI window from there. Reason could be any, so basically you have to put system into correct run level. Command is simple,

$init <run level>
e.g. $ init 5

So lets understand run level, but again this may slightly vary (run level 3-5) with distributions. Which could be seen in /etc/inittab.

init 0 : shutdown/halt
init 1 : single user/admin task
init 2 : multi-user without network
init 3 : start normally with console/multi-user with network
init 4 : <user defined>
init 5 : start normally with GUI
init 6 : re-boot



References:


Working with Ethernet: http://linux-ip.net/html/tools-ip-link.html
http://people.debian.org/~ultrotter/talks/dc10/networking.html


Serial over Ethernet: http://www.novell.com/communities/node/4753/netconsole-howto-send-kernel-boot-messages-over-ethernet


---------------------------------------------------------------------------------------------------

Kernel

kprobes: A good tool to put break point in kernel execution and to debug.

Proc file system:

Build own linux based system:

--------------------------------------------------------------------------------------------------
Firefox OS

http://www.ifadey.com/2012/07/firefox-os-boot-to-gecko/
http://www.youtube.com/watch?v=yNWRMTqtQEI
https://wiki.mozilla.org/Gaia


-------------------------------------------------------------------------------------------------
Windowing System for new OS

http://xwinman.org/others.php
https://help.ubuntu.com/community/Installation/LowMemorySystems

Linux Distributions

http://distrowatch.com

--------------------------------------------------------------------------------------------------
Linux Boot-up



--------------------------------------------------------------------------------------------------
Worth Reading


Porting U-Boot: 

Monday, August 6, 2012

Raspberry Pi Getting started

First Look

Couple of days a go I received my raspberry pi, pre-ordered a month before.
In dimension, it exactly matches the size of my credit card, as they said. By finishing and clarity of the device, it is average, but no one will give such a board at such cost. Realy appricatlabe designing  which looks worth and logical for usablity and to reduce cost.  Few of appreciating designs are:

- Power supply by mini usb (worth as I already had my Mobile charger with rating of 5V-700mA)
- Searial connection through expansion pins
- No power on/off button
- JTAG and SD card  muxed to same pins
- 1st level of booting only through SD card, so you would be happy that you will not brick your board during development work.

I couldn't get detailed internals for the BCM2835 Soc, which disappointed me.

Lets Explore the device

Lets 1st try to understand how booting sequence goes on in the board, you may get these information from raspberry pi forum.
- When device is powered on, only GPU core gets turned on (ARM core, SDRAM and SD card controller is still off state)
- Bootloader from BCM2385 Soc ROM starts executing and scans SD card for bootcode.bin
- Here chain loading sequence is observed with broadcom firmware. ROM BL loads bootcode.bin as first bootloader, visible to user and present in FAT32 partition of SD card, to L2 cache. (note that SDRAM is still off and not available). So you can say, 1st stage initiates communication with SD card controller and also understands FAT32 file system only, as like many other 1st stage bootloader I have seen.
- bootcode.bin now enables SDRAM and loads loader.bin which basically understands elf format.
- loader.bin then loads start.elf at top of memory. Till this point nothing is visible in bootlog message.
- start.elf then does its jobs like split memory between ARM CPU and GPU, reads commandline.txt for getting information of root partition, its format and kernel parameters. I guess start.elf turns on peripherals for console/display. As start.elf could show some colourful screen through HDMI, which seams un-initialized. I guess here display subsystem is mapped to RAM memory area.
- start.elf program finds and loads kernel.img from root partition to SDRAM. Till this time, all  instructions were running in GPU core.
- Now control goes to ARM core to execute kernel image.

In short: GPU core(GPU start) -> BL from Soc ROM -> bootcode.bin -> loader.bin -> start.elf(GPU end) -> (CPU start) kernel.img

This boot sequence is fundamentally same across different board, but the way start.elf reads cmdline.txt, it shows similarity with GRUB BL.
We can play with our code only from kernel.img file or later. It is better if can put u-boot in the system to have more fun were we can do debugging traces.

Running Linux into the board

So now lets start with how to start porting Linux into the board:
Take appropriate SD card as in raspberry specification. Format it to FAT32 bootable format.

$ mkfs.vfat /dev/mmcxxxx
Now make two partitions in mmc card, one for boot device as FAT32 file system and another root partition as ext4 file system.

/boot  -- in FAT32 format to keep raspberry GPU firmwares, bootcode.bin, loader.bin, start.elf and kernel directive file cmdline.txt.

/root -- proper root device should be mentioned cmdline.txt file.

We need certain toolchains to cross compile linux into Ubuntu OS.

Install git and the cross-compilation toolchain:
$ sudo apt-get install git-core gcc-4.6-arm-linux-gnueabi

Make a directory for the sources and tools, then clone them with git:
$mkdir raspberrypi
$cd raspberrypi
$git clone https://github.com/raspberrypi/tools.git
$git clone https://github.com/raspberrypi/linux.git
$cd linux
Generate the .config file from the pre-packaged raspberry pi one:
$make ARCH=arm CROSS_COMPILE=/usr/bin/arm-linux-gnueabi- bcmrpi_cutdown_defconfig
For doing any configuration changes, open configuration menu and mark your changes,
$make ARCH=arm CROSS_COMPILE=/usr/bin/arm-linux-gnueabi- menuconfig
Once configuration file is prepared, start compiling kernel for r-pi device,
$make ARCH=arm CROSS_COMPILE=/usr/bin/arm-linux-gnueabi- -k -j5

mkdir ../modules

Then compile and ‘install’ the loadable modules to the temp directory:
$make modules_install ARCH=arm CROSS_COMPILE=/usr/bin/arm-linux-gnueabi- INSTALL_MOD_PATH=../modules/
Now we need to use imagetool-uncompressed.py to generate un-compressed kernel image for r-pi.
See below for necessary download links
$cd ../tools/mkimage/
$./imagetool-uncompressed.py ../../linux/arch/arm/boot/Image

Here you will find generated kernel.img.
Now we can copy kernel.img in mmc card.

$sudo cp -f kernel.img /media/<boot partition>/
Now copy generated library and firmwares from modules directory to mmc
$cd ../../modules/
$sudo cp -rf lib/ /media/<boot partition >/lib/
$sync

Now put mmc card into the r-pi mmc slot and power-on the device. HDMI cable was my 1st preference to see the output message, as my serial setup was not working initially. To get minimum indication of your source code execution, observe green led, next to red power led in r-pi board. If your instructions are executing, then you will see continuous blinks of green led.

So, if you see you kernel message, you are done with one stage, now you have to move ahead to execute user space code. Well hopefully in the end of above code execution, you might see kernel panic as your kernel would not be able to find root partition if you are not done with it.


Raspberry Pi on QEMU

Install latest QEMU version in your system. I have verified support in QEMU v1.5.

# qemu-system-arm -cpu ?|grep arm11

If you find arm1176 in list then you have support for rpi in qemu.

Download arch linux from raspberry pi official site and verify if it boots well.

# qemu-system-arm -cpu arm1176 -kernel kernel-qemu -m 256 -M versatilepb -no-reboot -serial stdio -append "root=/dev/sda2 panic=0" -hda archlinux-hf-2013-02-11.img

If it boots then it is allright to go for your custom image. Just use same arch linux image, mount at some place and use your custom kernel, or boot loader, or library.

Link: http://www.v13.gr/blog/?p=276

Links



Downloads

Follow https://github.com/raspberrypi/ link and download most of the required tools,


$git clone git://github.com/raspberrypi/tools.git
$git clone git://github.com/raspberrypi/linux.git
$git clone git://github.com/gonzoua/u-boot-pi.git
$git clone https://github.com/raspberrypi/firmware/



http://kernelnomicon.org/?m=201205
http://wiki.gentoo.org/wiki/Raspberry_Pi


References: http://raspberrypi.org/phpBB3/viewtopic.php?f=63&t=6685