November 12, 2008

Intel Attacks ARM

Filed under: Industry News — Tags: , , — sglass @ 1:44 pm

The Atom was the reason why Intel had to sell the XScale division. Unfortunately the XScale CPU wasn’t all it should have been, lacking debug capability or the performance leap promised by its StrongARM heritage. While Intel sold a few chips to people for WinCE PDAs, and even a few Motorola cellphones, the market was small compared to that available to TI and the like.

Free from its ownership of a competing architecture, one which has wiped the floor with Intel, its execs obviously feel comfortable letting rip at ARM. Intel is no-doubt hugely frustrated at its inability to compete in the fast-growing cellphone market, and the Apple iPhone is just another sign of ARM’s dominance in this sector.

So here is the comment I’m referring to:

Kedia didn’t just stop at the iPhone, claiming ARM was a malaise afflicting smartphones in general. “The smartphone of today is not very smart,” he said. “The problem they have today is they use ARM.”

Wall believed the situation was unlikely to change anytime soon, saying Intel was two years ahead of the rival company. He didn’t believe fast, full internet would receive a debut with ARM-based devices in the near future. “Even if they do have full capability, the performance will be so poor,” he said.

Of course this guy is just venting, during a trip to Taiwan. Perhaps he met with a number of potential customers there who all told him they were using ARM and very happy with it.

But also, it simply isn’t true. Tom’s Hardware shows Atom’s power consumption (for CPU alone) of about 2.5W, with 5W including the required companion chip.

We should point out, though, that the two chipsets to be used with the Atom N200s are power users: the Atom 230s use an i945GC that consumes 22W (4W for the CPU) and the Atom N270s ship with an i945GSE that burns 5.5W (2.4W for the CPU).

This is for a 1.6GHz CPU. By comparison the OMAP3530, a dual core 600MHz CPU with integrated video DSP, 3D graphics, NEON SIMD machine, DDR interface (i.e. Atom + support chip) consumes under 2W total (and that’s the maximum from the datasheet and the Beagle board - with power management OFF!).

It is a mystery why Intel chips consume so much power. Some say it is the bythanthine x86 instruction set. Others say that Intel aims for speed rather than power. Who knows…

So in terms of power consumption, Intel isn’t even able to play the game yet. It is perhaps 3-5 years behind ARM on this one.

The claim that the Internet isn’t usable on an ARM CPU is also bogus. From what little I have seen of the iPhone it seems usable enough. My Nokia E90 certainly runs ok on the web, although I agree it could do with more speed (it is an ARM11 design). I think Intel will be shocked at the capability of the Cortex-A8 devices when they come out in the new year.

Of course Intel needs to attack ARM - ARM owns the lower power market space and it is the only way that Intel can make inroads into it. But Intel needs to get its products in order first. Perhaps Intel should swallow its pride, take an ARM architectural license and put its A team on the project. The C team didn’t do a great job, but everyone knows Intel has great chip engineers - just look at the Pentium range. Take away the x86 baggage and who knows what might be achieved?

November 10, 2008

Advanced Realview ICE usage

Filed under: ARM Tools News — Tags: , , , — andre @ 10:44 pm

When using the Realview ICE, there are normally portions of the development that are often repeated. Loading and executing a specific image for example, or configuring the SDRAM controller to allow normal development to proceed. These are typically done either in a simple compiled program, or using the inbuilt scripting available in the Realview Debugger. When using the graphical environment however, even when using a single executable, the time taken to click on all the buttons to execute the task can seem laborious. However there is another method available - using the RDDI network protocol to communicate directly with the ICE unit.


This protocol allows for all of the standard JTAG operations to be performed automatically from code. At Bluewater we have developed a minimal scripting interface which communicates over this API to automate the tasks which we often perform. This means that with a single click we can bring up a completely un-programmed board into a running state, upload the latest version of our code, and begin executing it with no further interaction. When dealing with a large variety of different development boards, this can be vital in ensuring that information on how to deal with each one is not lost - the script provides all the information necessary. As the details of the actual JTAG operations are still left with the ICE, all the performance advantages of the ICE are still in effect.

Operations available from the RDDI network protocol include:

  • Resetting the target
  • Setting & retrieving register values
  • Uploading images to memory
  • Downloading images from memory

A sample library is provided by ARM for these purposes, available from http://www.arm.com/products/DevTools/RDDIRVI.html.

Xnets & PCB Layout tool documentation

While working on DDR memory routing on our Snapper-DV board, which is now using a Texas Instruments (TI) OMAP3530 applications processor, all DDR memory bus lines, including control signals, are terminated with series resistors in between the processor going to the two balance-T DDR ICs. As with all routing that involves DDR, extensive length matching is normally applied to these memory traces (which are further classified into classes per their function). Normally, Cadence Allegro Performance tool via Constraint Manager can handle length matching of traces point-to-point (net) very well.  But with series termination resistors in between, it adds complication because these resistors have to be inerpreted as part of the net connecting the processor to the DDR ICs (see figure below) and thus have to be included in the length matching. When the path of a net traverses a discrete device (resistor, inductor or capacitor), each net segment is represented by an individual net entity (net1, net2) and this whole length is called “Extended Nets” or Xnets.

CPU ———— series term resistors ———— DDR ICs

net1                                          net2

|——– Extended Nets (Xnets) ———|

Cadence handles Xnets very well and it’s very well documented using their high-end schematic/PCB layout and simulation tools, which are very costly. Xnets have to be created so that Constraint Manager can interpret this as a whole trace/track length for matching.  These can only be created by attaching a “signal model” on the discrete device (resistor in our case) and this is a very easy step with Cadence high-end tools front-to-back flow as it is well documented. But with a basic ConceptHDL and Allegro Performance tools, it took some fiddling around to achieve the task. Though the Allegro Performance datasheet shows that Xnets are supported, getting the Constraint Manager to recognize the Xnets was not easy. Assigning a “signal_model” property using the “Edit>Properties” menu was easy, but it didn’t create the Xnet that was expected. There are no explicit instructions on how to do this on this medium-flavored tool. The trick was to go through “Setup Advisor” until “SI Model Assignments” and assign the “signal_model” property to the resistors. This procedure worked, Xnets were created, Constraint Manager did recognize these and length matching can now be handled optimally and with ease. In summary, and from experience with using schematic/PCB software tools not just from Cadence, it just doesn’t end going thorugh the documentation of using the tool.  There’s a lot more to it when using and applying the tools in real life work.

Automated Snapper CL15 Test System

Filed under: New Development — Tags: , , , , — ryan @ 9:28 pm

In preparation for manufacturing a large number of Snapper CL15 modules, we have been developing an automated test system for quickly and accurately finding assembly faults in Snapper CL15 modules.  The test system will be sent to the manufacturer so that the modules can be tested and, if necessary, repaired at the point of assembly.

In the past we have used our Autotester software, combined with a Rig 200 baseboard for testing Snapper modules.  While this approach works reasonably well, it has a number of limitations which make in unsuitable for this task:

  • It is too interactive.  Many of the tests require user interaction or confirmation. For example, the audio test requires the user to confirm that a sound was played correctly, and the USB tests require the user to insert and remove a USB device.  For large builds this interaction becomes both tedious and error prone.
  • It cannot accurately find faults.  A failed test in the Autotester only tells the user which sub-system is faulty, but not where the specific problem is. For example, the video sub-system of the Snapper CL15 comprises of more than 20 pins, but a failed video test in the Autotester does not give any information about which of these may be faulty.
  • The Rig 200 baseboard does not expose all of the features of the Snapper CL15 in an easy to test way.  While the major features of the Snapper CL15 are accessible on the Rig 200 baseboard, many of them require some additional hardware to actually test the functionality.

To solve these problems, we have developed the Snapper CL15 test jig.  The test jig is a standard Snapper CL15 baseboard which has been designed to fully automate the testing of all of the Snapper CL15 features. In the event of a failure, very specific information about the nature of the fault, often referencing a single pin, is given. The problem of requiring external hardware for testing some peripherals is solved by having an FPGA which can monitor and drive pins on the Snapper CL15.

The FPGA enables tests which previously required user interaction, such as video test, to be fully automated. For the video testing, the Snapper CL15 runs an application which configures a video mode and displays a test pattern.  The FPGA has registers which contain information such as the number of clocks per line and the sum of the pixel data in the frame. The software running on the Snapper CL15 can verify that this information is correct.

Many of the test procedures use loop-backs so that no user interaction is required. For example, the audio tests, which in the Autotester setup required a user confirmation that sound played correctly, loop the line-out to the line-in via an analogue switch. After initially testing that the left and right capture channels are working correctly by feeding a 1kHz (generated by the baseboard) to them, the line-out and high-power out are tested by looping them back to the line-in and comparing the captured audio with what was played.

The test jig system will be able to fully test a Snapper CL15 module in around 2-3 minutes, with no user interaction other than insterting the Snapper CL15 module and checking the result of the test. Because the information about failures is highly specific, it enables the manufacturer to quickly find and correct any assembly faults with the modules.

November 5, 2008

More On Sky Challenge

Filed under: Uncategorized — admin @ 7:24 pm

As you know, Bluewater recently participated in Sky Challenge, having contributed by developing the heads-up display system which the pilot used for flying the course , and co-development of a microwave communication link which sends the positional data for the aircraft to the ground.

Here are a few more links that we like that show you what Sky Challenge is all about.

http://news.bbc.co.uk/1/hi/technology/7633110.stm

http://news.bbc.co.uk/2/hi/technology/7651327.stm

http://www.astrofiammante.net/blog/sky-challenge-better-than-the-nintendo-wii-post279/

ARM Tools Seminar

Filed under: Uncategorized — Tags: — Amanda @ 7:00 pm

Bluewater is hosting our first ARM Tools Seminar in Christchurch on Thursday 27th November 2008 and we’d like for you to join us!

The seminar will cover topics such as the sales of ARM Development Tools, the benefits and any new features of those tools, as well as new features that have been added to the existing line of available ARM chips. As part of the afternoon’s discussions, Simon will be presenting information on the current state of ARM technology.

If you would like to attend our Christchurch seminar, please RSVP your full name, as well as the names of all staff members who would like to attend, and your company name by the 14th November 2008.

Date: Thursday 27th November 2008
Time: 9:00am - 2:00pm, lunch and refreshments will be provided
Venue: To be determined (details will follow on location information)
RSVP: Friday 14th November 2008
Contact: Amanda Gardner on (03) 377 9127 x202
or amanda@bluewatersys.com

In addition, further ARM Tools seminars will be held in Auckland and Wellington, dates to be determined.

If you would like any additional information, please do not hesitate to contact me on the details provided above.  Also, please reach out and register your interest if you would like to attend a seminar that is to be held in Auckland or Wellington.

October 21, 2008

Fun with C

Filed under: Uncategorized — theuns @ 10:29 pm

A few fun things you can do with C structures:

1. Bitfields

struct {
    uint32_t    year        : 7,
                    month    : 4,
                    day        : 5,
                            : 5,
                    hour    : 5,
                    min        : 6;
} compact_date;

Notes:
1. Turns out that C99 defines fixes-size integer types: no more need to define them yourself.
2. The order in which bits are placed into a bitfield (MSB or LSB first) is not standardised - which is ok if you’re only on a single platform, but may get messy otherwise.
3. To leave reserved blank spaces, just leave them blank.
4. If you need fewer bits, just use fewer.

This structure can store a date (up to 127 years) using only 4 bytes (as opposed to the naive implementation which would require 5 or more). Not a massive win, but convenient when alignment is a concern.

2. Unions

union {
    uint16_t    isDate        : 1,
                    month    : 4,
                    day        : 5;
    uint16_t                : 1,
                    hour    : 5,
                    min        : 6;
} datetime;

This little beauty allows overlapping bitfield definitions: depending on the value of isDate, you can access either month/day or hour:min inside the same compact storage area.

3. Head and trailer structures

struct {
    uint16_t    type;
    uint16_t    len;
    uint8_t        data[1];
} tlv;

This is a fairly common construct but still rather useful. Because C doesn’t check array bounds, you can readily access additional data[] elements - the only trick becomes handling sizeof() operations cleanly.

To get the size of the header (only):

&((tlv*)0)->data)

To get the size of the complete structure:

sizeof(tlv) + sizeof(tlv.data[0]) * (data_count - 1)

4. And finally, just for fun:

#define STDIO "stdio.h"
#define STRING <string.h>
#include STDIO
#include STRING

int main (int argc, char** argv) {
  printf("Don't let me catch you doing this!\n");
}

October 20, 2008

Bayer Filtering with RealView versus GCC

Filed under: ARM Tools News — Tags: , , , — sglass @ 3:50 pm

For our Big-Eye project (a 3.1 Megapixel network camera for security and remote monitoring applications), we needed an efficient software algorithm for converting the Bayer-format data used by the image sensor to RGB data suitable for sending across the network / JPEG compression. A simple C algorithm was coded which did the job. However, it was very slow - about 1.7 seconds to convert a single frame. This was using GCC 4.1.1.

The code was fairly well written, using word operations to move through the three lines of the image, two pixels at a time. It was not looking good.

After spending a bit of time trying a few basic optimisation techniques, we decided to try the RealView compiler. This now supports most of the GNU options and it is fairly easy to build software using it. The result was pretty astounding. With no code changes, RealView produced an execution time of around 0.3 seconds!

A bit of additional fiddling suggested that further improvements might be possible. The main difference seems to be that RealView makes much better use of registers, and thus needs a lot less load/store on the stack.

This is probably a common feature of many image processing algorithms. If written with efficiency in mind, they will push the register set quite hard. Much assembler optimisation relies on using the register set more efficiently. But if the compiler can do it for you, so much the better.

We could easily have spent a few days working on this part of the code, and with RealView it was just a re-compile.

In the end we have moved all the code over to RealView and this is now the standard build environment.

October 5, 2008

Sky Challenge Participation

Filed under: Uncategorized — ryan @ 2:14 pm

Bluewater Systems participated in a project called Sky Challenge which allows high performance aerobatic aircraft to race through virtual courses.  Using high precision GPS and inertial systems the aircraft’s position is sent to a heads up display, which displays a series of objects for the pilot to fly through.  Spectators on the ground were able to watch the actual race in the sky, and a real time computer enhanced version which showed the virtual objects.

Bluewater Systems’ part in the project was the development of a heads-up display system which the pilot used for flying the course , and co-development of a microwave communication link which sends the positional data for the aircraft to the ground.  The heads-up display software was developed by working closely with a number of aerobatic pilots.  The main goals of the software were to provide the pilots with as much information as possible, without cluttering up the screen.

Since Sky Challenge was open to an invite-only audience, have a look at what we were able to experience firsthand.

http://news.bbc.co.uk/2/hi/technology/7651500.stm

September 28, 2008

RVDS vs MDK

Filed under: ARM Tools News — Tags: , , , , , — andre @ 5:05 pm

As an ARM authorised reseller, we tend to use the ARM Realview tools (RVDS) whenever appropriate. With ARMs recent acquisition of Keil there is now another option available, the Microcontrol Development Kit - MDK.

MDK is aimed at the lower-end market - ARM7, Cortex-M3, and some ARM9. It is also targeted towards either OS-less, or RTX-based designs. RTX is ARMs simple realtime OS. It includes all of the features normally supplied with a small embedded OS: Threads, Scheduler, Mutexing, Memory Pools, Mailboxes, Delays, Events etc… When using the RTX kernel and the MDK tools together, a large number of advanced debugging options become available.  The ability to trivially graph thread stack usage and thread execution time, which can make some aspects of profiling an application trivial. It also provides a full simulation system for a large number of CPUs (see the Keil Device List for more details). These simulated CPUs generally include the full peripheral suite as well, which makes prototype development easy, even in the absence of actual hardware. MDK also includes a full IDE, build system and large library of examples, making initial project creation a breeze.

RVDS is targeted at a different market. It aims to support the latest ARM CPUs, with recent additions for the Cortex-A9. Its simulation model is geared more towards ARM cores rather than specific SOC chips, and as such it is not suitable for full system simulation. It can be used very well for algorithm simulation however, and provides all the optimisation feedback necessary for this purpose.

Both of these platforms share the same compiler suite, RVCT. RVCT generally provides the best code generation and optimisation features of all the ARM compilers, certainly far better than the commonly used GCC compiler. It also supports a lot of the extensions provided by other tool chains, making it relatively easy to transition from one compiler to another.

Generally, the decision between the two is quite apparent - if a more traditional embedded design is being done, such as an embedded control application, then a lower CPU, such as the Cortex-M3, or ARM7 will be chosen. In this case, MDK is more appropriate. If a more advanced design is to be used, involving a complex modern operating system such as Linux or WinCE and an ARM9 or greater CPU, then RVDS will provide faster code download & flexibility.

MDK vs RVDS

Newer Posts »