Having just finished working on the UI for the YouView IPTV Set Top Box, I thought I’d share some of my insights into the best practices when building applications for such resource constrained devices. The YouView UI is AIR based, written in AS3 and runs in Stagecraft 2 (AKA AIR for TV). As the name suggests, AIR for TV is a special version of the AIR runtime for embedded systems, such as Set Top Boxes. The first incarnation of the YouView UI (back when it was code-named ‘Project Canvas’) was for Stagecraft version 1, which means coding in AS2 and suffering the abysmal performance that comes with running ActionScript Virtual Machine 1.
Despite the delays and the need to code the UI from scratch in AS3, I think it was ultimately the right decision. Stagecraft 2 is a much better platform – Stagecraft 2.5.1 to be precise. It was a great opportunity to learn how to write optimal code and use hardware acceleration effectively on a resource constrained device. Regardless of which technology you’re using, here are some key things to be aware of when developing for such platforms:
- Limit your pre-composite calculations
In AIR/Stagecraft we’re talking about limiting display list hierarchy complexity; in HTML5 we’re talking about reducing the DOM complexity. Stagecraft (or whatever display engine you’re using) needs to traverse through the display list (or DOM), working out which areas of the screen to redraw. This is somewhat similar to how the desktop Flash Player handles redraws, but with some key differences to how it decides what needs redrawing, how it tackles moving display objects and how it delegates the work of updating the frame buffer – a subject for another time. Mostly importantly, if you’re developing for a resource constrained device (such as mobile or Set Top Box), you’ll have very limited CPU power, even if the device’s GPU (Graphics Processing Unit) affords you great hardware acceleration capabilities. So, before you can delegate any graphics compositing work to hardware, you must enumerate changes in the display list in software, right? Complex display lists are a headache for some of the low-powered CPUs found in mobiles and Set Top Boxes and this will show up as rocketing CPU usage, low framerates and few spare work cycles – AKA ‘DoPlays’ in Stagecraft. By keeping your display list shallow, with only the bare minimum of display objects on stage at any one time, you’ll be making life easier for the CPU – whether or not graphics are thereafter drawn in software or hardware.
- Benchmark everything
When building an application for a resource constrained device, you should be able to run each component in isolation, to assess its drain on CPU and system/video memory. There’s no point optimising the hell out of one component, when it’s actually another one that is the source of your performance bottleneck.
- Know thine hardware acceleration capabilities
There’s no point blindly using cacheAsBitmap and cacheAsBitmapMatrix everywhere, if it’s not going to speed things up on the target device. Worse still, too many cacheAsBitmaps and you may be just wasting valuable video memory, or causing unnecessary redraws (again, the subject of a future article). A lot of platforms will accelerate bitmaps, even if stretched, but not necessarily if flipped or rotated. Alpha on bitmaps (or anything cached as bitmap) will usually be accelerated too, but this is not necessarily the case with all colour transforms. Benchmarking any component you’re building will quickly tell you where you might have pushed it too far, but you should also have a way of verifying that a particular set of transforms is indeed hardware accelerated. Stagecraft provides this when using its –showblit command line parameter. I’ll be going into more detail about this in another post.
- Mind your memory
When using various hardware acceleration tricks, especially on resource constrained devices, video memory is at a premium and usually in limited supply. You will need to know the limits and have a way of seeing how much video memory your application is using at any one time – ensuring you dispose and dereference any bitmaps you’re finished with too. If your platform uses DirectFB for its rendering, as YouView does, the executable ‘dfdump’ can show you just where your video memory is going. This is something else I’ll get into in another article.
- Blit blit blit
This refers to blitting, where blocks of pixels are copied from one bitmap to another. This technique is used a lot in games, where graphics performance is critical, you should arm yourself with the basics of how old video games used blitting of multiple things to a single bitmap for performance and video memory efficiency.
I’ll probably go into more depth on each of these things in forthcoming posts. Stay tuned…
See also: Bitmap ‘folding’ trick
A common oversight when using Bitmaps with loaded content is that Flash will revert a Bitmap’s smoothing parameter to false when you replace its bitmapData. It’s simple enough to fix, but since you may not know if someone is going to replace the bitmapData of a Bitmap you have created – then it’s often better to code defensively around it.
This little SmoothBitmap class is for just such an occassion. Instantiate it like a regular Bitmap and, no matter what another developer does with it, smooth pixels when scaling/rotating will be ensured.
I was recently creating an API that required extending TextField and happened across the getRawText() method. I assumed this returned the text from the field without formatting or something – so I looked up the AS3 docs for flash.text.TextField.
Nothing there – gee thanks Adobe. A quick search turned up this which, it turns out, isn’t quite accurate.
So, with a tad of testing, it appears that getRawText() returns the text, stripped of any HTML tags (if you had set htmlText). I now wonder if this is faster than using a RegEx to strip the tags and why Adobe didn’t document it?
A couple of years ago, I created an object pooling utility for a games project I was building in AS3. Since then, I’ve used it quite a few times, in order to speed up apps and improve resource management, easing the load on the garbage collector by reusing objects instead of recreating them.
While object pooling isn’t a magic bullet to speed up every use case, it works especially well on things that are heavy to continually construct and destroy. A good example is my History of the World project, which uses an object pool for item renderers, instead of creating and destroying them as you navigate around – press ALT+CTRL to bring up the resource debugger, which shows a little information on its usage.
I recently updated the utility, improving its performance, adding features and putting loads of unit tests around it. It’s now hosted it over at GitHub. Using it is a simple as:
var pool:LoanShark = new LoanShark(SomeClass);
var someInstance:SomeClass = pool.borrowObject();
// Instead of nullifying an object, check it back into the pool
I thought I’d give a quick insight into how the animation effects in one of my projects were acheived.
Scott Bedford, former Creative Director at Carlson Marketing, posted this video of a project we worked on a while back, for the Lurpak Breakfast campaign. I created all the animation prototypes for the various effects used throughout the site, some of which can be seen here. The site won two DMA awards, but I’m most proud of the crumbs animation and the code-generated interactive steam effect – similar to the one you’ll see on my homepage.
Continue reading Lurpak Breakfast – how it was done
I had a conversation yesterday with a friend and colleague about how his company should standardise their development environment for all Flashers – be they contract or perm, junior to senior. Continue reading How the hell do I build this?
In a simple situation, where you wish to add many elements to an Array or Vector, you might just do:
However, the sizes of both Arrays are manipulated for each loop, which will have an adverse impact on speed and memory usage. So, we could cache the length of the input Array and not manipulate it:
var inputLength:uint = input.length;
for (var i:int = 0; i < inputLength; i++)
But we’re still growing the size of the output Array incrementally, which is very bad. Since we know input.length in advance, we could grow the output Array to its new size just once, before the loop:
var inputLength:uint = input.length;
var outputLength:uint = output.length;
var newOutputLength:uint = outputLength + inputLength;
output.length = newOutputLength;
for (var i:int = 0; i < inputLength; i++)
output[i + outputLength] = input[i]);
This is OK, but still involves a loop. If only we could push multiple elements into the push method in one go. Well, we can – enter the apply method. Since Array.push accepts multiple arguments (something rarely used) and apply allows us to pass an Array of arguments to any Function, one line and we’re done:
This works out faster and more memory efficient than the other methods. It works nicely for Vectors, too. If anyone has a faster method of doing this, do let me know.
If you’re still churning out Flash banners, please use this!
I created this simple utility, called SWFIdle, to enable the Flash Player to lower its CPU usage while the user is not interacting with it. Since it’s possible to have multiple Flash instances embedded in one page (for example, a game and a couple of banners), I recommend that everyone uses this in their projects, so that players needn’t fight for CPU and give a worse name than it has already.
I know there’s the hasPriority embed attribute now. But:
- That assumes you have access to the HTML that embeds your SWF
- If no other players are present, it has no effect
- There’s still usually little reason to be running your SWF at a high framerate if the user isn’t interacting with it
- Flash banners with wastefully unoptimised drawing routines are probably one of the key reasons that Flash got poo-pooed off of mobile platforms and disabled on everyone’s laptops – CPU usage = battery usage!