NEON shortcut for flat colour blending into 16-bit

This is a shortcut for the needs descriptor
00000077:03515104_00000000_00000000.  It requires blending a single 32-bit
colour value into a 16-bit framebuffer.
It's used when fading out the screen, eg. when a modal requester pops-up.

The PF JIT produces code for this using 24 instructions/pixel. The NEON
implementation requires 2.1 instructions/pixel. Performance hasn't been
benchmarked, but the improvement is quite visible.

This code has only been tested by inspection of the fading effect described
above, when press+holding a finger on the home screen to pop up the
Shortcuts/Widgets/Folders/Wallpaper requester.

Along with the NEON version, a fallback v5TE implementation is also provided.

This ARM version of col32cb16blend is not fully optimised, but is a reasonable
implementation, and better than the version produced by the JIT. It is here as
a fallback, if NEON is not available.
diff --git a/libpixelflinger/Android.mk b/libpixelflinger/Android.mk
index 0cc85d9..6491d24 100644
--- a/libpixelflinger/Android.mk
+++ b/libpixelflinger/Android.mk
@@ -40,7 +40,13 @@
 	buffer.cpp
 
 ifeq ($(TARGET_ARCH),arm)
+ifeq ($(TARGET_ARCH_VERSION),armv7-a)
+PIXELFLINGER_SRC_FILES += col32cb16blend_neon.S
+PIXELFLINGER_SRC_FILES += col32cb16blend.S
+else
 PIXELFLINGER_SRC_FILES += t32cb16blend.S
+PIXELFLINGER_SRC_FILES += col32cb16blend.S
+endif
 endif
 
 ifeq ($(TARGET_ARCH),arm)