Here are profiler screenshots from the same scene, first with NGUI 3.0.4, then with 3.0.6 f3.
There where other changes than NGUI version between those builds too, e.g. the spikes you see in 3.0.4 are gone because of those.
But AFAIK the CPU time difference 2.32 -> 1.22 and the absence of memory allocations should be directly contributable to NGUI optimizations.