So, I've taken a closer look at this and I think mostly the performance changes you are observing is a consequence of how the system of NGUI has changed from 2.7 to 3+.
AnalysisI tested in the Editor 5.0.1p2 on the "Example 14 - Endless Scroll" by making about 200 entries in the middle green scroll list and while dragging or scrolling the UIPanel.LateUpdate balloons up to ~65% CPU usage as you also saw in your example, newlife. I'm testing without culling enabled on either UIPanel or UIWrapContent.
Using the deep profiler, we can dig a little deeper to see what could be causing this. Note, as always, that the numbers will be skewed as the deep profiler adds overhead to each method, which will also trigger a bit more garbage collection etc. That said, the picture looks very similar to the previous one.
6.5 ms to UIPanel.UpdateWidgets
4.8 ms to UIPanel.FillDrawCall
Looking deeper into UpdateWidgets we see this:
ApplyTransform seems to take the largest chunk of the work, UpdateGeometry itself second and UpdateGeometry and UpdateTransform about even. (Caveat: You see the List.Add and List.Clear instead of BetterList because I experimented with switching back to the regular list to see if there was a difference - there's not really.)
Let's take a look at the code for these methods:
void UpdateWidgets()
{
bool changed = false;
bool forceVisible = false;
bool clipped = hasCumulativeClipping;
if (!cullWhileDragging)
{
for (int i = 0; i < UIScrollView.list.size; ++i)
{
UIScrollView sv = UIScrollView.list[i];
if (sv.panel == this && sv.isDragging) forceVisible = true;
}
}
if (mForced != forceVisible)
{
mForced = forceVisible;
mResized = true;
}
// Update all widgets
for (int i = 0, imax = widgets.Count; i < imax; ++i)
{
UIWidget w = widgets[i];
// If the widget is visible, update it
if (w.panel == this && w.enabled)
{
int frame = Time.frameCount;
// First update the widget's transform
if (w.UpdateTransform(frame) || mResized)
{
// Only proceed to checking the widget's visibility if it actually moved
bool vis = forceVisible || (w.CalculateCumulativeAlpha(frame) > 0.001f);
w.UpdateVisibility(vis, forceVisible || ((clipped || w.hideIfOffScreen) ? IsVisible(w) : true));
}
// Update the widget's geometry if necessary
if (w.UpdateGeometry(frame))
{
changed = true;
//Debug.Log("Geometry changed: " + w.name + " " + frame, w);
if (!mRebuild)
{
// Find an existing draw call, if possible
if (w.drawCall != null) w.drawCall.isDirty = true;
else FindDrawCall(w);
}
}
}
}
// Inform the changed event listeners
if (changed && onGeometryUpdated != null) onGeometryUpdated();
mResized = false;
}
Foreach widget drawn it calls UpdateTransform and UpdateGeometry once, consistent with the deep profile. There seems to be no immediate "bad things"™ going on, unless there are side effects to some of the properties. There are minor things we can do to improve it, like moving the int frame = Time.frameCount outside the loop (assuming overhead from Time.frameCount) and checking which of the (w.panel == this) or (w.enabled) is heavier, since if the first one fails the second is not run. We can also simplify the ternery conditional ( ?: ) to save a conditional.
w.UpdateVisibility(vis, forceVisible || ((!clipped && !w.hideIfOffScreen) || IsVisible(w)));
Not much else to do in that method, from my perspective. So we move on to UIWidget.UpdateTransform (following the method structure).
public bool UpdateTransform (int frame)
{
Transform trans = cachedTransform;
mPlayMode = Application.isPlaying;
#if UNITY_EDITOR
if (mMoved || !mPlayMode)
#else
if (mMoved)
#endif
{
mMoved = true;
mMatrixFrame = -1;
trans.hasChanged = false;
Vector2 offset = pivotOffset;
float x0 = -offset.x * mWidth;
float y0 = -offset.y * mHeight;
float x1 = x0 + mWidth;
float y1 = y0 + mHeight;
mOldV0 = panel.worldToLocal.MultiplyPoint3x4(trans.TransformPoint(x0, y0, 0f));
mOldV1 = panel.worldToLocal.MultiplyPoint3x4(trans.TransformPoint(x1, y1, 0f));
}
else if (!panel.widgetsAreStatic && trans.hasChanged)
{
mMoved = true;
mMatrixFrame = -1;
trans.hasChanged = false;
Vector2 offset = pivotOffset;
float x0 = -offset.x * mWidth;
float y0 = -offset.y * mHeight;
float x1 = x0 + mWidth;
float y1 = y0 + mHeight;
Vector3 v0 = panel.worldToLocal.MultiplyPoint3x4(trans.TransformPoint(x0, y0, 0f));
Vector3 v1 = panel.worldToLocal.MultiplyPoint3x4(trans.TransformPoint(x1, y1, 0f));
if (Vector3.SqrMagnitude(mOldV0 - v0) > 0.000001f ||
Vector3.SqrMagnitude(mOldV1 - v1) > 0.000001f)
{
mMoved = true;
mOldV0 = v0;
mOldV1 = v1;
}
}
// Notify the listeners
if (mMoved && onChange != null) onChange();
return mMoved || mChanged;
}
We can see in the picture above (although it's slightly cut off) that we call Vector3.SqrMagnitude which means, we're in the second condition in this instance - which we would expect, as none of the individual widgets have moved and only the parent panel is scrolling. Now, I think it's a bit weird that it even gets in here, but the trans.hasChanged must take the parent into account, which is a little annoying as it causes overhead for us here.
If you KNOW your widgets will not change while scrolling, you can set the panel to be static (widgetsAreStatic), which will escape all the calculations above.
I'm wondering if the first mMoved = true should actually be mMoved = false, as there is a second check for movement with the sqrmagnitude. That would potentially save whatever is hooked up to the OnChange and the stuff back in UIPanel. I'll test this out later. Also, we don't use the frame parameter at all - I imagine this is a leftover.
Moving on to UIWidget.UpdateGeometry
public bool UpdateGeometry (int frame)
{
// Has the alpha changed?
float finalAlpha = CalculateFinalAlpha(frame);
if (mIsVisibleByAlpha && mLastAlpha != finalAlpha) mChanged = true;
mLastAlpha = finalAlpha;
if (mChanged)
{
mChanged = false;
if (mIsVisibleByAlpha && finalAlpha > 0.001f && shader != null)
{
bool hadVertices = geometry.hasVertices;
if (fillGeometry)
{
geometry.Clear();
OnFill(geometry.verts, geometry.uvs, geometry.cols);
}
if (geometry.hasVertices)
{
// Want to see what's being filled? Uncomment this line.
//Debug.Log("Fill " + name + " (" + Time.frameCount + ")");
if (mMatrixFrame != frame)
{
mLocalToPanel = panel.worldToLocal * cachedTransform.localToWorldMatrix;
mMatrixFrame = frame;
}
geometry.ApplyTransform(mLocalToPanel);
mMoved = false;
return true;
}
return hadVertices;
}
else if (geometry.hasVertices)
{
if (fillGeometry) geometry.Clear();
mMoved = false;
return true;
}
}
else if (mMoved && geometry.hasVertices)
{
if (mMatrixFrame != frame)
{
mLocalToPanel = panel.worldToLocal * cachedTransform.localToWorldMatrix;
mMatrixFrame = frame;
}
geometry.ApplyTransform(mLocalToPanel);
mMoved = false;
return true;
}
mMoved = false;
return false;
}
First, we saw in the deep profiler that the two culprits are the method itself and the ApplyTransform, which means we can largely ignore any other method calls inside the method.
The Matrix4x4 multiply operator leads us to the condition:
else if (mMoved && geometry.hasVertices)
Remember that the mMatrixFrame was set to -1 in the UpdateTransform and mMoved was set to true, this means that this one will always run assuming the widget has vertices (Sprites, labels). There doesn't seem to be any way to improve this code directly, apart from avoiding some of the calls ealier by not having mMoved set if something hasn't moved.
Looking inside the UIGeometry.ApplyTransform
public void ApplyTransform (Matrix4x4 widgetToPanel)
{
if (verts.size > 0)
{
mRtpVerts.Clear();
for (int i = 0, imax = verts.size; i < imax; ++i) mRtpVerts.Add(widgetToPanel.MultiplyPoint3x4(verts[i]));
// Calculate the widget's normal and tangent
mRtpNormal = widgetToPanel.MultiplyVector(Vector3.back).normalized;
Vector3 tangent = widgetToPanel.MultiplyVector(Vector3.right).normalized;
mRtpTan
= new Vector4
(tangent
.x, tangent
.y, tangent
.z,
-1f
); }
else mRtpVerts.Clear();
}
The relative-to-panel vertices, normals and tangents are calculated and saved - doing this every frame means Clearing a list and refiling it with these. Only thing I can see as a potential optimization is to avoid generating normals and tangents unless they are needed (set in the UIPanel drawing the widget). Other than that, the best optimization is to avoid re-applying the transform at all.
Applying improvementsUsing times on UIPanel.LateUpdate from a regular profiler (not deep)
Stock:
1.75 - 2.0msmove time.frameCount: no significant difference
UpdateWidgets optimizations:
simplify conditional:
1.80+ ms slightly slower (boo), removed again.
UpdateTransform optimizations:
mMoved default to false :
0.45 - 1.40ms - A significant improvement. This may have sideeffects, however.
UpdateGeometry optimizations:
pass generateNormals to ApplyTransform: no significant difference.
UIPanel Inspector optimizations:
Turning on Static:
0.20 - 0.60ms A very significant improvement.
Before drawing any conclusions, these numbers assume the computer I am using, the particulars of what's running in the background etc. This means you cannot compare the raw numbers with anything else than other tests from the same setup. I've also only tested with some 200 elements in the list, and not tested the scaling of it (say to 2000). That said, the internal percentages do speak for themselves given these caveats.
1) There is an optimization to be had in changing the UIWidget.UpdateTransform to not set mMoved every frame unless the transform itself has moved or changed - this needs some more tests, to make sure it doesn't break functionality.
2) If you know your widgets does not change while scrolling you can switch the widgetsAreStatic to true on the UIPanel to gain significant performace increase.
This has been a fun little trip debugging performance; I'll test the potential optimizations out some more in other scenarios and make sure they don't break anything, then pullrequest it to the main trunk.