Win class performance, cause (NEW), and a test for all

Listing issues addressed in beta version 4.05
BUCKAROO
Posts: 206
Joined: Sun Oct 24, 2010 3:13 am

Post by BUCKAROO » Mon Aug 05, 2013 3:20 am

tzuk wrote:Here, you can try the revised DLL for version 4.05.03. 32-bit only.
http://www.sandboxie.com/SbieDll.dll
So that it doesn't reveal itself at some other juncture, I'm just reporting an obscure window bug that is now probably being, well, further obscured by this revised DLL. This DLL worked around a dialog bug with WinVICE-2.4-x86.zip\x64.exe # Settings -> Joystick settings... (Dialog 122) what's window Width was 32767 and the positive Left offset of controls within BS_GROUPBOX were somewheres beyond the right edge of my screen.

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Mon Aug 05, 2013 5:47 am

Not sure I understand your language. Are you saying this WinVICE works better or worse with the new DLL ?
tzuk

DR_LaRRY_PEpPeR
Posts: 291
Joined: Wed Jul 04, 2012 6:40 pm
Location: St. Louis area

Post by DR_LaRRY_PEpPeR » Mon Aug 05, 2013 6:12 am

I didn't really have time to test the new DLL yesterday, but I couldn't resist :P, and had to wait to reply...
tzuk wrote:Alright, I withdraw my objections to your invalid handle test, and I apologize.
No prob. :)
About CreateWindow, very little code there was candidate for being a time sink in the first place.
One thing was going through the SbieSvc GUI process, for window handles in the sandbox, so I changed that now.
Hopefully that change goes well without any surprise problems.
Ahh, that makes sense that it was going through GuiProxy even for stuff in the sandbox. It seems like that would have been a clue when I said with CreateWindow the SbieSvc CPU usage is the same as when attempting to "access" non-sandboxed handles. :o

Here, you can try the revised DLL for version 4.05.03. 32-bit only.
http://www.sandboxie.com/SbieDll.dll
My initial reaction: :shock:

Good news: Compared to default settings before, this improved CreateWindow almost 20x! That's about 3x better than 3.76. Superb. :) That should of course be plenty fast, and much better than the 100x slower than UNsandboxed/* I mentioned (~46x).

Also, using the different OpenWinClass settings have no effect on performance now -- same as defaults. :)

BUT the bad news: Those excellent CreateWindow results only occur with just GUI Bench running in the sandbox! Open up just 1 IE 6 window with it, and it's barely 2x faster than previous default settings. :( Open 10 IE windows and it's over 2x slower than before!

What's going on with extra window handles/objects having such an impact?

If you notice the "access" benchmark, accessing a sandboxed handle (or not for that matter), the performance is not affected by how much stuff is open. The time remains constant. Shouldn't CreateWindow be the same...?

Also, like in 3.76, instead of SbieSvc CPU usage, now CreateWindow causes some CPU usage in the other sandboxed process(es)... Again, that doesn't happen with the "access" benchmark -- just GUIBench.exe itself maxed out.


In conclusion, vs old defaults, it's probably a wash at best in practice, and can easily be slower with the rapid slowdown... :? But at least we're getting somewhere, thanks! If you can get that initial performance to remain constant (like "access"), it would be mission accomplished! :D

BUCKAROO
Posts: 206
Joined: Sun Oct 24, 2010 3:13 am

Post by BUCKAROO » Mon Aug 05, 2013 6:57 am

tzuk wrote:Not sure I understand your language.
The dialog box displays correctly with the performance revised DLL. I suspect the SbieSvc GUI process, or some code in the new DLL which now bypassed the bug, still carries this obscure bug, that's me jumping to conclusions.
tzuk wrote:Are you saying this WinVICE works better or worse with the new DLL ?
Better ! :)

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Mon Aug 05, 2013 11:06 am

BUCKAROO, well if the new DLL fixes a problem then so much the better! :)

DR_LaRRY_PEpPeR, not sure I understand your complicated number scheme.
Slower x100 than X but faster x46 than Y but slower x2 than Z.

How long does it take you to start PowerArchiver with the new DLL, and open the configuration window?
For me the new DLL makes the configuration window open in half the time for me, around one second, I think.
Startup time is unchanged at 3 seconds.

When a window is created, CreateWindow tells all other windows about it. This is part of the emulation of
some other API which tracks open windows. Makes sense that with more open windows, it would take longer,
but that's just the way it is.
tzuk

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Wed Aug 07, 2013 4:05 am

The updated DLL is included in version 4.05.04.
tzuk

DR_LaRRY_PEpPeR
Posts: 291
Joined: Wed Jul 04, 2012 6:40 pm
Location: St. Louis area

Post by DR_LaRRY_PEpPeR » Thu Aug 08, 2013 3:04 pm

Sorry didn't reply sooner! Nothing yesterday, but the previous 2 days, I thought I had some extra feedback, and then shortly later I realized something else or that I was wrong... Anyway, so not much different to report than my last post, although there is one big mistake (or not) I made with my CreateWindow benchmark!

I guess you didn't see my code, but I realized Tuesday when checking something else. This was the benchmark code:

Code: Select all

HWND dummy = CreateWindowA("Static", "Dummy test", WS_CHILD, 0, 0, 128, 64, NULL, NULL, NULL, NULL);
DestroyWindow(dummy);
I said any CreateWindow call seemed to behave the same in Sandboxie -- didn't matter if custom class or standard Static/Edit/etc., or what the other parameters were. So I was just using 0s everywhere. Except I noticed if I didn't use WS_CHILD, it was much slower UNsandboxed, so I figured it needed some style flag. BUT I've now realized that WS_CHILD without a parent HWND is also invalid which is why it runs SO FAST UNsandboxed! (10^6 in 1.3s). I've updated it with a parent param.

Updated GUIBench.zip and source/package.

Oops, sorry! :oops: Now what I realized doesn't really change anything in Sandboxie, it's just that the UNsandboxed result was way off, and made Sandboxie look bad with a lot of overhead that's not really there! e.g. Sandboxed is actually only a few times slower than UNsandboxed (expected) up to 10's of times slower at worst, NOT 100s or 1000s of times slower. :? :o


The bigger speed difference I saw with CreateWindow in API Monitor (before creating "supporting" benchmark) must be coming from other stuff within the CreateWindow calls, rather than only the CreateWindow function itself...

BTW, I also verified the slowness I see using API Monitor itself is not related to CreateWindow (not that I thought it was).


PowerArchiver: I was only checking laptop before, which hasn't had PA installed for a long while. I've since upgraded to the latest betas on main system to check my 2011 PA version. The results are about what I expected (like the IE window opening speed).

In my main active sandbox (with OE, 2 IEs, Pale Moon running), Configuration window took just over 3 seconds to open with old DLL (defaults). Just over 2 seconds with new DLL -- and no OpenWinClass settings make it slower, so no 50 second extreme. :)

However, in a sandbox by itself, it's right around 1 second (about half-second UNsandboxed). So those few other things open in other sandbox add a second. :?


CreateWindow benchmark, 10^3 iterations on main system takes 1.25s UNsandboxed, ~10s with old DLL, and ~6s new DLL (main active sandbox; ~2s on its own). So instead of adding 0.01s I mentioned, old DLL was "just" 0.009s and new DLL adds 0.005 per call. Savings of 0.004 reflects the times I see: PA Configuration, 270 CreateWindow calls * 4ms is about a second saved.

IE window opening: from 0.7s to 0.58, which again is right for 30-ish CreateWindows.


Now, about those drastic slowdowns I see with the new DLL (using hopefully valid CreateWindow calls :)). 10^4 iterations on laptop takes ~1.5s UNsandboxed. I guess there is a slight slowdown in Windows since the slower laptop is doing 10x more calls in almost the same time as main system with a lot more window stuff open...

Same 10^4 iterations:
- 3.76
4.6s alone
6.4s with 1 IE window
17.8s with 10 IE windows

- 4.05.03
15.8s alone
19.6s with 1 IE window
42.6s with 10 IE windows

- 4.05.04
~4s alone
7.5~10s with 1 IE window
44~49 with 10 IE windows


Besides the rapid degradation itself, something doesn't seem right that it becomes slower than the previous DLL/version if the same "window notification" stuff is happening, but without the GuiProxy overhead? The new version can beat 3.76 initially. It would be nice if the algorithm, etc. performance could stay with it. 8)

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Fri Aug 09, 2013 4:14 am

Alright, reading between the numbers, I think you're fairly happy with the new version and we can put this subject behind us.
At least I hope so.
I'm glad you kept at it, and spending time to investigate and develop these benchmarks.
Thanks!

* * *

Here's a possible explanation for the "faster at first and then slower" scenario.

The "job" wrapper with its Win32 restrictions has the effect that it checks and re-checks access rights to every window handle.
This probably contributes to some performance degradation, once the number of handles grow and the checks take more time.

Where there are few windows in the sandbox, CreateWindow v4 sees fewer total handles than v3.76,
and the handle checks are still fast.
Result is faster than v3.76.

As the number of windows in the sandbox grows, and makes the number of windows outside the sandbox negligible,
to the point that we can say that CreateWindow v4 sees a similar number of total handles as version 3.76,
but now the handle checks are slow.
Result is slower than v3.76.
tzuk

DR_LaRRY_PEpPeR
Posts: 291
Joined: Wed Jul 04, 2012 6:40 pm
Location: St. Louis area

Post by DR_LaRRY_PEpPeR » Fri Aug 09, 2013 6:30 am

Yeah, the ideal potential improvement is nice, but I'm just trying to make some sense of the slowdown that's occurring now (which you tried to explain). :) I had kinda thought of something along the line of what you describe...

You're talking about the Job Object and its "USER Handles" restriction (_UILIMIT_HANDLES), right? And Windows' mechanism checking access rights to window handles...?


I just included v3.76 for reference, but really looking at the difference compared to the previous DLL and use of GuiProxy -- if the same stuff is happening (notifications about CreateWindow, etc.) as before, shouldn't the times be the same, minus X amount of overhead from skipping GuiProxy now??

e.g. On laptop IE window opening test, I see the expected small improvement with 1 window open, but with 10 windows (1 iexplore process), the old DLL/GuiProxy method is a hair faster than after your change! :? (As the posted Create bench numbers show.) I'd be REALLY happy to hold onto the improvement, instead of losing it and then some with the new version. :o


BTW, I notice there's still a spike in SbieSvc CPU (both processes) when opening IE windows (closing also), for one example. (Doesn't happen with Create bench or PowerArchiver Configuration...) So can you see if something else is still going through GuiProxy that shouldn't be when you get a chance? :) Thanks!

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Fri Aug 09, 2013 7:42 am

I was indeed referring to UILIMIT_HANDLES which checks handles at both the application level (USER32.DLL) and the kernel level (WIN32K.SYS). To emphasize this, there is one more system call (NtUserValidateHandleSecure) that happens for each Win32 handle access by a program in a restricted job.

Perhaps as the number of windows in the sandbox grows, it is cheaper to make a single call to SbieSvc GuiProxy (which is outside the job) to get a list of windows, than to get the list from inside the job. Perhaps a little counter-intuitive but it's the only guess I have. In any case I don't think it is interesting to look into a 2-6 second slowdown over 10,000 calls because this doesn't happen outside your benchmarks.
tzuk

DR_LaRRY_PEpPeR
Posts: 291
Joined: Wed Jul 04, 2012 6:40 pm
Location: St. Louis area

Post by DR_LaRRY_PEpPeR » Fri Aug 09, 2013 10:46 am

It's actually the 30+ seconds I'm talking about 1 vs 10 IE windows -- if you mean 2-6s between old and new DLLs, well that's not an improvement, thus my curiosity. :) With PA's Configuration window, it seems an awful lot that running with just a few other usual things (OE + Pale Moon + 2x IE) adds a solid second before the window appears (compared to having only PA in a sandbox). And it adds over 1/8th to IE window opening (0.45 to 0.58 ).


Anyway here, I don't think the explanation about Windows' handle checks is right! I just tried 4.02 with OpenWinClass=* which still keeps the "USER Handles" restricted Job. And I already knew 4.02, with that setting, let GUI Bench's CreateWindow run as fast as UNsandboxed (just by itself). So I opened 12 IE windows in 2 IE processes, plus UNsandboxed and sandboxed GUI Bench.

Result? Whether sandboxed or not, they both run 10^4 in 2.8 seconds! So it looks like there's NO measurable overhead from the restricted Job itself or Windows' checking. I would expect that Windows' mechanisms are very optimized and wouldn't have a severe degradation like that...


That's why I was thinking (and it appears the case), that something in SbieDll is adding a ton of extra time (maybe before your change also, but whatever). Even in 3.76 for that matter, just not as much...

Obviously without Open=* SbieDll or such will add some amount of overhead (understandable), but what is IT doing that adds 40 more seconds with just 10 IE windows?

Actually it seems even less would need to be done than in 3.76, if the Job restrictions (zero overhead!?) are already protecting stuff...


It's only SbieDll I guess now (as in 3.76) that is causing CPU usage in other sandboxed processes (instead of SbieSvc). That doesn't happen with only the restricted Job (4.02's Open=*).

You had mentioned "emulation of some other API which tracks open windows," so I figured that was some Sandboxie thing, and thus might not be performing/scaling optimally. :?

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Sat Aug 10, 2013 2:17 pm

I haven't looked into your findings but I think we can say that for the normal case, there is some improvement in version 4.05.04. Greater improvements might have to involve reworking some aspects of the new GUI logic in Sandboxie and I'm not going to do that. Even this minor change had a side effect as BUCKAROO reported, and while it seems to be a positive side effect, the point is that every minor change has the potential to break stuff. And for performance gains which cannot be measured in a practical sense, I don't think it is worth to keep spending time on this.
tzuk

DR_LaRRY_PEpPeR
Posts: 291
Joined: Wed Jul 04, 2012 6:40 pm
Location: St. Louis area

Post by DR_LaRRY_PEpPeR » Sat Aug 10, 2013 5:00 pm

Hopefully you understand that the restricted Job and Windows are NOT an issue with performance. I did some other checking after our posts yesterday and thinking about the logic of stuff. :) I am now VERY confident that Sandboxie (SbieDll I guess) is doing some completely unnecessary stuff with only CreateWindow! That's my logical conclusion anyway. I REALLY think CreateWindow can basically be as fast as UNsandboxed (yes, the Job has effectively no impact), and that would obviously be a HUGE improvement over what we have now! :D

I obviously believed what you said about the Windows Job object access checks on window handles, but wanted to know more myself (besides using 4.02 with Open=*). Yes, there IS an overhead from the checks, but they are so small that it cannot be measured with CreateWindow -- a few milliseconds at most over 10,000 calls -- might as well be zero. :)

BTW, as I have said previously (before your change), Sandboxie doesn't have much effect on [resource-based] dialog boxes. Why is that? Creating and "case WM_INITDIALOG: EndDialog(...)" with the GUI Bench dialog (13 controls) is WAY faster than the CreateWindow function. And it does NOT slow down! It's only about 1.6-2x slower than UNsandboxed, which is fine (although I think even it may have unneeded overhead; see below). So again, why are dialogs unaffected by the possibilties you were explaining...? :?


I stuck GUI Bench into a Job myself (with UILIMIT_HANDLES of course) and it was interesting what I observed. The window "access" benchmark ("THIS window's handle") was ~12x slower than no Job. Hmm, Sandboxie is ~24x slower accessing sandboxed handles. ;) (So yes, with much faster functions like IsWindowVisible, the Job/Windows access checks are measurable, FYI, but nothing to worry about here...)


How I WAS thinking Sandboxie works with window "accessing" functions like IsWindowVisible: Try it; if it "fails," go through GuiProxy...

But then I added IsWindow with IsWindowVisible before running in the Job, and..... The "access" time was about equal to Sandboxie! :D

SO now I wonder if the logic is thus:

Code: Select all

if (IsWindow(some_HWND)) // Fails for non-sandboxed handle
{
	IntendedWinFunc(... some_HWND ...);
}
else
{
	// Go through GuiProxy
}
Which would make sense... Is that close to correct? Anyway, the "access" time would suggest yes; OK cool, that's all good, should be fine performance.


Those details to say/ask: Why can't CreateWindow work like that?? As a "blind outsider," I can't understand anything else it needs to do which is really slowing it down. Maybe you can help me "see" if I'm wrong. In other words, just let CreateWindow run "naturally," as you do with IsWindowVisible, et al. Massive, massive performance increase!! :shock:

I hope you can explain and/or look into it (maybe a simple oversight, etc.). Thanks!

pastuch

thanks

Post by pastuch » Sat Aug 10, 2013 8:07 pm

I have low performance problem with SBIE 4 too.
DR_LaRRY_PEpPeR, thank you very much for your hard detective work :)

tzuk
Sandboxie Founder
Sandboxie Founder
Posts: 16076
Joined: Tue Jun 22, 2004 12:57 pm

Post by tzuk » Sun Aug 11, 2013 3:12 am

While I am glad to see that DR_LaRRY_PEpPeR is having fun investigating job objects, it does not seem that he is going to fix anyone's performance problems as he is focused on the wrong stuff.

Other than DR_LaRRY_PEpPeR, who is going into way too much detail, my impression is the three or so other people who reported performance issues just make a "me too" style of post, and don't give any details at all, and never follow up on anything.

I think I've had enough of this non-issue for the time being.
tzuk

Locked

Who is online

Users browsing this forum: No registered users and 1 guest