Add the means to detect physical CPU cores vs. Hyperthreads to WinRT
Clearly much work has gone into making multithreaded programming a natural part of modern Windows development. However, there appears to be a crucial bit of information that is not available to the WinRT/Metro developer. Specifically, I have no means of detecting the number of physical CPU cores in the user’s machine.
A simple RunAsync will satisfy the needs of many multithreading goals. But that is not a universal solution. It does not satisfy the needs of all architectures. Consider for example a compute-intensive application that does most of its work as tasks dispatched to a concurrent work pool. That pool will contain N number of threads, ideally equal to the number of physical CPU cores in the user’s machine. The problem is that the quantity N is simply not available to the Metro developer.
The Win32 API function GetNativeSystemInfo is allowed in Metro apps. However, this function only returns the total number of logical processors; it does not distinguish between physical cores and HT. If it says for example 8, I have no idea what that really means. For all I know that could be 2 cores and 6 Hyperthreads. The Win32 function GetLogicalProcessorInformation was designed to return exactly this info, but for some reason this function is off-limits to Metro. WinRT itself does not appear to offer any alternative.
To understand why this is a problem, I refer you to Microsoft’s own documentation on the matter. Specifically, the Core Detection sample of the DirectX SDK has this to say:
"...More significantly, SMT or HT Technology threads share the L1 instruction and data caches. If their memory access patterns are incompatible, they can end up fighting over the cache and causing many cache misses. In the worst case, the total performance for the CPU core can actually decrease when a second thread is run.
...On Windows, the situation is more complicated. The number of threads and their configuration will vary from computer to computer, and determining the configuration is complicated. The function GetLogicalProcessorInformation gives information about the relationship between different hardware threads, and this function is available on Windows Vista, Windows 7, and Windows XP SP3.
...The safest assumption is to have no more than one CPU-intensive thread per CPU core."
So you see, if I create a thread pool equal in size to the value returned to me in the SYSTEM_INFO structure, dwNumberOfProcessors, I am virtually guaranteed that this is the wrong number of threads to create for the optimal use of a SMT/Hyperthreading CPU.
I cannot envision how to support Metro as a developer of modern video games; and by modern I mean multithreaded, parallel, and compute-intensive. How many threads shall I create to suit the user’s CPU? I don’t know, because the new API won’t tell me, and because I will fail certification if I use the Win32 function that was created specifically to answer this question.
Please either add equivalent functionality to WinRT or lift the ban on Win32 GetLogicalProcessorInformation. This is information that I must have.
I have brought this topic up already in the Metro development forums and there was no resolution. For reference see here:
I've seen systems where even GetLogicalProcessorInformation doesn't return the correct data, so we had to resort to thread affinity (not in WinRT), counting cores, and looking at cpuid data in depth. This is crazy compared to MacOSX, where you have one call that returns the physical or logical or active cores (sysctrlbyname). Tasks like SIMD are also highly dependent on the physical and not logical core count. Logical cores do nothing for SIMD. Also why no physical memory totals in WinRT?
Scott Bruno commented
I should add that the MSFT rep who responded on the forums was very helpful, even though there was no solution found for the problem.