Category Archives: Technical

xPU Thermals. Whats burning?

After a small re-amp of a spark (or more like a nuke going off..) for my interest in hardware and tinkering – especially the build and just going expensive LEGO all the over things, I remembered when I once upon a time started, actual heat output wasn’t really an issue – things didn’t even need a heatsink… It was coooool.

Now, when I for the first time ever own _three_  too many AMD Ryzen systems, and since starting to overclock (started with my i5 system from way back (actually, back with my Prescott P4, but nevermind), and non-stock coolers before that ofc.. ), I realized I had never really given it that much thought, just bump some settings and let it roll.
I even been using stock cooling solutions primarily (remember kids, in the server farm, in the way-back machine – stock coolers where cool, and for Intel PII/Celeron with the fancy Slot A config, there were not that many options without a hack-saw). The turn was here – cooling solutions for our rigs with our biiig 80mm chassis fans.. we pushed them hard – dustfilters? Tsk!

They started burning up the place at the same time as (to use a now over used term – VIRAL – ) a video-clip or an actual link to toms hardware where they tested how the stuff worked without a cooler – AMD CPU’s turned into ovens, (Someone got the link on youtube). Fun times!

So, onwards to – graphics.. , NVIDIA and ATI here in the ring already boxing – well, that’s a separate story I can’t really remember, graphics where expensive back then too (right now of the time of writing, it’s silly..). But we started to have quite more umpf being crammed in, and the hype-plane was flying.

So, we are starting to produce more and more heat, and to be honest – airflow was a concept more thought of by putting a big desk fan next to your rig, or pick parts from housing AC equipment.. Around this time (of the Slot A Intel setups), I built my first fan-less rig – as my second server – living under a bed 24/7.. It lived on long – but damn, it was a hard thing to do even back then.

Now however, we have auto-adjusting CPU’s and GPU’s – throttling down clockspeeds when we hit a thermal limit. That works, in a fashion. But we also have power saving modes that clocks down from the get go when you don’t demand that much – we are no longer locked to a specific speed, and we might have a boost on a few cores (yes, this I love) because even tho each core might not displace that much heat, once you cram a few of them together – we have a toasty situation.

Our software, handles this either with some small help. The OS layer just kicks around and becomes the kid that keeps asking for money (or clock-power) – but also the responsible adult that says “that’s it, thanks for the loan”.

The application layer however.. Not that thinking. It’s like the stuckup teenager that just does what ever it wants. Well, usually per instructions of the code and function. But some does concern themselves with battery mode or not – so they do think a tiny bit.

But it leaves an impact window that for us that remember an overall slower process of doing things, at least I get the feeling that I have to wait until it fires off and performs what I asked after all the chains has
(Yes, I have lost the rant-concept right now..).

So, what is really burning? Seems nothing is – we are clocking down, calling things turbo boosting and the clock values are often the baseline where equipment keep the thermal/performance balance the best – even memory does this.

How much impact does these power saving plan have? Does it save us a few bucks on a workstation, or is it that the laptop segment features and trickery simply have ported over?

I don’t know right now. Because this rant is over for now.

Cuda-playtime

# ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1050 Ti"
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 4039 MBytes (4234936320 bytes)
( 6) Multiprocessors, (128) CUDA Cores/MP: 768 CUDA Cores
GPU Max Clock rate: 1418 MHz (1.42 GHz)
Memory Clock rate: 3504 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

The Ryzen 5 1600 / B350 / under linux (kubuntu with kernel 4.15.0-22) and Nvidia drivers…

Every other day one thinks, it’s cool and nice..
And alas – I notice all cores and threads are stuck at 1374Mhz.. No wonder it stays around 41 degrees C over full load..
It seems something is missing in the performance-junka-mado..

However, “clocking down” to a clock in BIOS to 3775 as a max target seems to get it to boost more often and not lock itself down, it finally jumps to a stable 3768 Mhz on all six cores and threads.
Now, the B350 “chipset” is not what I would call anything special, at all. A bit sad that they did a mid-tier chipset that caught most attention it seems.

There’s more… All the AMD chipsets seems to be, not really that fun for keeping score on temps.. One needs to compile it87 as a module to get the values..

All these points are still valid over several updates, and sadly – since it’s first gen ryzen – it’s not new – but I fail to see the light in the tunnel getting all the things working correctly.

Yes, this is a rant.

I am still happy, since it’s paired with a GTX 1060 6GB ASUS ‘blowerstyle’ GFX. It works for all the things I want. But I do want it do listen to my clock settings.. But I am also guessing the firmware is a bit to blame.

And talking about blame.. WHY THE H*CK can’t Nvidia ensure there is fan profiling in the damned locked drivers straight from the get-go? The damn thing clocks down since it reaches max temp before max performance more or less, with the fans locked at half the speed :S.

Rant over.

However, getting the temps is straightforward – https://github.com/groeck/it87 . Go and Git it.

Jumping around the OS train. /Rant

All aboard!

Next to me, work laptop turned off for the day – runs Windows 10 (*sob sob*).
In front of me, workstation running *buntu 17.10 18.04 LTS.
Below it, MacBook Pro running Mac OS.
Next to it, my “Pi tower” running rasbian-ish distros.
Next to the screen, gaiming rig running windows 10.
BSD stuffs runs virtual from another location in the apartment.
Chromecast here and there, older i5 and i7 machines littering with a plethora of flavours installed.

No wonder I never get anything done at home, I keep fiddling. And if I am not actually fiddling – I am ranting about fiddling.

For the love of something – no wonder that it’s hard to keep track on where I do what – I am all over the place!

The need to know and to tinker always seems greater than the need to sleep or sometimes do something .. Serious.
Like, being social, or go on a date, or take a walk.. Phaw. No time! So much to tinker with!

/rant over.

Twiddling-series.

This year has been quite.. eventful, just as last year.
I am hoping that things will calm down and my head will start working in a more true sense of the word soon.

When this happens (if it happens..), I will start up a small ‘twiddling’ series – targeting the more amusing and fun toys that exists for us interested in, well, fun stuffs.

The baseline will simply be a presentation of the “what”, the “how”, the “why”, and my own small experiences with it, twiddling.

Look at it this way, it’s a getting-started-and-start-getting-your-mods-and-ideas-rolling series.

Traffic Baseline. Apps/OS.

Rant.

Many “NGFW” creations looks into the application-stack “layer 8”. I am however pondering over, since many seems to also identify the underlying OS (for enabling better and easier rule-sets per device category for example) – why not also provide a baseline for that specific OS – what to expect and also identify the normally permitted traffic – and the underlying connectionpoints for those. With this reasoning, one could filter out lots of garbage traffic that otherwise needs to be looked at with all the possible UTM-profiles.

This would be something we all could benefit from, make easier exclusions on per OS-basis etc. If we learn what normal is, we do not have to look at it all the time – only in a fully forensic perspective would it be needed – to fully determine a timeline etc.