Performance versus productivity: Getting high performance by using compilers, simple code rewrites and professional libraries. Evaluation using dense matrix transpose.

Figure1: Regular Memory (RM) node from Bridges 2 [7].
Figure 2 : Naive dense matrix transpose
Figure 3: Makefile extract for compiling program using the naive implementation.
Figure 4: Performance evaluation of intuitive implementation of transpose without and with compiler optimizations.
Figure 5: Control which code block should be optimized and which should not [13].
Figure 6: Signatures for various loop unrolling factors
Figure 7: Transpose with loop unrolling with #pragma.
Figure 8: Evaluating performance with different unroll factors, including without unrolling and with complete loop unrolling.
Figure 9: Blocking implementation with 4 nested loops.
Figure 10: Blocking implementation, performance evaluation.
Figure 11: Implementation using parallelization with omp and blocking.
Figure 13: Using MKL to apply transpose.
Figure 14: Performance of MKL library and using best implementation as baseline.




On technology, Africa, dissent, culture and life.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

My App is Too Slow, Now What?

How The Pokémon Company International (TPCi) use AWS.

Installing a Windows Virtual Machine in a Linux Docker Container

Constructing Min-Heap from an Array

Deploying Node.js at scale

Cara Install R pada Ubuntu 20.04 / Debian 11 / Linux Mint

Cara Install R pada Ubuntu 20.04 / Debian 11 / Linux Mint


Migrating SAP HANA to AWS: Tips and Tricks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
R. F. Rweye

R. F. Rweye

On technology, Africa, dissent, culture and life.

More from Medium

Necessity of a Society Management Software !!

How To Quantify Accomplishments on Your Resume

Photo of a man writing notes next to a laptop; there is also a logo from

The Hidden Costs of Context Switching: 4 Strategies for an Optimized Workday

How we create a psychologically safe team in ProdTech at Multiverse