Reading List March 2021
My reading list March 2021
SIMD and Apple M1
AArch64 latency / throughput benchmark report: Exact cycles per instruction on Apple M1.
Brief notes on Apple M1 Firestorm microarchitecture: I totally agree, no more gimmick, weird shit instruction in x86_64, just straight forward. Simple is the best.
Armv8 A64 Assembly & Intrinsics Guide Server: This is very helpful for me. Honestly I just hate ARM website, it’s so slooowwww, and contain too little information.
SIMD vectorization: A very good lecture.
CS377 Spring 2021: This course is very good. Nice visualization.
LCU14-504: Taming ARMv8 NEON: from theory to benchmark results: The tips section is nice.
Performance Engineering of Software System: Useful resources. I’ve read next 4 chapters:
Measurement and Timing: Next-time, when benchmark software, using Min, Avg, and Geometric Mean are fair.
Storage Allocation: This section introduces about Heap, Stack and Garbage Collector. Stack is very efficient.,
Parallel Storage Allocation: For multi-thread system, using
supermallocprovide better performance. Default in C code only provides about
0.97M/sfor 32 threads, while
131.7 M/s, respectively.
The Clik Runtime System: I don’t know much about Clik, so I only skim through this chapter. The novelty of stealing is nice, which proves the efficiency of Clik.
Caching and Cache-Efficient Algorithm: Visit matrix multiplication again, with 2 level caches, 3 level caches, and parameter tunning for multiple matrix vector size. I think in my usage, SIMD implementation, the parameter is easy to grab, since it should be multiple of 2, and within the range of total available registers.
Cache Oblivious Algorithm: The demonstration of Merge Sort can be cache friendly, however I don’t understand further.
FPGA + HLS
David Patterson: Computer Architecture and Data Storage | Lex Fridman Podcast #104: David Patterson is pioneer in hardware, father of RISC. This podcast mention the term "software 2.0", this talk give me a lot of insights. Man, I wish I know start up that do crypto accelerated thing.