Alen is a PhD candidate in computer science at the National University of Singapore. His research interests lie broadly in the areas of computer architecture, performance analysis, and distributed systems. His current research focus is on speeding up the architectural simulation of multi-threaded applications.
A tutorial on LoopPoint Tools has been accepted to the IEEE International Symposium on High-Performance Computer Architecture (HPCA) 2023, which will be held in Montreal, Canada from Feb 25 to Mar 01 2023. The authors are looking forward to sharing this work and having an in-person discussion during the tutorial session. The tutorial is based on our recently published research to demonstrate how to effectively reduce the simulation time of large multi-threaded applications to a practically short period of time. This is a key issue for future large-system exploration, both in the industry as well as academia.
In this tutorial, we will demonstrate a collection of tools and techniques that are intended to help computer architecture researchers simulate complex applications on future hardware (with a focus on our most recent publication, LoopPoint, from HPCA 2022). The tutorial targets researchers interested in simulation methodologies, workload sampling, application analysis, and computer architecture in general. The tutorial covers several interesting and novel methodologies developed in industry as well as academia.
You can find the source code at GitHub: LoopPoint and ELFies
Slides
The tutorial slides are posted here.
Release of SPEC ELFies
We are releasing a set of ELFies for simulating on gem5/Sniper. The ELFies are representative checkpoints of SPEC CPU2017 benchmarks (reference inputs) that use 8 threads generated using LoopPoint methodology. The ELFies of each benchmark can be downloaded from the individual links below. We are anticipating a release of all the remaining benchmarks of SPEC CPU2017 in the upcoming months.
These ELFies can be simulated on both gem5 and Sniper. Check this example configuration script for simulating ELFies on gem5. On Sniper, the following command can be used to simulate ELFies:
run-sniper -v -n 8 -c gainestown -g scheduler/type=static -s simuserroi --roi-script --trace-args="-pinplay:control start:address:<start-pc>:count1:global,stop:address:<stop-pc>:count<stop-count>:global" -sprogresstrace -- /path/to/app.sim.elfie
Please refer to the SPEC Fair Use Rules before using these checkpoints. If used as the basis for prediction of SPEC run time or a SPEC metric, any results published must be very clearly tagged as “Estimated” or “Estimated by simulation of ELFies for representative simulation regions (looppoints)”. By downloading these ELFies, you confirm that you agree to the license policy outlined above.
Tools Used
- Compiler for building SPEC CPU2017 benchmarks: Intel Compiler Toolchain v2021.5.0
- SDE version: 9.14
- LoopPoint
- Pinball2Elf
Benchmark Settings
- Benchmark suite: Multi-threaded subset of SPEC CPU2017 benchmarks
- Input class: Reference inputs
- Number of threads: 8 OpenMP threads
- OpenMP wait-policy:
active
(spin-loops enabled) - OpenMP schedule:
static
- Compiler settings: Instructions of Nehalem architecture (
SSE4.2
); optmizations (O3
) - Sampling methodology: LoopPoint
- Sample settings: Detailed Regions of ~800M instructions (ignoring spin-loops), no warmup, maxK=20
Agenda
The tutorial is scheduled for Saturday, Feb 25 2023 after lunch at Outremont 1 at Hotel Bonaventure Montreal.
Time (EST) | Speaker | Topic |
13.20 to 13.30 | Trevor E. Carlson | Overview of the tutorial |
13.30 to 14.20 | Akanksha Chaudhari | Performance analysis, simulation, sampling |
14.20 to 15.20 | Harish Patil | Using tools: Pin, PinPlay, SDE, ELFies |
15.20 to 15.40 | Break | |
15.40 to 16.20 | Alen Sabu | Multi-threaded sampling and LoopPoint |
16.20 to 17.00 | Changxi Liu | Sniper and LoopPoint demo |
17.00 to 17.40 | Zhantong Qiu | Using LoopPoint with gem5 |
Speakers
LoopPoint and ELFies was done as a collaboration project between the National University of Singapore (NUS) and Intel Corporation. There are several people involved in the project both from NUS and Intel. The primary contributors of the project are listed below.
Changxi Liu is a PhD student at the National University of Singapore. His research interests include simulation, compilers, high-performance computing, and computer architecture exploration. He has published papers in ICS and FGCS in the area of high-performance computing. He received a Master’s and Bachelor’s degree in computer science both from Beihang University, Beijing.
Akanksha is a Research Assistant at the School of Computing, National University of Singapore. Her research interests include operating systems, computer architecture, cyber-physical systems, and storage technologies. She received her bachelor's degree in electronics and communication engineering from BITS Pilani, India.
Zhantong is an undergraduate student at University of California, Davis interested in Computer Architecture and Simulation research.
Jason Lowe-Power is an Assistant Professor at the Department of Computer Science, University of California, Davis. Lowe-Power’s focus is hardware-software co-design to improve the efficiency and programmability of modern computer systems. As part of the Davis Computer Architecture Group, Lowe-Power investigates improving the efficiency and usability of heterogeneous systems, enhancing system security using hardware extensions, and developing open-source simulation methodology to support computer architecture research. Lowe-Power is a Member of the ECE graduate group and is the Project management committee chair for the open-source computer architecture simulator gem5.
Harish Patil is a Principal Engineer in the Technology Path-finding and Innovation group at Intel Corporation. His areas of interest include static/dynamic program analysis (using “Pin/SDE” and “LLVM”), simulation point selection(“PinPoints”), record/replay (“PinPlay”), and debugging (“DrDebug”). Recipient of “ACM Programming Languages Software Award:2020” for co-developing the Pin program instrumentation framework. Co-author of two papers with “Test-of-Time” awards based on Pin (PLDI 2005-2015) and PinPlay (CGO 2010-2020). He has a Ph.D. from the University of Wisconsin, Madison, a B.Tech. and an M.Tech. from Indian Institute of Technology, Bombay, and an MBA from Babson College.
Wim Heirman is a Principal Engineer at Intel Corporation. His research interests include fast and accurate simulation, and computer architecture design and exploration. He co-authored the Sniper Multi-Core Simulator, has written 100+ papers in scientific conferences and journals, and has 10 granted US patents. He received a M.Sc (2003) and Ph.D (2008) in computer engineering from Ghent University, Belgium.
Trevor E. Carlson is an assistant professor at the National University of Singapore (NUS). He received his B.S. and M.S. degrees from Carnegie Mellon University in 2002 and 2003, his Ph.D. from Ghent University in 2014, and has worked for 3 years as a postdoctoral researcher at Uppsala University until 2017. He has over 13 years of computer architecture experience covering both industry and academia. His work on microarchitecture, simulation, sampling and modeling has seen three Best Paper Awards and three nominations for Best Paper. He has been developing techniques for high-efficiency processors that improve energy efficiency and performance by taking into account Memory Level Parallelism (MLP) together with unique architectural designs and software techniques. He co-authored the Sniper Multi-Core Simulator which has been used by hundreds of researchers to evaluate the performance and power efficiency of next-generation systems. His research interests include highly-efficient microarchitectures, hardware/software co-design, performance modeling, and fast and scalable simulation methodologies.