Homework 6: MemPath (RISC-V + Cache Simulator)
Computer Architecture - CS110 @ ShanghaiTech University
Github Classroom: Fetch starter code from this repo and work on it (You already know how to do it)
Introduction
This homework models one shared instruction/data memory path for a simplified RV32I system.
The framework loads a machine-code image, runs an untimed in-order RV32I core, and generates one instruction-fetch request for every executed instruction. Your part is to complete the memory side of the lab.
Figure 1: Execution-stage and replay-stage dataflow. The framework core and LSU determine the dynamic request stream, while replay drives the shared memory path and accumulates final timing statistics.
Your Tasks
Work mainly in:
The framework already handles image loading, untimed core execution, trace storage, DRAM timing, and replay.
Overall cache requirements:
- one shared cache for instruction fetches and data accesses
- set-associative organization controlled by the runtime parameters
- deterministic LRU replacement
- write-back policy
- write-allocate policy
- blocking behavior
LSU
In src/hw/lsu.c,
decide whether the current instruction creates a data-memory
request. For this lab, that means lw and
sw only.
When it does, compute the address and size, return the
request metadata, and append the matching L or
S trace entry. Otherwise return
valid = false.
Cache
In src/hw/cache.c,
make the cache answer the replayed requests correctly. Detect
hits and misses, use an invalid line before evicting, keep LRU
deterministic, and update cache state after each access.
Your cache should work for all supported cache organizations.
Memory Controller
mc_access() handles one logical replay request
at a time.
Figure 2: One-request service path in the memory controller. Replay issues one logical request through mc_access(), the controller services that request through the cache and DRAM helpers, and the completed per-request result returns to replay.
In src/hw/mem_ctrl.c,
connect one replay request to the cache and DRAM helpers.
Keep the controller request-based: one call to
mc_access() is one logical request. If that request
touches more than one block, handle the blocks in order but
still treat the whole access as one
memory_request.
When the controller talks to DRAM, use whole cache blocks. On a dirty eviction, write the old block back before refilling the new one.
Build and Test
Use make to build the simulator.
Use make test to run the public self-test. This
checks the handout example and compares your output against the
expected result.
Key Rules
The core is untimed and in-order. Every executed instruction
generates one I request. Only lw and
sw generate additional data requests.
Use byte-addressed trace entries: I 0xADDR 4,
L 0xADDR SIZE, and S 0xADDR SIZE.
The runtime parameters are passed in as:
mempath_sim <image> <s> <E> <b> <hit_lat> <dram_read_lat> <dram_write_lat> <cpu_freq_ghz> <cache_freq_ghz> <dram_freq_ghz>
Extra Reading
For the longer explanation and the input-image example, see this guide.

