Hw6

Computer Architecture I (CS110 / CS110P) Document
Reference: CS110 Course Page

Homework 6: MemPath (RISC-V + Cache Simulator)

Computer Architecture - CS110 @ ShanghaiTech University

Github Classroom: Fetch starter code from this repo and work on it (You already know how to do it)

Introduction

This homework models one shared instruction/data memory path for a simplified RV32I system.

The framework loads a machine-code image, runs an untimed in-order RV32I core, and generates one instruction-fetch request for every executed instruction. Your part is to complete the memory side of the lab.

fig1.png

Figure 1: Execution-stage and replay-stage dataflow. The framework core and LSU determine the dynamic request stream, while replay drives the shared memory path and accumulates final timing statistics.

Your Tasks

Work mainly in:

The framework already handles image loading, untimed core execution, trace storage, DRAM timing, and replay.

Overall cache requirements:

  • one shared cache for instruction fetches and data accesses
  • set-associative organization controlled by the runtime parameters
  • deterministic LRU replacement
  • write-back policy
  • write-allocate policy
  • blocking behavior

LSU

In src/hw/lsu.c, decide whether the current instruction creates a data-memory request. For this lab, that means lw and sw only.

When it does, compute the address and size, return the request metadata, and append the matching L or S trace entry. Otherwise return valid = false.

Cache

In src/hw/cache.c, make the cache answer the replayed requests correctly. Detect hits and misses, use an invalid line before evicting, keep LRU deterministic, and update cache state after each access.

Your cache should work for all supported cache organizations.

Memory Controller

mc_access() handles one logical replay request at a time.

fig2.png

Figure 2: One-request service path in the memory controller. Replay issues one logical request through mc_access(), the controller services that request through the cache and DRAM helpers, and the completed per-request result returns to replay.

In src/hw/mem_ctrl.c, connect one replay request to the cache and DRAM helpers.

Keep the controller request-based: one call to mc_access() is one logical request. If that request touches more than one block, handle the blocks in order but still treat the whole access as one memory_request.

When the controller talks to DRAM, use whole cache blocks. On a dirty eviction, write the old block back before refilling the new one.

Build and Test

Use make to build the simulator.

Use make test to run the public self-test. This checks the handout example and compares your output against the expected result.

Key Rules

The core is untimed and in-order. Every executed instruction generates one I request. Only lw and sw generate additional data requests.

Use byte-addressed trace entries: I 0xADDR 4, L 0xADDR SIZE, and S 0xADDR SIZE.

The runtime parameters are passed in as:

mempath_sim <image> <s> <E> <b> <hit_lat> <dram_read_lat> <dram_write_lat> <cpu_freq_ghz> <cache_freq_ghz> <dram_freq_ghz>

Extra Reading

For the longer explanation and the input-image example, see this guide.