Basic 1-Stage Processor

Status: Verified and packaged HDL: SystemVerilog Verification: Python (cocotb) with Icarus Verilog

Overview

This project implements a single-cycle, 1-stage processor. The entire instruction lifecycle — Fetch, Decode, Execute, and Writeback — completes within a single clock cycle.

The design features a custom minimal ISA, word-addressed memory, and a simplified control path with no pipelining, interrupts, or hazard forwarding logic. It is fully synthesizable and verified using a Python-based scoreboard reference model.

Architecture

 Data Width : 32-bit
 Registers  : 16 GPRs (x0 hardwired to 0)
 Pipeline   : Single-stage (non-pipelined)
 PC Step    : +1 per cycle (word-addressed)
 Memories   : 256-word IMEM (ROM), 256-word DMEM (RAM)

Datapath Block Diagram

graph LR
    subgraph FETCH
        PC["PC<br/>(register)"]
        ADDER["+1"]
        IMEM["Instruction<br/>Memory<br/><i>u_imem</i>"]
    end

    subgraph DECODE
        DEC["Instruction<br/>Decoder"]
        CTRL["Control<br/>Unit<br/><i>u_ctrl</i>"]
        SEXT["Sign-Extend<br/>imm14 → 32"]
        RF["Register File<br/>16 × 32<br/><i>u_regfile</i>"]
    end

    subgraph EXECUTE
        MUX_B{"MUX<br/>alu_src_imm"}
        ALU["ALU<br/><i>u_alu</i>"]
    end

    subgraph MEMORY
        DMEM["Data<br/>Memory<br/><i>u_dmem</i>"]
    end

    subgraph WRITEBACK
        MUX_WB{"MUX<br/>mem_to_reg"}
    end

    PC -->|addr| IMEM
    PC --> ADDER -->|pc_next| PC
    IMEM -->|instr| DEC
    DEC -->|opcode| CTRL
    DEC -->|rs1, rs2| RF
    DEC -->|imm14| SEXT
    CTRL -->|alu_op| ALU
    CTRL -->|alu_src_imm| MUX_B
    CTRL -->|mem_write| DMEM
    CTRL -->|reg_write| RF
    CTRL -->|mem_to_reg| MUX_WB
    RF -->|rs1_data| ALU
    RF -->|rs2_data| MUX_B
    SEXT -->|imm32| MUX_B
    MUX_B -->|alu_b| ALU
    ALU -->|alu_result| DMEM
    ALU -->|alu_result| MUX_WB
    RF -->|rs2_data| DMEM
    DMEM -->|dmem_rdata| MUX_WB
    MUX_WB -->|write_back_data| RF
    DEC -->|rd| RF

    style PC fill:#4a90d9,color:#fff,stroke:#2c5f8a
    style IMEM fill:#6ab04c,color:#fff,stroke:#3e7a28
    style RF fill:#f0932b,color:#fff,stroke:#b5700e
    style ALU fill:#eb4d4b,color:#fff,stroke:#b33230
    style DMEM fill:#6ab04c,color:#fff,stroke:#3e7a28
    style CTRL fill:#be2edd,color:#fff,stroke:#8c1aab
    style MUX_B fill:#f9ca24,color:#333,stroke:#c9a00c
    style MUX_WB fill:#f9ca24,color:#333,stroke:#c9a00c

Execution Flow (Single Clock Cycle)

Every instruction completes all four stages within one rising clock edge:

flowchart LR
    F["1. FETCH<br/>────────<br/>Read instr<br/>from IMEM[PC]"]
    D["2. DECODE<br/>────────<br/>Extract fields<br/>Read registers<br/>Generate control"]
    E["3. EXECUTE<br/>────────<br/>ALU computes<br/>result or address"]
    W["4. WRITEBACK<br/>────────<br/>Write result<br/>to register or<br/>data memory"]
    N["PC ← PC + 1"]

    F --> D --> E --> W --> N
    N -.->|next cycle| F

    style F fill:#4a90d9,color:#fff,stroke:#2c5f8a
    style D fill:#be2edd,color:#fff,stroke:#8c1aab
    style E fill:#eb4d4b,color:#fff,stroke:#b33230
    style W fill:#f0932b,color:#fff,stroke:#b5700e
    style N fill:#4a90d9,color:#fff,stroke:#2c5f8a

Module Hierarchy

graph TD
    TOP["cpu_top"]
    TOP --> IMEM["instruction_memory<br/><i>u_imem</i><br/>256 × 32 ROM"]
    TOP --> CTRL["control_unit<br/><i>u_ctrl</i><br/>Combinational"]
    TOP --> REGF["regfile<br/><i>u_regfile</i><br/>16 × 32, 2R/1W"]
    TOP --> ALUM["alu<br/><i>u_alu</i><br/>ADD / SUB / PASS"]
    TOP --> DMEM["data_memory<br/><i>u_dmem</i><br/>256 × 32 RAM"]
    ISA["isa_defs<br/><i>(package)</i>"]
    ISA -.->|imported by| TOP
    ISA -.->|imported by| CTRL
    ISA -.->|imported by| ALUM
    ISA -.->|imported by| REGF

    style TOP fill:#2c3e50,color:#fff,stroke:#1a252f
    style IMEM fill:#6ab04c,color:#fff,stroke:#3e7a28
    style CTRL fill:#be2edd,color:#fff,stroke:#8c1aab
    style REGF fill:#f0932b,color:#fff,stroke:#b5700e
    style ALUM fill:#eb4d4b,color:#fff,stroke:#b33230
    style DMEM fill:#6ab04c,color:#fff,stroke:#3e7a28
    style ISA fill:#535c68,color:#fff,stroke:#2d3436

Instruction Set Architecture (ISA)

The processor uses a fixed 32-bit instruction width with a custom encoding.

Instruction Encoding

  31      26 25    22 21    18 17    14 13                 0
 ┌──────────┬────────┬────────┬────────┬────────────────────┐
 │  opcode  │   rd   │  rs1   │  rs2   │      imm14         │
 │  (6 bit) │ (4 bit)│ (4 bit)│ (4 bit)│     (14 bit)       │
 └──────────┴────────┴────────┴────────┴────────────────────┘
       │         │        │        │            │
       │         │        │        │            └─ Signed immediate (two's complement)
       │         │        │        └────────────── Source register 2 / store data
       │         │        └─────────────────────── Source register 1 / base address
       │         └──────────────────────────────── Destination register
       └────────────────────────────────────────── Operation selector

Instruction List

All immediates (imm14) are sign-extended to 32 bits before use.

Mnemonic	Opcode	Type	Assembly	Semantics
NOP	`0`	-	`NOP`	No operation
ADD	`1`	R-type	`ADD rd, rs1, rs2`	`rd ← rs1 + rs2`
SUB	`2`	R-type	`SUB rd, rs1, rs2`	`rd ← rs1 - rs2`
ADDI	`3`	I-type	`ADDI rd, rs1, imm`	`rd ← rs1 + sext(imm14)`
LOAD	`4`	I-type	`LOAD rd, [rs1+imm]`	`rd ← MEM[rs1 + sext(imm14)]`
STORE	`5`	S-type	`STORE [rs1+imm], rs2`	`MEM[rs1 + sext(imm14)] ← rs2`

Control Signal Matrix

Opcode	reg_write	mem_write	alu_src_imm	alu_op	mem_to_reg
NOP	0	0	0	ADD	0
ADD	1	0	0	ADD	0
SUB	1	0	0	SUB	0
ADDI	1	0	1	ADD	0
LOAD	1	0	1	ADD	1
STORE	0	1	1	ADD	0

Data Flow Per Instruction Type

flowchart TB
    subgraph R["R-type (ADD / SUB)"]
        direction LR
        R1["rs1_data"] --> RA["ALU"]
        R2["rs2_data"] --> RA
        RA -->|result| RW["rd"]
    end

    subgraph I["I-type (ADDI)"]
        direction LR
        I1["rs1_data"] --> IA["ALU"]
        I2["sext(imm14)"] --> IA
        IA -->|result| IW["rd"]
    end

    subgraph L["LOAD"]
        direction LR
        L1["rs1_data"] --> LA["ALU<br/>addr calc"]
        L2["sext(imm14)"] --> LA
        LA -->|addr| LM["DMEM"]
        LM -->|rdata| LW["rd"]
    end

    subgraph S["STORE"]
        direction LR
        S1["rs1_data"] --> SA["ALU<br/>addr calc"]
        S2["sext(imm14)"] --> SA
        SA -->|addr| SM["DMEM"]
        S3["rs2_data"] -->|wdata| SM
    end

    style R fill:#e8f5e9,stroke:#2e7d32
    style I fill:#e3f2fd,stroke:#1565c0
    style L fill:#fff3e0,stroke:#e65100
    style S fill:#fce4ec,stroke:#c62828

Verification

The testbench uses cocotb (coroutine-based co-simulation) with a Python reference model (scoreboard) that mirrors the RTL behavior cycle-by-cycle.

Verification Architecture

flowchart LR
    subgraph SIM["Icarus Verilog Simulation"]
        DUT["cpu_top<br/>(DUT)"]
    end

    subgraph COCOTB["cocotb (Python)"]
        DRV["Test Driver<br/>clock, reset,<br/>program inject"]
        SB["Scoreboard<br/>Reference Model"]
        CHK["Checker<br/>reg & mem compare"]
    end

    DRV -->|"drive clk/reset<br/>write IMEM"| DUT
    DUT -->|"read regs[0:15]<br/>read mem[0:31]"| CHK
    DRV -->|"exec_instr()"| SB
    SB -->|"expected state"| CHK
    CHK -->|"PASS / FAIL"| RESULT["results.xml"]

    style DUT fill:#2c3e50,color:#fff,stroke:#1a252f
    style SB fill:#be2edd,color:#fff,stroke:#8c1aab
    style DRV fill:#4a90d9,color:#fff,stroke:#2c5f8a
    style CHK fill:#27ae60,color:#fff,stroke:#1e8449
    style RESULT fill:#f39c12,color:#fff,stroke:#d68910

Test Suites

Test Module	Type	Description
`test_basic`	Directed	Reset behavior, `x0` immutability, ADD/SUB/ADDI, LOAD/STORE correctness against `program.hex`
`test_randomized`	Constrained random	64 random instructions checked cycle-by-cycle against scoreboard model

Sample Program (`sim/program.hex`)

The directed test validates this 8-instruction program:

Addr  Hex         Assembly              Effect
───── ────────── ────────────────────── ──────────────────────────
0     00000000    NOP                   (no-op)
1     0c400005    ADDI x1, x0, 5       x1 = 5
2     0c800003    ADDI x2, x0, 3       x2 = 3
3     04c48000    ADD  x3, x1, x2      x3 = 8
4     1400c000    STORE [x0+0], x3      MEM[0] = 8
5     11000000    LOAD x4, [x0+0]       x4 = MEM[0] = 8
6     09508000    SUB  x5, x4, x2      x5 = 5
7     00000000    NOP                   (no-op)

Expected final state: x1=5, x2=3, x3=8, x4=8, x5=5, MEM[0]=8

Running Tests

Prerequisites

Icarus Verilog (12.0+)
Python 3.9+
cocotb (pip install cocotb)

Commands

# Run directed tests
make -C tb SIM=icarus MODULE=test_basic sim

# Run randomized tests
make -C tb SIM=icarus MODULE=test_randomized sim

# Run both test suites
make test

# Run randomized tests with a custom seed
TEST_SEED=0xDEAD make -C tb SIM=icarus MODULE=test_randomized sim

Convenience Targets (from `tb/`)

cd tb
make basic        # directed tests only
make randomized   # randomized tests only
make all          # both suites

Continuous Integration

GitHub Actions runs both test suites automatically on every push and pull request to main.

flowchart LR
    PUSH["Push / PR<br/>to main"] --> J1["test_basic<br/>(icarus)"]
    PUSH --> J2["test_randomized<br/>(icarus)"]
    J1 --> R1{{"PASS /<br/>FAIL"}}
    J2 --> R2{{"PASS /<br/>FAIL"}}
    R1 -->|fail| A1["Upload VCD<br/>waveform"]
    R2 -->|fail| A2["Upload VCD<br/>waveform"]

    style PUSH fill:#4a90d9,color:#fff,stroke:#2c5f8a
    style J1 fill:#27ae60,color:#fff,stroke:#1e8449
    style J2 fill:#27ae60,color:#fff,stroke:#1e8449
    style A1 fill:#e74c3c,color:#fff,stroke:#c0392b
    style A2 fill:#e74c3c,color:#fff,stroke:#c0392b

The workflow installs Icarus Verilog and cocotb on ubuntu-latest with Python 3.11, runs each test module as a separate matrix job, and uploads VCD waveform artifacts on failure.

Project Structure

one-stage-processor/
├── rtl/                        # Synthesizable SystemVerilog
│   ├── isa_defs.sv             # Opcodes, types, ALU op enum
│   ├── cpu_top.sv              # Top-level: PC, decode, datapath wiring
│   ├── alu.sv                  # ADD / SUB / pass-through
│   ├── regfile.sv              # 16×32 register file (x0 = 0)
│   ├── control_unit.sv         # Combinational opcode → control signals
│   ├── instruction_memory.sv   # 256-word ROM (loaded from hex)
│   └── data_memory.sv          # 256-word RAM (sync write, async read)
├── tb/
│   ├── cocotb/
│   │   ├── test_basic.py       # Directed tests
│   │   ├── test_randomized.py  # Constrained random tests
│   │   └── scoreboard.py       # Python ISA reference model
│   └── Makefile                # cocotb simulation runner
├── sim/
│   └── program.hex             # Pre-loaded instruction memory image
├── .github/
│   └── workflows/
│       └── ci.yml              # GitHub Actions CI pipeline
├── Makefile                    # Top-level build entry point
└── README.md

License

This project is provided for educational and reference purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
rtl		rtl
tb		tb
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
convert_fixed3.tcl		convert_fixed3.tcl
convert_to_vivado.py		convert_to_vivado.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Basic 1-Stage Processor

Overview

Architecture

Datapath Block Diagram

Execution Flow (Single Clock Cycle)

Module Hierarchy

Instruction Set Architecture (ISA)

Instruction Encoding

Instruction List

Control Signal Matrix

Data Flow Per Instruction Type

Verification

Verification Architecture

Test Suites

Sample Program (`sim/program.hex`)

Running Tests

Prerequisites

Commands

Convenience Targets (from `tb/`)

Continuous Integration

Project Structure

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Subhankar-hub/one-stage-processor

Folders and files

Latest commit

History

Repository files navigation

Basic 1-Stage Processor

Overview

Architecture

Datapath Block Diagram

Execution Flow (Single Clock Cycle)

Module Hierarchy

Instruction Set Architecture (ISA)

Instruction Encoding

Instruction List

Control Signal Matrix

Data Flow Per Instruction Type

Verification

Verification Architecture

Test Suites

Sample Program (sim/program.hex)

Running Tests

Prerequisites

Commands

Convenience Targets (from tb/)

Continuous Integration

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Sample Program (`sim/program.hex`)

Convenience Targets (from `tb/`)

Packages