VexRiscv/README.md

464 lines
17 KiB
Markdown
Raw Normal View History

2017-07-16 11:47:32 -04:00
## Index
- [Description](#description)
2017-07-16 12:06:45 -04:00
- [Area usage and maximal frequency](#area-usage-and-maximal-frequency)
2017-07-16 11:47:32 -04:00
- [Dependencies](#dependencies)
- [CPU generation](#cpu-generation)
- [Regression tests](#regression-tests)
2017-07-16 12:10:03 -04:00
- [Interactive debug of the simulated CPU via GDB OpenOCD and Verilator](#interactive-debug-of-the-simulated-cpu-via-gdb-openocd-and-verilator)
2017-07-16 11:47:32 -04:00
- [Using eclipse to run the software and debug it](#using-eclipse-to-run-the-software-and-debug-it)
- [Briey SoC](#briey-soc)
2017-07-29 16:25:28 -04:00
- [Murax SoC](#murax-soc)
2017-07-16 11:47:32 -04:00
- [Build the RISC-V GCC](#build-the-risc-v-gcc)
2017-07-17 08:22:13 -04:00
- [CPU parametrization and instantiation example](#cpu-parametrization-and-instantiation-example)
- [Add a custom instruction to the CPU via the plugin system](#add-a-custom-instruction-to-the-cpu-via-the-plugin-system)
2017-07-16 11:47:32 -04:00
## Description
2017-03-26 16:38:07 -04:00
This repository host an RISC-V implementation written in SpinalHDL. There is some specs :
- RV32IM instruction set
- Pipelined on 5 stages (Fetch, Decode, Execute, Memory, WriteBack)
2017-07-16 14:41:03 -04:00
- 1.16 DMIPS/Mhz when all features are enabled
2017-03-26 16:38:07 -04:00
- Optimized for FPGA
- AXI4 and Avalon ready
2017-07-16 13:06:05 -04:00
- Optional MUL/DIV extension
2017-05-19 11:13:33 -04:00
- Optional instruction and data caches
- Optional MMU
- Optional debug extension allowing eclipse debugging via an GDB >> openOCD >> JTAG connection
2017-07-16 13:06:05 -04:00
- Optional interrupts and exception handling with the Machine and the User mode from the riscv-privileged-v1.9.1 spec.
- Two implementation of shift instructions, Single cycle / shiftNumber cycles
2017-03-26 16:38:07 -04:00
- Each stage could have bypass or interlock hazard logic
- FreeRTOS port https://github.com/Dolu1990/FreeRTOS-RISCV
The hardware description of this CPU is done by using an very software oriented approach
(without any overhead in the generated hardware). There is a list of software concepts used :
- There is very few fixed things. Nearly everything is plugin based. The PC manager is a plugin, the register file is a plugin, the hazard controller is a plugin ...
- There is an automatic a tool which allow plugins to insert data in the pipeline at a given stage, and allow other plugins to read it in another stages through automatic pipelining.
- There is an service system which provide a very dynamic framework. As instance, a plugin could provide an exception service which could then be used by others plugins to emit exceptions from the pipeline.
2017-07-16 12:06:45 -04:00
## Area usage and maximal frequency
2017-07-29 16:25:28 -04:00
The following number where obtains by synthesis the CPU as toplevel without any specific synthesis option to save area or to get better maximal frequency (neutral).<br>
The clock constraint is set to a unattainable value, which tends to increase the design area.<br>
The used CPU corresponding configuration can be find in src/scala/vexriscv/demo.
```
VexRiscv smallest (RV32I, 0.47 DMIPS/Mhz, no datapath bypass, no interrupt) ->
Artix 7 -> 324 Mhz 478 LUT 539 FF
Cyclone V -> 187 Mhz 341 ALMs
Cyclone IV -> 180 Mhz 736 LUT 529 FF
Cyclone II -> 156 Mhz 740 LUT 528 FF
VexRiscv smallest (RV32I, 0.47 DMIPS/Mhz, no datapath bypass) ->
Artix 7 -> 335 Mhz 560 LUT 589 FF
Cyclone V -> 182 Mhz 420 ALMs
Cyclone IV -> 160 Mhz 852 LUT 579 FF
Cyclone II -> 144 Mhz 844 LUT 578 FF
2017-07-17 09:38:52 -04:00
VexRiscv small and productive (RV32I, 0.78 DMIPS/Mhz) ->
2017-07-19 12:34:16 -04:00
Artix 7 -> 330 Mhz 719 LUT 557 FF
Cyclone V -> 153 Mhz 539 ALMs
2017-07-17 09:38:52 -04:00
Cyclone IV -> 148 Mhz 1,127 LUT 552 FF
Cyclone II -> 114 Mhz 1,133 LUT 551 FF
2017-07-17 10:52:36 -04:00
VexRiscv full no cache (RV32IM, 1.14 DMIPS/Mhz, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
2017-07-19 12:34:16 -04:00
Artix 7 -> 291 Mhz 1403 LUT 936 FF
Cyclone V -> 147 Mhz 928 ALMs
2017-07-17 10:52:36 -04:00
Cyclone IV -> 137 Mhz 1,910 LUT 959 FF
Cyclone II -> 110 Mhz 1,940 LUT 958 FF
VexRiscv full (RV32IM, 1.14 DMIPS/Mhz, I$, D$, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
Artix 7 -> 249 Mhz 1862 LUT 1498 FF
2017-07-16 08:42:24 -04:00
Cyclone V -> 133 Mhz 1272 ALMs
Cyclone IV -> 116 Mhz 2727 LUT 1759 FF
Cyclone II -> 105 Mhz 2771 LUT 1758 FF
2017-07-17 08:01:35 -04:00
VexRiscv full with MMU (RV32IM, 1.16 DMIPS/Mhz, I$, D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch, MMU) ->
Artix 7 -> 210 Mhz 2104 LUT 2017 FF
2017-07-16 08:42:24 -04:00
Cyclone V -> 115 Mhz 1503 ALMs
Cyclone IV -> 100 Mhz 3145 LUT 2278 FF
Cyclone II -> 92 Mhz 3195 LUT 2279 FF
```
2017-06-15 08:06:32 -04:00
## Dependencies
On Ubuntu 14 :
```sh
# JAVA JDK 7 or 8
sudo apt-get install openjdk-8-jdk
2017-06-15 08:06:32 -04:00
# SBT
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
# Verilator (for sim only)
2017-06-15 14:27:20 -04:00
sudo apt-get install git make autoconf g++ flex bison
git clone http://git.veripool.org/git/verilator # Only first time
unsetenv VERILATOR_ROOT # For csh; ignore error if on bash
unset VERILATOR_ROOT # For bash
cd verilator
git pull # Make sure we're up-to-date
git tag # See what versions exist
autoconf # Create ./configure script
./configure
make
sudo make install
2017-06-15 08:06:32 -04:00
```
2017-03-26 18:33:34 -04:00
The VexRiscv need the unreleased master-head of SpinalHDL :
```sh
# Compile and localy publish the latest SpinalHDL
rm -rf SpinalHDL
git clone https://github.com/SpinalHDL/SpinalHDL.git
cd SpinalHDL
sbt clean compile publish-local
cd ..
```
2017-06-15 07:44:21 -04:00
## CPU generation
You can find two example of CPU instantiation in :
- src/main/scala/vexriscv/GenFull.scala
- src/main/scala/vexriscv/GenSmallest.scala
2017-03-26 18:33:34 -04:00
2017-06-15 07:54:34 -04:00
To generate the corresponding RTL as a VexRiscv.v file, run (it could take time the first time you run it):
2017-06-15 07:44:21 -04:00
2017-07-09 12:02:01 -04:00
NOTE :
The VexRiscv could need the unreleased master-head of SpinalHDL. If it fail to compile, just get the SpinalHDL repository and do a "sbt clean compile publish-local" in it as described in the dependencies chapter.
2017-07-09 12:02:01 -04:00
2017-06-15 07:44:21 -04:00
```sh
sbt "run-main vexriscv.demo.GenFull"
2017-06-15 07:44:21 -04:00
# or
sbt "run-main vexriscv.demo.GenSmallest"
2017-06-15 07:44:21 -04:00
```
2017-07-16 11:47:32 -04:00
## Regression tests
2017-06-15 07:44:21 -04:00
To run tests (need the verilator simulator), go in the src/test/cpp/regression folder and run :
```sh
# To test the GenFull CPU
make clean run
# To test the GenSmallest CPU
make clean run IBUS=SIMPLE DBUS=SIMPLE CSR=no MMU=no DEBUG_PLUGIN=no MUL=no DIV=no
2017-06-15 07:44:21 -04:00
```
2017-07-27 18:07:51 -04:00
Those self tested tests include :
- ISA tests from https://github.com/riscv/riscv-tests/tree/master/isa
- Dhrystone benchmark
- 24 tests FreeRTOS tests
- Some handwritten tests to check the CSR, debug module and MMU plugins
You can enable FreeRTOS tests by adding 'FREERTOS=yes' in the command line, will take time. Also, it use THREAD_COUNT host CPU threads to run multiple regression in parallel.
2017-07-16 12:10:03 -04:00
## Interactive debug of the simulated CPU via GDB OpenOCD and Verilator
2017-06-15 07:44:21 -04:00
It's as described to run tests, but you just have to add DEBUG_PLUGIN_EXTERNAL=yes in the make arguments.
Work for the GenFull, but not for the GenSmallest as this configuration has no debug module.
Then you can use the https://github.com/SpinalHDL/openocd_riscv tool to create a GDB server connected to the target (the simulated CPU)
```sh
#in the VexRiscv repository, to run the simulation on which one OpenOCD can connect itself =>
sbt "run-main vexriscv.demo.GenFull"
cd src/test/cpp/regression
make run DEBUG_PLUGIN_EXTERNAL=yes
#In the openocd git, after building it =>
2017-06-15 07:44:21 -04:00
src/openocd -c "set VEXRISCV_YAML PATH_TO_THE_GENERATED_CPU0_YAML_FILE" -f tcl/target/vexriscv_sim.cfg
#Run a GDB session with an elf RISCV executable (GenFull CPU)
YourRiscvToolsPath/bin/riscv32-unknown-elf-gdb VexRiscvRepo/src/test/resources/elf/uart.elf
target remote localhost:3333
monitor reset halt
load
continue
# Now it should print messages in the Verilator simulation of the CPU
2017-03-26 18:33:34 -04:00
```
## Using eclipse to run the software and debug it
2017-07-16 11:47:32 -04:00
You can use the eclipse + zilin embedded CDT plugin to do it (http://opensource.zylin.com/embeddedcdt.html). Tested with Helios Service Release 2 and the corresponding zylin plugin.
2017-03-26 18:33:34 -04:00
## Briey SoC
As a demonstrator, a SoC named Briey is implemented in src/main/scala/vexriscv/demo/Briey.scala. This SoC is very similar to the Pinsec one :
2017-07-09 12:02:01 -04:00
<img src="http://cdn.rawgit.com/SpinalHDL/SpinalDoc/dd17971aa549ccb99168afd55aad274bbdff1e88/asset/picture/pinsec_hardware.svg" align="middle" width="300">
2017-07-09 12:02:01 -04:00
To generate the Briey SoC Hardware :
```sh
sbt "run-main vexriscv.demo.Briey"
2017-07-09 12:02:01 -04:00
```
To run the verilator simulation of the Briey SoC which can be then connected to OpenOCD/GDB, first get those dependencies :
```sh
sudo apt-get install build-essential xorg-dev libudev-dev libts-dev libgl1-mesa-dev libglu1-mesa-dev libasound2-dev libpulse-dev libopenal-dev libogg-dev libvorbis-dev libaudiofile-dev libpng12-dev libfreetype6-dev libusb-dev libdbus-1-dev zlib1g-dev libdirectfb-dev libsdl2-dev
```
Then go in src/test/cpp/briey and run the simulation with (UART TX is printed in the terminal, VGA is displayed in a GUI):
```sh
make clean run
```
2017-07-09 12:02:01 -04:00
To connect OpenOCD (https://github.com/SpinalHDL/openocd_riscv) to the simulation :
```sh
src/openocd -f tcl/interface/jtag_tcp.cfg -c "set BRIEY_CPU0_YAML /home/spinalvm/Spinal/VexRiscv/cpu0.yaml" -f tcl/target/briey.cfg
```
You can find multiples software examples and demo there : https://github.com/SpinalHDL/BrieySoftware
2017-07-16 11:47:32 -04:00
You can find some FPGA project which instantiate the Briey SoC there (DE1-SoC, DE0-Nano): https://drive.google.com/drive/folders/0B-CqLXDTaMbKZGdJZlZ5THAxRTQ?usp=sharing
2017-07-09 12:02:01 -04:00
2017-07-19 12:34:16 -04:00
There is some measurements of Briey SoC timings and area :
2017-07-19 12:36:30 -04:00
```
2017-07-29 16:25:28 -04:00
Artix 7 -> 256 Mhz 3302 LUT 3524 FF
Cyclone V -> 126 Mhz 2,295 ALMs
Cyclone IV -> 121 Mhz 4,781 LUT 3,713 FF
Cyclone II -> 104 Mhz 4,902 LUT 3,718 FF
```
## Murax SoC
Murax is a very light SoC (fit in ICE40 FPGA) which could work without any external component.
2017-07-29 20:42:14 -04:00
- VexRiscv RV32I[M]
2017-07-29 16:25:28 -04:00
- JTAG debugger (eclipse/GDB/openocd ready)
2017-07-29 20:42:14 -04:00
- 8 kB of on-chip ram
2017-07-29 16:25:28 -04:00
- Interrupt support
- APB bus for peripherals
- 32 GPIO pin
- one 16 bits prescaler, two 16 bits timers
2017-07-31 07:57:34 -04:00
- one UART with tx/rx fifo
2017-07-29 16:25:28 -04:00
2017-07-29 20:42:14 -04:00
Depending the CPU configuration, on the ICE40-hx8k FPGA with icestorm for synthesis, the full SoC will get following area/performance :
2017-07-31 07:57:34 -04:00
- RV32I interlocked stages => 51 Mhz, 2387 LC 0.37 DMIPS/Mhz
- RV32I bypassed stages => 45 Mhz, 2718 LC 0.55 DMIPS/Mhz
2017-07-29 20:42:14 -04:00
2017-07-29 16:25:28 -04:00
You can find its implementation there : src/main/scala/vexriscv/demo/Murax.scala
To generate the Murax SoC Hardware :
```sh
sbt "run-main vexriscv.demo.Murax"
```
Then go in src/test/cpp/murax and run the simulation with :
```sh
make clean run
```
To connect OpenOCD (https://github.com/SpinalHDL/openocd_riscv) to the simulation :
```sh
src/openocd -f tcl/interface/jtag_tcp.cfg -c "set MURAX_CPU0_YAML /home/spinalvm/Spinal/VexRiscv/cpu0.yaml" -f tcl/target/murax.cfg
```
2017-07-29 20:42:14 -04:00
There is some measurements of Murax SoC timings and area for the 0.37 DMIPS/Mhz SoC version :
2017-07-29 16:25:28 -04:00
```
2017-07-31 07:57:34 -04:00
Murax interlocked stages (0.37 DMIPS/Mhz) ->
Artix 7 -> 306 Mhz 1021 LUT 1291 FF
Cyclone V -> 173 Mhz 752 ALMs
Cyclone IV -> 140 Mhz 1483 LUT 1,250 FF
Cyclone II -> 127 Mhz 1484 LUT 1,249 FF
ICE40-HX -> 51 Mhz 2387 LC (icestorm)
MuraxFast bypassed stages (0.55 DMIPS/Mhz) ->
Artix 7 -> 310 Mhz 1192 LUT 1388 FF
Cyclone V -> 160 Mhz 893 ALMs
Cyclone IV -> 142 Mhz 1726 LUT 1,284 FF
Cyclone II -> 106 Mhz 1714 LUT 1,283 FF
ICE40-HX -> 45 Mhz, 2718 LC (icestorm)
2017-07-19 12:36:30 -04:00
```
2017-07-19 12:34:16 -04:00
2017-07-29 16:43:43 -04:00
There is some scripts to generate the SoC and call the icestorm toolchain there : scripts/Murax/
2017-07-09 12:02:01 -04:00
## Build the RISC-V GCC
To install in /opt/ the rv32i and rv32im gcc, do the following (will take hours):
```sh
# Be carefull, sometime the git clone has issue to successfully clone riscv-gnu-toolchain.
sudo apt-get install autoconf automake autotools-dev curl libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev -y
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain riscv-gnu-toolchain
cd riscv-gnu-toolchain
echo "Starting RISC-V Toolchain build process"
ARCH=rv32im
rmdir -rf $ARCH
mkdir $ARCH; cd $ARCH
../configure --prefix=/opt/$ARCH --with-arch=$ARCH --with-abi=ilp32
sudo make -j4
cd ..
ARCH=rv32i
rmdir -rf $ARCH
mkdir $ARCH; cd $ARCH
../configure --prefix=/opt/$ARCH --with-arch=$ARCH --with-abi=ilp32
sudo make -j4
cd ..
echo -e "\\nRISC-V Toolchain installation completed!"
```
2017-07-17 08:19:28 -04:00
## CPU parametrization and instantiation example
2017-03-26 16:38:07 -04:00
You can find many example of different config in the https://github.com/SpinalHDL/VexRiscv/tree/master/src/main/scala/vexriscv/demo folder. There is one :
2017-03-26 18:33:34 -04:00
2017-03-26 16:38:07 -04:00
```scala
import vexriscv._
import vexriscv.plugin._
2017-07-17 08:19:28 -04:00
//Instanciate one VexRiscv
val cpu = new VexRiscv(
//Provide a configuration instance
config = VexRiscvConfig(
//Provide a list of plugins which will futher add their logic into the CPU
plugins = List(
new PcManagerSimplePlugin(
resetVector = 0x00000000l,
relaxedPcCalculation = true
2017-07-17 08:19:28 -04:00
),
new IBusSimplePlugin(
interfaceKeepData = false,
catchAccessFault = false
),
new DBusSimplePlugin(
catchAddressMisaligned = false,
catchAccessFault = false
),
new DecoderSimplePlugin(
catchIllegalInstruction = false
),
new RegFilePlugin(
regFileReadyKind = Plugin.SYNC,
zeroBoot = true
),
new IntAluPlugin,
new SrcPlugin(
separatedAddSub = false,
executeInsertion = false
),
new LightShifterPlugin,
new HazardSimplePlugin(
bypassExecute = false,
bypassMemory = false,
bypassWriteBack = false,
bypassWriteBackBuffer = false
),
new BranchPlugin(
earlyBranch = false,
catchAddressMisaligned = false,
prediction = NONE
),
new YamlPlugin("cpu0.yaml")
)
)
)
```
2017-07-17 08:22:13 -04:00
## Add a custom instruction to the CPU via the plugin system
2017-07-17 08:19:28 -04:00
2017-07-17 08:22:13 -04:00
There is an example of an simple plugin which add an simple SIMD_ADD instruction :
2017-03-26 16:38:07 -04:00
2017-07-17 08:19:28 -04:00
```scala
import spinal.core._
import vexriscv.plugin.Plugin
import vexriscv.{Stageable, DecoderService, VexRiscv}
2017-07-17 08:19:28 -04:00
//This plugin example will add a new instruction named SIMD_ADD which do the following :
//
//RD : Regfile Destination, RS : Regfile Source
//RD( 7 downto 0) = RS1( 7 downto 0) + RS2( 7 downto 0)
//RD(16 downto 8) = RS1(16 downto 8) + RS2(16 downto 8)
//RD(23 downto 16) = RS1(23 downto 16) + RS2(23 downto 16)
//RD(31 downto 24) = RS1(31 downto 24) + RS2(31 downto 24)
//
//Instruction encoding :
//0000011----------000-----0110011
// |RS2||RS1| |RD |
//
//Note : RS1, RS2, RD positions follow the RISC-V spec and are common for all instruction of the ISA
class SimdAddPlugin extends Plugin[VexRiscv]{
//Define the concept of IS_SIMD_ADD signals, which specify if the current instruction is destined for ths plugin
object IS_SIMD_ADD extends Stageable(Bool)
2017-03-26 16:38:07 -04:00
2017-03-26 16:43:00 -04:00
//Callback to setup the plugin and ask for different services
2017-03-26 16:38:07 -04:00
override def setup(pipeline: VexRiscv): Unit = {
import pipeline.config._
2017-07-17 08:19:28 -04:00
//Retrieve the DecoderService instance
2017-03-26 16:38:07 -04:00
val decoderService = pipeline.service(classOf[DecoderService])
2017-07-17 08:19:28 -04:00
//Specify the IS_SIMD_ADD default value when instruction are decoded
decoderService.addDefault(IS_SIMD_ADD, False)
//Specify the instruction decoding which should be applied when the instruction match the 'key' parttern
decoderService.add(
//Bit pattern of the new SIMD_ADD instruction
key = M"0000011----------000-----0110011",
//Decoding specification when the 'key' pattern is recognized in the instruction
List(
IS_SIMD_ADD -> True,
REGFILE_WRITE_VALID -> True, //Enable the register file write
BYPASSABLE_EXECUTE_STAGE -> True, //Notify the hazard management unit that the instruction result is already accessible in the EXECUTE stage (Bypass ready)
BYPASSABLE_MEMORY_STAGE -> True, //Same as above but for the memory stage
RS1_USE -> True, //Notify the hazard management unit that this instruction use the RS1 value
RS2_USE -> True //Same than above but for RS2.
)
)
2017-03-26 16:38:07 -04:00
}
override def build(pipeline: VexRiscv): Unit = {
import pipeline._
2017-07-17 08:19:28 -04:00
import pipeline.config._
2017-03-26 16:38:07 -04:00
2017-07-17 08:19:28 -04:00
//Define some signals used internally to the plugin
val rs1 = execute.input(RS1).asUInt //32 bits UInt value of the regfile[RS1]
val rs2 = execute.input(RS2).asUInt
val rd = UInt(32 bits)
//Do some computation
rd( 7 downto 0) := rs1( 7 downto 0) + rs2( 7 downto 0)
rd(16 downto 8) := rs1(16 downto 8) + rs2(16 downto 8)
rd(23 downto 16) := rs1(23 downto 16) + rs2(23 downto 16)
rd(31 downto 24) := rs1(31 downto 24) + rs2(31 downto 24)
2017-03-26 16:38:07 -04:00
2017-07-17 08:19:28 -04:00
//When the instruction is a SIMD_ADD one, then write the result into the register file data path.
when(execute.input(IS_SIMD_ADD)){
execute.output(REGFILE_WRITE_DATA) := rd.asBits
2017-03-26 16:38:07 -04:00
}
}
}
2017-05-19 11:13:33 -04:00
```
2017-07-17 08:19:28 -04:00
Then if you want to add this plugin to a given CPU, you just need to add it in its parameterized plugin list.
This example is a very simple one, but each plugin can really have access to the whole CPU
- Halt a given stage of the CPU
- Unschedule instructions
- Emit an exception
- Introduce new instruction decoding specification
- Ask to jump the PC somewhere
- Read signals published by other plugins
- override published signals values
- Provide an alternative implementation
- ...