Merge branch 'rework_fetch' into dev

This commit is contained in:
Dolu1990 2020-03-07 18:22:46 +01:00
commit 97db4f02a0
22 changed files with 1301 additions and 983 deletions

View File

@ -65,55 +65,54 @@ dhrystone binaries which fit inside a 4KB I$ and 4KB D$ (I already had this case
The CPU configurations used below can be found in the `src/scala/vexriscv/demo` directory.
```
VexRiscv smallest (RV32I, 0.52 DMIPS/Mhz, no datapath bypass, no interrupt) ->
Artix 7 -> 233 Mhz 494 LUT 505 FF
Cyclone V -> 193 Mhz 347 ALMs
Cyclone IV -> 179 Mhz 730 LUT 494 FF
VexRiscv small (RV32I, 0.52 DMIPS/Mhz, no datapath bypass, no interrupt) ->
Artix 7 -> 243 Mhz 504 LUT 505 FF
Cyclone V -> 174 Mhz 352 ALMs
Cyclone IV -> 179 Mhz 731 LUT 494 FF
iCE40 -> 92 Mhz 1130 LC
VexRiscv smallest (RV32I, 0.52 DMIPS/Mhz, no datapath bypass) ->
Artix 7 -> 232 Mhz 538 LUT 562 FF
Cyclone V -> 189 Mhz 387 ALMs
Cyclone IV -> 175 Mhz 829 LUT 550 FF
VexRiscv small (RV32I, 0.52 DMIPS/Mhz, no datapath bypass) ->
Artix 7 -> 240 Mhz 556 LUT 566 FF
Cyclone V -> 194 Mhz 394 ALMs
Cyclone IV -> 174 Mhz 831 LUT 555 FF
iCE40 -> 85 Mhz 1292 LC
VexRiscv small and productive (RV32I, 0.82 DMIPS/Mhz) ->
Artix 7 -> 226 Mhz 689 LUT 531 FF
Cyclone V -> 145 Mhz 499 ALMs
Cyclone IV -> 150 Mhz 1,111 LUT 525 FF
Artix 7 -> 232 Mhz 816 LUT 534 FF
Cyclone V -> 155 Mhz 492 ALMs
Cyclone IV -> 155 Mhz 1,111 LUT 530 FF
iCE40 -> 63 Mhz 1596 LC
VexRiscv small and productive with I$ (RV32I, 0.70 DMIPS/Mhz, 4KB-I$) ->
Artix 7 -> 230 Mhz 734 LUT 564 FF
Cyclone V -> 145 Mhz 511 ALMs
Cyclone IV -> 144 Mhz 1,145 LUT 531 FF
Artix 7 -> 220 Mhz 730 LUT 570 FF
Cyclone V -> 142 Mhz 501 ALMs
Cyclone IV -> 150 Mhz 1,139 LUT 536 FF
iCE40 -> 66 Mhz 1680 LC
VexRiscv full no cache (RV32IM, 1.21 DMIPS/Mhz 2.30 Coremark/Mhz, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
Artix 7 -> 219 Mhz 1537 LUT 977 FF
Cyclone V -> 139 Mhz 958 ALMs
Cyclone IV -> 135 Mhz 2,011 LUT 968 FF
Artix 7 -> 216 Mhz 1418 LUT 949 FF
Cyclone V -> 133 Mhz 933 ALMs
Cyclone IV -> 143 Mhz 2,076 LUT 972 FF
VexRiscv full (RV32IM, 1.21 DMIPS/Mhz 2.30 Coremark/Mhz with cache trashing, 4KB-I$,4KB-D$, single cycle barrel shifter, debug module, catch exceptions, static branch) ->
Artix 7 -> 193 Mhz 1706 LUT 1172 FF
Cyclone V -> 144 Mhz 1,128 ALMs
Cyclone IV -> 133 Mhz 2,298 LUT 1,096 FF
Artix 7 -> 199 Mhz 1840 LUT 1158 FF
Cyclone V -> 141 Mhz 1,166 ALMs
Cyclone IV -> 131 Mhz 2,407 LUT 1,067 FF
VexRiscv full max dmips/mhz -> (RV32IM, 1.44 DMIPS/Mhz 2.70 Coremark/Mhz,, 16KB-I$,16KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch prediction in the fetch stage, branch and shift operations done in the Execute stage) ->
Artix 7 -> 140 Mhz 1767 LUT 1128 FF
Cyclone V -> 90 Mhz 1,089 ALMs
Cyclone IV -> 79 Mhz 2,336 LUT 1,048 FF
VexRiscv full max perf (HZ*IPC) -> (RV32IM, 1.38 DMIPS/Mhz 2.57 Coremark/Mhz, 8KB-I$,8KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch prediction in the fetch stage, branch and shift operations done in the Execute stage) ->
Artix 7 -> 200 Mhz 1935 LUT 1216 FF
Cyclone V -> 130 Mhz 1,166 ALMs
Cyclone IV -> 126 Mhz 2,484 LUT 1,120 FF
VexRiscv full with MMU (RV32IM, 1.24 DMIPS/Mhz 2.35 Coremark/Mhz, with cache trashing, 4KB-I$, 4KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch, MMU) ->
Artix 7 -> 161 Mhz 1985 LUT 1585 FF
Cyclone V -> 124 Mhz 1,319 ALMs
Cyclone IV -> 122 Mhz 2,710 LUT 1,501 FF
Artix 7 -> 151 Mhz 2021 LUT 1541 FF
Cyclone V -> 124 Mhz 1,368 ALMs
Cyclone IV -> 128 Mhz 2,826 LUT 1,474 FF
VexRiscv linux balanced (RV32IMA, 1.21 DMIPS/Mhz 2.27 Coremark/Mhz, with cache trashing, 4KB-I$, 4KB-D$, single cycle barrel shifter, catch exceptions, static branch, MMU, Supervisor, Compatible with mainstream linux) ->
Artix 7 -> 170 Mhz 2530 LUT 2013 FF
Cyclone V -> 125 Mhz 1,618 ALMs
Cyclone IV -> 116 Mhz 3,314 LUT 2,016 FF
Artix 7 -> 180 Mhz 2883 LUT 2130 FF
Cyclone V -> 131 Mhz 1,764 ALMs
Cyclone IV -> 121 Mhz 3,608 LUT 2,082 FF
```
The following configuration results in 1.44 DMIPS/MHz:
@ -319,9 +318,9 @@ You can find some FPGA projects which instantiate the Briey SoC here (DE1-SoC, D
Here are some measurements of Briey SoC timings and area :
```
Artix 7 -> 186 Mhz 3138 LUT 3328 FF
Cyclone V -> 139 Mhz 2,175 ALMs
Cyclone IV -> 129 Mhz 4,337 LUT 3,170 FF
Artix 7 -> 181 Mhz 3220 LUT 3181 FF
Cyclone V -> 142 Mhz 2,222 ALMs
Cyclone IV -> 130 Mhz 4,538 LUT 3,211 FF
```
## Murax SoC
@ -374,16 +373,16 @@ Here are some timing and area measurements of the Murax SoC:
```
Murax interlocked stages (0.45 DMIPS/Mhz, 8 bits GPIO) ->
Artix 7 -> 215 Mhz 1044 LUT 1202 FF
Cyclone V -> 173 Mhz 737 ALMs
Cyclone IV -> 144 Mhz 1,484 LUT 1,206 FF
iCE40 -> 64 Mhz 2422 LC (nextpnr)
Artix 7 -> 216 Mhz 1109 LUT 1201 FF
Cyclone V -> 182 Mhz 725 ALMs
Cyclone IV -> 147 Mhz 1,551 LUT 1,223 FF
iCE40 -> 64 Mhz 2422 LC (nextpnr)
MuraxFast bypassed stages (0.65 DMIPS/Mhz, 8 bits GPIO) ->
Artix 7 -> 229 Mhz 1269 LUT 1302 FF
Cyclone V -> 159 Mhz 864 ALMs
Cyclone IV -> 137 Mhz 1,688 LUT 1,241 FF
iCE40 -> 66 Mhz 2799 LC (nextpnr)
Artix 7 -> 224 Mhz 1278 LUT 1300 FF
Cyclone V -> 173 Mhz 867 ALMs
Cyclone IV -> 143 Mhz 1,755 LUT 1,258 FF
iCE40 -> 66 Mhz 2799 LC (nextpnr)
```
Some scripts to generate the SoC and call the icestorm toolchain can be found here: `scripts/Murax/`

View File

@ -84,6 +84,8 @@ Composed of 2 stream :
Used by the interconnect to order master to change their memory copies status and get memory copies owners data.
Composed of 2 stream :
| Name | Direction | Description |
|----------|-----------|----------|
| probeCmd | M <- S | Used for cache management |
@ -129,7 +131,7 @@ Emitted on the readAck channel (master -> slave), it carry no information, just
| Name | From command | Description |
|--------------|---------------|----------|
| readSuccess | * | - |
| readSuccess | * | - |
### Write commands

View File

@ -13,7 +13,6 @@ trait JumpService{
trait IBusFetcher{
def haltIt() : Unit
def flushIt() : Unit
def incoming() : Bool
def pcValid(stage : Stage) : Bool
def getInjectionPort() : Stream[Bits]

View File

@ -36,7 +36,6 @@ case class VexRiscvConfig(){
object PC extends Stageable(UInt(32 bits))
object PC_CALC_WITHOUT_JUMP extends Stageable(UInt(32 bits))
object INSTRUCTION extends Stageable(Bits(32 bits))
object INSTRUCTION_READY extends Stageable(Bool)
object INSTRUCTION_ANTICIPATED extends Stageable(Bits(32 bits))
object LEGAL_INSTRUCTION extends Stageable(Bool)
object REGFILE_WRITE_VALID extends Stageable(Bool)

View File

@ -20,7 +20,7 @@ object GenFullNoMmuMaxPerf extends App{
prediction = DYNAMIC_TARGET,
historyRamSizeLog2 = 8,
config = InstructionCacheConfig(
cacheSize = 4096*4,
cacheSize = 4096*2,
bytePerLine =32,
wayCount = 1,
addressWidth = 32,
@ -29,13 +29,13 @@ object GenFullNoMmuMaxPerf extends App{
catchIllegalAccess = true,
catchAccessFault = true,
asyncTagMemory = false,
twoCycleRam = true,
twoCycleRam = false,
twoCycleCache = true
)
),
new DBusCachedPlugin(
config = new DataCacheConfig(
cacheSize = 4096*4,
cacheSize = 4096*2,
bytePerLine = 32,
wayCount = 1,
addressWidth = 32,
@ -76,7 +76,7 @@ object GenFullNoMmuMaxPerf extends App{
new CsrPlugin(CsrPluginConfig.small),
new DebugPlugin(ClockDomain.current.clone(reset = Bool().setName("debugReset"))),
new BranchPlugin(
earlyBranch = true,
earlyBranch = false,
catchAddressMisaligned = true
),
new YamlPlugin("cpu0.yaml")

View File

@ -56,7 +56,7 @@ object GenNoCacheNoMmuMaxPerf extends App{
new CsrPlugin(CsrPluginConfig.small),
new DebugPlugin(ClockDomain.current.clone(reset = Bool().setName("debugReset"))),
new BranchPlugin(
earlyBranch = true,
earlyBranch = false,
catchAddressMisaligned = true
),
new YamlPlugin("cpu0.yaml")

View File

@ -272,7 +272,7 @@ object LinuxGen {
// wfiGenAsNop = true,
// ucycleAccess = CsrAccess.NONE
// )),
// new DebugPlugin(ClockDomain.current.clone(reset = Bool().setName("debugReset"))),
new DebugPlugin(ClockDomain.current.clone(reset = Bool().setName("debugReset"))),
new BranchPlugin(
earlyBranch = false,
catchAddressMisaligned = true,
@ -310,7 +310,7 @@ object LinuxGen {
// }
// }
SpinalConfig(mergeAsyncProcess = true, anonymSignalPrefix = "_zz").generateVerilog {
SpinalConfig(mergeAsyncProcess = false, anonymSignalPrefix = "_zz").generateVerilog {
val toplevel = new VexRiscv(configFull(

View File

@ -4,8 +4,9 @@ import spinal.core._
import spinal.lib._
import spinal.lib.eda.bench._
import spinal.lib.eda.icestorm.IcestormStdTargets
import spinal.lib.io.InOutWrapper
import vexriscv.VexRiscv
import vexriscv.plugin.{DecoderSimplePlugin}
import vexriscv.plugin.DecoderSimplePlugin
import scala.collection.mutable.ArrayBuffer
import scala.util.Random
@ -96,7 +97,7 @@ object VexRiscvSynthesisBench {
}
val full = new Rtl {
override def getName(): String = "VexRiscv full"
override def getName(): String = "VexRiscv full with MMU"
override def getRtlPath(): String = "VexRiscvFull.v"
SpinalVerilog(wrap(GenFull.cpu()).setDefinitionName(getRtlPath().split("\\.").head))
}
@ -113,15 +114,10 @@ object VexRiscvSynthesisBench {
// val rtls = List(smallAndProductive, smallAndProductiveWithICache, fullNoMmuMaxPerf, fullNoMmu, full)
// val rtls = List(smallAndProductive)
val targets = XilinxStdTargets(
vivadoArtix7Path = "/media/miaou/HD/linux/Xilinx/Vivado/2018.3/bin"
) ++ AlteraStdTargets(
quartusCycloneIVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin",
quartusCycloneVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin"
) ++ IcestormStdTargets().take(1)
val targets = XilinxStdTargets() ++ AlteraStdTargets() ++ IcestormStdTargets().take(1)
// val targets = IcestormStdTargets()
Bench(rtls, targets, "/media/miaou/HD/linux/tmp")
Bench(rtls, targets)
}
}
@ -132,7 +128,7 @@ object BrieySynthesisBench {
override def getName(): String = "Briey"
override def getRtlPath(): String = "Briey.v"
SpinalVerilog({
val briey = new Briey(BrieyConfig.default).setDefinitionName(getRtlPath().split("\\.").head)
val briey = InOutWrapper(new Briey(BrieyConfig.default).setDefinitionName(getRtlPath().split("\\.").head))
briey.io.axiClk.setName("clk")
briey
})
@ -141,14 +137,9 @@ object BrieySynthesisBench {
val rtls = List(briey)
val targets = XilinxStdTargets(
vivadoArtix7Path = "/media/miaou/HD/linux/Xilinx/Vivado/2018.3/bin"
) ++ AlteraStdTargets(
quartusCycloneIVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin",
quartusCycloneVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin"
)
val targets = XilinxStdTargets() ++ AlteraStdTargets() ++ IcestormStdTargets().take(1)
Bench(rtls, targets, "/media/miaou/HD/linux/tmp")
Bench(rtls, targets)
}
}
@ -161,7 +152,7 @@ object MuraxSynthesisBench {
override def getName(): String = "Murax"
override def getRtlPath(): String = "Murax.v"
SpinalVerilog({
val murax = new Murax(MuraxConfig.default.copy(gpioWidth = 8)).setDefinitionName(getRtlPath().split("\\.").head)
val murax = InOutWrapper(new Murax(MuraxConfig.default.copy(gpioWidth = 8)).setDefinitionName(getRtlPath().split("\\.").head))
murax.io.mainClk.setName("clk")
murax
})
@ -172,7 +163,7 @@ object MuraxSynthesisBench {
override def getName(): String = "MuraxFast"
override def getRtlPath(): String = "MuraxFast.v"
SpinalVerilog({
val murax = new Murax(MuraxConfig.fast.copy(gpioWidth = 8)).setDefinitionName(getRtlPath().split("\\.").head)
val murax = InOutWrapper(new Murax(MuraxConfig.fast.copy(gpioWidth = 8)).setDefinitionName(getRtlPath().split("\\.").head))
murax.io.mainClk.setName("clk")
murax
})
@ -180,14 +171,9 @@ object MuraxSynthesisBench {
val rtls = List(murax, muraxFast)
val targets = IcestormStdTargets().take(1) ++ XilinxStdTargets(
vivadoArtix7Path = "/media/miaou/HD/linux/Xilinx/Vivado/2018.3/bin"
) ++ AlteraStdTargets(
quartusCycloneIVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin",
quartusCycloneVPath = "/media/miaou/HD/linux/intelFPGA_lite/18.1/quartus/bin"
)
val targets = XilinxStdTargets() ++ AlteraStdTargets() ++ IcestormStdTargets().take(1)
Bench(rtls, targets, "/media/miaou/HD/linux/tmp")
Bench(rtls, targets)
}
}

View File

@ -114,7 +114,7 @@ case class InstructionCacheCpuFetch(p : InstructionCacheConfig) extends Bundle w
val mmuBus = MemoryTranslatorBus()
val physicalAddress = UInt(p.addressWidth bits)
val cacheMiss, error, mmuRefilling, mmuException, isUser = ifGen(!p.twoCycleCache)(Bool)
val haltIt = Bool
val haltIt = Bool() //Used to wait on the MMU rsp busy
override def asMaster(): Unit = {
out(isValid, isStuck, isRemoved, pc)

View File

@ -69,6 +69,7 @@ case class CsrPluginConfig(
midelegAccess : CsrAccess = CsrAccess.NONE,
pipelineCsrRead : Boolean = false,
pipelinedInterrupt : Boolean = true,
csrOhDecoder : Boolean = true,
deterministicInteruptionEntry : Boolean = false, //Only used for simulatation purposes
wfiOutput : Boolean = false
){
@ -263,7 +264,7 @@ case class CsrReadToWriteOverride(that : Data, bitOffset : Int) //Used for speci
case class CsrOnWrite(doThat :() => Unit)
case class CsrOnRead(doThat : () => Unit)
case class CsrMapping() extends CsrInterface{
val mapping = mutable.HashMap[Int,ArrayBuffer[Any]]()
val mapping = mutable.LinkedHashMap[Int,ArrayBuffer[Any]]()
def addMappingAt(address : Int,that : Any) = mapping.getOrElseUpdate(address,new ArrayBuffer[Any]) += that
override def r(csrAddress : Int, bitOffset : Int, that : Data): Unit = addMappingAt(csrAddress, CsrRead(that,bitOffset))
override def w(csrAddress : Int, bitOffset : Int, that : Data): Unit = addMappingAt(csrAddress, CsrWrite(that,bitOffset))
@ -447,9 +448,7 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
if(supervisorGen) {
redoInterface = pcManagerService.createJumpInterface(pipeline.execute)
redoInterface.valid := False
redoInterface.payload.assignDontCare()
redoInterface = pcManagerService.createJumpInterface(pipeline.execute, -1)
}
exceptionPendings = Vec(Bool, pipeline.stages.length)
@ -643,10 +642,13 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
satpAccess(CSR.SATP, 31 -> satp.MODE, 22 -> satp.ASID, 0 -> satp.PPN)
if(supervisorGen) onWrite(CSR.SATP){
execute.arbitration.flushNext := True
redoInterface.valid := True
redoInterface.payload := execute.input(PC) + 4
if(supervisorGen) {
redoInterface.valid := False
redoInterface.payload := decode.input(PC)
onWrite(CSR.SATP){
execute.arbitration.flushNext := True
redoInterface.valid := True
}
}
}
}
@ -685,7 +687,7 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
//Aggregate all exception port and remove required instructions
val exceptionPortCtrl = if(exceptionPortsInfos.nonEmpty) new Area{
val exceptionPortCtrl = exceptionPortsInfos.nonEmpty generate new Area{
val firstStageIndexWithExceptionPort = exceptionPortsInfos.map(i => indexOf(i.stage)).min
val exceptionValids = Vec(stages.map(s => Bool().setPartialName(s.getName())))
val exceptionValidsRegs = Vec(stages.map(s => Reg(Bool).init(False).setPartialName(s.getName()))).allowUnsetRegToAvoidLatch
@ -762,7 +764,7 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
//Avoid the PC register of the last stage to change durring an exception handleing (Used to fill Xepc)
stages.last.dontSample.getOrElseUpdate(PC, ArrayBuffer[Bool]()) += exceptionValids.last
exceptionPendings := exceptionValidsRegs
} else null
}
@ -808,17 +810,29 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
//Used to make the pipeline empty softly (for interrupts)
val pipelineLiberator = new Area{
when(interrupt.valid && allowInterrupts){
decode.arbitration.haltByOther := decode.arbitration.isValid
val pcValids = Vec(RegInit(False), stagesFromExecute.length)
val active = interrupt.valid && allowInterrupts && decode.arbitration.isValid
when(active){
decode.arbitration.haltByOther := True
for((stage, reg, previous) <- (stagesFromExecute, pcValids, True :: pcValids.toList).zipped){
when(!stage.arbitration.isStuck){
reg := previous
}
}
}
when(!active || decode.arbitration.isRemoved) {
pcValids.foreach(_ := False)
}
val done = !stagesFromExecute.map(_.arbitration.isValid).orR && fetcher.pcValid(mepcCaptureStage)
// val pcValids = for(stage <- stagesFromExecute) yield RegInit(False) clearWhen(!started) setWhen(!stage.arbitration.isValid)
val done = CombInit(pcValids.last)
if(exceptionPortCtrl != null) done.clearWhen(exceptionPortCtrl.exceptionValidsRegs.tail.orR)
}
//Interrupt/Exception entry logic
val interruptJump = Bool.addTag(Verilator.public)
interruptJump := interrupt.valid && pipelineLiberator.done && allowInterrupts
if(pipelinedInterrupt) interrupt.valid clearWhen(interruptJump) //avoid double fireing
val hadException = RegNext(exception) init(False)
pipelineLiberator.done.clearWhen(hadException)
@ -984,7 +998,7 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
val imm = IMM(input(INSTRUCTION))
def writeSrc = input(SRC1)
// val readDataValid = True
val readData = B(0, 32 bits)
val readData = Bits(32 bits)
val writeInstruction = arbitration.isValid && input(IS_CSR) && input(CSR_WRITE_OPCODE)
val readInstruction = arbitration.isValid && input(IS_CSR) && input(CSR_READ_OPCODE)
val writeEnable = writeInstruction && ! blockedBySideEffects && !arbitration.isStuckByOthers// && readDataRegValid
@ -1030,50 +1044,84 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
//Translation of the csrMapping into real logic
val csrAddress = input(INSTRUCTION)(csrRange)
Component.current.addPrePopTask(() => {
switch(csrAddress) {
for ((address, jobs) <- csrMapping.mapping) {
is(address) {
val withWrite = jobs.exists(j => j.isInstanceOf[CsrWrite] || j.isInstanceOf[CsrOnWrite])
val withRead = jobs.exists(j => j.isInstanceOf[CsrRead] || j.isInstanceOf[CsrOnRead])
if(withRead && withWrite) {
illegalAccess := False
} else {
if (withWrite) illegalAccess.clearWhen(input(CSR_WRITE_OPCODE))
if (withRead) illegalAccess.clearWhen(input(CSR_READ_OPCODE))
}
Component.current.afterElaboration{
def doJobs(jobs : ArrayBuffer[Any]): Unit ={
val withWrite = jobs.exists(j => j.isInstanceOf[CsrWrite] || j.isInstanceOf[CsrOnWrite])
val withRead = jobs.exists(j => j.isInstanceOf[CsrRead] || j.isInstanceOf[CsrOnRead])
if(withRead && withWrite) {
illegalAccess := False
} else {
if (withWrite) illegalAccess.clearWhen(input(CSR_WRITE_OPCODE))
if (withRead) illegalAccess.clearWhen(input(CSR_READ_OPCODE))
}
when(writeEnable) {
for (element <- jobs) element match {
case element: CsrWrite => element.that.assignFromBits(writeData(element.bitOffset, element.that.getBitsWidth bits))
case element: CsrOnWrite =>
element.doThat()
case _ =>
}
}
when(writeEnable) {
for (element <- jobs) element match {
case element: CsrWrite => element.that.assignFromBits(writeData(element.bitOffset, element.that.getBitsWidth bits))
case element: CsrOnWrite =>
element.doThat()
case _ =>
}
}
for (element <- jobs) element match {
case element: CsrRead if element.that.getBitsWidth != 0 => readData(element.bitOffset, element.that.getBitsWidth bits) := element.that.asBits
case _ =>
}
when(readEnable) {
for (element <- jobs) element match {
case element: CsrOnRead =>
element.doThat()
case _ =>
}
}
when(readEnable) {
for (element <- jobs) element match {
case element: CsrOnRead =>
element.doThat()
case _ =>
}
}
}
switch(csrAddress) {
for ((address, jobs) <- csrMapping.mapping if jobs.exists(_.isInstanceOf[CsrReadToWriteOverride])) {
is(address) {
for (element <- jobs) element match {
case element: CsrReadToWriteOverride if element.that.getBitsWidth != 0 => readToWriteData(element.bitOffset, element.that.getBitsWidth bits) := element.that.asBits
case _ =>
def doJobsOverride(jobs : ArrayBuffer[Any]): Unit ={
for (element <- jobs) element match {
case element: CsrReadToWriteOverride if element.that.getBitsWidth != 0 => readToWriteData(element.bitOffset, element.that.getBitsWidth bits) := element.that.asBits
case _ =>
}
}
csrOhDecoder match {
case false => {
readData := 0
switch(csrAddress) {
for ((address, jobs) <- csrMapping.mapping) {
is(address) {
doJobs(jobs)
for (element <- jobs) element match {
case element: CsrRead if element.that.getBitsWidth != 0 => readData(element.bitOffset, element.that.getBitsWidth bits) := element.that.asBits
case _ =>
}
}
}
}
switch(csrAddress) {
for ((address, jobs) <- csrMapping.mapping if jobs.exists(_.isInstanceOf[CsrReadToWriteOverride])) {
is(address) {
doJobsOverride(jobs)
}
}
}
}
case true => {
val oh = csrMapping.mapping.keys.toList.distinct.map(address => address -> RegNextWhen(decode.input(INSTRUCTION)(csrRange) === address, !execute.arbitration.isStuck).setCompositeName(this, "csr_" + address)).toMap
val readDatas = ArrayBuffer[Bits]()
for ((address, jobs) <- csrMapping.mapping) {
when(oh(address)){
doJobs(jobs)
}
if(jobs.exists(_.isInstanceOf[CsrRead])) {
val masked = B(0, 32 bits)
when(oh(address)) (for (element <- jobs) element match {
case element: CsrRead if element.that.getBitsWidth != 0 => masked(element.bitOffset, element.that.getBitsWidth bits) := element.that.asBits
case _ =>
})
readDatas += masked
}
}
readData := readDatas.reduceBalancedTree(_ | _)
for ((address, jobs) <- csrMapping.mapping) {
when(oh(address)){
doJobsOverride(jobs)
}
}
}
@ -1081,7 +1129,7 @@ class CsrPlugin(val config: CsrPluginConfig) extends Plugin[VexRiscv] with Excep
illegalAccess setWhen(privilege < csrAddress(9 downto 8).asUInt)
illegalAccess clearWhen(!arbitration.isValid || !input(IS_CSR))
})
}
}
}
}

View File

@ -145,7 +145,7 @@ class DBusCachedPlugin(val config : DataCacheConfig,
decoderService.add(FENCE, Nil)
mmuBus = pipeline.service(classOf[MemoryTranslator]).newTranslationPort(MemoryTranslatorPort.PRIORITY_DATA ,memoryTranslatorPortConfig)
redoBranch = pipeline.service(classOf[JumpService]).createJumpInterface(if(pipeline.writeBack != null) pipeline.writeBack else pipeline.execute)
redoBranch = pipeline.service(classOf[JumpService]).createJumpInterface(if(pipeline.writeBack != null) pipeline.writeBack else pipeline.memory)
if(catchSomething)
exceptionBus = pipeline.service(classOf[ExceptionService]).newExceptionPort(if(pipeline.writeBack == null) pipeline.memory else pipeline.writeBack)

View File

@ -201,7 +201,6 @@ class DebugPlugin(val debugClockDomain : ClockDomain, hardwareBreakpointCount :
execute.arbitration.haltByOther := True
busReadDataReg := execute.input(PC).asBits
when(stagesFromExecute.tail.map(_.arbitration.isValid).orR === False){
iBusFetcher.flushIt()
iBusFetcher.haltIt()
execute.arbitration.flushIt := True
execute.arbitration.flushNext := True
@ -214,10 +213,11 @@ class DebugPlugin(val debugClockDomain : ClockDomain, hardwareBreakpointCount :
iBusFetcher.haltIt()
}
when(stepIt && iBusFetcher.incoming()) {
iBusFetcher.haltIt()
when(stepIt) {
//Assume nothing will stop the CPU in the decode stage
when(decode.arbitration.isValid) {
haltIt := True
decode.arbitration.flushNext := True
}
}

View File

@ -172,7 +172,7 @@ class DecoderSimplePlugin(catchIllegalInstruction : Boolean = false,
}
if(catchIllegalInstruction){
decodeExceptionPort.valid := arbitration.isValid && input(INSTRUCTION_READY) && !input(LEGAL_INSTRUCTION) // ?? HalitIt to alow decoder stage to wait valid data from 2 stages cache cache ??
decodeExceptionPort.valid := arbitration.isValid && !input(LEGAL_INSTRUCTION) // ?? HalitIt to alow decoder stage to wait valid data from 2 stages cache cache ??
decodeExceptionPort.code := 2
decodeExceptionPort.badAddr := input(INSTRUCTION).asUInt
}

View File

@ -16,12 +16,14 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
val decodePcGen : Boolean,
val compressedGen : Boolean,
val cmdToRspStageCount : Int,
val pcRegReusedForSecondStage : Boolean,
val allowPcRegReusedForSecondStage : Boolean,
val injectorReadyCutGen : Boolean,
val prediction : BranchPrediction,
val historyRamSizeLog2 : Int,
val injectorStage : Boolean,
val relaxPredictorAddress : Boolean) extends Plugin[VexRiscv] with JumpService with IBusFetcher{
val relaxPredictorAddress : Boolean,
val fetchRedoGen : Boolean,
val predictionBuffer : Boolean = true) extends Plugin[VexRiscv] with JumpService with IBusFetcher{
var prefetchExceptionPort : Flow[ExceptionCause] = null
var decodePrediction : DecodePredictionBus = null
var fetchPrediction : FetchPredictionBus = null
@ -31,7 +33,6 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
// assert(!(cmdToRspStageCount == 1 && !injectorStage))
assert(!(compressedGen && !decodePcGen))
var fetcherHalt : Bool = null
var fetcherflushIt : Bool = null
var pcValids : Vec[Bool] = null
def pcValid(stage : Stage) = pcValids(pipeline.indexOf(stage))
var incomingInstruction : Bool = null
@ -42,12 +43,10 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
injectionPort = Stream(Bits(32 bits))
injectionPort
}
def pcRegReusedForSecondStage = allowPcRegReusedForSecondStage && prediction != DYNAMIC_TARGET //TODO might not be required for DYNAMIC_TARGET
var predictionJumpInterface : Flow[UInt] = null
override def haltIt(): Unit = fetcherHalt := True
override def flushIt(): Unit = fetcherflushIt := True
case class JumpInfo(interface : Flow[UInt], stage: Stage, priority : Int)
val jumpInfos = ArrayBuffer[JumpInfo]()
override def createJumpInterface(stage: Stage, priority : Int = 0): Flow[UInt] = {
@ -61,7 +60,6 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
// var decodeExceptionPort : Flow[ExceptionCause] = null
override def setup(pipeline: VexRiscv): Unit = {
fetcherHalt = False
fetcherflushIt = False
incomingInstruction = False
if(resetVector == null) externalResetVector = in(UInt(32 bits).setName("externalResetVector"))
@ -75,21 +73,33 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
}
case DYNAMIC_TARGET => {
fetchPrediction = pipeline.service(classOf[PredictionInterface]).askFetchPrediction()
if(compressedGen && cmdToRspStageCount > 1){
dynamicTargetFailureCorrection = createJumpInterface(pipeline.decode)
}
}
}
pcValids = Vec(Bool, pipeline.stages.size)
}
object IBUS_RSP
object DECOMPRESSOR
object INJECTOR_M2S
def isDrivingDecode(s : Any): Boolean = {
if(injectorStage) return s == INJECTOR_M2S
s == IBUS_RSP || s == DECOMPRESSOR
}
class FetchArea(pipeline : VexRiscv) extends Area {
import pipeline._
import pipeline.config._
val externalFlush = stages.map(_.arbitration.flushNext).orR
//JumpService hardware implementation
def getFlushAt(s : Any, lastCond : Boolean = true): Bool = {
if(isDrivingDecode(s) && lastCond) pipeline.decode.arbitration.isRemoved else externalFlush
}
//Arbitrate jump requests into pcLoad
val jump = new Area {
val sortedByStage = jumpInfos.sortWith((a, b) => {
(pipeline.indexOf(a.stage) > pipeline.indexOf(b.stage)) ||
@ -103,7 +113,7 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
pcLoad.payload := MuxOH(OHMasking.first(valids.asBits), pcs)
}
fetcherflushIt setWhen(stages.map(_.arbitration.flushNext).orR)
//The fetchPC pcReg can also be use for the second stage of the fetch
//When the fetcherHalt is set and the pipeline isn't stalled,, the pc is propagated to to the pcReg, which allow
@ -112,12 +122,16 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
//PC calculation without Jump
val output = Stream(UInt(32 bits))
val pcReg = Reg(UInt(32 bits)) init(if(resetVector != null) resetVector else externalResetVector) addAttribute(Verilator.public)
val corrected = False
val correction = False
val correctionReg = RegInit(False) setWhen(correction) clearWhen(output.fire)
val corrected = correction || correctionReg
val pcRegPropagate = False
val booted = RegNext(True) init (False)
val inc = RegInit(False) clearWhen(corrected || pcRegPropagate) setWhen(output.fire) clearWhen(!output.valid && output.ready)
val inc = RegInit(False) clearWhen(correction || pcRegPropagate) setWhen(output.fire) clearWhen(!output.valid && output.ready)
val pc = pcReg + (inc ## B"00").asUInt
val predictionPcLoad = ifGen(prediction == DYNAMIC_TARGET) (Flow(UInt(32 bits)))
val redo = (fetchRedoGen || prediction == DYNAMIC_TARGET) generate Flow(UInt(32 bits))
val flushed = False
if(compressedGen) when(inc) {
pc(1) := False
@ -125,22 +139,27 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
if(predictionPcLoad != null) {
when(predictionPcLoad.valid) {
corrected := True
correction := True
pc := predictionPcLoad.payload
}
}
if(redo != null) when(redo.valid){
correction := True
pc := redo.payload
flushed := True
}
when(jump.pcLoad.valid) {
corrected := True
correction := True
pc := jump.pcLoad.payload
flushed := True
}
when(booted && (output.ready || fetcherflushIt || pcRegPropagate)){
when(booted && (output.ready || correction || pcRegPropagate)){
pcReg := pc
}
pc(0) := False
if(!pipeline(RVC_GEN)) pc(1) := False
if(!compressedGen) pc(1) := False
output.valid := !fetcherHalt && booted
output.payload := pc
@ -148,6 +167,7 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
val decodePc = ifGen(decodePcGen)(new Area {
//PC calculation without Jump
val flushed = False
val pcReg = Reg(UInt(32 bits)) init(if(resetVector != null) resetVector else externalResetVector) addAttribute(Verilator.public)
val pcPlus = if(compressedGen)
pcReg + ((decode.input(IS_RVC)) ? U(2) | U(4))
@ -170,6 +190,7 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
//application of the selected jump request
when(jump.pcLoad.valid && (!decode.arbitration.isStuck || decode.arbitration.isRemoved)) {
pcReg := jump.pcLoad.payload
flushed := True
}
})
@ -177,116 +198,109 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
case class FetchRsp() extends Bundle {
val pc = UInt(32 bits)
val rsp = IBusSimpleRsp()
val isRvc = Bool
val isRvc = Bool()
}
val iBusRsp = new Area {
// val input = Stream(UInt(32 bits))
// val inputPipeline = Vec(Stream(UInt(32 bits)), cmdToRspStageCount)
// val inputPipelineHalt = Vec(False, cmdToRspStageCount-1)
// for(i <- 0 until cmdToRspStageCount) {
// inputPipeline(i) << {i match {
// case 0 => input.m2sPipeWithFlush(flush, false, collapsBubble = false)
// case _ => inputPipeline(i-1).haltWhen(inputPipelineHalt(i-1)).m2sPipeWithFlush(flush,collapsBubble = false)
// }}
// }
// val stages = Array.fill(cmdToRspStageCount)(Stream(UInt(32 bits)))
val redoFetch = False
val stages = Array.fill(cmdToRspStageCount + 1)(new Bundle {
val input = Stream(UInt(32 bits))
val output = Stream(UInt(32 bits))
val halt = Bool
val inputSample = Bool
val halt = Bool()
})
stages(0).input << fetchPc.output
stages(0).inputSample := True
for(s <- stages) {
s.halt := False
s.output << s.input.haltWhen(s.halt)
}
if(fetchPc.redo != null) {
fetchPc.redo.valid := redoFetch
fetchPc.redo.payload := stages.last.input.payload
}
val flush = (if(isDrivingDecode(IBUS_RSP)) pipeline.decode.arbitration.isRemoved || decode.arbitration.flushNext && !decode.arbitration.isStuck else externalFlush) || redoFetch
for((s,sNext) <- (stages, stages.tail).zipped) {
val sFlushed = if(s != stages.head) flush else False
val sNextFlushed = flush
if(s == stages.head && pcRegReusedForSecondStage) {
sNext.input.arbitrationFrom(s.output.toEvent().m2sPipeWithFlush(fetcherflushIt, s != stages.head, collapsBubble = false))
sNext.input.arbitrationFrom(s.output.toEvent().m2sPipeWithFlush(sNextFlushed, false, collapsBubble = false, flushInput = sFlushed))
sNext.input.payload := fetchPc.pcReg
fetchPc.pcRegPropagate setWhen(sNext.input.ready)
} else {
sNext.input << s.output.m2sPipeWithFlush(fetcherflushIt, s != stages.head, collapsBubble = false)
sNext.input << s.output.m2sPipeWithFlush(sNextFlushed, false, collapsBubble = false, flushInput = sFlushed)
}
}
//
// val pipeline = Vec(Stream(UInt(32 bits)), cmdToRspStageCount + 1)
// val halts = Vec(False, cmdToRspStageCount)
// for(i <- 0 until cmdToRspStageCount + 1) {
// pipeline(i) << {i match {
// case 0 => pipeline(0) << fetchPc.output.haltWhen(halts(i))
// case 1 => pipeline(1).m2sPipeWithFlush(flush, false, collapsBubble = false)
// case _ => inputPipeline(i-1).haltWhen(inputPipelineHalt(i-1)).m2sPipeWithFlush(flush,collapsBubble = false)
// }}
// }
// ...
val readyForError = True
val output = Stream(FetchRsp())
incomingInstruction setWhen(stages.tail.map(_.input.valid).reduce(_ || _))
}
val decompressor = ifGen(decodePcGen)(new Area{
def input = iBusRsp.output
val input = iBusRsp.output.clearValidWhen(iBusRsp.redoFetch)
val output = Stream(FetchRsp())
val flush = getFlushAt(DECOMPRESSOR)
val flushNext = if(isDrivingDecode(DECOMPRESSOR)) decode.arbitration.flushNext else False
val consumeCurrent = if(isDrivingDecode(DECOMPRESSOR)) flushNext && output.ready else False
val bufferValid = RegInit(False)
val bufferData = Reg(Bits(16 bits))
val isInputLowRvc = input.rsp.inst(1 downto 0) =/= 3
val isInputHighRvc = input.rsp.inst(17 downto 16) =/= 3
val throw2BytesReg = RegInit(False)
val throw2Bytes = throw2BytesReg || input.pc(1)
val unaligned = throw2Bytes || bufferValid
def aligned = !unaligned
val raw = Mux(
sel = bufferValid,
whenTrue = input.rsp.inst(15 downto 0) ## bufferData,
whenFalse = input.rsp.inst(31 downto 16) ## (input.pc(1) ? input.rsp.inst(31 downto 16) | input.rsp.inst(15 downto 0))
whenFalse = input.rsp.inst(31 downto 16) ## (throw2Bytes ? input.rsp.inst(31 downto 16) | input.rsp.inst(15 downto 0))
)
val isRvc = raw(1 downto 0) =/= 3
val decompressed = RvcDecompressor(raw(15 downto 0))
output.valid := (isRvc ? (bufferValid || input.valid) | (input.valid && (bufferValid || !input.pc(1))))
output.valid := input.valid && !(throw2Bytes && !bufferValid && !isInputHighRvc)
output.pc := input.pc
output.isRvc := isRvc
output.rsp.inst := isRvc ? decompressed | raw
// input.ready := (bufferValid ? (!isRvc && output.ready) | (input.pc(1) || output.ready))
input.ready := !output.valid || !(!output.ready || (isRvc && !input.pc(1) && input.rsp.inst(16, 2 bits) =/= 3) || (!isRvc && bufferValid && input.rsp.inst(16, 2 bits) =/= 3))
addPrePopTask(() => {
when(!input.ready && output.fire && !fetcherflushIt /* && ((isRvc && !bufferValid && !input.pc(1)) || (!isRvc && bufferValid && input.rsp.inst(16, 2 bits) =/= 3))*/) {
input.pc.getDrivingReg(1) := True
}
})
input.ready := output.ready && (!iBusRsp.stages.last.input.valid || flushNext || (!(bufferValid && isInputHighRvc) && !(aligned && isInputLowRvc && isInputHighRvc)))
bufferValid clearWhen(output.fire)
val bufferFill = False
when(input.fire){
when(!(!isRvc && !input.pc(1) && !bufferValid) && !(isRvc && input.pc(1) && output.ready)) {
bufferValid := True
bufferFill := True
} otherwise {
bufferValid := False
}
bufferData := input.rsp.inst(31 downto 16)
when(output.fire){
throw2BytesReg := (aligned && isInputLowRvc && isInputHighRvc) || (bufferValid && isInputHighRvc)
}
val bufferFill = (aligned && isInputLowRvc && !isInputHighRvc) || (bufferValid && !isInputHighRvc) || (throw2Bytes && !isRvc && !isInputHighRvc)
when(output.ready && input.valid){
bufferValid := False
}
when(output.ready && input.valid){
bufferData := input.rsp.inst(31 downto 16)
bufferValid setWhen(bufferFill)
}
when(flush || consumeCurrent){
throw2BytesReg := False
bufferValid := False
}
if(fetchPc.redo != null) {
fetchPc.redo.payload(1) setWhen(throw2BytesReg)
}
bufferValid.clearWhen(fetcherflushIt)
iBusRsp.readyForError.clearWhen(bufferValid && isRvc) //Can't emit error, as there is a earlier instruction pending
incomingInstruction setWhen(bufferValid && bufferData(1 downto 0) =/= 3)
})
def condApply[T](that : T, cond : Boolean)(func : (T) => T) = if(cond)func(that) else that
val injector = new Area {
val inputBeforeStage = condApply(if (decodePcGen) decompressor.output else iBusRsp.output, injectorReadyCutGen)(_.s2mPipe(fetcherflushIt))
val inputBeforeStage = condApply(if (decodePcGen) decompressor.output else iBusRsp.output, injectorReadyCutGen)(_.s2mPipe(externalFlush))
if (injectorReadyCutGen) {
iBusRsp.readyForError.clearWhen(inputBeforeStage.valid) //Can't emit error if there is a instruction pending in the s2mPipe
incomingInstruction setWhen (inputBeforeStage.valid)
}
val decodeInput = (if (injectorStage) {
val decodeInput = inputBeforeStage.m2sPipeWithFlush(fetcherflushIt, collapsBubble = false)
val flushStage = getFlushAt(INJECTOR_M2S)
val decodeInput = inputBeforeStage.m2sPipeWithFlush(flushStage, false, collapsBubble = false, flushInput = externalFlush)
decode.insert(INSTRUCTION_ANTICIPATED) := Mux(decode.arbitration.isStuck, decode.input(INSTRUCTION), inputBeforeStage.rsp.inst)
iBusRsp.readyForError.clearWhen(decodeInput.valid) //Can't emit error when there is a instruction pending in the injector stage buffer
incomingInstruction setWhen (decodeInput.valid)
@ -298,16 +312,16 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
if(!decodePcGen) iBusRsp.readyForError.clearWhen(!pcValid(decode)) //Need to wait a valid PC on the decode stage, as it is use to fill CSR xEPC
def pcUpdatedGen(input : Bool, stucks : Seq[Bool], relaxedInput : Boolean) : Seq[Bool] = {
def pcUpdatedGen(input : Bool, stucks : Seq[Bool], relaxedInput : Boolean, flush : Bool) : Seq[Bool] = {
stucks.scanLeft(input)((i, stuck) => {
val reg = RegInit(False)
if(!relaxedInput) when(fetcherflushIt) {
if(!relaxedInput) when(flush) {
reg := False
}
when(!stuck) {
reg := i
}
if(relaxedInput || i != input) when(fetcherflushIt) {
if(relaxedInput || i != input) when(flush) {
reg := False
}
reg
@ -316,20 +330,17 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
val stagesFromExecute = stages.dropWhile(_ != execute).toList
val nextPcCalc = if (decodePcGen) new Area{
val valids = pcUpdatedGen(True, False :: stagesFromExecute.map(_.arbitration.isStuck), true)
val valids = pcUpdatedGen(True, False :: stagesFromExecute.map(_.arbitration.isStuck), true, decodePc.flushed)
pcValids := Vec(valids.takeRight(stages.size))
} else new Area{
val valids = pcUpdatedGen(True, iBusRsp.stages.tail.map(!_.input.ready) ++ (if (injectorStage) List(!decodeInput.ready) else Nil) ++ stagesFromExecute.map(_.arbitration.isStuck), false)
val valids = pcUpdatedGen(True, iBusRsp.stages.tail.map(!_.input.ready) ++ (if (injectorStage) List(!decodeInput.ready) else Nil) ++ stagesFromExecute.map(_.arbitration.isStuck), false, fetchPc.flushed)
pcValids := Vec(valids.takeRight(stages.size))
}
val decodeRemoved = RegInit(False) setWhen(decode.arbitration.isRemoved) clearWhen(fetcherflushIt) //!decode.arbitration.isStuck || decode.arbitration.isFlushed
decodeInput.ready := !decode.arbitration.isStuck
decode.arbitration.isValid := decodeInput.valid && !decodeRemoved
decode.arbitration.isValid := decodeInput.valid
decode.insert(PC) := (if (decodePcGen) decodePc.pcReg else decodeInput.pc)
decode.insert(INSTRUCTION) := decodeInput.rsp.inst
decode.insert(INSTRUCTION_READY) := True
if (compressedGen) decode.insert(IS_RVC) := decodeInput.isRvc
if (injectionPort != null) {
@ -415,18 +426,15 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
}
}
def stage1ToInjectorPipe[T <: Data](input : T): (T,T) ={
def stage1ToInjectorPipe[T <: Data](input : T): (T, T, T) ={
val iBusRspContext = iBusRsp.stages.drop(1).dropRight(1).foldLeft(input)((data,stage) => RegNextWhen(data, stage.output.ready))
// val decompressorContext = ifGen(compressedGen)(new Area{
// val lastContext = RegNextWhen(iBusRspContext, decompressor.input.fire)
// val output = decompressor.bufferValid ? lastContext | iBusRspContext
// })
val decompressorContext = cloneOf(input)
decompressorContext := iBusRspContext
val injectorContext = Delay(if(compressedGen) decompressorContext else iBusRspContext, cycleCount=if(injectorStage) 1 else 0, when=injector.decodeInput.ready)
val iBusRspContextOutput = cloneOf(input)
iBusRspContextOutput := iBusRspContext
val injectorContext = Delay(iBusRspContextOutput, cycleCount=if(injectorStage) 1 else 0, when=injector.decodeInput.ready)
val injectorContextWire = cloneOf(input) //Allow combinatorial override
injectorContextWire := injectorContext
(ifGen(compressedGen)(decompressorContext), injectorContextWire)
(iBusRspContext, iBusRspContextOutput, injectorContextWire)
}
val predictor = prediction match {
@ -449,10 +457,10 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
}
val fetchContext = DynamicContext()
fetchContext.hazard := hazard
fetchContext.line := historyCache.readSync((fetchPc.output.payload >> 2).resized, iBusRsp.stages(0).output.ready || fetcherflushIt)
fetchContext.line := historyCache.readSync((fetchPc.output.payload >> 2).resized, iBusRsp.stages(0).output.ready || externalFlush)
object PREDICTION_CONTEXT extends Stageable(DynamicContext())
decode.insert(PREDICTION_CONTEXT) := stage1ToInjectorPipe(fetchContext)._2
decode.insert(PREDICTION_CONTEXT) := stage1ToInjectorPipe(fetchContext)._3
val decodeContextPrediction = decode.input(PREDICTION_CONTEXT).line.history.msb
val branchStage = decodePrediction.stage
@ -488,13 +496,11 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
}
//TODO no more fireing depedancies
predictionJumpInterface.valid := decode.arbitration.isValid && decodePrediction.cmd.hadBranch //TODO OH Doublon de priorité
predictionJumpInterface.valid := decode.arbitration.isValid && decodePrediction.cmd.hadBranch
predictionJumpInterface.payload := decode.input(PC) + ((decode.input(BRANCH_CTRL) === BranchCtrlEnum.JAL) ? imm.j_sext | imm.b_sext).asUInt
if(relaxPredictorAddress) KeepAttribute(predictionJumpInterface.payload)
decode.arbitration.flushNext setWhen(predictionJumpInterface.valid)
when(predictionJumpInterface.valid && decode.arbitration.isFiring){
flushIt()
}
if(relaxPredictorAddress) KeepAttribute(predictionJumpInterface.payload)
}
case DYNAMIC_TARGET => new Area{
// assert(!compressedGen || cmdToRspStageCount == 1, "Can't combine DYNAMIC_TARGET and RVC as it could stop the instruction fetch mid-air")
@ -502,26 +508,40 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
case class BranchPredictorLine() extends Bundle{
val source = Bits(30 - historyRamSizeLog2 bits)
val branchWish = UInt(2 bits)
val last2Bytes = ifGen(compressedGen)(Bool)
val target = UInt(32 bits)
val unaligned = ifGen(compressedGen)(Bool)
}
val history = Mem(BranchPredictorLine(), 1 << historyRamSizeLog2)
val historyWrite = history.writePort
val historyWriteDelayPatched = history.writePort
val historyWrite = cloneOf(historyWriteDelayPatched)
historyWriteDelayPatched.valid := historyWrite.valid
historyWriteDelayPatched.address := (if(predictionBuffer) historyWrite.address - 1 else historyWrite.address)
historyWriteDelayPatched.data := historyWrite.data
val line = history.readSync((iBusRsp.stages(0).input.payload >> 2).resized, iBusRsp.stages(0).output.ready || fetcherflushIt)
val hit = line.source === (iBusRsp.stages(1).input.payload.asBits >> 2 + historyRamSizeLog2) && (if(compressedGen)(!(!line.unaligned && iBusRsp.stages(1).input.payload(1))) else True)
//Avoid stoping instruction fetch in the middle patch
if(compressedGen && cmdToRspStageCount == 1){
hit clearWhen(!decompressor.output.valid)
}
val writeLast = RegNextWhen(historyWriteDelayPatched, iBusRsp.stages(0).output.ready)
//Avoid write to read hazard
val historyWriteLast = RegNextWhen(historyWrite, iBusRsp.stages(0).output.ready)
val hazard = historyWriteLast.valid && historyWriteLast.address === (iBusRsp.stages(1).input.payload >> 2).resized
//TODO improve predictionPcLoad way of doing things
fetchPc.predictionPcLoad.valid := line.branchWish.msb && hit && !hazard && iBusRsp.stages(1).output.valid //XXX && !(!line.unaligned && iBusRsp.inputPipeline(0).payload(1))
val buffer = predictionBuffer generate new Area{
val line = history.readSync((iBusRsp.stages(0).input.payload >> 2).resized, iBusRsp.stages(0).output.ready)
val pcCorrected = RegNextWhen(fetchPc.corrected, iBusRsp.stages(0).input.ready)
val hazard = (writeLast.valid && writeLast.address === (iBusRsp.stages(1).input.payload >> 2).resized)
}
val (line, hazard) = predictionBuffer match {
case true =>
(RegNextWhen(buffer.line, iBusRsp.stages(0).output.ready),
RegNextWhen(buffer.hazard, iBusRsp.stages(0).output.ready) || buffer.pcCorrected)
case false =>
(history.readSync((iBusRsp.stages(0).input.payload >> 2).resized,
iBusRsp.stages(0).output.ready), writeLast.valid && writeLast.address === (iBusRsp.stages(1).input.payload >> 2).resized)
}
val hit = line.source === (iBusRsp.stages(1).input.payload.asBits >> 2 + historyRamSizeLog2)
if(compressedGen) hit clearWhen(!line.last2Bytes && iBusRsp.stages(1).input.payload(1))
fetchPc.predictionPcLoad.valid := line.branchWish.msb && hit && !hazard && iBusRsp.stages(1).input.valid
fetchPc.predictionPcLoad.payload := line.target
case class PredictionResult() extends Bundle{
@ -535,22 +555,7 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
fetchContext.hit := hit
fetchContext.line := line
val (decompressorContext, injectorContext) = stage1ToInjectorPipe(fetchContext)
if(compressedGen) {
//prediction hit on the right instruction into words
decompressorContext.hit clearWhen(decompressorContext.line.unaligned && (decompressor.bufferValid || (decompressor.isRvc && !decompressor.input.pc(1))))
// if(compressedGen) injectorContext.hit clearWhen(decodePc.pcReg(1) =/= injectorContext.line.unaligned)
decodePc.predictionPcLoad.valid := injectorContext.line.branchWish.msb && injectorContext.hit && !injectorContext.hazard && injector.decodeInput.fire
decodePc.predictionPcLoad.payload := injectorContext.line.target
when(decompressorContext.line.branchWish.msb && decompressorContext.hit && !decompressorContext.hazard && decompressor.output.fire){
decompressor.bufferValid := False
decompressor.input.ready := True
}
}
val (iBusRspContext, iBusRspContextOutput, injectorContext) = stage1ToInjectorPipe(fetchContext)
object PREDICTION_CONTEXT extends Stageable(PredictionResult())
pipeline.decode.insert(PREDICTION_CONTEXT) := injectorContext
@ -565,7 +570,7 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
historyWrite.address := fetchPrediction.rsp.sourceLastWord(2, historyRamSizeLog2 bits)
historyWrite.data.source := fetchPrediction.rsp.sourceLastWord.asBits >> 2 + historyRamSizeLog2
historyWrite.data.target := fetchPrediction.rsp.finalPc
if(compressedGen) historyWrite.data.unaligned := !fetchPrediction.stage.input(PC)(1) ^ fetchPrediction.stage.input(IS_RVC)
if(compressedGen) historyWrite.data.last2Bytes := fetchPrediction.stage.input(PC)(1) && fetchPrediction.stage.input(IS_RVC)
when(fetchPrediction.rsp.wasRight) {
historyWrite.valid := branchContext.hit
@ -582,27 +587,31 @@ abstract class IBusFetcherImpl(val resetVector : BigInt,
historyWrite.valid clearWhen(branchContext.hazard || !branchStage.arbitration.isFiring)
val compressor = compressedGen generate new Area{
val predictionBranch = iBusRspContext.hit && !iBusRspContext.hazard && iBusRspContext.line.branchWish(1)
val unalignedWordIssue = iBusRsp.output.valid && predictionBranch && iBusRspContext.line.last2Bytes && Mux(decompressor.unaligned, !decompressor.isInputHighRvc, decompressor.isInputLowRvc && !decompressor.isInputHighRvc)
val predictionFailure = ifGen(compressedGen && cmdToRspStageCount > 1)(new Area{
val predictionBranch = decompressorContext.hit && !decompressorContext.hazard && decompressorContext.line.branchWish(1)
val unalignedWordIssue = decompressor.bufferFill && decompressor.input.rsp.inst(17 downto 16) === 3 && predictionBranch
val decompressorFailure = RegInit(False) setWhen(unalignedWordIssue) clearWhen(fetcherflushIt)
val injectorFailure = Delay(decompressorFailure, cycleCount=if(injectorStage) 1 else 0, when=injector.decodeInput.ready)
val bypassFailure = if(!injectorStage) False else decompressorFailure && !injector.decodeInput.valid
dynamicTargetFailureCorrection.valid := False
dynamicTargetFailureCorrection.payload := decode.input(PC)
when(injectorFailure || bypassFailure){
when(unalignedWordIssue){
historyWrite.valid := True
historyWrite.address := (decode.input(PC) >> 2).resized
historyWrite.address := (iBusRsp.stages(1).input.payload >> 2).resized
historyWrite.data.branchWish := 0
decode.arbitration.isValid := False
decode.arbitration.flushNext := True
dynamicTargetFailureCorrection.valid := True
iBusRsp.redoFetch := True
}
})
//Do not trigger prediction hit when it is one for the upper RVC word and we aren't there yet
iBusRspContextOutput.hit clearWhen(iBusRspContext.line.last2Bytes && (decompressor.bufferValid || (!decompressor.throw2Bytes && decompressor.isInputLowRvc)))
decodePc.predictionPcLoad.valid := injectorContext.line.branchWish.msb && injectorContext.hit && !injectorContext.hazard && injector.decodeInput.fire
decodePc.predictionPcLoad.payload := injectorContext.line.target
//Clean the RVC buffer when a prediction was made
when(iBusRspContext.line.branchWish.msb && iBusRspContextOutput.hit && !iBusRspContext.hazard && decompressor.output.fire){
decompressor.bufferValid := False
decompressor.throw2BytesReg := False
decompressor.input.ready := True //Drop the remaining byte if any
}
}
}
}

View File

@ -35,18 +35,21 @@ class IBusCachedPlugin(resetVector : BigInt = 0x80000000l,
memoryTranslatorPortConfig : Any = null,
injectorStage : Boolean = false,
withoutInjectorStage : Boolean = false,
relaxPredictorAddress : Boolean = true) extends IBusFetcherImpl(
relaxPredictorAddress : Boolean = true,
predictionBuffer : Boolean = true) extends IBusFetcherImpl(
resetVector = resetVector,
keepPcPlus4 = keepPcPlus4,
decodePcGen = compressedGen,
compressedGen = compressedGen,
cmdToRspStageCount = (if(config.twoCycleCache) 2 else 1) + (if(relaxedPcCalculation) 1 else 0),
pcRegReusedForSecondStage = true,
allowPcRegReusedForSecondStage = true,
injectorReadyCutGen = false,
prediction = prediction,
historyRamSizeLog2 = historyRamSizeLog2,
injectorStage = (!config.twoCycleCache && !withoutInjectorStage) || injectorStage,
relaxPredictorAddress = relaxPredictorAddress){
relaxPredictorAddress = relaxPredictorAddress,
fetchRedoGen = true,
predictionBuffer = predictionBuffer){
import config._
assert(isPow2(cacheSize))
@ -58,7 +61,6 @@ class IBusCachedPlugin(resetVector : BigInt = 0x80000000l,
var iBus : InstructionCacheMemBus = null
var mmuBus : MemoryTranslatorBus = null
var privilegeService : PrivilegeService = null
var redoBranch : Flow[UInt] = null
var decodeExceptionPort : Flow[ExceptionCause] = null
val tightlyCoupledPorts = ArrayBuffer[TightlyCoupledPort]()
def tightlyGen = tightlyCoupledPorts.nonEmpty
@ -86,9 +88,6 @@ class IBusCachedPlugin(resetVector : BigInt = 0x80000000l,
FLUSH_ALL -> True
))
redoBranch = pipeline.service(classOf[JumpService]).createJumpInterface(pipeline.decode, priority = 1) //Priority 1 will win against branch predictor
if(catchSomething) {
val exceptionService = pipeline.service(classOf[ExceptionService])
decodeExceptionPort = exceptionService.newExceptionPort(pipeline.decode,1)
@ -157,7 +156,7 @@ class IBusCachedPlugin(resetVector : BigInt = 0x80000000l,
stages(0).halt setWhen (cache.io.cpu.prefetch.haltIt)
cache.io.cpu.fetch.isRemoved := fetcherflushIt
cache.io.cpu.fetch.isRemoved := externalFlush
}
@ -237,19 +236,9 @@ class IBusCachedPlugin(resetVector : BigInt = 0x80000000l,
decodeExceptionPort.code := 1
}
when(!iBusRsp.readyForError){
redoFetch := False
cache.io.cpu.fill.valid := False
when(redoFetch) {
iBusRsp.redoFetch := True
}
// when(pipeline.stages.map(_.arbitration.flushIt).orR){
// cache.io.cpu.fill.valid := False
// }
redoBranch.valid := redoFetch
redoBranch.payload := (if (decodePcGen) decode.input(PC) else cacheRsp.pc)
decode.arbitration.flushNext setWhen(redoBranch.valid)
cacheRspArbitration.halt setWhen (issueDetected || iBusRspOutputHalt)

View File

@ -233,27 +233,31 @@ class IBusSimplePlugin( resetVector : BigInt,
val rspHoldValue : Boolean = false,
val singleInstructionPipeline : Boolean = false,
val memoryTranslatorPortConfig : Any = null,
relaxPredictorAddress : Boolean = true
relaxPredictorAddress : Boolean = true,
predictionBuffer : Boolean = true
) extends IBusFetcherImpl(
resetVector = resetVector,
keepPcPlus4 = keepPcPlus4,
decodePcGen = compressedGen,
compressedGen = compressedGen,
cmdToRspStageCount = busLatencyMin + (if(cmdForkOnSecondStage) 1 else 0),
pcRegReusedForSecondStage = !(cmdForkOnSecondStage && cmdForkPersistence),
allowPcRegReusedForSecondStage = !(cmdForkOnSecondStage && cmdForkPersistence),
injectorReadyCutGen = false,
prediction = prediction,
historyRamSizeLog2 = historyRamSizeLog2,
injectorStage = injectorStage,
relaxPredictorAddress = relaxPredictorAddress){
relaxPredictorAddress = relaxPredictorAddress,
fetchRedoGen = memoryTranslatorPortConfig != null,
predictionBuffer = predictionBuffer){
var iBus : IBusSimpleBus = null
var decodeExceptionPort : Flow[ExceptionCause] = null
val catchSomething = memoryTranslatorPortConfig != null || catchAccessFault
var mmuBus : MemoryTranslatorBus = null
var redoBranch : Flow[UInt] = null
if(rspHoldValue) assert(busLatencyMin <= 1)
// if(rspHoldValue) assert(busLatencyMin <= 1)
assert(!rspHoldValue, "rspHoldValue not supported yet")
assert(!singleInstructionPipeline)
override def setup(pipeline: VexRiscv): Unit = {
super.setup(pipeline)
@ -268,7 +272,6 @@ class IBusSimplePlugin( resetVector : BigInt,
if(memoryTranslatorPortConfig != null) {
mmuBus = pipeline.service(classOf[MemoryTranslator]).newTranslationPort(MemoryTranslatorPort.PRIORITY_INSTRUCTION, memoryTranslatorPortConfig)
redoBranch = pipeline.service(classOf[JumpService]).createJumpInterface(pipeline.decode, priority = 1) //Priority 1 will win against branch predictor
}
}
@ -282,9 +285,12 @@ class IBusSimplePlugin( resetVector : BigInt,
iBus.cmd << (if(cmdWithS2mPipe) cmd.s2mPipe() else cmd)
//Avoid sending to many iBus cmd
val pendingCmd = Reg(UInt(log2Up(pendingMax + 1) bits)) init (0)
val pendingCmdNext = pendingCmd + cmd.fire.asUInt - iBus.rsp.fire.asUInt
pendingCmd := pendingCmdNext
val pending = new Area{
val inc, dec = Bool()
val value = Reg(UInt(log2Up(pendingMax + 1) bits)) init (0)
val next = value + U(inc) - U(dec)
value := next
}
val secondStagePersistence = cmdForkPersistence && cmdForkOnSecondStage && !cmdWithS2mPipe
def cmdForkStage = if(!secondStagePersistence) iBusRsp.stages(if(cmdForkOnSecondStage) 1 else 0) else iBusRsp.stages(1)
@ -292,41 +298,43 @@ class IBusSimplePlugin( resetVector : BigInt,
val cmdFork = if(!secondStagePersistence) new Area {
//This implementation keep the cmd on the bus until it's executed or the the pipeline is flushed
def stage = cmdForkStage
stage.halt setWhen(stage.input.valid && (!cmd.valid || !cmd.ready))
if(singleInstructionPipeline) {
cmd.valid := stage.input.valid && pendingCmd =/= pendingMax && !stages.map(_.arbitration.isValid).orR
assert(injectorStage == false)
assert(iBusRsp.stages.dropWhile(_ != stage).length <= 2)
}else {
cmd.valid := stage.input.valid && stage.output.ready && pendingCmd =/= pendingMax
}
val canEmit = stage.output.ready && pending.value =/= pendingMax
stage.halt setWhen(stage.input.valid && (!canEmit || !cmd.ready))
cmd.valid := stage.input.valid && canEmit
pending.inc := cmd.fire
} else new Area{
//This implementation keep the cmd on the bus until it's executed, even if the pipeline is flushed
def stage = cmdForkStage
val pendingFull = pendingCmd === pendingMax
val cmdKeep = RegInit(False) setWhen(cmd.valid) clearWhen(cmd.ready)
val pendingFull = pending.value === pendingMax
val enterTheMarket = Bool()
val cmdKeep = RegInit(False) setWhen(enterTheMarket) clearWhen(cmd.ready)
val cmdFired = RegInit(False) setWhen(cmd.fire) clearWhen(stage.input.ready)
stage.halt setWhen(cmd.isStall || (pendingFull && !cmdFired))
cmd.valid := (stage.input.valid || cmdKeep) && !pendingFull && !cmdFired
enterTheMarket := stage.input.valid && !pendingFull && !cmdFired && !cmdKeep
// stage.halt setWhen(cmd.isStall || (pendingFull && !cmdFired)) //(cmd.isStall)
stage.halt setWhen(pendingFull && !cmdFired && !cmdKeep)
stage.halt setWhen(!cmd.ready && !cmdFired)
cmd.valid := enterTheMarket || cmdKeep
pending.inc := enterTheMarket
}
val mmu = (mmuBus != null) generate new Area {
mmuBus.cmd.isValid := cmdForkStage.input.valid
mmuBus.cmd.virtualAddress := cmdForkStage.input.payload
mmuBus.cmd.bypassTranslation := False
mmuBus.end := cmdForkStage.output.fire || fetcherflushIt
mmuBus.end := cmdForkStage.output.fire || externalFlush
cmd.pc := mmuBus.rsp.physicalAddress(31 downto 2) @@ U"00"
//do not emit memory request if MMU miss
when(mmuBus.rsp.exception || mmuBus.rsp.refilling){
cmdForkStage.halt := False
cmd.valid := False
}
when(mmuBus.busy){
cmdForkStage.input.valid := False
cmdForkStage.input.ready := False
//do not emit memory request if MMU had issues
when(cmdForkStage.input.valid) {
when(mmuBus.rsp.refilling) {
cmdForkStage.halt := True
cmd.valid := False
}
when(mmuBus.rsp.exception) {
cmdForkStage.halt := False
cmd.valid := False
}
}
val joinCtx = stageXToIBusRsp(cmdForkStage, mmuBus.rsp)
@ -339,51 +347,43 @@ class IBusSimplePlugin( resetVector : BigInt,
val rspJoin = new Area {
import iBusRsp._
//Manage flush for iBus transactions in flight
val discardCounter = Reg(UInt(log2Up(pendingMax + 1) bits)) init (0)
discardCounter := discardCounter - (iBus.rsp.fire && discardCounter =/= 0).asUInt
when(fetcherflushIt) {
if(secondStagePersistence)
discardCounter := pendingCmd + cmd.valid.asUInt - iBus.rsp.fire.asUInt
else
discardCounter := (if(cmdForkOnSecondStage) pendingCmdNext else pendingCmd - iBus.rsp.fire.asUInt)
}
val rspBufferOutput = Stream(IBusSimpleRsp())
val rspBuffer = if(!rspHoldValue) new Area{
val rspBuffer = new Area {
val output = Stream(IBusSimpleRsp())
val c = StreamFifoLowLatency(IBusSimpleRsp(), busLatencyMin + (if(cmdForkOnSecondStage && cmdForkPersistence) 1 else 0))
c.io.push << iBus.rsp.throwWhen(discardCounter =/= 0).toStream
c.io.flush := fetcherflushIt
rspBufferOutput << c.io.pop
} else new Area{
val rspStream = iBus.rsp.throwWhen(discardCounter =/= 0).toStream
val validReg = RegInit(False) setWhen(rspStream.valid) clearWhen(rspBufferOutput.ready)
rspBufferOutput << rspStream
rspBufferOutput.valid setWhen(validReg)
val discardCounter = Reg(UInt(log2Up(pendingMax + 1) bits)) init (0)
discardCounter := discardCounter - (c.io.pop.valid && discardCounter =/= 0).asUInt
when(iBusRsp.flush) {
discardCounter := (if(cmdForkOnSecondStage) pending.next else pending.value - U(pending.dec))
}
c.io.push << iBus.rsp.toStream
// if(compressedGen) c.io.flush setWhen(decompressor.consumeCurrent)
// if(!compressedGen && isDrivingDecode(IBUS_RSP)) c.io.flush setWhen(decode.arbitration.flushNext && iBusRsp.output.ready)
val flush = discardCounter =/= 0 || iBusRsp.flush
output.valid := c.io.pop.valid && discardCounter === 0
output.payload := c.io.pop.payload
c.io.pop.ready := output.ready || flush
pending.dec := c.io.pop.fire // iBus.rsp.valid && flush || c.io.pop.valid && output.ready instead to avoid unecessary dependancies ?
}
val fetchRsp = FetchRsp()
fetchRsp.pc := stages.last.output.payload
fetchRsp.rsp := rspBufferOutput.payload
fetchRsp.rsp.error.clearWhen(!rspBufferOutput.valid) //Avoid interference with instruction injection from the debug plugin
fetchRsp.rsp := rspBuffer.output.payload
fetchRsp.rsp.error.clearWhen(!rspBuffer.output.valid) //Avoid interference with instruction injection from the debug plugin
val join = Stream(FetchRsp())
val exceptionDetected = False
val redoRequired = False
join.valid := stages.last.output.valid && rspBufferOutput.valid
join.valid := stages.last.output.valid && rspBuffer.output.valid
join.payload := fetchRsp
stages.last.output.ready := stages.last.output.valid ? join.fire | join.ready
rspBufferOutput.ready := join.fire
output << join.haltWhen(exceptionDetected || redoRequired)
rspBuffer.output.ready := join.fire
output << join.haltWhen(exceptionDetected)
if(memoryTranslatorPortConfig != null){
redoRequired setWhen( stages.last.input.valid && mmu.joinCtx.refilling)
redoBranch.valid := redoRequired && iBusRsp.readyForError
redoBranch.payload := decode.input(PC)
decode.arbitration.flushIt setWhen(redoBranch.valid)
decode.arbitration.flushNext setWhen(redoBranch.valid)
when(stages.last.input.valid && mmu.joinCtx.refilling) {
iBusRsp.redoFetch := True
}
}

View File

@ -122,7 +122,7 @@ object StreamForkVex{
object StreamVexPimper{
implicit class StreamFlushPimper[T <: Data](pimped : Stream[T]){
def m2sPipeWithFlush(flush : Bool, discardInput : Boolean = true, collapsBubble : Boolean = true): Stream[T] = {
def m2sPipeWithFlush(flush : Bool, discardInput : Boolean = true, collapsBubble : Boolean = true, flushInput : Bool = null): Stream[T] = {
val ret = cloneOf(pimped)
val rValid = RegInit(False)
@ -132,7 +132,10 @@ object StreamVexPimper{
pimped.ready := (Bool(collapsBubble) && !ret.valid) || ret.ready
when(pimped.ready) {
rValid := pimped.valid
if(flushInput == null)
rValid := pimped.valid
else
rValid := pimped.valid && !flushInput
rData := pimped.payload
}

View File

@ -2,6 +2,7 @@ package vexriscv.plugin
import vexriscv._
import vexriscv.VexRiscv
import spinal.core._
import spinal.lib.KeepAttribute
//Input buffer generaly avoid the FPGA synthesis to duplicate reg inside the DSP cell, which could stress timings quite much.
class MulPlugin(inputBuffer : Boolean = false) extends Plugin[VexRiscv]{
@ -94,6 +95,12 @@ class MulPlugin(inputBuffer : Boolean = false) extends Plugin[VexRiscv]{
insert(MUL_LH) := aSLow * bHigh
insert(MUL_HL) := aHigh * bSLow
insert(MUL_HH) := aHigh * bHigh
Component.current.afterElaboration{
//Avoid synthesis tools to retime RS1 RS2 from execute stage to decode stage leading to bad timings (ex : Vivado, even if retiming is disabled)
KeepAttribute(input(RS1))
KeepAttribute(input(RS2))
}
}
//First aggregation of partial multiplication

File diff suppressed because it is too large Load Diff

View File

@ -1,122 +1,174 @@
:0200000480007A
:10000000930E1000970000009380C06F73905030E3
:10001000970000009380807273905010B71001F029
:100020001301000023A02000130E1000170F000082
:10003000130FCF0073000000130E2000B720000044
:10004000938000801301000073B0003073200130F2
:1000500097000000938040017390103473002030AB
:100060006F008068170F0000130F4F02730000002D
:100070006F008067130E3000170F0000130F0F0181
:10008000832010006F004066130E4000B720000070
:1000900093800080371100001301018073B000309D
:1000A000732001309700000093804001739010345A
:1000B000730020306F004063170F0000130F0F0113
:1000C000832010006F004062130E5000B720000024
:1000D000938000801301000073B000307320013062
:1000E000970000009380400173901034730020301B
:1000F0006F00805F170F0000130F0F0183201000A7
:100100006F00805E130E600093000001739020303A
:10011000130E7000170F0000130F0F018320100043
:100120006F00805C130E8000170F0000130FCF03C9
:10013000B720000093800080371100001301018078
:1001400073B00030732001309700000093804001AD
:1001500073901034730020306F000059832010001A
:100160006F008058130E9000170F0000130F8F03BD
:10017000B7200000938000801301000073B00030AE
:100180007320013097000000938040017390103479
:10019000730020306F004055832010006F00C05462
:1001A000130EA000170F0000130FCF03B71001F0BC
:1001B0001301000023A02000930080007390003002
:1001C000B71000009380008073904030B71001F0AA
:1001D0001301100023A02000730050106F00C050C6
:1001E000130EB000170F0000130F8F06B71001F0A9
:1001F0001301000023A020009300800073900030C2
:10020000B71000009380008073904030B72000004A
:1002100093800080371100001301018073B000301B
:1002200073200130970000009380400173901034D8
:10023000730020306F00404BB71001F01301100025
:1002400023A02000730050106F00004A130EC0005E
:10025000170F0000130F4F06B71001F01301000035
:1002600023A020009300800073900030B71000009E
:100270009380008073904030B7200000938000800E
:100280001301000073B000307320013097000000AC
:100290009380400173901034730020306F00C0448D
:1002A000B71001F01301100023A0200073005010BC
:1002B0006F0080439300200073900010130EE00045
:1002C000170F0000130F0F04B72001F013010000F7
:1002D00023A02000930020007390003093000020A2
:1002E00073904030930E0000B72001F0130110000E
:1002F00023A02000930040069380F0FFE34E10FE01
:10030000130EF000170F0000130F8F06B72001F037
:100310001301000023A02000930020007390003000
:100320009300002073904030B7200000938000803D
:10033000371100001301018073B0003073200130C9
:1003400097000000938040017390103473002030B8
:100350006F008039930E1000B72001F013011000D8
:1003600023A02000730050106F000038130E00010E
:10037000170F0000130F0F06B72001F01301000044
:1003800023A02000930020007390003093000020F1
:1003900073904030B720000093800080130100006C
:1003A00073B000307320013097000000938040014B
:1003B00073901034730020306F000033B72001F0C9
:1003C0001301100023A02000730050106F00C031F3
:1003D000130E10019300002073903030170F0000AF
:1003E000130F0F04B72001F01301000023A0200019
:1003F00093002000739000309300002073904030F1
:10040000930E0000B72001F01301100023A020007C
:10041000930040069380F0FFE34E10FE130E200180
:10042000170F0000130F8F06B72001F01301000013
:1004300023A0200093002000739000309300002040
:1004400073904030B7200000938000803711000087
:100450001301018073B00030732001309700000059
:100460009380400173901034730020306F00C027D8
:10047000930E1000B72001F01301100023A02000FC
:10048000730050106F004026130E3001170F00004C
:10049000130F0F06B72001F01301000023A0200066
:1004A0009300200073900030930000207390403040
:1004B000B7200000938000801301000073B000306B
:1004C0007320013097000000938040017390103436
:1004D000730020306F004021B72001F0130110009D
:1004E00023A02000730050106F000020B72001F0FF
:1004F0001301000023A02000130E4001170F00007D
:10050000130F0F039300200073900030930000201E
:1005100073904030930E00009300002073A04014AD
:10052000930040069380F0FFE34E10FE130E50013F
:10053000170F0000130F0F069300002073B0401434
:10054000930020007390003093000020739040309F
:10055000B720000093800080371100001301018054
:1005600073B0003073200130970000009380400189
:1005700073901034730020306F000017930E10003A
:100580009300002073A04014730050106F00C0153A
:10059000130E6001170F0000130F8F05930000204A
:1005A00073B040149300200073900030930000203B
:1005B000739040309300002073A04014B7200000D7
:1005C000938000801301000073B00030732001306D
:1005D0009700000093804001739010347300203026
:1005E0006F008010730050106F000010130E700128
:1005F000930E0000B72001F01301000023A020009B
:100600009300002073B04014F3214034B72001F070
:100610001301100023A020009300002073B04014A9
:10062000F3214034B72001F01301000023A0200083
:100630009300002073B04014F3214034B72001F040
:100640001301000023A020009300002073A0401499
:10065000F3214034B72001F01301100023A0200043
:100660009300002073A04014F3214034B72001F020
:100670001301000023A02000130E8001930020002E
:1006800073A0403073A0403473A00030930E10006C
:10069000170F0000130FCF03B720000093800080D6
:1006A000371100001301018073B000307320013056
:1006B0009700000093804001739010347300203045
:1006C0006F008002730050106F000002130E900143
:1006D000170F0000130F4F017360043073005010A8
:1006E0006F0080006F000001370110F0130141F22C
:1006F0002320C101370110F0130101F22320010072
:10070000E3840EFEF3202034F3201034F320003075
:10071000F32030349300000873B0003093002000C1
:10072000E38A1EFCB72000009380008073A0003095
:1007300073101F3473002030E3880EFAF320201466
:10074000F3201014F3200010F32030147300000085
:10075000130000001300000013000000130000004D
:0807600013000000130000006B
:10000000930E1000971000009380C0A3739050309F
:1000100097100000938080A673905010B71001F0E5
:100020001301000023A020001300000013000000B3
:100030001300000013000000130000001300000074
:100040001300000013000000130E1000170F000033
:10005000130FCF0073000000130E2000B720000024
:10006000938000801301000073B0003073200130D2
:10007000970000009380400173901034730020308B
:100080006F00901A170F0000130F4F02730000004B
:100090006F009019130E3000170F0000130F0F019F
:1000A000832010006F005018130E4000B72000008E
:1000B00093800080371100001301018073B000307D
:1000C000732001309700000093804001739010343A
:1000D000730020306F005015170F0000130F0F0131
:1000E000832010006F005014130E5000B720000042
:1000F000938000801301000073B000307320013042
:1001000097000000938040017390103473002030FA
:100110006F009011170F0000130F0F0183201000C4
:100120006F009010130E6000930000017390203058
:10013000130E7000170F0000130F0F018320100023
:100140006F00900E130E8000170F0000130FCF03E7
:10015000B720000093800080371100001301018058
:1001600073B000307320013097000000938040018D
:1001700073901034730020306F00100B8320100038
:100180006F00900A130E9000170F0000130F8F03DB
:10019000B7200000938000801301000073B000308E
:1001A0007320013097000000938040017390103459
:1001B000730020306F005007832010006F00D006BE
:1001C000130EA000170F0000130FCF07B71001F098
:1001D0001301000023A02000130000001300000002
:1001E00013000000130000001300000013000000C3
:1001F0001300000013000000930080007390003093
:10020000B71000009380008073904030B71001F069
:100210001301100023A020001300000013000000B1
:100220001300000013000000130000001300000082
:100230001300000013000000730050106F00C07E18
:10024000130EB000170F0000130F8F0AB71001F044
:100250001301000023A02000130000001300000081
:100260001300000013000000130000001300000042
:100270001300000013000000930080007390003012
:10028000B71000009380008073904030B7200000CA
:1002900093800080371100001301018073B000309B
:1002A0007320013097000000938040017390103458
:1002B000730020306F004077B71001F01301100079
:1002C00023A0200013000000130000001300000012
:1002D00013000000130000001300000013000000D2
:1002E00013000000730050106F000074130EC00064
:1002F000170F0000130F4F0AB71001F01301000091
:1003000023A02000130000001300000013000000D1
:100310001300000013000000130000001300000091
:10032000130000009300800073900030B7100000AD
:100330009380008073904030B7200000938000804D
:100340001301000073B000307320013097000000EB
:100350009380400173901034730020306F00C06CA4
:10036000B71001F01301100023A0200013000000BB
:100370001300000013000000130000001300000031
:100380001300000013000000130000007300501061
:100390006F0080699300200073900010130EE0003E
:1003A000170F0000130F0F08B72001F01301000012
:1003B00023A0200013000000130000001300000021
:1003C00013000000130000001300000013000000E1
:1003D0001300000093002000739000309300002071
:1003E00073904030930E0000B72001F0130110000D
:1003F00023A02000130000001300000013000000E1
:1004000013000000130000001300000013000000A0
:1004100013000000930040069380F0FFE34E10FEAF
:10042000130EF000170F0000130F8F0AB72001F012
:100430001301000023A0200013000000130000009F
:100440001300000013000000130000001300000060
:100450001300000013000000930020007390003090
:100460009300002073904030B720000093800080FC
:10047000371100001301018073B000307320013088
:100480009700000093804001739010347300203077
:100490006F008059930E1000B72001F01301100077
:1004A00023A0200013000000130000001300000030
:1004B00013000000130000001300000013000000F0
:1004C00013000000730050106F000056130E00015F
:1004D000170F0000130F0F0AB72001F013010000DF
:1004E00023A02000130000001300000013000000F0
:1004F00013000000130000001300000013000000B0
:10050000130000009300200073900030930000203F
:1005100073904030B72000009380008013010000EA
:1005200073B00030732001309700000093804001C9
:1005300073901034730020306F00004FB72001F02B
:100540001301100023A0200013000000130000007E
:10055000130000001300000013000000130000004F
:100560001300000013000000730050106F00C04B18
:10057000130E10019300002073903030170F00000D
:10058000130F0F08B72001F01301000023A0200073
:10059000130000001300000013000000130000000F
:1005A00013000000130000001300000013000000FF
:1005B000930020007390003093000020739040302F
:1005C000930E0000B72001F01301100023A02000BB
:1005D00013000000130000001300000013000000CF
:1005E00013000000130000001300000013000000BF
:1005F000930040069380F0FFE34E10FE130E20019F
:10060000170F0000130F8F0AB72001F0130100002D
:1006100023A02000130000001300000013000000BE
:10062000130000001300000013000000130000007E
:10063000130000009300200073900030930000200E
:1006400073904030B7200000938000803711000085
:100650001301018073B00030732001309700000057
:100660009380400173901034730020306F00C03BC2
:10067000930E1000B72001F01301100023A02000FA
:10068000130000001300000013000000130000001E
:10069000130000001300000013000000130000000E
:1006A000730050106F004038130E3001170F000018
:1006B000130F0F0AB72001F01301000023A0200040
:1006C00013000000130000001300000013000000DE
:1006D00013000000130000001300000013000000CE
:1006E00093002000739000309300002073904030FE
:1006F000B7200000938000801301000073B0003029
:1007000073200130970000009380400173901034F3
:10071000730020306F004031B72001F0130110004A
:1007200023A02000130000001300000013000000AD
:10073000130000001300000013000000130000006D
:1007400013000000730050106F00002EB72001F05E
:100750001301000023A0200013000000130000007C
:10076000130000001300000013000000130000003D
:100770001300000013000000130E4001170F0000CB
:10078000130F0F039300200073900030930000209C
:1007900073904030930E00009300002073A040142B
:1007A000930040069380F0FFE34E10FE130E5001BD
:1007B000170F0000130F0F069300002073B04014B2
:1007C000930020007390003093000020739040301D
:1007D000B7200000938000803711000013010180D2
:1007E00073B0003073200130970000009380400107
:1007F00073901034730020306F000023930E1000AC
:100800009300002073A04014730050106F00C021AB
:10081000130E6001170F0000130F8F0593000020C7
:1008200073B04014930020007390003093000020B8
:10083000739040309300002073A04014B720000054
:10084000938000801301000073B0003073200130EA
:1008500097000000938040017390103473002030A3
:100860006F00801C730050106F00001C130E70018D
:10087000930E0000B72001F01301000023A0200018
:10088000130000001300000013000000130000001C
:10089000130000001300000013000000130000000C
:1008A0009300002073B04014F3214034B72001F0CE
:1008B0001301100023A0200013000000130000000B
:1008C00013000000130000001300000013000000DC
:1008D00013000000130000009300002073B04014C8
:1008E000F3214034B72001F01301000023A02000C1
:1008F00013000000130000001300000013000000AC
:10090000130000001300000013000000130000009B
:100910009300002073B04014F3214034B72001F05D
:100920001301000023A020001300000013000000AA
:10093000130000001300000013000000130000006B
:1009400013000000130000009300002073A0401467
:10095000F3214034B72001F01301100023A0200040
:10096000130000001300000013000000130000003B
:10097000130000001300000013000000130000002B
:100980009300002073A04014F3214034B72001F0FD
:100990001301000023A0200013000000130000003A
:1009A00013000000130000001300000013000000FB
:1009B0001300000013000000130E800193002000BC
:1009C00073A0403073A0403473A00030930E100029
:1009D000170F0000130FCF03B72000009380008093
:1009E000371100001301018073B000307320013013
:1009F0009700000093804001739010347300203002
:100A00006F008002730050106F000002130E9001FF
:100A1000170F0000130F4F01736004307300501064
:100A20006F0080006F000001370110F0130141F2E8
:100A30002320C101370110F0130101F2232001002E
:100A4000E3840EFEF3202034F3201034F320003032
:100A5000F32030349300000873B00030930020007E
:100A6000E38A1EFCB72000009380008073A0003052
:100A700073101F3473002030E3880EFAF320201423
:100A8000F3201014F3200010F32030147300000042
:100A9000130000001300000013000000130000000A
:080AA000130000001300000028
:040000058000000077
:00000001FF

View File

@ -10,11 +10,28 @@
li x1, 0xF0011000; \
li x2, value; \
sw x2, 0(x1); \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \
#define externalInterruptS(value) \
li x1, 0xF0012000; \
li x2, value; \
sw x2, 0(x1); \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \
nop; \

View File

@ -102,9 +102,9 @@ class DhrystoneBench extends FunSuite{
getDmips(
name = "GenLinuxBalenced",
gen = LinuxGen.main(Array.fill[String](0)("")),
testCmd = "make clean run IBUS=CACHED DBUS=CACHED DEBUG_PLUGIN=STD DHRYSTONE=yes SUPERVISOR=yes MMU=no CSR=yes CSR_SKIP_TEST=yes DEBUG_PLUGIN=no COMPRESSED=no MUL=yes DIV=yes LRSC=yes AMO=yes REDO=10 TRACE=no COREMARK=yes LINUX_REGRESSION=no"
testCmd = "make clean run IBUS=CACHED DBUS=CACHED DEBUG_PLUGIN=STD DHRYSTONE=yes SUPERVISOR=yes MMU=no CSR=yes CSR_SKIP_TEST=yes COMPRESSED=no MUL=yes DIV=yes LRSC=yes AMO=yes REDO=10 TRACE=no COREMARK=yes LINUX_REGRESSION=no"
)
//make run IBUS=CACHED DBUS=CACHED DEBUG_PLUGIN=STD DHRYSTONE=yess SUPERVISOR=yes CSR=yes COMPRESSED=no MUL=yes DIV=yes LRSC=yes AMO=yes REDO=1 TRACE=no LINUX_REGRESSION=yes SEED=42
test("final_report") {