RISC-V requires stack to be aligned to 16 bytes.
1d5384e669/riscv-elf.md?plain=1#L183
Right now, in bios/linker.ld, `_fstack` is being set to 8 bytes
before the end of sram region.
```
PROVIDE(_fstack = ORIGIN(sram) + LENGTH(sram) - 8);
```
Removing ` - 8` makes it aligned to 16.
Also there are changes in crt0.S for vexriscv,
vexriscv_smp and cv32e40p.
Code that was setting up stack, was adding 4 to its address
for some reason.
Removing it makes it aligned to 8 bytes, and with change in
bios/linker.ld to 16 bytes.
It also fixes `printf` with long long integers on 32bit
CPUs ([relevant issue](https://github.com/riscv/riscv-gcc/issues/63)).
A regular CPU can provides specific mapping constraints and we are overriding provided mapping
with these constraints.
The case of CPUNone is different and we can do the opposite: Give priority to User's mapping.
For the regular CPU case, the override was done silently, it is now logged during the build.
- Rename optional #define and allow defining them externally.
- Add comments.
- Rename FLASH_CHIP_MX25L12833F_QUAD to SPIFLASH_MODULE_QUAD_CAPABLE.
- Rename FLASH_CHIP_MX25L12833F_QPI to SPIFLASH_MODULE_QPI_CAPABLE.
The instructions used for QUAD/QPI are probably different between chips, we could
imagine providing them through the LiteX integration based on the passed SPI Flash
module.
One small FPGAs running the BIOS from SPI Flash, the default divisor of 9 was slowing down too
much BIOS boot time (It was OK on reboot after liblitespi auto-calibration). Reduce the default
divisor to avoid this.
The implementation was causing regressions on actual designs, rework done:
- Only keep a common iteration loop as before.
- Add iteration on CLKO dividers (to fall in the VCO range).
- Do the iterations as before, if while doing it we find a clock suitable for feedback: just use it.
- If no feedback clock has been found: create it (if at least one free output available, if not raise an error).
- Shift in _csr_rd_buf should only been done when buf is set.
- When CSR size is not an exact multiple of the CSR data-width, the gap is in
the low addresses, not the high ones. So offset is introduced to take this into
account.
Dynamically adjusting the phase of a feedback will cause it to unlock.
The phase adjust ports are shared by all the outputs, so there is no
technical way to prevent this. Allow the user to indicate that they
will not adjust a clock when requesting an output by setting
uses_dpa=False, and only consider those that the user has promised not
to use.