rewrite pins
This commit is contained in:
parent
addd660bf2
commit
2b698fc08a
|
@ -75,16 +75,19 @@ If you do not delete the container you can run
|
||||||
|
|
||||||
when you need to rebuild. If you need shell access, run `make $NAME-shell`.
|
when you need to rebuild. If you need shell access, run `make $NAME-shell`.
|
||||||
|
|
||||||
|
## Launch TFTP Server
|
||||||
|
|
||||||
|
Install py3tftp (`pip3 install --user py3tftp`). Then run `make tftp` to
|
||||||
|
launch the TFTP server. Keep this terminal open.
|
||||||
|
|
||||||
## Flash FPGA
|
## Flash FPGA
|
||||||
|
|
||||||
Plug in your FPGA into the USB slot. Then run
|
Plug in your FPGA into the USB slot. Then run
|
||||||
|
|
||||||
openFPGALoader -c digilent upsilon/boot/digilent_arty.bit
|
openFPGALoader -c digilent upsilon/boot/digilent_arty.bit
|
||||||
|
|
||||||
## Launch TFTP Server
|
In a second you should see messages in the TFTP terminal. This means your
|
||||||
|
controller is sucessfully connected to your computer.
|
||||||
Install py3tftp (`pip3 install --user py3tftp`). Then run `make tftp` to
|
|
||||||
launch the TFTP server. Keep this terminal open.
|
|
||||||
|
|
||||||
## SSH Access
|
## SSH Access
|
||||||
|
|
||||||
|
|
|
@ -1,14 +0,0 @@
|
||||||
This document is for recording notes on measurements done on Upslion
|
|
||||||
running on actual FPGAs.
|
|
||||||
|
|
||||||
Commit: `9c2731ad8d794d0b3c46999a40f0064f2b020c69`
|
|
||||||
FPGA: Arty A7-100T
|
|
||||||
F4PGA commit: `f43bb728b1bd9ef3807ef65bcf6b6629e0fa71f5`
|
|
||||||
|
|
||||||
ADCs:
|
|
||||||
|
|
||||||
SPI clocks take about 10ns to start going up and down from low voltage. They
|
|
||||||
rise to about 500mV in that time. MISO oscillates up and down up to 50mV with
|
|
||||||
no data present, rising and stays at that until it oscillates down. Should not
|
|
||||||
be a problem. Probably capacitance/crosstalk. Ringing of about 40mV on clock
|
|
||||||
and SS.
|
|
|
@ -1,582 +0,0 @@
|
||||||
Upsilon Maintenance Manual. This document may be distributed under your choice
|
|
||||||
of the GNU GPL v3.0 (or any later version), or under the [CC BY-SA 4.0][CC].
|
|
||||||
|
|
||||||
[CC]: https://creativecommons.org/licenses/by-sa/4.0/legalcode
|
|
||||||
|
|
||||||
# Introduction
|
|
||||||
|
|
||||||
This document is aimed at maintainers of this software who are not
|
|
||||||
experienced programmers (in either software or hardware). Its goal is
|
|
||||||
to contain any pertinent information to the devlopment process of Upsilon.
|
|
||||||
|
|
||||||
This manual is (hopefully) modular enough that you can just skip to the
|
|
||||||
section you need without having to read the entire thing.
|
|
||||||
|
|
||||||
## Organization of the Project
|
|
||||||
|
|
||||||
Upsilon uses LiteX and Linux for it's FPGA code. LiteX generates HDL
|
|
||||||
and glues it together. It also forms the build system of the hardware
|
|
||||||
portion of Upsilon. Linux is the kernel portion, which deals with
|
|
||||||
communication between the computer that receives scan data and the
|
|
||||||
hardware that is executing the scan.
|
|
||||||
|
|
||||||
LiteX further uses F4PGA to compile the HDL code. F4PGA is primarily
|
|
||||||
made up of Yosys (synthesis) and nextpnr (place and route).
|
|
||||||
|
|
||||||
## Required Knowledge
|
|
||||||
|
|
||||||
This document is written under the assumption that you are using Linux.
|
|
||||||
You can make this work on other platforms but I don't know how to.
|
|
||||||
|
|
||||||
Verilog is critical for writing hardware. You should hopefully not have
|
|
||||||
to write much of it.
|
|
||||||
|
|
||||||
The kernel is written in C. This C is different than C you have written
|
|
||||||
before because it is running "freestanding."
|
|
||||||
|
|
||||||
You do not need to know about Linux kernel development. You will need
|
|
||||||
to know the basics of ssh, vi, and how to use Unix as a user.
|
|
||||||
|
|
||||||
Tests are written in C++ and verilog. You will not have to write C++
|
|
||||||
unless you modify the Verilog files.
|
|
||||||
|
|
||||||
The macro processing language GNU m4 is used occasionally. You will
|
|
||||||
need to know how to use m4 if you modify the main `base.v.m4` file
|
|
||||||
(e.g. adding more software-accessable ports).
|
|
||||||
|
|
||||||
Python is used the SoC generator. The SoC generator uses a library called
|
|
||||||
LiteX, which in turn uses migen. You do not need to know a lot about migen,
|
|
||||||
but LiteX's documentation is poor so you will need to know some migen in order
|
|
||||||
to read the code and understand how some modules work.
|
|
||||||
|
|
||||||
# Compile Process
|
|
||||||
|
|
||||||
Although each component uses a different build system, you can run everything
|
|
||||||
with
|
|
||||||
1. `make` (compile everything in this folder)
|
|
||||||
2. `make clean` (clean up all compiled files)
|
|
||||||
|
|
||||||
## Setting up the Toolchain
|
|
||||||
|
|
||||||
The toolchain is primarily designed around modern Linux. It may not work
|
|
||||||
properly on Windows or MacOS. If you have access to a computational
|
|
||||||
cluster (if you are at FSU physics, ask the Physics department) then
|
|
||||||
you should set up the toolchain on their servers. You will be able to
|
|
||||||
compile things on any computer with an internet connection.
|
|
||||||
|
|
||||||
### F4PGA
|
|
||||||
|
|
||||||
1. Clone [F4PGA](https://github.com/chipsalliance/f4pga) (if you want,
|
|
||||||
checkout commit `b6c5fff`, but you should try checking out master
|
|
||||||
first)
|
|
||||||
2. Run `scripts/prepare_environment.sh`. Note that you will need to change
|
|
||||||
the environment variable `$F4PGA_INSTALL_DIR` if you do not have access
|
|
||||||
to the default directory (which is root access).
|
|
||||||
3. Run `scripts/activate.sh`. If you run into problems, open the file and
|
|
||||||
copy the `source` and `conda` commands manually into your terminal.
|
|
||||||
4. Install meson and ninja through pip.
|
|
||||||
|
|
||||||
All commands should be done in the conda environment.
|
|
||||||
|
|
||||||
### LiteX
|
|
||||||
|
|
||||||
1. Download `litex_setup.py` from the [LiteX repository][litex_repo], Upsilon
|
|
||||||
uses 2022.08 to some directory (don't put it in your home directory because
|
|
||||||
there will be a bunch of downloaded repositories.
|
|
||||||
2. Run `litex_setup.py --init --install --user --tag 2022.08`
|
|
||||||
3. Download a GCC RISC-V cross compiler. If you have root access to the build
|
|
||||||
machine, then you can probably install this with your package manager. Users
|
|
||||||
of Ubuntu 14 can download the [sifive][sifive_gcc] GCC. Otherwise you will have
|
|
||||||
to compile a cross compiler (`x86_64` host to RV32I target) manually.
|
|
||||||
4. Put the GCC RISC-V cross compiler in your `$PATH` variable.
|
|
||||||
|
|
||||||
[litex_repo]: https://github.com/enjoy-digital/litex
|
|
||||||
[sifive_gcc]: https://github.com/sifive/freedom-tools/releases
|
|
||||||
|
|
||||||
### Buildroot
|
|
||||||
|
|
||||||
Buildroot builds a Linux system for the FPGA. To build the Images, download a stable
|
|
||||||
version of Buildroot that the config files support and run
|
|
||||||
|
|
||||||
make BR2_EXTERNAL=/upsilon_directory/buildroot litex_vexriscv_defconfig
|
|
||||||
|
|
||||||
### OpenSBI
|
|
||||||
|
|
||||||
OpenSBI is a platform independent interface between the hardware and the kernel.
|
|
||||||
Download the latest version of OpenSBI that the config files support. Copy
|
|
||||||
the files in the `opensbi` directory to the `targets` directory and run
|
|
||||||
|
|
||||||
make CROSS_COMPILE=riscv64-linux-gnu- PLATFORM=litex/vexriscv
|
|
||||||
|
|
||||||
## FPGA Build System
|
|
||||||
|
|
||||||
Make sure F4PGA and a RISC-V GCC compiler are in your path. Then just go into
|
|
||||||
the `firmware` folder and run `make`. This should generate everything you need
|
|
||||||
and compile the software. The synthesis suite is single threaded. This will
|
|
||||||
take about 15-20 minutes on a good computer.
|
|
||||||
|
|
||||||
The FPGA firmware (aka gateware) build system is designed in a recursive
|
|
||||||
manner. That means that each directory has a Makefile that processes all the
|
|
||||||
files in the directory. There is a `common.makefile` in the `rtl/` directory
|
|
||||||
that is used when a rule (such as preprocessing a Verilog source file)
|
|
||||||
is used in multiple Makefiles.
|
|
||||||
|
|
||||||
For the Arty A7, the bitstream is `firmware/build/digilent_arty/gateware/digilent_arty.bit`.
|
|
||||||
|
|
||||||
## Software Build System
|
|
||||||
|
|
||||||
It is recommended to use the [docker files][docker].
|
|
||||||
|
|
||||||
[docker]: https://software.mcgoron.com/peter/upsilon-docker
|
|
||||||
|
|
||||||
# Loading the Software and Firmware
|
|
||||||
|
|
||||||
## Network Setup
|
|
||||||
|
|
||||||
You will need the FPGA and the controlling computer on the same wired
|
|
||||||
network. **DO NOT CONNECT THE FPGA TO A WIDE NETWORK. USE A PRIVATE LAN
|
|
||||||
THAT ONLY CONTAINS THE CONTROLLING COMPUTER AND THE FPGA. DO NOT ATTEMPT
|
|
||||||
TO CONNECT THE FPGA TO THE INTERNET.** The controlling computer can
|
|
||||||
still connect to the internet, but through another LAN port. The best
|
|
||||||
thing to do is to buy a USB to Ethernet adapter.
|
|
||||||
|
|
||||||
The default TFTP client connects to 192.168.1.100.
|
|
||||||
|
|
||||||
## Connecting to the FPGA Over USB
|
|
||||||
|
|
||||||
Connect to the FPGA over USB and run `litex_term /dev/ttyUSB1` (or whatever
|
|
||||||
connection it should be) and you should see the LiteX BIOS come up.
|
|
||||||
|
|
||||||
## Loading the Firmware
|
|
||||||
|
|
||||||
Connect the FPGA to a computer using a Micro-USB to USB cable. Run
|
|
||||||
`openFPGALoader -c digilent digilent_arty.bit` to upload the firmware
|
|
||||||
(gateware) to the controller.
|
|
||||||
|
|
||||||
You can load the software using serial boot but this is very slow. The
|
|
||||||
better thing to do is to use TFTP boot, which goes over Ethernet.
|
|
||||||
**WHEN YOU RUN TFTP, DO NOT EXPOSE YOUR INTERFACE TO THE INTERNET
|
|
||||||
CONNECTED NETWORK INTERFACE. THIS IS A BIG SECURITY RISK. ONLY RUN
|
|
||||||
TFTP FOR THE AMOUNT OF TIME REQUIRED TO BOOT THE CONTROL SOFTWARE.**
|
|
||||||
You can read about how to setup a TFTP server on the [OpenWRT wiki][owrt_wiki].
|
|
||||||
|
|
||||||
Using DNSMasq on linux, run
|
|
||||||
|
|
||||||
dnsmasq -d --port=0 --enable-tftp --tftp-root=/path/to/firmware/directory --user=root --group=root --interface=$INTERFACE
|
|
||||||
|
|
||||||
Do not use `--tftp-no-blocksize`. The controller will only read the first
|
|
||||||
512 bytes of the kernel.
|
|
||||||
|
|
||||||
In the root of the TFTP server, have `boot.bin` be the kernel binary
|
|
||||||
(`zephyr.bin`).
|
|
||||||
|
|
||||||
[owrt_wiki]: https://openwrt.org/docs/guide-user/troubleshooting/tftpserver
|
|
||||||
|
|
||||||
# FPGA
|
|
||||||
|
|
||||||
Upsilon runs on a Field Programmable Gate Array (FPGA). FPGAs are sets
|
|
||||||
of logic gates and other peripherals that can be changed by a computer.
|
|
||||||
FPGAs can implement CPUs, digital filters, and control code at a much
|
|
||||||
higher speed than a computer. The downside is that FPGAs are much more
|
|
||||||
difficult to program for.
|
|
||||||
|
|
||||||
A large part of Upsilon is written in Verilog. Verilog is a Hardware
|
|
||||||
Description Language (HDL), which is similar to a programming language
|
|
||||||
(such as C++ or Python).
|
|
||||||
|
|
||||||
The difference is, is that Verilog compiles to a *piece of hardware* that
|
|
||||||
deals with individual bits executing operations in sync with a clock. This
|
|
||||||
differs from a *piece of software*, which is a set of instructions that a
|
|
||||||
computer follows. Verilog is usually much less abstract than regular code.
|
|
||||||
|
|
||||||
Regular code is tested on the system in which it is run. Hardware,
|
|
||||||
on the other hand, is very difficult to test on the device that it
|
|
||||||
is actually running on. Hardware is usually *simulated*. This project
|
|
||||||
primarily simulates Verilog code using the program Verilator, where the
|
|
||||||
code that runs the simulation is written in C++.
|
|
||||||
|
|
||||||
Instead of strings, integers, and classes, the basic components of all
|
|
||||||
Verilog code is the wire and the register, which store bits (1 and 0).
|
|
||||||
Wires connect components together, and registers store data, in a similar
|
|
||||||
way to variables in software. Unlike usual programming languages, where
|
|
||||||
code executes one step at a time, most FPGA code runs at the tick of
|
|
||||||
the system clock in parallel.
|
|
||||||
|
|
||||||
To compile Verilog to a format suitable for execution on an FPGA, you
|
|
||||||
*synthesize* the Verilog into a low-level format that uses the specific
|
|
||||||
resources of the FPGA you are using, and then you run a *place and route*
|
|
||||||
program to allocate resources on the FPGA to fit your design. Running
|
|
||||||
synthesis on its own can help you understand how much resources a module
|
|
||||||
uses. Place-and-route gives you *timing reports*, which tell you about
|
|
||||||
major design problems that outstrip the capabilities of the FPGA (or the
|
|
||||||
programs you are using). You should look up what "timing" on an FPGA is
|
|
||||||
and learn as much as you can about it, because it is an issue that does
|
|
||||||
not happen in standard software and can be very difficult to fix when
|
|
||||||
you run into it.
|
|
||||||
|
|
||||||
Once a bitstream is synthesized, it is loaded onto a FPGA through a cable
|
|
||||||
(for this project, openFPGALoader).
|
|
||||||
|
|
||||||
## Recommendations to Learners
|
|
||||||
|
|
||||||
[Gisselquist Technology][GT] is the best free online resource for FPGA
|
|
||||||
programming out there. These articles will help you understand how to
|
|
||||||
write *good* FPGA code, not just valid code.
|
|
||||||
|
|
||||||
[GT]: https://zipcpu.com/
|
|
||||||
|
|
||||||
Here are some exercises for you to ease yourself into FPGA programming.
|
|
||||||
|
|
||||||
* Write an FPGA program that implements addition without using the `+`
|
|
||||||
operator. This program should add each number bit by bit, handling
|
|
||||||
carried digits properly. This is called a *full adder*.
|
|
||||||
* Write an FPGA program that multiplies two signed integers together,
|
|
||||||
without using the `*` operator. The width of these integers should
|
|
||||||
not be hard-coded: it should be easy to change. What you write in
|
|
||||||
this is something that is actually a part of this project: see
|
|
||||||
`boothmul.v`. You do not (and should not!) write it just like Upsilon
|
|
||||||
has written it.
|
|
||||||
* Write an FPGA program that communicates over SPI. For simplicity,
|
|
||||||
you only need to write it for a single SPI mode: look up on the internet
|
|
||||||
for details. There is an SPI slave device in this repository that you
|
|
||||||
can use to simulate an end for the SPI master you write, but you should
|
|
||||||
write the SPI slave yourself. For bonus points, connect your SPI master
|
|
||||||
to a real SPI device and confirm that your communication works.
|
|
||||||
|
|
||||||
For each of these exercises, follow the complete "Design Testing Process"
|
|
||||||
below. At the very least, write simulations and test your programs on
|
|
||||||
real hardware.
|
|
||||||
|
|
||||||
## Control and Status Registers in Hardware
|
|
||||||
|
|
||||||
LiteX uses "Control and Status Registers" (CSRs) to communicate between
|
|
||||||
the CPU and any Verilog modules. (RISC-V CPUs have something with the
|
|
||||||
same name, but Upsilon does not use that.)
|
|
||||||
|
|
||||||
## Design Testing Process
|
|
||||||
|
|
||||||
### Simulation
|
|
||||||
|
|
||||||
When you write or modify a verilog module, the first thing you should do
|
|
||||||
is write/run a simulation of that module. A simulation of that module
|
|
||||||
should at the minimum compare the execution of the module with known
|
|
||||||
results (called "Ground truth testing"). A simulation should also consider
|
|
||||||
edge cases that you might overlook when writing Verilog.
|
|
||||||
|
|
||||||
For example, a module that multiplies two signed integers together should
|
|
||||||
have a simulation that sends the module many pairs of integers, taking
|
|
||||||
care to ensure that all possible permutations of sign are tested (i.e.
|
|
||||||
positive times positive, negative times positive, etc.) and also that
|
|
||||||
special-cases are handled (i.e. largest 32-bit integer multiplied by
|
|
||||||
largest negative 32-bit integer, multiplication by 0 and 1, etc.).
|
|
||||||
|
|
||||||
Writing simulation code is a very boring task, but you *must* do it.
|
|
||||||
Otherwise there is no way for you to check that
|
|
||||||
|
|
||||||
1. Your code does what you want it to do
|
|
||||||
2. Any changes you make to your code don't break it
|
|
||||||
|
|
||||||
If you find a bug that isn't covered by your simulation, make sure you
|
|
||||||
add that case to the simulation.
|
|
||||||
|
|
||||||
The file `firmware/rtl/testbench.hpp` contains a class that you should
|
|
||||||
use to organize individual tests. Make a derived class of `TB` and
|
|
||||||
use the `posedge()` function to encode what default actions your test
|
|
||||||
should take at every positive edge of the clock. Remember, in C++ each
|
|
||||||
action is blocking: there is no equivalent to the non-blocking `<=`.
|
|
||||||
|
|
||||||
If you have to do a lot of non-blocking code for your test, you
|
|
||||||
should write a Verilog wrapper for your test that implements
|
|
||||||
the non-blocking code. **Verilator only supports a subset of
|
|
||||||
non-synthesizable Verilog. Unless you really need to, use synthesizable
|
|
||||||
Verilog only.** See `firmware/rtl/waveform/waveform_sim.v` and
|
|
||||||
`firmware/rtl/waveform/dma_sim.v` for an example of Verilog files only
|
|
||||||
used for tests.
|
|
||||||
|
|
||||||
### Test Synthesis
|
|
||||||
|
|
||||||
**Yosys only accepts a subset of Verilog. You might write a bunch of
|
|
||||||
code that Verilator will happily simulate but that will fail to go
|
|
||||||
through Yosys.**
|
|
||||||
|
|
||||||
Once you have simulated your design, you should use yosys to synthesize it.
|
|
||||||
This will allow you to understand how much and what resources the module
|
|
||||||
is taking up. To do this, you can put the follwing in a script file:
|
|
||||||
|
|
||||||
read_verilog module_1.v
|
|
||||||
read_verilog module_2.v
|
|
||||||
...
|
|
||||||
read_verilog top_module.v
|
|
||||||
synth_xilinx -flatten -nosrl -noclkbuf -nodsp -iopad -nowidelut
|
|
||||||
write_verilog yosys_synth_output.v
|
|
||||||
|
|
||||||
and run `yosys -s scriptfile`. The options to `synth_xilinx` reflect
|
|
||||||
the current limitations that F4PGA has. The file `xc7.f4pga.tcl` that
|
|
||||||
F4PGA downloads is the complete synthesis script, read it to understand
|
|
||||||
the internals of what F4PGA does to compile your verilog.
|
|
||||||
|
|
||||||
### Test Compilation
|
|
||||||
|
|
||||||
I haven't been able to do this for most of this project. The basic idea
|
|
||||||
is to use `firmware/rtl/soc.py` to load only the module to test, and
|
|
||||||
to use LiteScope to write and read values from the module. For more
|
|
||||||
information, you can look at
|
|
||||||
[the boothmul test](https://software.mcgoron.com/peter/boothmul/src/branch/master/arty_test).
|
|
||||||
|
|
||||||
# Software Programming
|
|
||||||
|
|
||||||
The "software" is the code written in C that runs on the FPGA. This
|
|
||||||
handles access to hardware components, running scripts sent by the
|
|
||||||
controlling computer, and sending information between the hardware and
|
|
||||||
the controlling computer.
|
|
||||||
|
|
||||||
## Crash Course in Multithreaded Programming
|
|
||||||
|
|
||||||
Each script (up to 32 by default, change by redefining a macro) runs in
|
|
||||||
a separate thread. This allows for multiple scripts to execute without
|
|
||||||
having to explicitly hand control from one component to another, but
|
|
||||||
since there is no defined execution path (one thread may execute before
|
|
||||||
or after another thread), the program must handle scripts attempting to
|
|
||||||
access the same component.
|
|
||||||
|
|
||||||
Upsilon handles multiple threads using
|
|
||||||
|
|
||||||
1. Mutexes
|
|
||||||
2. Thread Local Storage
|
|
||||||
|
|
||||||
Mutexes ("mutual exclusion") are objects that only allow for one thread
|
|
||||||
to access them at a time. When one thread locks a mutex, other threads
|
|
||||||
attempting to lock the mutex sleep until the thread unlocks the mutex.
|
|
||||||
After the thread that locked the mutex unlocks it, some other thread gets
|
|
||||||
the mutex.
|
|
||||||
|
|
||||||
Mutex management is important because if multiple threads attempt to
|
|
||||||
read or write to a converter at the same time, the scripts could deadlock,
|
|
||||||
requiring a hard reset of the system. (You could add manual deadlock
|
|
||||||
aborting by adding new commands that call `k_thread_abort`, as long as
|
|
||||||
all threads are not deadlocked. This is a hack but may be necessary.)
|
|
||||||
|
|
||||||
Each thread can lock the mutex as many times as it wants, but it must
|
|
||||||
unlock the mutex the same number of times. Thread local storage (the
|
|
||||||
`__thread` modifier) is used to count the number of times that each mutex
|
|
||||||
is locked by a thread. Since (as the name implies) TLS is thread-local,
|
|
||||||
there is no need to control access to it by mutexes: each thread gets
|
|
||||||
its own local version of the thread local variables.
|
|
||||||
|
|
||||||
The software has to count the number of recursive locks because when
|
|
||||||
the thread finally releases control of the mutex, another thread must
|
|
||||||
be able to access the hardware in a well defined state: it should not
|
|
||||||
attempt to write to hardware while the hardware is running (certain
|
|
||||||
specific exceptions apply). When the unlock routines (see for example
|
|
||||||
`waveform_release()`) reach the final unlock
|
|
||||||
(e.g. `waveform_locked[i] == 1`), the software waits for the hardware
|
|
||||||
to finish what its doing before unlocking.
|
|
||||||
|
|
||||||
The kernel implements "time-slicing", which means that each running
|
|
||||||
program executes in chunks. After each chunk is finished, another
|
|
||||||
program can execute. The amount of time for each thread is controlled
|
|
||||||
by `CONFIG_TIMESLICE_SIZE` in `prj.conf`. When executing critical code,
|
|
||||||
use `k_sched_lock` and `k_sched_unlock`.
|
|
||||||
|
|
||||||
TODO: Use `k_thread_time_slice_set` to implement an abort check for
|
|
||||||
threads.
|
|
||||||
|
|
||||||
## Crash Course in Network Programming
|
|
||||||
|
|
||||||
The kernel communicates with the controlling computer using a TCP/IP
|
|
||||||
connection. You should connect the controller and the computer to a
|
|
||||||
router and assign the kernel a static IP.
|
|
||||||
|
|
||||||
Each script that runs on the kernel is a separate connection. Each
|
|
||||||
connection runs on a separate thread, because each thread runs a Creole
|
|
||||||
interpreter.
|
|
||||||
|
|
||||||
TCP can usually detect when a connection breaks, but you should gracefully
|
|
||||||
shutdown all connections. Otherwise dead connections can hang around for
|
|
||||||
minutes at a time.
|
|
||||||
|
|
||||||
### Static IPs
|
|
||||||
|
|
||||||
The client and controller IPs are baked into the software *and firmware*
|
|
||||||
at build time. The software configuration is in `software/prj.conf`. The
|
|
||||||
firmware configuration is in `firmware/soc.py` (see `local_ip` and `remote_ip`
|
|
||||||
settings in `SoCCore`).
|
|
||||||
|
|
||||||
The controlling computer must have it's static IP on the interface connected
|
|
||||||
to the controller to be the same as `remote_ip`. By default this is `91.168.1.100`.
|
|
||||||
|
|
||||||
## Logging
|
|
||||||
|
|
||||||
TODO: Do logging via UDP?
|
|
||||||
|
|
||||||
Logging is done via UART. Connect the micro-USB slot to the controlling
|
|
||||||
computer to get debug output.
|
|
||||||
|
|
||||||
All you need to know is
|
|
||||||
|
|
||||||
* Use `LOG_WRN` for errors that you can recover from (i.e. closing a
|
|
||||||
connection
|
|
||||||
* Use `LOG_ERR` for errors that are fatal and halt the firmware,
|
|
||||||
requiring a reset
|
|
||||||
* Use `LOG_INF` for misc information (i.e. initialization completed,
|
|
||||||
accepted connection, closing connection)
|
|
||||||
* Use `LOG_DBG` for debugging output
|
|
||||||
|
|
||||||
If you need debugging output, add a line of the form
|
|
||||||
|
|
||||||
set_source_file_properties(src_file PROPERTIES COMPILE_FLAGS -DFILE_LOG_LEVEL=4
|
|
||||||
|
|
||||||
This will enable debugging output for this file only. Do not enable
|
|
||||||
debugging output for the entire system! This will make the debugging
|
|
||||||
output unusuable.
|
|
||||||
|
|
||||||
When you are done, set `4` to `3` in that line.
|
|
||||||
|
|
||||||
TODO: Ethernet debugging output.
|
|
||||||
|
|
||||||
## Control and Status Registers in Software
|
|
||||||
|
|
||||||
CSR read and write functions are generated by `/firmware/generate_csr_locations.py`.
|
|
||||||
You should not need to directly call `write` and `read` on raw addresses.
|
|
||||||
If you add a new CSR, add it to the generator script.
|
|
||||||
|
|
||||||
### Implementation Information
|
|
||||||
|
|
||||||
CSRs can be used in software by using `litex_write8`,
|
|
||||||
`litex_read16`, etc. In the Zephyr source, look at
|
|
||||||
`soc/riscv/litex-vexriscv/soc.h` for the complete implementation.
|
|
||||||
Also look at `include/zephyr/arch/common/sys_io.h` to see how these
|
|
||||||
functions are implemented.
|
|
||||||
|
|
||||||
Do not directly write to CSR ports without using `litex_writeN` and
|
|
||||||
`litex_readN`, and do not directly use `sys_io.h` functions. If you are
|
|
||||||
not careful you will not access the registers correctly and you will
|
|
||||||
crash the software.
|
|
||||||
|
|
||||||
# Controlling Computer
|
|
||||||
|
|
||||||
## Creole
|
|
||||||
|
|
||||||
Creole is the bytecode that the kernel runs. It is written using a
|
|
||||||
python library. It looks very similar to assembly, but is custom built
|
|
||||||
to make it easier to write direct assembly code.
|
|
||||||
|
|
||||||
Creole programs are the scripts run by the kernel to communicate with
|
|
||||||
hardware and send messages over Ethernet to the controlling computer.
|
|
||||||
Each creole program should do one thing: i.e. monitor an ADC, run
|
|
||||||
the raster scan, output waveforms, etc.
|
|
||||||
|
|
||||||
Creole programs should reserve the hardware modules (DAC, ADC, CLOOP,
|
|
||||||
waveforms) that they use explicitly. This makes your program faster
|
|
||||||
and less error prone.
|
|
||||||
|
|
||||||
Since the Creole assembler is a python library, you can use things
|
|
||||||
like Python format strings to automate production of Creole code. You
|
|
||||||
can also add virtual instructions (by directly modifying the library)
|
|
||||||
easily.
|
|
||||||
|
|
||||||
Creole has a concept of data blocks, assigned using the `DB` command.
|
|
||||||
These blocks are used for waveforms and for printing sets of data out
|
|
||||||
to the datastream.
|
|
||||||
|
|
||||||
Creole uses a [self-synchronizing code][ssc] to detect encoding and
|
|
||||||
transmission errors. This makes programs bigger, but you should not
|
|
||||||
write big Creole programs.
|
|
||||||
|
|
||||||
[ssc]: https://en.wikipedia.org/wiki/Self-synchronizing_code
|
|
||||||
|
|
||||||
The controlling computer sends a 16 bit little endian unsigned integer
|
|
||||||
(the size of the Creole program in bytes) followed by Creole bytecode.
|
|
||||||
|
|
||||||
# Hacks and Pitfalls
|
|
||||||
|
|
||||||
The open source software stack that Upsilon uses is novel and unstable.
|
|
||||||
|
|
||||||
## LiteX
|
|
||||||
|
|
||||||
Set `compile_software` to `False` in `soc.py` when checking for Verilog
|
|
||||||
compile errors. Set it back when you do an actual compile run, or your
|
|
||||||
program will not boot.
|
|
||||||
|
|
||||||
If LiteX complains about not having a RiscV compiler, that is because
|
|
||||||
your system does not have compatible RISC-V compiler in your `$PATH`.
|
|
||||||
Refer to the LiteX install instructions above to see how to set up the
|
|
||||||
SiFive GCC, which will work.
|
|
||||||
|
|
||||||
## F4PGA
|
|
||||||
|
|
||||||
This is really a Yosys (and really, an abc bug). F4PGA defaults to using
|
|
||||||
the ABC flow, which can break, especially for block RAM. To fix, edit out
|
|
||||||
`-abc` in the tcl script (find it before you install it...)
|
|
||||||
|
|
||||||
## Yosys
|
|
||||||
|
|
||||||
Yosys fails to calculate computed parameter values correctly. For instance,
|
|
||||||
|
|
||||||
parameter CTRLVAL = 5;
|
|
||||||
localparam VALUE = CTRLVAL + 1;
|
|
||||||
|
|
||||||
Yosys will *silently* fail to compile this, setting `VALUE` to be equal
|
|
||||||
to 0. The solution is to use macros.
|
|
||||||
|
|
||||||
## Reset Pins
|
|
||||||
|
|
||||||
On the Arty A7 there is a Reset button. This is connected to the CPU and only
|
|
||||||
resets the CPU. Possibly due to timing issues modules get screwed up if they
|
|
||||||
share a reset pin with the CPU. The code currently connects button 0 to reset
|
|
||||||
the modules seperately from the CPU.
|
|
||||||
|
|
||||||
## Clock Speeds
|
|
||||||
|
|
||||||
The output pins on the FPGA (except for the high speed PMOD outputs) cannot
|
|
||||||
switch fast enough to
|
|
||||||
|
|
||||||
## Macros
|
|
||||||
|
|
||||||
Verilog's preprocessor is awful. F4PGA (through yosys) barely supports it.
|
|
||||||
|
|
||||||
You should only use Verilog macros as a replacement for `localparam`.
|
|
||||||
When you need to do so, you must preprocess the file with
|
|
||||||
Verilator. For example, if you have a file called `mod.v` in the folder
|
|
||||||
`firmware/rtl/mod/`, then in the file `firmware/rtl/mod/Makefile` add
|
|
||||||
|
|
||||||
codegen: [...] mod_preprocessed.v
|
|
||||||
|
|
||||||
(putting it after all other generated files). The file
|
|
||||||
`firmware/rtl/common.makefile` should automatically generate the
|
|
||||||
preprocessed file for you.
|
|
||||||
|
|
||||||
## If The Controlling Computer Cannot Connect to the Internet
|
|
||||||
|
|
||||||
When you connect your computer to the controller over Ethernet, your computer
|
|
||||||
may attempt to route all traffic over the controller network (since it is
|
|
||||||
wired) instead of another network (like a wireless network). This means that
|
|
||||||
your computer can't connect to the internet (or your connection is really slow).
|
|
||||||
If this happens to you on a Linux machine, you can change the routing table.
|
|
||||||
|
|
||||||
Run `route -n` (or `ip route` if this does not work) to print the routing table.
|
|
||||||
Find the entry named `default via [...] dev eth-interface`. This is the default route
|
|
||||||
for the ethernet device. Remove it using `ip route del default via [...] dev eth-interface`.
|
|
||||||
|
|
||||||
If the route keeps on reappearing, delete it and quickly enter
|
|
||||||
`ip route del default via [...] dev eth0 metric 65534`. This will make the
|
|
||||||
route the last priority.
|
|
||||||
|
|
||||||
## Getting The Correct IP for the Controlling Computer
|
|
||||||
|
|
||||||
Some routers can automatically assign IPs based on MAC address. If your computer
|
|
||||||
can do that, great. Otherwise you will need to configure your computer with a
|
|
||||||
static ip.
|
|
||||||
|
|
||||||
1. Remove your computer from the DHCP list that the router has.
|
|
||||||
2. Run `ip link set eth-interface up`.
|
|
||||||
3. Then run `ip addr` and run `ip addr del [ip] dev eth-interface` on
|
|
||||||
each ip on the ethernet interface that is connected to the controller.
|
|
||||||
3. Run `ip addr add 192.168.1.100/24 dev eth-interface` (or whatever ip + subnet
|
|
||||||
mask you need)
|
|
||||||
4. If `ip route` does not give a routing entry for `192.168.1.0/24`, run
|
|
||||||
`ip route add 192.168.1.0/24 dev eth0 proto kernel scope link` (again,
|
|
||||||
change depending on different situations)
|
|
||||||
|
|
||||||
This will use the static ip `192.168.1.100`, which is the default TFTP boot
|
|
||||||
IP.
|
|
|
@ -119,8 +119,8 @@ if __name__ == "__main__":
|
||||||
{"read_only": False, "name": "dac_sel", "total": dac_num},
|
{"read_only": False, "name": "dac_sel", "total": dac_num},
|
||||||
{"read_only": True, "name": "dac_finished", "total": dac_num},
|
{"read_only": True, "name": "dac_finished", "total": dac_num},
|
||||||
{"read_only": False, "name": "dac_arm", "total": dac_num},
|
{"read_only": False, "name": "dac_arm", "total": dac_num},
|
||||||
{"read_only": True, "name": "from_dac", "total": dac_num},
|
{"read_only": True, "name": "dac_recv_buf", "total": dac_num},
|
||||||
{"read_only": False, "name": "to_dac", "total": dac_num},
|
{"read_only": False, "name": "dac_send_buf", "total": dac_num},
|
||||||
# {"read_only": False, "name": "wf_arm", "total": dac_num},
|
# {"read_only": False, "name": "wf_arm", "total": dac_num},
|
||||||
# {"read_only": False, "name": "wf_halt_on_finish", "total": dac_num},
|
# {"read_only": False, "name": "wf_halt_on_finish", "total": dac_num},
|
||||||
# {"read_only": True, "name": "wf_finished", "total": dac_num},
|
# {"read_only": True, "name": "wf_finished", "total": dac_num},
|
||||||
|
@ -132,9 +132,9 @@ if __name__ == "__main__":
|
||||||
|
|
||||||
{"read_only": True, "name": "adc_finished", "total": adc_num},
|
{"read_only": True, "name": "adc_finished", "total": adc_num},
|
||||||
{"read_only": False, "name": "adc_arm", "total": adc_num},
|
{"read_only": False, "name": "adc_arm", "total": adc_num},
|
||||||
{"read_only": True, "name": "from_adc", "total": adc_num},
|
{"read_only": True, "name": "adc_recv_buf", "total": adc_num},
|
||||||
|
|
||||||
{"read_only": False, "name": "adc_sel", "total": adc_num},
|
{"read_only": False, "name": "adc_sel", "total": adc_num},
|
||||||
|
|
||||||
{"read_only": True, "name": "cl_in_loop", "total": 1},
|
{"read_only": True, "name": "cl_in_loop", "total": 1},
|
||||||
{"read_only": False, "name": "cl_cmd", "total": 1},
|
{"read_only": False, "name": "cl_cmd", "total": 1},
|
||||||
{"read_only": False, "name": "cl_word_in", "total": 1},
|
{"read_only": False, "name": "cl_word_in", "total": 1},
|
||||||
|
|
|
@ -35,8 +35,8 @@ m4_define(m4_dac_wires, ⟨
|
||||||
input [$1-1:0] dac_sel_$2,
|
input [$1-1:0] dac_sel_$2,
|
||||||
output dac_finished_$2,
|
output dac_finished_$2,
|
||||||
input dac_arm_$2,
|
input dac_arm_$2,
|
||||||
output [DAC_WID-1:0] from_dac_$2,
|
output [DAC_WID-1:0] dac_recv_buf_$2,
|
||||||
input [DAC_WID-1:0] to_dac_$2
|
input [DAC_WID-1:0] dac_send_buf_$2
|
||||||
|
|
||||||
/*
|
/*
|
||||||
input wf_arm_$2,
|
input wf_arm_$2,
|
||||||
|
@ -61,7 +61,7 @@ m4_define(m4_adc_wires, ⟨
|
||||||
input [$3-1:0] adc_sel_$2,
|
input [$3-1:0] adc_sel_$2,
|
||||||
output adc_finished_$2,
|
output adc_finished_$2,
|
||||||
input adc_arm_$2,
|
input adc_arm_$2,
|
||||||
output [$1-1:0] from_adc_$2
|
output [$1-1:0] adc_recv_buf_$2
|
||||||
⟩)
|
⟩)
|
||||||
|
|
||||||
/* This is used in the body of the module. It declares the interconnect
|
/* This is used in the body of the module. It declares the interconnect
|
||||||
|
@ -109,8 +109,8 @@ m4_define(m4_dac_switch, ⟨
|
||||||
.ss_L(ss_L_port_$2[0]),
|
.ss_L(ss_L_port_$2[0]),
|
||||||
.finished(dac_finished_$2),
|
.finished(dac_finished_$2),
|
||||||
.arm(dac_arm_$2),
|
.arm(dac_arm_$2),
|
||||||
.from_slave(from_dac_$2),
|
.from_slave(dac_recv_buf_$2),
|
||||||
.to_slave(to_dac_$2)
|
.to_slave(dac_send_buf_$2)
|
||||||
)
|
)
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -192,7 +192,7 @@ m4_define(m4_adc_switch, ⟨
|
||||||
.ss_L(adc_conv_L_port_$2[0]),
|
.ss_L(adc_conv_L_port_$2[0]),
|
||||||
.finished(adc_finished_$2),
|
.finished(adc_finished_$2),
|
||||||
.arm(adc_arm_$2),
|
.arm(adc_arm_$2),
|
||||||
.from_slave(from_adc_$2)
|
.from_slave(adc_recv_buf_$2)
|
||||||
);
|
);
|
||||||
|
|
||||||
/* 2nd option for each ADC is the non-converting option.
|
/* 2nd option for each ADC is the non-converting option.
|
||||||
|
@ -296,7 +296,7 @@ m4_define(CL_DATA_WID, CL_CONSTS_WID)
|
||||||
input [CL_DATA_WID-1:0] cl_word_in,
|
input [CL_DATA_WID-1:0] cl_word_in,
|
||||||
output reg [CL_DATA_WID-1:0] cl_word_out,
|
output reg [CL_DATA_WID-1:0] cl_word_out,
|
||||||
input cl_start_cmd,
|
input cl_start_cmd,
|
||||||
output reg cl_finish_cmd
|
output cl_finish_cmd
|
||||||
|
|
||||||
,output reg test_clock
|
,output reg test_clock
|
||||||
);
|
);
|
||||||
|
|
|
@ -165,8 +165,8 @@ class Base(Module, AutoCSR):
|
||||||
self._make_csr("dac_sel", CSRStorage, 3, f"Select DAC {i} Output", num=i)
|
self._make_csr("dac_sel", CSRStorage, 3, f"Select DAC {i} Output", num=i)
|
||||||
self._make_csr("dac_finished", CSRStatus, 1, f"DAC {i} Transmission Finished Flag", num=i)
|
self._make_csr("dac_finished", CSRStatus, 1, f"DAC {i} Transmission Finished Flag", num=i)
|
||||||
self._make_csr("dac_arm", CSRStorage, 1, f"DAC {i} Arm Flag", num=i)
|
self._make_csr("dac_arm", CSRStorage, 1, f"DAC {i} Arm Flag", num=i)
|
||||||
self._make_csr("from_dac", CSRStatus, 24, f"DAC {i} Received Data", num=i)
|
self._make_csr("dac_recv_buf", CSRStatus, 24, f"DAC {i} Received Data", num=i)
|
||||||
self._make_csr("to_dac", CSRStorage, 24, f"DAC {i} Data to Send", num=i)
|
self._make_csr("dac_send_buf", CSRStorage, 24, f"DAC {i} Data to Send", num=i)
|
||||||
# self._make_csr("wf_arm", CSRStorage, 1, f"Waveform {i} Arm Flag", num=i)
|
# self._make_csr("wf_arm", CSRStorage, 1, f"Waveform {i} Arm Flag", num=i)
|
||||||
# self._make_csr("wf_halt_on_finish", CSRStorage, 1, f"Waveform {i} Halt on Finish Flag", num=i)
|
# self._make_csr("wf_halt_on_finish", CSRStorage, 1, f"Waveform {i} Halt on Finish Flag", num=i)
|
||||||
# self._make_csr("wf_finished", CSRStatus, 1, f"Waveform {i} Finished Flag", num=i)
|
# self._make_csr("wf_finished", CSRStatus, 1, f"Waveform {i} Finished Flag", num=i)
|
||||||
|
@ -187,7 +187,7 @@ class Base(Module, AutoCSR):
|
||||||
|
|
||||||
self._make_csr("adc_finished", CSRStatus, 1, f"ADC {i} Finished Flag", num=i)
|
self._make_csr("adc_finished", CSRStatus, 1, f"ADC {i} Finished Flag", num=i)
|
||||||
self._make_csr("adc_arm", CSRStorage, 1, f"ADC {i} Arm Flag", num=i)
|
self._make_csr("adc_arm", CSRStorage, 1, f"ADC {i} Arm Flag", num=i)
|
||||||
self._make_csr("from_adc", CSRStatus, 32, f"ADC {i} Received Data", num=i)
|
self._make_csr("adc_recv_buf", CSRStatus, 32, f"ADC {i} Received Data", num=i)
|
||||||
|
|
||||||
self._make_csr("cl_in_loop", CSRStatus, 1, "Control Loop Loop Enabled Flag")
|
self._make_csr("cl_in_loop", CSRStatus, 1, "Control Loop Loop Enabled Flag")
|
||||||
self._make_csr("cl_cmd", CSRStorage, 8, "Control Loop Command Input")
|
self._make_csr("cl_cmd", CSRStorage, 8, "Control Loop Command Input")
|
||||||
|
|
|
@ -0,0 +1,25 @@
|
||||||
|
from mmio import *
|
||||||
|
|
||||||
|
def dac_write_value(val, num):
|
||||||
|
write_dac_send_buf(1 << 20 | val & 0xFFFFF, num) # 20 bit DAC
|
||||||
|
write_dac_arm(1, num)
|
||||||
|
write_dac_arm(0, num)
|
||||||
|
|
||||||
|
def dac_read_value(val, num):
|
||||||
|
write_dac_send_buf(1 << 23 | val, num)
|
||||||
|
write_dac_arm(1, num)
|
||||||
|
write_dac_arm(0, num)
|
||||||
|
return read_dac_recv_buf(num)
|
||||||
|
|
||||||
|
def dac_init(num):
|
||||||
|
write_dac_sel(0,num)
|
||||||
|
dac_write_value(0, num)
|
||||||
|
write_dac_send_buf(1 << 22 | 1 << 2, num)
|
||||||
|
write_dac_arm(1, num)
|
||||||
|
write_dac_arm(0, num)
|
||||||
|
return dac_read_value(1 << 22, num)
|
||||||
|
|
||||||
|
def adc_read_value(num):
|
||||||
|
write_adc_arm(1, num)
|
||||||
|
write_adc_arm(0, num)
|
||||||
|
return read_from_adc(num)
|
|
@ -1,36 +0,0 @@
|
||||||
from micropython import const
|
|
||||||
from time import sleep_us
|
|
||||||
import machine
|
|
||||||
|
|
||||||
dac_sel = const(4026531844)
|
|
||||||
dac_arm = const(4026531852)
|
|
||||||
dac_fin = const(4026531848)
|
|
||||||
dac_from = const(4026531856)
|
|
||||||
dac_to = const(4026531860)
|
|
||||||
|
|
||||||
machine.mem8[dac_sel] = 1
|
|
||||||
|
|
||||||
def dac_comm(val):
|
|
||||||
machine.mem32[dac_to] = val
|
|
||||||
machine.mem8[dac_arm] = 1
|
|
||||||
while machine.mem8[dac_fin] == 0:
|
|
||||||
pass
|
|
||||||
machine.mem8[dac_arm] = 0
|
|
||||||
|
|
||||||
def dac_read(val):
|
|
||||||
dac_comm(1 << 23 | val)
|
|
||||||
dac_comm(0)
|
|
||||||
v = bin(machine.mem32[dac_from])
|
|
||||||
print(v, len(v) - 2)
|
|
||||||
|
|
||||||
# dac_comm(0b11010010001)
|
|
||||||
dac_comm(1 << 22 | 1 << 2)
|
|
||||||
dac_comm(1 << 21 | (1 << 1))
|
|
||||||
dac_read(1 << 21)
|
|
||||||
|
|
||||||
def dac_ramp_up(fromval,toval,step,ival):
|
|
||||||
assert step > 0
|
|
||||||
while fromval <= toval:
|
|
||||||
dac_comm(1 << 20 | fromval)
|
|
||||||
sleep_us(ival)
|
|
||||||
fromval = fromval + step
|
|
Loading…
Reference in New Issue