Sync velia

It seems one of our boxes started behaving crazy. It reports 2.3kW
input power in one of the PSUs while usually the values are around 21W.
The value is way beyond what ietf-hardware YANG model can represent and
libyang throws a validation error which kills velia and our whole NETCONF
stack.
Logs attached below (thanks to Jan Kundrát for those).

This brings in the patch that tries to address the issue when trying to
write invalid value into sysrepo. It *does not* fix the HW issue but
rather it logs an error and signalizes an overflow/underflow in the
sensor data.

Some logs:

 * sysfs reads of the power_input values: Note the values at
14:11:29:

  add-drop-DQ000VOT ~ # while true; do date; cat /sys/class/hwmon/hwmon8/power*_input ; sleep 1; done
  ...
  Mon Dec 11 14:11:27 UTC 2023
  25250000
  21000000
  Mon Dec 11 14:11:28 UTC 2023
  25500000
  21000000
  Mon Dec 11 14:11:29 UTC 2023
  2316000000
  20000000
  Mon Dec 11 14:11:30 UTC 2023
  25250000
  21000000
  ...
  Mon Dec 11 14:12:01 UTC 2023
  25250000
  20000000
  Mon Dec 11 14:12:02 UTC 2023
  0
  20000000
  Mon Dec 11 14:12:03 UTC 2023
  25000000
  20000000
  ...

 * and the original velia crash

  Dec 11 13:59:46 add-drop-DQ000VOT veliad-hardware[7997]: terminate called after throwing an instance of 'libyang::ErrorWithCode'
  Dec 11 13:59:46 add-drop-DQ000VOT veliad-hardware[7997]:   what():  Couldn't create a node with path '/ietf-hardware:hardware/component[name='ne:psu2:power-in']/sensor-data/value': LY_EVALID
  ...
  Dec 11 13:59:46 add-drop-DQ000VOT main[7997]: Processing node update /ietf-hardware:hardware/component[name='ne:psu2:power-in']/sensor-data/value -> 2316000000
  Dec 11 13:59:47 add-drop-DQ000VOT systemd[1]: velia-hardware-g2.service: Main process exited, code=dumped, status=6/ABRT
  Dec 11 13:59:47 add-drop-DQ000VOT systemd[1]: velia-hardware-g2.service: Failed with result 'core-dump'.

Depends-on: https://gerrit.cesnet.cz/c/CzechLight/velia/+/6703
Change-Id: I07ef810a69842e3e37910ae3a427eaf432e1c00e
1 file changed
tree: 38fe2581604a66fb811daef9e47db4841ad02b86
  1. .gitmodules
  2. .zuul.yaml
  3. Config.in
  4. README.md
  5. board/
  6. ci/
  7. configs/
  8. crypto/
  9. dev-setup-git.sh
  10. doc/
  11. external.desc
  12. external.mk
  13. package/
  14. submodules/
  15. tests/
README.md

How to use this

This repository contains CzechLight-specific bits for Buildroot. Buildroot is a tool which produces system images for flashing to embedded devices. They have a nice documentation which explains everything that one might need.

The system architecture is described in another document. This is a quick build HOWTO.

Quick Start

Everything is in Gerrit. One should not need to clone anything from anywhere else. The build will download source tarballs of various open source components, though.

By default, each change of this repo uploaded to Gerrit causes the CI system to produce a firmware update. On Gerrit, the change will get a comment from Zuul with a link to the CI log server. Next to the logs, a file named artifacts/update.raucb can be used for updating devices.

Behind the scenes, the system uses Zuul with a configuration tracked in git.

Developer Workflow

Here's how to reproduce the build on a developer's workstation:

git clone ssh://$YOUR_LOGIN@cesnet.cz@gerrit.cesnet.cz:29418/CzechLight/br2-external czechlight
pushd czechlight
git submodule update --init --recursive
popd
mkdir build-clearfog
cd build-clearfog
../czechlight/dev-setup-git.sh
make czechlight_clearfog_defconfig
make -j8

A full rebuild takes about half an hour on a modern laptop.

WARNING: Buildroot is fragile. It is not safe to perform incremental builds after changing an "important" setting. Please check their manual for details.

Installing

Updates via RAUC

Apart from the traditional way of re-flashing the SD card or the eMMC from scratch, it's also possible to use RAUC to update. This method preserves the U-Boot version and the U-Boot's environment. Apart from that, everything starting with the kernel and the DTB file and including the root FS is updated. Configuration stored in /cfg is brought along and preserved as well.

To install an update:

# build node
make
rsync -avP images/update.raucb somewhere.example.org:path/to/web/root

# target, perhaps via an USB console or over SSH
rauc install http://somewhere.example.org/update.raucb
reboot

Once the updated FW slot boots, the configuration in /cfg will be automatically upgraded ("migrated") to the newest layout. A downgrade to an incompatible OS version might therefore fail during the next reboot. Completely removing all data in the newly updated slot's cfg partition will restore functionality, but it is effectively a factory reset.

Initial installation

Clearfog

On a regular Clearfog Base with an eMMC, one has to bootstrap the device first. If recovering a totally bricked board (or one that is fresh from factory), use the kwboot command to upload the initial, new enough U-Boot via the console. Ensure that the jumpers are set to 0 1 0 0 1 (default for eMMC boot is 0 0 1 1 1), and then use U-Boot's kwboot tool:

./host/bin/kwboot -b ./u-boot-spl.kwb -t -p /dev/ttyUSB0

Prepare a USB flash disk with a raw bootable image, images/usb-flash.img. Use a tool such as dd to overwrite the raw block device, do not copy the image file. Once in U-Boot, plug the USB flash disk and execute:

usb start; fatload usb 0:1 00800000 boot.scr; source 00800000

The system will boot and flash the eMMC from the USB drive. Once the status LED starts blinking in yellow, data are being transferred to the eMMC. The light changes to solid yellow in later phases of the flashing process. Once everything is done, the status LED shows a solid white light and the system reboots automatically.

Turn off power, remove the USB flash, re-jumper the board (0 0 1 1 1), power-cycle, and configure MAC addresses at the U-Boot prompt. The MAC addresses are found on the label at the front panel.

=> setenv eth1addr 00:11:17:01:XX:XX
=> setenv eth2addr 00:11:17:01:XX:YY
=> setenv eth3addr 00:11:17:01:XX:ZZ

Also set up the system type:

Modelczechlight variable value
ROADM Line Degreesdn-roadm-line-g2
WSS Add/Dropsdn-roadm-add-drop-g2
Hi-resolution Add/Dropsdn-roadm-hires-add-drop-g2
Coherent Add/Dropsdn-roadm-coherent-a-d-g2
Inline EDFA Amplifiersdn-inline-g2

Some prototypes have deprecated PCBs (blue). On these, skip the -g2 suffix. All red PCBs are -g2.

=> setenv czechlight sdn-roadm-line-g2
=> saveenv
Saving Environment to MMC... Writing to redundant MMC(0)... OK
=> boot

Once the system boots (which currently requires a reboot for some unknown reason -- fsck, perhaps?), configure hostname, plug in the network cable, and update SW:

# hostnamectl set-hostname line-XYZSERIALNO
# cp /etc/hostname /cfg/etc/
# rauc install http://somewhere.example.org/update.raucb
# reboot

Beaglebone Black

Obtain a reasonable Linux distro image for BBB and flash it to a µSD card. Unlock eMMC boot partitions (echo 0 > /sys/class/block/mmcblk1boot0/force_ro; echo 0 > /sys/class/block/mmcblk1boot1/force_ro). Clean the eMMC data (blkdiscard /dev/mmcblk1). Flash the content of images/emmc.img to device's /dev/mmcblk1. Flash what fits into /dev/mmcblk1boot0 and /dev/mmcblk1boot1. Fetching the image over web (python3 -m http.server and wget http://...:8000/emmc.img -O - | dd of=/dev/mmcblk1 conv=sparse) works well.