2025-03-11 Ordinary Meeting Notes
Mar 11, 2025 | RV-LFX Self-hosted Trace (1 hour)
Attendees: Beeman Strong
Notes
Attendees: Beeman, Bruce, Artem, Daniel, Andres, Greg, Iain
Intro: Andres - CPU Architect at Google, worked with ARM trace. Worked at ARM and Codasip, contributed to Cheri.
Agenda
Trace page faults
Trace buffer management
Trace page faults
Discussed option of not supporting them via email
Downsides are memory utilization costs and virtualization challenges
Propose new local interrupt (LAMBI) when trace page fault (TPF) occurs
Simplifies pending the interrupt for the appropriate priv mode
Base+wrptr provides addr of page to service
Considered updating xtval, but risks race conditions with ~simultaneous memory faults
An implementation may want to walk trace pages early, so have the translation by the time trace writes to the page
Avoids dropping trace due to inability to store out trace while waiting for the walk to complete
Recommended, but not required
If early walk hits PF, can add an indication that the TPF is for the next page, rather than the current page
Do we have data on how long it takes to service a page walk, on average? How many pages in advance must be walked to avoid trace loss?
AI Beeman: find data on this
Does an implementation without H need to include all this?
Even without H, can have regular kernel page faults
Don’t want to include optionality such that some implementations require kernel to pin pages and others don’t
Disagree
SiFive just has simple physical buffer
Self-hosted trace (SHT) is not required, can continue to use existing trace extensions as-is
SHT just layers new ifc and output buffer on top of existing trace extension
But may want to include CSR ifc
Perhaps Beeman and Bruce can talk offline to understand SiFive usage, to see if there’s a good solution for everyone
May have >1 instance of output buffer, for sampling, so will need indication of which buffer incurred TPF
Does RV trace support stalling hart?
Yes, includes lossless stall
Implementations need to be able to take TPF during lossless stall, otherwise can hang
Can trace include indication of lossless stall?
For use with perf analysis
If packets lost, will get timestamp on resume
No standardized cycle-accurate trace yet, but existing implementations don’t have stall cause info
Need perf counters for that
SiFive has an event to count trace-induced stall cycles, Events TG has one planned as well
When cycle-accurate is standardized, could consider this
Because of need to load translations, trace may not start immediately following enable
Propose a status bit to indicate current translation loaded, so SW can poll and ensure following code is traced
Action items
Beeman Strong- Mar 11, 2025- collect data for PF handling latency and trace BW
RISC-V International