Open
Conversation
TODO: currently membox pagetable never got cleaned
修复:创建页表时未初始化页表项 功能:使用log.c来输出日志
类似CUDA stream
TODO:
处理task中相继kernel之间的memory alloc/free问题
改进finish回报链路:warp - block - kernel - task
endl可以强制flush cout缓冲区,使得输出立即显示在屏幕上 目前与log.c中的printf缓冲区联合使用时,如此才能保证输出信息顺序正确
补丁功能:MemoryBox允许在同一位置上重复分配(自动探测并维持原状) 功能:host, task, kernel等激活、运行、结束回调 功能:修改命令行参数格式
原先只能间接判定整个kernel完成与否:通过检验所有运行此kernel的SM的所有硬件warp都已经空闲
fix scalar lw writeback bug (wrongly use branch mask)
增加硬件block slot,存储线程块正在运行在哪些硬件warp、barrier状态,实现barrier功能,兼容单一SM上多kernel的功能 修复I_TYPE operator==功能不正确导致的sc_signal不更新值的bug
将用于解析.metadata与.data生成测例的逻辑从kernel_info_t中移出,从而mini驱动不再需要依赖systemc
修改regext指令对于src3寄存器index扩展的行为:对于vmadd这类src3与dst相同的指令,使得regext的s3字段无效,转用d字段作为实际生效的s3,从而确保vmadd这类指令总是dst=src3 这与spike的行为对齐 还需要核实编译器的行为 核实后需要完善指令定义
bug fixed: scoreboard judgd log improved for writeback to regfile with mask (V)FSGNJ* serial instruction implemented
update membox add ramulator2
* free private mem after kernel finished * add some logging
* 引入ramulator2的内存控制器&DDR时序建模 * 重写LSU以支持时序建模,尽量对齐Chisel RTL实现的时序 * 在SM中引入spdlog日志
1. LSU wordOffset读请求对齐问题,例如wordOffset1H='b1100,实际数据位于0011 2. decode模块错误的非法指令assert问题。这些非法指令可能后续不会发射
若OPC出口乱序,同地址的访存Load/Store可能乱序,目前LSU中并无同地址数据依赖replay机制 Chisel实现中看起来也在Scoreboard中管理此事(OpColRegX, OpColRegV)
在.cl源文件中使用.insn内联自定义指令,示例如下
__asm__ __volatile__(".insn r 0x0, 0, 0, zero, %0, %1"
:
: "r"('f'), "vr"(*data));
其中RS1提供1个printf风格的格式化字符,VRS2提供待打印的数据
原本只能使用SYSTEMC_HOME环境变量
为增加icache,修改了ibuf前的流水线建模,目前流水级为: PC - FETCH - FETCH2 - Decode&Ibuffer_input PC为每个warp独有 FETCH由warp_sche仲裁选择一个warp的PC来fetch,先查询subcore内的L0 icache(很小的全相连),若命中则下周起从L0 icache中取出指令数据,不命中则立即发请求到L1 icache FETCH2处理L0/L1 icache返回的输出结果。L0 icache已经在上个流水级检查过因此结果一定命中,L1 icache可能返回不命中,这时下个周期对应warp的PC会回滚,稍后重新发出Fetch请求。 Decode处理regext部分为时序逻辑外,是纯组合逻辑,其输出直接被寄存到ibuf内。如果FETCH2流水级末的寄存器表明L1 icache miss或者Ibuf输入端表明ibuf满,会触发此warp的PC回滚 TODO:目前没有建模多个Subcore到L1 icache的请求彼此竞争的问题。之后应当补充。可以在FETCH级别也增加PC rewind,当L1 icahce拒绝了本Subcore的请求直接回滚;或者拒绝等价于L1 icache miss,稍后再回滚
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
批量发起Ventus相关的各个仓库的开发分支到main分支的PR
此次将在ventus-env main分支当前最新提交上运行OpenCL CTS测试
已经核实过ventus-env中当前子仓库的commit号即为本PR中包含的最新commit号