PDF(2742298 KB)
PDF(2742298 KB)
PDF(2742298 KB)
基于一般化斜投影的异策略时序差分学习算法
({{custom_author.role_cn}}), {{javascript:window.custom_author_cn_index++;}}Off-policy linear temporal difference learning algorithms with a generalized oblique projection
({{custom_author.role_en}}), {{javascript:window.custom_author_en_index++;}}| {{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
| 〈 |
|
〉 |