Off-policy linear temporal difference learning algorithms with a generalized oblique projection
Wu Yushuang1,Chen Xiaoyu1,Ma Jingwen2,Chen Xingguo2,3*
Journal of Nanjing University(Natural Sciences) ›› 2017, Vol. 53 ›› Issue (6) : 1052.
Off-policy linear temporal difference learning algorithms with a generalized oblique projection
{{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
〈 | 〉 |