Myopic policy for opportunistic access in cognitive radio networks by exploiting primary user feedbacks

Myopic policy for opportunistic access in cognitive radio networks by exploiting primary user feedbacks The authors consider a cognitive radio network overlaying on top of a legacy primary network in which a secondary user is allowed to access primary channel by overhearing feedback signals over the primary channels. Each channel is assumed to be a two state Makovian process. Aiming at maximising the expected accumulated discounted network throughput, the considered sequential decision-making problem can be cast into a restless multi-armed bandit (RMAB) problem which is well-known to be PSPACE-hard, and thus a natural alternative approach is to seek a simple myopic policy. This study presents a theoretical study on the optimality of the proposed myopic policy for the special RMAB problem by considering four different cases: negatively correlated homogeneous channels, heterogeneous channels, positively correlated heterogeneous channels and negatively correlated heterogeneous channels. More specifically, the authors establish the closed-form conditions to guarantee the optimality of the myopic policy for the four cases, respectively, which, combined with the case of positively correlated homogeneous channels, constitute a complete paradigm for the optimality of the myopic policy.