Constructing Dynamic Treatment Regimes for Binary Outcome with Partially SMART data
A dynamic treatment regime is a sequence of decision rules that specify how the dosage and/or type of treatment should be adjusted through time in response to an individual's changing needs. Q-learning, which involves an iterative two-step procedure that first uses regression to model the conditional mean outcome at the each stage, and second, derives the estimated regime by maximizing the estimated conditional mean functions, is often used on data from SMART studies to develop the optimal treatment regime. We propose to generalize Q-learning to the case of binary outcome with data from a partial SMART study, where only a proportion of patients went through the full course of the trial. The method will be illustrated using data from a web-based smoking cessation study.