Item Details

Print View

Non-Stationary Contextual Multi-Armed Bandit With Application in Online Recommendations

Hassan, Muhammad
Thesis/Dissertation; Online
Hassan, Muhammad
Beling, Peter
The only constant in the online social behavior is forever changing user intent and preferences. These changes could be inspired by myriad of factors but still have an overall trend e.g. todays popular news will be stale tomorrow etc. Such patterns are especially noticeable in viral trends where an immediate gain in popularity is followed by gradual lost of holistic interest. In this study we focus on design of a recommendation system which accounts for this non-stationary behavior of declining popularity. Contextual multi-armed bandit (contextual MAB) is a popular framework for learning user behavior and personalized recommendations based on the past behavior. Fundamentally MAB solves the exploitation-exploration dilemma which aims at minimal guided experimentation required to gain certain level of confidence in its recommendation. Traditionally contextual MABs (e.g. LinUCB) have been used to model stationary user behavior that is not appropriate for the target environment where LinUCB can accumulate linear regret. Here we extend this LinUCB type algorithm to model a decaying environment. We present three algorithms with variable levels of specificity in the assumptions they make about the non-stationary environment. We show by simulation the effectiveness of our methods which illustrates the usefulness of modeling meta-trends in user behavior.
University of Virginia, Department of Systems Engineering, MS, 2015
Published Date
Libra ETD Repository
In CopyrightIn Copyright
▾See more
▴See less


Read Online