Item Details

Non-Stationary Contextual Multi-Armed Bandit With Application in Online Recommendations

Hassan, Muhammad
Format
Thesis/Dissertation; Online
Author
Hassan, Muhammad
Advisor
Beling, Peter
Abstract
The only constant in the online social behavior is forever changing user intent and preferences. These changes could be inspired by myriad of factors but still have an overall trend e.g. todays popular news will be stale tomorrow etc. Such patterns are especially noticeable in viral trends where an immediate gain in popularity is followed by gradual lost of holistic interest. In this study we focus on design of a recommendation system which accounts for this non-stationary behavior of declining popularity. Contextual multi-armed bandit (contextual MAB) is a popular framework for learning user behavior and personalized recommendations based on the past behavior. Fundamentally MAB solves the exploitation-exploration dilemma which aims at minimal guided experimentation required to gain certain level of confidence in its recommendation. Traditionally contextual MABs (e.g. LinUCB) have been used to model stationary user behavior that is not appropriate for the target environment where LinUCB can accumulate linear regret. Here we extend this LinUCB type algorithm to model a decaying environment. We present three algorithms with variable levels of specificity in the assumptions they make about the non-stationary environment. We show by simulation the effectiveness of our methods which illustrates the usefulness of modeling meta-trends in user behavior.
Language
English
Date Received
20150803
Published
University of Virginia, Department of Systems Engineering, MS (Master of Science), 2015
Published Date
2015-07-30
Degree
MS (Master of Science)
Collection
Libra ETD Repository
Logo for In CopyrightIn Copyright

Availability

Read Online