Mahdikhani, Maryam. Essays on retail operations and the recent pandemic (COVID-19): using mathematical and text-mining approaches. Retrieved from https://doi.org/doi:10.7282/t3-66as-ts56
DescriptionThis dissertation consists of three essays. The first essay examines on auction design and the last two essays apply sentiment analysis methodologies on big data.
The first paper of my dissertation examines the auction design with negative externality and its impact on the optimal mechanism design. In light of previous studies, our research shows that auctioning a good may impact the seller's payoff and those who lose the object. We simplify the potential mechanism by depriving buyers of their right to absolute non-participation. Our characterizations are thus tailored towards understanding bidders' type space, and the information structure of single-object auctions with negative externality's set up.
The second paper of my dissertation aims to predict helpful reviews on Amazon Fashion products and identify the most frequent terms in such reviews. We choose features from topics using the latent Dirichlet allocation (LDA) model and topics plus Bi-grams using the TF-IDF vectorizer. We then use the features to enhance the performance of support vector machine (SVM) classifier to predict the helpfulness of reviews. The research is performed on a large corpus of Amazon fashion reviews. We find that reviews gets more votes when they are more specific regarding quality of product and return experience.
The third essay of my dissertation is motivated by tweets on COVID-19 and the retweeting behavior. Our research objective is to predict tweet's popularity based on the volume of retweets regardless of the user's followers. We examine the features selection, including (i) topics by using LDA, (ii) N-grams by using TF-IDF vectorizer, and (iii) topics plus Bi-grams TF-IDF vectorizer. We use the extracted features on Random Forest (RF) classifier, SVM classifier, and Logistic Regression (LR) classifier. We find that RF has the highest accuracy for predicting the volume of retweets by particularly using topics plus Bi-grams TF-IDF vectorizer.