Abstract
Numerous industries have benefited from the use of machine learning and fashion in industry is no exception. By gaining a better understanding of what makes a good outfit, companies can provide useful product recommendations to their users. In this project, we follow two existing approaches that employ graphs to represent outfits and use modified versions of the Graph neural network (GNN) frameworks. Both Node-wise Graph Neural Network (NGNN) and Hypergraph Neural Network aim to score a set of items according to the outfit compatibility of items. The data used is the Polyvore Dataset which consists of curated outfits with product images and text descriptions for each product in an outfit. We recreate the analysis on a subset of this data and compare the two existing models on their performance on two tasks Fill in the blank (FITB): finding an item that completes an outfit, and Compatibility prediction: estimating compatibility of different items grouped as an outfit. We can replicate the results directionally and find that HGNN does have a slightly better performance on both tasks. On top of replicating the results of the two papers we also tried to use embeddings generated from a vision transformer and witness enhanced prediction accuracy across the board
Abstract (translated)
许多行业都从机器学习和时尚中受益,时尚行业也不例外。通过更好地理解什么是好的服装,公司可以为用户提供有用的产品推荐。在这个项目中,我们遵循了两种现有的方法,即节点图神经网络(NGNN)和超图神经网络,这些方法使用图形来表示服装,并使用对 Graph神经网络(GNN)框架的修改版本。节点图神经网络(NGNN)和超图神经网络的目标是根据服装的兼容性对一组项目进行评分。所使用的数据是Polyvore数据集,它包括精心挑选的服装和每个服装中产品的图片和文字描述。我们在数据子集上重新分析,并比较这两个现有模型的性能在两个任务上:Fill in the blank(FITB):找到完成套路的物品,Compatibility prediction:估计将不同物品分组为套路的兼容性。我们可以沿袭两个论文的结果,并发现HGNN在两个任务上都表现得更好。除了复制两个论文的结果外,我们还试图尝试使用从视觉 transformer生成的嵌入,并全面提高在整个 board上的预测准确性。
URL
https://arxiv.org/abs/2404.18040