Paper Reading AI Learner

Sketch2code: Generating a website from a paper mockup

2019-05-09 10:15:13
Alex Robinson

Abstract

An early stage of developing user-facing applications is creating a wireframe to layout the interface. Once a wireframe has been created it is given to a developer to implement in code. Developing boiler plate user interface code is time consuming work but still requires an experienced developer. In this dissertation we present two approaches which automates this process, one using classical computer vision techniques, and another using a novel application of deep semantic segmentation networks. We release a dataset of websites which can be used to train and evaluate these approaches. Further, we have designed a novel evaluation framework which allows empirical evaluation by creating synthetic sketches. Our evaluation illustrates that our deep learning approach outperforms our classical computer vision approach and we conclude that deep learning is the most promising direction for future research.

Abstract (translated)

开发面向用户的应用程序的早期阶段是创建一个线框来布局界面。一旦创建了一个线框,它就被提供给开发人员在代码中实现。开发锅炉板用户界面代码是一项耗时的工作,但仍然需要有经验的开发人员。在本文中,我们提出了两种自动化这一过程的方法,一种是使用经典的计算机视觉技术,另一种是使用深度语义分割网络的新应用。我们发布了一个网站数据集,可以用来培训和评估这些方法。此外,我们还设计了一个新的评估框架,允许通过创建合成草图进行经验评估。我们的评估表明,我们的深度学习方法优于传统的计算机视觉方法,我们认为,深度学习是未来研究最有前景的方向。

URL

https://arxiv.org/abs/1905.13750

PDF

https://arxiv.org/pdf/1905.13750.pdf


Tags
3D Action Action_Localization Action_Recognition Activity Adversarial Agent Attention Autonomous Bert Boundary_Detection Caption Chat Classification CNN Compressive_Sensing Contour Contrastive_Learning Deep_Learning Denoising Detection Dialog Diffusion Drone Dynamic_Memory_Network Edge_Detection Embedding Embodied Emotion Enhancement Face Face_Detection Face_Recognition Facial_Landmark Few-Shot Gait_Recognition GAN Gaze_Estimation Gesture Gradient_Descent Handwriting Human_Parsing Image_Caption Image_Classification Image_Compression Image_Enhancement Image_Generation Image_Matting Image_Retrieval Inference Inpainting Intelligent_Chip Knowledge Knowledge_Graph Language_Model Matching Medical Memory_Networks Multi_Modal Multi_Task NAS NMT Object_Detection Object_Tracking OCR Ontology Optical_Character Optical_Flow Optimization Person_Re-identification Point_Cloud Portrait_Generation Pose Pose_Estimation Prediction QA Quantitative Quantitative_Finance Quantization Re-identification Recognition Recommendation Reconstruction Regularization Reinforcement_Learning Relation Relation_Extraction Represenation Represenation_Learning Restoration Review RNN Salient Scene_Classification Scene_Generation Scene_Parsing Scene_Text Segmentation Self-Supervised Semantic_Instance_Segmentation Semantic_Segmentation Semi_Global Semi_Supervised Sence_graph Sentiment Sentiment_Classification Sketch SLAM Sparse Speech Speech_Recognition Style_Transfer Summarization Super_Resolution Surveillance Survey Text_Classification Text_Generation Tracking Transfer_Learning Transformer Unsupervised Video_Caption Video_Classification Video_Indexing Video_Prediction Video_Retrieval Visual_Relation VQA Weakly_Supervised Zero-Shot