tract: Making accurate motion prediction of surrounding agents such as pedestrians and vehicles is a critical task when robots are trying to perform autonomous navigation tasks. Recent research on multi-modal trajectory prediction, including regression and classification approaches, perform very well at short-term prediction. However, when it comes to long-term prediction, most Long Short-Term Memory (LSTM) based models tend to diverge far away from the ground truth. Therefore, in this work, we present a two-stage framework for long-term trajectory prediction, which is named as Long-Term Network (LTN). Our Long-Term Network integrates both the regression and classification approaches. We first generate a set of proposed trajectories with our proposed distribution using a Conditional Variational Autoencoder (CVAE), and then classify them with binary labels, and output the trajectories with the highest score. We demonstrate our Long-Term Network's performance with experiments on two real-world pedestrian datasets: ETH/UCY, Stanford Drone Dataset (SDD), and one challenging real-world driving forecasting dataset: nuScenes. The results show that our method outperforms multiple state-of-the-art approaches in long-term trajectory prediction in terms of accuracy.