tract: Autonomous drone swarms are a burgeoning technology with significant applications in the field of mapping, inspection, transportation and monitoring. To complete a task, each drone has to accomplish a sub-goal within the context of the overall task at hand and navigate through the environment by avoiding collision with obstacles and with other agents in the environment. In this work, we choose the task of optimal coverage of an environment with drone swarms where the global knowledge of the goal states and its positions are known but not of the obstacles. The drones have to choose the Points of Interest (PoI) present in the environment to visit, along with the order to be visited to ensure fast coverage. We model this task in a simulation and use an agent-oriented approach to solve the problem. We evaluate different policy networks trained with reinforcement learning algorithms based on their effectiveness, i.e. time taken to map the area and efficiency, i.e. computational requirements. We couple the task assignment with path planning in an unique way for performing collision avoidance during navigation and compare a grid-based global planning algorithm, i.e. Wavefront and a gradient-based local planning algorithm, i.e. Potential Field. We also evaluate the Potential Field planning algorithm with different cost functions, propose a method to adaptively modify the velocity of the drone when using the Huber loss function to perform collision avoidance and observe its effect on the trajectory of the drones. We demonstrate our experiments in 2D and 3D simulations.