Three-dimensional urban reconstruction requires the combination of data from different sensors, such as cameras, inertial systems, GPS, and laser sensors. In this technical report, a complete system for the generation of textured volumetric global maps (deep vision) is presented. Our acquisition platform is terrestrial and moves through different urban environments digitizing them. The report is focused on describing the three main problems identified in this type of works. (1) The acquisition of three-dimensional data with high precision, (2) the extraction of the texture and its correlation with the 3D data, and (3) the generation of the surfaces that describe the components of the urban environment. It also describes the methods implemented to extrinsically calibrate the acquisition platform, as well as the methods developed to eliminate the radial and tangential image distortion; and the subsequent generation of a panoramic image. Procedures are developed for the sampling of 3D data and its smoothing. Subsequently, the process to generate textured global maps with a negligible uncertainty is developed and the results are presented. Finally, the process of surface generation and the post-process of eliminating certain holes/occlusions in the meshes are reported. In each section, results obtained are shown. Using the methods presented here for geometric and photorealistic reconstruction of urban environments, high-quality 3D models are generated. The results achieved the following objectives: generate global textured models that preserve the geometry of the scanned scenes.