Should Dockerfile execute “npm install” and “npm run build” or should it only copy those files over?
TL;DR: It should always execute all necessary build commands in the "build step" of the multi-stage image!
Long answer:
In the first example "tutorials" Dockerfile that you posted, a multi-stage build is used. With multistage builds you can discard artifacts created in previous stages and only keep those files and changes in the final image that you really need. In this case, the installed "dev" packages are not copied into the final image, thus not consuming any space. The build folder will only contain the code and the node modules required at runtime without any dev dependencies that were required in the first step of the build to compile the project.
In your second approach, you are running npm install && npm run build
ouside your Dockerfile
and then copy your results into the final image. While this works, from a devops perspective it is not a good idea since you want to keep all required building instructions consistently in one place (preferably in one Dockerfile), so the next person building your image does not have to figure out how the compilation process works. Another problem with copying the build results from your local machine is that you may be running another OS with a different node version etc. and this can impact the build result. If you instead, as with the "tutorial" Dockerfile, conduct the build within the Dockerfile, you have full control over the OS and the environment (node version, node-sass libraries etc.) and everybody executing docker build
will get the same compilation results (given that you pinpointed the node version of your Dockerfile base image, i.e. using FROM node:14.15.4-alpine as build-deps
instead of merely FROM node:alpine as build-deps
).
One last note on the evolution of Dockerfiles. In the past, it was actually the way to go to perform the compilation outside the Dockerfile (or in another separate Dockerfile) and then to copy all the results into your final image. This matches the second approach mentioned in your OP. But for all the shortcomings mentioned above the docker architects in 2017 invented multi-stage builds. Here are some enlightening quotes from the docker blog:
Before multi-stage build, Docker users would use a script to compile the applications on the host machine, then use Dockerfiles to build the images. Multi-stage builds, however, facilitate the creation of small and significantly more efficient containers since the final image can be free of any build tools. [And] External scripts are no longer needed to orchestrate a build.
The same idea is reiterated in the official docs:
It was actually very common to have one Dockerfile to use for development (which contained everything needed to build your application), and a slimmed-down one to use for production, which only contained your application and exactly what was needed to run it. This has been referred to as the “builder pattern”. Maintaining two Dockerfiles is not ideal. […] Multi-stage builds vastly simplify this situation! […] You only need the single Dockerfile. You don’t need a separate build script, either. Just run docker build. The end result is the same tiny production image as before, with a significant reduction in complexity. You don’t need to create any intermediate images and you don’t need to extract any artifacts to your local system at all.