It's because of the nearest filtering. Depending on amount of zoom, certain lines of artwork pixels will stradle lines of screen pixels so they get drawn one pixel wider than other lines. As the camera moves, the rounding works out differently on each frame of animation so that different lines are drawn wider on each frame.
If you aren't going for a retro low-res aesthetic, you could use linear filtering with mip maps (MipMapLinearLinear or MipMapLinearNearest). Then start with larger resolution art. The first looks better if you are smoothly transitioning between zoom levels, with a possible performance impact.
Otherwise, you could round the camera's position to an exact multiple of the size of a pixel in world units. Then the same enlarged columns will always correspond with the same screen pixels, which would cut down on perceived flickering considerably. You said you were casting the camera translation to an int, but this requires the art to be scaled such that one pixel of art exactly corresponds with one pixel of screen.
This doesn't fix the other problem, that certain lines of pixels are drawn wider so they appear to have greater visual weight than similar nearby lines, as can be seen in both your screenshots. Maybe one way to get round that would be to do the zoom a secondary step, so you can control the appearance with an upscaling shader. Draw your scene to a frame buffer that is sized so one pixel of texture corresponds to one world pixel (with no zoom), and also lock your camera to integer locations. Then draw the frame buffer's contents to the screen, and do your zooming at this stage. Use a specialized upscaling shader for drawing the frame buffer texture to the screen to minimize blurriness and avoid nearest filtering artifacts. There are various shaders for this purpose that you can find by searching online. Many have been developed for use with emulators.