Obviously, different software needs to operate with data/variables of different types and sizes and often times there's a need to operate with several different ones in the same code.
Being able to access those variables directly and as whole, irrespective of the size, simplifies programming as you don't need to glue together, say, 4 8-bit bytes to form a 32-bit value or similarly extract individual 8-bit values from a 32-bit memory location.
There exist processors that aren't very flexible in terms of the natively supported data sizes. For example, fixed-point digital signal processors. Some can only directly access memory as 16-bit words and 32-bit double words. I think the absence of 8-bit byte addressing in them isn't a big problem because they are expected to be doing lots of signal processing instead of being versatile and be suited for general-purpose computing, and signal samples are rarely 8-bit (that's too coarse), most often they're 16-bit.
Supporting fewer data sizes and other features in hardware makes this hardware simpler and cheaper (including in terms of consumed energy), which becomes important if we're talking about thousands and millions of devices.
Different problems need different solutions, hence the variety.