I'll do my best--this is hairy without LING terminology.
in an imaginary language with 5 consonants, say, you could have two different types of 't' and 3 different types of 'p', or, on the other end of the spectrum, each of the 5 in totally different places (e.g. one is bilabial, one is dental, one is velar, one is uvular and the last a totally different manner of articulation--non-pulmonic, such as a click). while the first situation describes a consonant class which is the most natural for our mouths to create, it's actually harder on the brain to contrast them with one another; a set with consonants which are very different from each other is easier. does that make sense? distinguishing 'pa' from 'ba' (which is, by the way, really hard for a lot of language groups) is more difficult than distinguishing, say, 'ta' from '!a'
there are different axes of ease and difficulty in the human linguistic process. something which is intuitive and simple on one level creates problems on another level, and vice versa.
Uselessness, that does actually sound kind of ethnocentric to me. I bet the Hawaiians think we went out of way to have too many hard to pronounce sounds. I would be interested to find out if the languages with "clicks" are closer to each other than all other languages (evolutionarily speaking, a clade), or if that has arisen independently multiple times.
the human mouth is also naturally inclined toward contrastivity, though--it's necessary to reach outside the 'unmarked' places of articulation to make as many clear distinctions between sounds as possible, if that makes sense.
I hope this isn't too ethnocentric of me, but I have to wonder how such a counterintuitive language ever developed. The human mouth and vocal chords make certain sounds naturally, and it seems like you'd really have to go out of your way to communicate like this instead.